nlp-word-model

Statistics

Model summary

Publish Date: 2020-10-23

Update Date: 2020-10-23

Word Count: 271

Read Times: 1 Min

Read Count:

隐含马尔可夫模型（HMM）是将分词作为字在句子中的序列标注任务来实现的。其基本思路是：每个字在构造一个特定词语时都占据着一个特定的位置即词位，一般采用四结构词位：B（词首），M（词中），E（词尾）和S（单独成词）。比如：

‘中文/分词/是/文本处理/不可或缺/的/一步/！’，

标注后的形式：

‘中/B 文/E 分/B 词/E 是/S 文/B 本/M 处/M 理/E 不/B 可/M 或/M 缺/E 的/S 一/B 步/E ！/S’。

其中，词位序列代表着HMM中不可见的隐藏状态序列，而训练集中的文本则为可见的观测序列。这样就变成了已知观测序列，求未知的隐藏序列的HMM问题。

Weiruohe

https://weiruohe.github.io/2020/10/23/nlp-word-model/

All articles in this blog are used except for special statements CC BY 4.0 reprint polocy. If reproduced, please indicate source Weiruohe !

Statistics

Title《**BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding**》 1.Abstract We intro

2020-10-27 PaperLookThrough

NLP BERT Pre-training-model

一、学习内容研读论文 CCG方向：《CCG Supertagging with a Recurrent Neural Network》 [^Question:the architecture of CCG Supertagging and

2020-10-22 WeiRuoHe