最新中文分词方面的论文和数据集

编程入门 行业动态 更新时间:2024-10-21 13:37:35

最新中文<a href=https://www.elefans.com/category/jswz/34/1763864.html style=分词方面的论文和数据集"/>

最新中文分词方面的论文和数据集

Chinese Word Segmentation

Task

Chinese word segmentation is the task of splitting Chinese text (a sequence of Chinese characters) into words.

Example:

'上海浦东开发与建设同步' → ['上海', '浦东', '开发', ‘与', ’建设', '同步']

Systems

♠ marks the system that uses character unigram as input.
♣ marks the system that uses character bigram as input.

  • Huang et al. (2019): BERT + model compression + multi-criterial learing ♠
  • Yang et al. (2018): Lattice LSTM-CRF + BPE subword embeddings ♠♣
  • Ma et al. (2018): BiLSTM-CRF + hyper-params search♠♣
  • Yang et al. (2017): Transition-based + Beam-search + Rich pretrain♠♣
  • Zhou et al. (2017): Greedy Search + word context♠
  • Chen et al. (2017): BiLSTM-CRF + adv. loss♠♣
  • Cai et al. (2017): Greedy Search+Span representation♠
  • Kurita et al. (2017): Transition-based + Joint model♠
  • Liu et al. (2016): neural semi-CRF♠
  • Cai and Zhao (2016): Greedy Search♠
  • Chen et al. (2015a): Gated Recursive NN♠♣
  • Chen et al. (2015b): BiLSTM-CRF♠♣

Evaluation

Metrics

F1-score

Dataset

Chinese Treebank 6
ModelF1Paper / SourceCode
Huang et al. (2019)97.6Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Ma et al. (2018)96.7State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Yang et al. (2018)96.3Subword Encoding in Lattice LSTM for Chinese Word SegmentationGithub
Yang et al. (2017)96.2Neural Word Segmentation with Rich PretrainingGithub
Zhou et al. (2017)96.2Word-Context Character Embeddings for Chinese Word Segmentation
Chen et al. (2017)96.2Adversarial Multi-Criteria Learning for Chinese Word SegmentationGithub
Liu et al. (2016)95.5Exploring Segment Representations for Neural Segmentation ModelsGithub
Chen et al. (2015b)96.0Long Short-Term Memory Neural Networks for Chinese Word SegmentationGithub
Chinese Treebank 7
ModelF1Paper / SourceCode
Ma et al. (2018)96.6State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Kurita et al. (2017)96.2Neural Joint Model for Transition-based Chinese Syntactic Analysis
AS
ModelF1Paper / SourceCode
Huang et al. (2019)96.6Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Ma et al. (2018)96.2State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Yang et al. (2017)95.7Neural Word Segmentation with Rich PretrainingGithub
Cai et al. (2017)95.3Fast and Accurate Neural Word Segmentation for ChineseGithub
Chen et al. (2017)94.8Adversarial Multi-Criteria Learning for Chinese Word SegmentationGithub
CityU
ModelF1Paper / SourceCode
Huang et al. (2019)97.6Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Ma et al. (2018)97.2State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Yang et al. (2017)96.9Neural Word Segmentation with Rich PretrainingGithub
Cai et al. (2017)95.6Fast and Accurate Neural Word Segmentation for ChineseGithub
Chen et al. (2017)95.6Adversarial Multi-Criteria Learning for Chinese Word SegmentationGithub
PKU
ModelF1Paper / SourceCode
Huang et al. (2019)96.6Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Yang et al. (2017)96.3Neural Word Segmentation with Rich PretrainingGithub
Ma et al. (2018)96.1State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Yang et al. (2018)95.9Subword Encoding in Lattice LSTM for Chinese Word SegmentationGithub
Cai et al. (2017)95.8Fast and Accurate Neural Word Segmentation for ChineseGithub
Chen et al. (2017)94.3Adversarial Multi-Criteria Learning for Chinese Word SegmentationGithub
Liu et al. (2016)95.7Exploring Segment Representations for Neural Segmentation ModelsGithub
Cai and Zhao (2016)95.7Neural Word Segmentation Learning for ChineseGithub
MSR
ModelF1Paper / SourceCode
Ma et al. (2018)98.1State-of-the-art Chinese Word Segmentation with Bi-LSTMs
Huang et al. (2019)97.9Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Yang et al. (2018)97.8Subword Encoding in Lattice LSTM for Chinese Word SegmentationGithub
Yang et al. (2017)97.5Neural Word Segmentation with Rich PretrainingGithub
Cai et al. (2017)97.1Fast and Accurate Neural Word Segmentation for ChineseGithub
Chen et al. (2017)96.0Adversarial Multi-Criteria Learning for Chinese Word SegmentationGithub
Liu et al. (2016)97.6Exploring Segment Representations for Neural Segmentation ModelsGithub
Cai and Zhao (2016)96.4Neural Word Segmentation Learning for ChineseGithub

更多推荐

最新中文分词方面的论文和数据集

本文发布于:2024-02-27 14:22:12,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1706875.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:分词   中文   数据   论文   最新

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!