|
8 | 8 | - [2.3 高效模型调优方案](#高效模型调优方案)
|
9 | 9 | - [2.4 产业级全流程方案](#产业级全流程方案)
|
10 | 10 | - [3. 快速开始](#快速开始)
|
| 11 | +- [4. 常用中文分类数据集](#常用中文分类数据集) |
11 | 12 |
|
12 | 13 | <a name="文本分类应用简介"></a>
|
13 | 14 |
|
|
233 | 234 | - 快速开启多标签分类 👉 [多标签指南](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/text_classification/multi_label#readme)
|
234 | 235 |
|
235 | 236 | - 快速开启层次分类 👉 [层次分类指南](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/applications/text_classification/hierarchical#readme)
|
| 237 | + |
| 238 | +<a name="常用中文分类数据集"></a> |
| 239 | + |
| 240 | +## 4. 常用中文分类数据集 |
| 241 | + |
| 242 | +**多分类数据集:** |
| 243 | + |
| 244 | +- [THUCNews新闻分类数据集](http://thuctc.thunlp.org/) |
| 245 | + |
| 246 | +- [百科问答分类数据集](https://github.com/brightmart/nlp_chinese_corpus#3%E7%99%BE%E7%A7%91%E7%B1%BB%E9%97%AE%E7%AD%94json%E7%89%88baike2018qa) |
| 247 | + |
| 248 | +- [头条新闻标题数据集TNEWS](https://github.com/aceimnorstuvwxz/toutiao-text-classfication-dataset) |
| 249 | + |
| 250 | +- [复旦新闻文本数据集](https://www.heywhale.com/mw/dataset/5d3a9c86cf76a600360edd04) |
| 251 | + |
| 252 | +- [IFLYTEK app应用描述分类数据集](https://storage.googleapis.com/cluebenchmark/tasks/iflytek_public.zip) |
| 253 | + |
| 254 | +- [CAIL 2022事件检测](https://cloud.tsinghua.edu.cn/d/6e911ff1286d47db8016/) |
| 255 | + |
| 256 | +**情感分类数据集(多分类):** |
| 257 | + |
| 258 | +- [亚马逊商品评论情感数据集](https://github.com/SophonPlus/ChineseNlpCorpus/blob/master/datasets/yf_amazon/intro.ipynb) |
| 259 | + |
| 260 | +- [财经新闻情感分类数据集](https://github.com/wwwxmu/Dataset-of-financial-news-sentiment-classification) |
| 261 | + |
| 262 | +- [ChnSentiCorp 酒店评论情感分类数据集](https://github.com/SophonPlus/ChineseNlpCorpus/tree/master/datasets/ChnSentiCorp_htl_all) |
| 263 | + |
| 264 | +- [外卖评论情感分类数据集](https://github.com/SophonPlus/ChineseNlpCorpus/blob/master/datasets/waimai_10k/intro.ipynb) |
| 265 | + |
| 266 | +- [weibo情感二分类数据集](https://github.com/SophonPlus/ChineseNlpCorpus/blob/master/datasets/weibo_senti_100k/intro.ipynb) |
| 267 | + |
| 268 | +- [weibo情感四分类数据集](https://github.com/SophonPlus/ChineseNlpCorpus/blob/master/datasets/simplifyweibo_4_moods/intro.ipynb) |
| 269 | + |
| 270 | +- [商品评论情感分类数据集](https://github.com/SophonPlus/ChineseNlpCorpus/blob/master/datasets/online_shopping_10_cats/intro.ipynb) |
| 271 | + |
| 272 | +- [电影评论情感分类数据集](https://github.com/SophonPlus/ChineseNlpCorpus/blob/master/datasets/dmsc_v2/intro.ipynb) |
| 273 | + |
| 274 | +- [大众点评分类数据集](https://github.com/SophonPlus/ChineseNlpCorpus/blob/master/datasets/yf_dianping/intro.ipynb) |
| 275 | + |
| 276 | +**多标签数据集:** |
| 277 | + |
| 278 | +- [学生评语分类数据集](https://github.com/FBI1314/textClassification/tree/master/multilabel_text_classfication/data) |
| 279 | + |
| 280 | +- [CAIL2019婚姻要素识别](https://aistudio.baidu.com/aistudio/projectdetail/3996601) |
| 281 | + |
| 282 | +- [CAIL2018 刑期预测、法条预测、罪名预测](https://cail.oss-cn-qingdao.aliyuncs.com/CAIL2018_ALL_DATA.zip) |
| 283 | + |
| 284 | +**层次分类数据集:** |
| 285 | + |
| 286 | +- [头条新闻标题分类-TNEWS的升级版](https://github.com/aceimnorstuvwxz/toutiao-multilevel-text-classfication-dataset) |
| 287 | + |
| 288 | +- [网页层次分类数据集](https://csri.scu.edu.cn/info/1012/2827.htm) |
| 289 | + |
| 290 | +- [医学意图数据集(CMID)](https://github.com/liutongyang/CMID) |
| 291 | + |
| 292 | +- [2020语言与智能技术竞赛事件分类](https://github.com/percent4/keras_bert_multi_label_cls/tree/master/data) |
0 commit comments