|
378 | 378 | |🔥coig-cqia|[AI-ModelScope/COIG-CQIA](https://modelscope.cn/datasets/AI-ModelScope/COIG-CQIA/summary)|chinese_traditional<br>coig_pc<br>exam<br>finance<br>douban<br>human_value<br>logi_qa<br>ruozhiba<br>segmentfault<br>wiki<br>wikihow<br>xhs<br>zhihu|44694|703.8±654.2, min=33, max=19288|general|-|
|
379 | 379 | |🔥ruozhiba|[AI-ModelScope/ruozhiba](https://modelscope.cn/datasets/AI-ModelScope/ruozhiba/summary)|post-annual<br>title-good<br>title-norm|85658|39.9±13.1, min=21, max=559|pretrain|-|
|
380 | 380 | |long-alpaca-12k|[AI-ModelScope/LongAlpaca-12k](https://modelscope.cn/datasets/AI-ModelScope/LongAlpaca-12k/summary)||11998|9619.0±8295.8, min=36, max=78925|longlora, QA|[Yukang/LongAlpaca-12k](https://huggingface.co/datasets/Yukang/LongAlpaca-12k)|
|
| 381 | +|lmsys-chat-1m|[AI-ModelScope/lmsys-chat-1m](https://modelscope.cn/datasets/AI-ModelScope/lmsys-chat-1m/summary)||-|Dataset is too huge, please click the original link to view the dataset stat.|chat, em|[lmsys/lmsys-chat-1m](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)| |
381 | 382 | |🔥ms-agent|[iic/ms_agent](https://modelscope.cn/datasets/iic/ms_agent/summary)||26336|650.9±217.2, min=209, max=2740|chat, agent, multi-round|-|
|
382 | 383 | |🔥ms-agent-for-agentfabric|[AI-ModelScope/ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary)|default<br>addition|30000|617.8±199.1, min=251, max=2657|chat, agent, multi-round|-|
|
383 | 384 | |ms-agent-multirole|[iic/MSAgent-MultiRole](https://modelscope.cn/datasets/iic/MSAgent-MultiRole/summary)||9500|447.6±84.9, min=145, max=1101|chat, agent, multi-round, role-play, multi-agent|-|
|
384 | 385 | |🔥toolbench-for-alpha-umi|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|backbone<br>caller<br>planner<br>summarizer|1448337|1439.7±853.9, min=123, max=18467|chat, agent|-|
|
385 | 386 | |damo-agent-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)||386984|956.5±407.3, min=326, max=19001|chat, agent, multi-round|-|
|
386 | 387 | |damo-agent-zh-mini|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)||20845|1326.4±329.6, min=571, max=4304|chat, agent, multi-round|-|
|
387 | 388 | |agent-instruct-all-en|[huangjintao/AgentInstruct_copy](https://modelscope.cn/datasets/huangjintao/AgentInstruct_copy/summary)|alfworld<br>db<br>kg<br>mind2web<br>os<br>webshop|1866|1144.3±635.5, min=206, max=6412|chat, agent, multi-round|-|
|
| 389 | +|🔥msagent-pro|[iic/MSAgent-Pro](https://modelscope.cn/datasets/iic/MSAgent-Pro/summary)||21905|1524.5±921.3, min=64, max=16770|chat, agent, multi-round|-| |
| 390 | +|toolbench|[swift/ToolBench](https://modelscope.cn/datasets/swift/ToolBench/summary)||124345|3669.5±1600.9, min=1047, max=22581|chat, agent, multi-round|-| |
388 | 391 | |code-alpaca-en|[wyj123456/code_alpaca_en](https://modelscope.cn/datasets/wyj123456/code_alpaca_en/summary)||20016|100.2±60.1, min=29, max=1776|-|[sahil2801/CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)|
|
389 | 392 | |🔥leetcode-python-en|[AI-ModelScope/leetcode-solutions-python](https://modelscope.cn/datasets/AI-ModelScope/leetcode-solutions-python/summary)||2359|727.1±235.9, min=259, max=2146|chat, coding|-|
|
390 | 393 | |🔥codefuse-python-en|[codefuse-ai/CodeExercise-Python-27k](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary)||27224|483.6±193.9, min=45, max=3082|chat, coding|-|
|
|
427 | 430 | |orpo-dpo-mix-40k|[AI-ModelScope/orpo-dpo-mix-40k](https://modelscope.cn/datasets/AI-ModelScope/orpo-dpo-mix-40k/summary)|default|43666|548.3±397.4, min=28, max=8483|dpo, orpo, en, quality|[mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k)|
|
428 | 431 | |stack-exchange-paired|[AI-ModelScope/stack-exchange-paired](https://modelscope.cn/datasets/AI-ModelScope/stack-exchange-paired/summary)||4483004|534.5±594.6, min=31, max=56588|hfrl, dpo, pairwise|[lvwerra/stack-exchange-paired](https://huggingface.co/datasets/lvwerra/stack-exchange-paired)|
|
429 | 432 | |shareai-llama3-dpo-zh-en-emoji|[hjh0119/shareAI-Llama3-DPO-zh-en-emoji](https://modelscope.cn/datasets/hjh0119/shareAI-Llama3-DPO-zh-en-emoji/summary)|default|2449|334.0±162.8, min=36, max=1801|rlhf, dpo, pairwise|-|
|
| 433 | +|ultrafeedback-kto|[AI-ModelScope/ultrafeedback-binarized-preferences-cleaned-kto](https://modelscope.cn/datasets/AI-ModelScope/ultrafeedback-binarized-preferences-cleaned-kto/summary)|default|230720|11.0±0.0, min=11, max=11|rlhf, kto|-| |
430 | 434 | |pileval|[huangjintao/pile-val-backup](https://modelscope.cn/datasets/huangjintao/pile-val-backup/summary)||214670|1612.3±8856.2, min=11, max=1208955|text-generation, awq|[mit-han-lab/pile-val-backup](https://huggingface.co/datasets/mit-han-lab/pile-val-backup)|
|
431 | 435 | |mantis-instruct|[swift/Mantis-Instruct](https://modelscope.cn/datasets/swift/Mantis-Instruct/summary)|birds-to-words<br>chartqa<br>coinstruct<br>contrastive_caption<br>docvqa<br>dreamsim<br>dvqa<br>iconqa<br>imagecode<br>llava_665k_multi<br>lrv_multi<br>multi_vqa<br>nextqa<br>nlvr2<br>spot-the-diff<br>star<br>visual_story_telling|655351|825.7±812.5, min=284, max=13563|chat, multi-modal, vision, quality|[TIGER-Lab/Mantis-Instruct](https://huggingface.co/datasets/TIGER-Lab/Mantis-Instruct)|
|
432 | 436 | |llava-data-instruct|[swift/llava-data](https://modelscope.cn/datasets/swift/llava-data/summary)|llava_instruct|364100|189.0±142.1, min=33, max=5183|sft, multi-modal, quality|[TIGER-Lab/llava-data](https://huggingface.co/datasets/TIGER-Lab/llava-data)|
|
433 | 437 | |midefics|[swift/MideficsDataset](https://modelscope.cn/datasets/swift/MideficsDataset/summary)||3800|201.3±70.2, min=60, max=454|medical, en, vqa|[WinterSchool/MideficsDataset](https://huggingface.co/datasets/WinterSchool/MideficsDataset)|
|
434 | 438 | |gqa|[None](https://modelscope.cn/datasets/None/summary)|train_all_instructions|-|Dataset is too huge, please click the original link to view the dataset stat.|multi-modal, en, vqa, quality|[lmms-lab/GQA](https://huggingface.co/datasets/lmms-lab/GQA)|
|
435 | 439 | |text-caps|[swift/TextCaps](https://modelscope.cn/datasets/swift/TextCaps/summary)||18145|38.2±4.4, min=31, max=73|multi-modal, en, caption, quality|[HuggingFaceM4/TextCaps](https://huggingface.co/datasets/HuggingFaceM4/TextCaps)|
|
| 440 | +|refcoco-unofficial-caption|[swift/refcoco](https://modelscope.cn/datasets/swift/refcoco/summary)||46215|44.7±3.2, min=36, max=71|multi-modal, en, caption|[jxu124/refcoco](https://huggingface.co/datasets/jxu124/refcoco)| |
| 441 | +|refcoco-unofficial-grounding|[swift/refcoco](https://modelscope.cn/datasets/swift/refcoco/summary)||46215|45.2±3.1, min=37, max=69|multi-modal, en, grounding|[jxu124/refcoco](https://huggingface.co/datasets/jxu124/refcoco)| |
| 442 | +|refcocog-unofficial-caption|[swift/refcocog](https://modelscope.cn/datasets/swift/refcocog/summary)||44799|49.7±4.7, min=37, max=88|multi-modal, en, caption|[jxu124/refcocog](https://huggingface.co/datasets/jxu124/refcocog)| |
| 443 | +|refcocog-unofficial-grounding|[swift/refcocog](https://modelscope.cn/datasets/swift/refcocog/summary)||44799|50.1±4.7, min=37, max=90|multi-modal, en, grounding|[jxu124/refcocog](https://huggingface.co/datasets/jxu124/refcocog)| |
436 | 444 | |a-okvqa|[swift/A-OKVQA](https://modelscope.cn/datasets/swift/A-OKVQA/summary)||18201|45.8±7.9, min=32, max=100|multi-modal, en, vqa, quality|[HuggingFaceM4/A-OKVQA](https://huggingface.co/datasets/HuggingFaceM4/A-OKVQA)|
|
437 | 445 | |okvqa|[swift/OK-VQA_train](https://modelscope.cn/datasets/swift/OK-VQA_train/summary)||9009|34.4±3.3, min=28, max=59|multi-modal, en, vqa, quality|[Multimodal-Fatima/OK-VQA_train](https://huggingface.co/datasets/Multimodal-Fatima/OK-VQA_train)|
|
438 | 446 | |ocr-vqa|[swift/OCR-VQA](https://modelscope.cn/datasets/swift/OCR-VQA/summary)||186753|35.6±6.6, min=29, max=193|multi-modal, en, ocr-vqa|[howard-hou/OCR-VQA](https://huggingface.co/datasets/howard-hou/OCR-VQA)|
|
|
443 | 451 | |guanaco|[AI-ModelScope/GuanacoDataset](https://modelscope.cn/datasets/AI-ModelScope/GuanacoDataset/summary)|default|31561|250.1±70.3, min=89, max=1436|chat, zh|[JosephusCheung/GuanacoDataset](https://huggingface.co/datasets/JosephusCheung/GuanacoDataset)|
|
444 | 452 | |mind2web|[swift/Multimodal-Mind2Web](https://modelscope.cn/datasets/swift/Multimodal-Mind2Web/summary)||1009|297522.4±325496.2, min=8592, max=3499715|agent, multi-modal|[osunlp/Multimodal-Mind2Web](https://huggingface.co/datasets/osunlp/Multimodal-Mind2Web)|
|
445 | 453 | |sharegpt-4o-image|[AI-ModelScope/ShareGPT-4o](https://modelscope.cn/datasets/AI-ModelScope/ShareGPT-4o/summary)|image_caption|57289|638.7±157.9, min=47, max=4640|vqa, multi-modal|[OpenGVLab/ShareGPT-4o](https://huggingface.co/datasets/OpenGVLab/ShareGPT-4o)|
|
| 454 | +|pixelprose|[swift/pixelprose](https://modelscope.cn/datasets/swift/pixelprose/summary)||-|Dataset is too huge, please click the original link to view the dataset stat.|caption, multi-modal, vision|[tomg-group-umd/pixelprose](https://huggingface.co/datasets/tomg-group-umd/pixelprose)| |
446 | 455 | |m3it|[AI-ModelScope/M3IT](https://modelscope.cn/datasets/AI-ModelScope/M3IT/summary)|coco<br>vqa-v2<br>shapes<br>shapes-rephrased<br>coco-goi-rephrased<br>snli-ve<br>snli-ve-rephrased<br>okvqa<br>a-okvqa<br>viquae<br>textcap<br>docvqa<br>science-qa<br>imagenet<br>imagenet-open-ended<br>imagenet-rephrased<br>coco-goi<br>clevr<br>clevr-rephrased<br>nlvr<br>coco-itm<br>coco-itm-rephrased<br>vsr<br>vsr-rephrased<br>mocheg<br>mocheg-rephrased<br>coco-text<br>fm-iqa<br>activitynet-qa<br>msrvtt<br>ss<br>coco-cn<br>refcoco<br>refcoco-rephrased<br>multi30k<br>image-paragraph-captioning<br>visual-dialog<br>visual-dialog-rephrased<br>iqa<br>vcr<br>visual-mrc<br>ivqa<br>msrvtt-qa<br>msvd-qa<br>gqa<br>text-vqa<br>ocr-vqa<br>st-vqa<br>flickr8k-cn|-|Dataset is too huge, please click the original link to view the dataset stat.|chat, multi-modal, vision|-|
|
447 | 456 | |sharegpt4v|[AI-ModelScope/ShareGPT4V](https://modelscope.cn/datasets/AI-ModelScope/ShareGPT4V/summary)|ShareGPT4V<br>ShareGPT4V-PT|-|Dataset is too huge, please click the original link to view the dataset stat.|chat, multi-modal, vision|-|
|
448 | 457 | |llava-instruct-150k|[AI-ModelScope/LLaVA-Instruct-150K](https://modelscope.cn/datasets/AI-ModelScope/LLaVA-Instruct-150K/summary)||624610|490.4±180.2, min=288, max=5438|chat, multi-modal, vision|-|
|
|
467 | 476 | |dolphin|[swift/dolphin](https://modelscope.cn/datasets/swift/dolphin/summary)|flan1m-alpaca-uncensored<br>flan5m-alpaca-uncensored|-|Dataset is too huge, please click the original link to view the dataset stat.|en|[cognitivecomputations/dolphin](https://huggingface.co/datasets/cognitivecomputations/dolphin)|
|
468 | 477 | |evol-instruct-v2|[AI-ModelScope/WizardLM_evol_instruct_V2_196k](https://modelscope.cn/datasets/AI-ModelScope/WizardLM_evol_instruct_V2_196k/summary)||109184|480.9±333.1, min=26, max=4942|chat, en|[WizardLM/WizardLM_evol_instruct_V2_196k](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k)|
|
469 | 478 | |fineweb|[None](https://modelscope.cn/datasets/None/summary)||-|Dataset is too huge, please click the original link to view the dataset stat.|pretrain, quality|[HuggingFaceFW/fineweb](https://huggingface.co/datasets/HuggingFaceFW/fineweb)|
|
| 479 | +|gen-qa|[swift/GenQA](https://modelscope.cn/datasets/swift/GenQA/summary)||-|Dataset is too huge, please click the original link to view the dataset stat.|qa, quality, multi-task|[tomg-group-umd/GenQA](https://huggingface.co/datasets/tomg-group-umd/GenQA)| |
470 | 480 | |github-code|[swift/github-code](https://modelscope.cn/datasets/swift/github-code/summary)||-|Dataset is too huge, please click the original link to view the dataset stat.|pretrain, quality|[codeparrot/github-code](https://huggingface.co/datasets/codeparrot/github-code)|
|
471 | 481 | |gpt4v-dataset|[swift/gpt4v-dataset](https://modelscope.cn/datasets/swift/gpt4v-dataset/summary)||12356|217.9±68.3, min=35, max=596|en, caption, multi-modal, quality|[laion/gpt4v-dataset](https://huggingface.co/datasets/laion/gpt4v-dataset)|
|
472 | 482 | |guanaco-belle-merge|[AI-ModelScope/guanaco_belle_merge_v1.0](https://modelscope.cn/datasets/AI-ModelScope/guanaco_belle_merge_v1.0/summary)||693987|134.2±92.0, min=24, max=6507|QA, zh|[Chinese-Vicuna/guanaco_belle_merge_v1.0](https://huggingface.co/datasets/Chinese-Vicuna/guanaco_belle_merge_v1.0)|
|
| 483 | +|infinity-instruct|[swift/Infinity-Instruct](https://modelscope.cn/datasets/swift/Infinity-Instruct/summary)||-|Dataset is too huge, please click the original link to view the dataset stat.|qa, quality, multi-task|[BAAI/Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct)| |
473 | 484 | |llava-med-zh-instruct|[swift/llava-med-zh-instruct-60k](https://modelscope.cn/datasets/swift/llava-med-zh-instruct-60k/summary)||56649|207.7±67.6, min=37, max=657|zh, medical, vqa|[BUAADreamer/llava-med-zh-instruct-60k](https://huggingface.co/datasets/BUAADreamer/llava-med-zh-instruct-60k)|
|
474 |
| -|lmsys-chat-1m|[AI-ModelScope/lmsys-chat-1m](https://modelscope.cn/datasets/AI-ModelScope/lmsys-chat-1m/summary)||-|Dataset is too huge, please click the original link to view the dataset stat.|chat, en|[lmsys/lmsys-chat-1m](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)| |
475 | 485 | |math-instruct|[AI-ModelScope/MathInstruct](https://modelscope.cn/datasets/AI-ModelScope/MathInstruct/summary)||262283|254.4±183.5, min=11, max=4383|math, cot, en, quality|[TIGER-Lab/MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct)|
|
476 | 486 | |math-plus|[TIGER-Lab/MATH-plus](https://modelscope.cn/datasets/TIGER-Lab/MATH-plus/summary)|train|893929|287.1±158.7, min=24, max=2919|qa, math, en, quality|[TIGER-Lab/MATH-plus](https://huggingface.co/datasets/TIGER-Lab/MATH-plus)|
|
477 | 487 | |moondream2-coyo-5M|[swift/moondream2-coyo-5M-captions](https://modelscope.cn/datasets/swift/moondream2-coyo-5M-captions/summary)||-|Dataset is too huge, please click the original link to view the dataset stat.|caption, pretrain, quality|[isidentical/moondream2-coyo-5M-captions](https://huggingface.co/datasets/isidentical/moondream2-coyo-5M-captions)|
|
|
0 commit comments