Home
Softono
p

paddlepaddle

Professional software vendor delivering innovative solutions on the Softono platform. Specialized in both open-source and proprietary software development.

Total Products
11

Software by paddlepaddle

PaddleNLP
Open Source

PaddleNLP

**简体中文**🀄 | [English🌎](./README_en.md) <p align="center"> <img src="https://user-images.githubusercontent.com/1371212/175816733-8ec25eb0-9af3-4380-9218-27c154518258.png" align="middle" width="500" /> </p> ------------------------------------------------------------------------------------------ <p align="center"> <a href="https://paddlenlp.readthedocs.io/en/latest/?badge=latest"><img src="https://readthedocs.org/projects/paddlenlp/badge/?version=latest"> <a href="https://github.com/PaddlePaddle/PaddleNLP/releases"><img src="https://img.shields.io/github/v/release/PaddlePaddle/PaddleNLP?color=ffa"></a> <a href=""><img src="https://img.shields.io/badge/python-3.7+-aff.svg"></a> <a href=""><img src="https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg"></a> <a href="https://github.com/PaddlePaddle/PaddleNLP/graphs/contributors"><img src="https://img.shields.io/github/contributors/PaddlePaddle/PaddleNLP?color=9ea"></a> <a href="https://github.com/PaddlePaddle/PaddleNLP/commits"><img src="https://img.shields.io/github/commit-activity/m/PaddlePaddle/PaddleNLP?color=3af"></a> <a href="https://pypi.org/project/paddlenlp/"><img src="https://img.shields.io/pypi/dm/paddlenlp?color=9cf"></a> <a href="https://github.com/PaddlePaddle/PaddleNLP/issues"><img src="https://img.shields.io/github/issues/PaddlePaddle/PaddleNLP?color=9cc"></a> <a href="https://github.com/PaddlePaddle/PaddleNLP/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/PaddleNLP?color=ccf"></a> <a href="./LICENSE"><img src="https://img.shields.io/badge/license-Apache%202-dfd.svg"></a> </p> <h4 align="center"> <a href=#特性> 特性 </a> | <a href=#模型支持> 模型支持 </a> | <a href=#安装> 安装 </a> | <a href=#快速开始> 快速开始 </a> | <a href=#社区交流> 社区交流 </a> </h4> **PaddleNLP**是一款基于飞桨深度学习框架的大语言模型(LLM)开发套件,支持在多种硬件上进行高效的大模型训练、无损压缩以及高性能推理。PaddleNLP 具备**简单易用**和**性能极致**的特点,致力于助力开发者实现高效的大模型产业级应用。 <a href="https://trendshift.io/repositories/2246" target="_blank"><img src="https://trendshift.io/api/badge/repositories/2246" alt="PaddlePaddle%2FPaddleNLP | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> ## News 📢 * **2025.04.29 PaddleNLP 现已支持 Qwen3 系列模型**: Qwen3 系列模型支持持两种思考模式,预训练约 36 万亿个 token、119 种语言和方言。包括六个 Dense 模型, Qwen3-32B、Qwen3-14B、Qwen3-8B、Qwen3-4B、Qwen3-1.7B 和 Qwen3-0.6B。两个 MoE 模型的权重:Qwen3-235B-A22B,Qwen3-30B-A3B。 * **2025.03.12 [PaddleNLP v3.0 Beta4](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta4)**:全面支持 DeepSeek V3/R1/R1-Distill, 及 QwQ-32B 等热门思考模型。**DeepSeek V3/R1完整版支持 FP8、INT8、4-bit 量化推理,MTP 投机解码**。单机 FP8推理输出超**1000 tokens/s**; 4-bit 推理输出超**2100 tokens/s**! 发布新版推理部署镜像,热门模型[一键部署](https://paddlenlp.readthedocs.io/zh/latest/llm/server/docs/general_model_inference.html)。推理部署[使用文档](https://paddlenlp.readthedocs.io/zh/latest/llm/docs/predict/index.html)全面更新,体验全面提升!自研下一代通用信息抽取模型 PP-UIE [全新发布](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/application/information_extraction),支持8K 长度信息抽取。新增大模型 Embedding 训练,支持 INF-CL 超大 batch size 训练。新增[MergeKit](https://paddlenlp.readthedocs.io/zh/latest/llm/docs/mergekit.html)模型融合工具,缓解对齐代价。低资源训练全面优化,16G 小显存可以流畅训练。 * **2025.02.10 PaddleNLP 现已支持 DeepSeek-R1系列模型,[在线使用](https://aistudio.baidu.com/projectdetail/8775758)**:依托全新的 PaddleNLP 3.0套件,DeepSeek-R1系列模型现已全面支持。凭借数据并行、数据分组切分并行、模型并行、流水线并行以及专家并行等一系列先进的分布式训练能力,结合 Paddle 框架独有的列稀疏注意力掩码表示技术——FlashMask 方法,DeepSeek-R1系列模型在训练过程中显著降低了显存消耗,同时取得了卓越的训练性能提升。 <details><summary> <b>点击展开</b> </summary><div> * **2025.03.17 《DeepSeek-R1满血版单机部署实测》** 🔥🔥🔥 飞桨框架3.0大模型推理部署全面升级,支持多款主流大模型,DeepSeek-R1满血版实现单机部署,吞吐提升一倍!欢迎广大用户开箱体验~现已开启有奖活动:完成 DeepSeek-R1-MTP 单机部署任务、提交高质量测评 blog,即可实时赢取奖金!💰💰💰 报名[地址](https://www.wjx.top/vm/OlzzmbG.aspx#), 活动详情:https://github.com/PaddlePaddle/PaddleNLP/issues/10166 , 参考文档:https://github.com/PaddlePaddle/PaddleNLP/issues/10157 。 * **2025.03.06 PaddleNLP 现已支持 Qwen/QwQ-32B 模型**: 其模型参数仅有 32B,但其数学推理、编程能力和通用能力可与具备 671B 参数(其中 37B 被激活)的 DeepSeek-R1 媲美。借助 PaddleNLP 3.0套件,现可实现多种并行策略[微调训练](./llm/README.md)、[高性能推理、低比特量化](./llm/docs/predict/qwen.md)和[服务化部署](./llm/server/README.md)。 * **2025.02.20 🔥🔥《PP-UIE 信息抽取智能引擎全新升级》** 强化零样本学习能力,支持极少甚至零标注数据实现高效冷启动与迁移学习,显著降低数据标注成本;具备处理长文本能力,支持 8192 个 Token 长度文档信息抽取,实现跨段落识别关键信息,形成完整理解;提供完整可定制化的训练和推理全流程,训练效率相较于 LLama-Factory 实现了1.8倍的提升。 2月26日(周三)19:00为您深度解析全新 PP-UIE 技术方案及在部署方面的功能、优势与技巧。报名链接:https://www.wjx.top/vm/mBKC6pb.aspx?udsid=606418 * **2024.12.16 [PaddleNLP v3.0 Beta3](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta3)**:大模型功能全新升级,新增了 Llama-3.2、DeepSeekV2模型,升级了 TokenizerFast,快速分词,重构了 SFTTrainer,一键开启 SFT 训练。此外,PaddleNLP 还支持了优化器状态的卸载和重载功能,实现了精细化的重新计算,训练性能提升7%。在 Unified Checkpoint 方面,进一步优化了异步保存逻辑,新增 Checkpoint 压缩功能,可节省78.5%存储空间。 最后,在大模型推理方面,升级 Append Attention,支持了 FP8量化,支持投机解码。 * **2024.12.13 📚《飞桨大模型套件 Unified Checkpoint 技术》**,加速模型存储95%,节省空间78%。支持全分布式策略调整自适应转换,提升模型训练的灵活性与可扩展性。训练-压缩-推理统一存储协议,无需手动转换提升全流程体验。Checkpoint 无损压缩结合异步保存,实现秒级存储并降低模型存储成本。适用于智能制造、指挥交通、医疗健康、金融服务等产业实际场景。12月24日(周二)19:00直播为您详细解读该技术如何优化大模型训练流程。报名链接:https://www.wjx.top/vm/huZkHn9.aspx?udsid=787976 * **2024.11.28 📚《FlashRAG-Paddle | 基于 PaddleNLP 的高效开发与评测 RAG 框架》**,为文本更快更好构建准确嵌入表示、加速推理生成速度。PaddleNLP 支持超大 Batch 嵌入表示学习与多硬件高性能推理,涵盖 INT8/INT4量化技术及多种高效注意力机制优化与 TensorCore 深度优化。内置全环节算子融合技术,使得 FlashRAG 推理性能相比 transformers 动态图提升70%以上,结合检索增强知识输出结果更加准确,带来敏捷高效的使用体验。直播时间:12月3日(周二)19:00。报名链接:https://www.wjx.top/vm/eaBa1vA.aspx?udsid=682361 * **2024.08.08 📚《飞桨产业级大语言模型开发利器 PaddleNLP 3.0 重磅发布》**,训压推全流程贯通,主流模型全覆盖。大模型自动并行,千亿模型训推全流程开箱即用。提供产业级高性能精调与对齐解决方案,压缩推理领先,多硬件适配。覆盖产业级智能助手、内容创作、知识问答、关键信息抽取等应用场景。直播时间:8月22日(周四)19:00。报名链接:https://www.wjx.top/vm/Y2f7FFY.aspx?udsid=143844 * **2024.06.27 [PaddleNLP v3.0 Beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta0)**:拥抱大模型,体验全升级。统一大模型套件,实现国产计算芯片全流程接入;全面支持飞桨4D 并行配置、高效精调策略、高效对齐算法、高性能推理等大模型产业级应用流程;自研极致收敛的 RsLoRA+算法、自动扩缩容存储机制 Unified Checkpoint 和通用化支持的 FastFFN、FusedQKV 助力大模型训推;主流模型持续支持更新,提供高效解决方案。 * **2024.04.24 [PaddleNLP v2.8](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.8.0)**:自研极致收敛的 RsLoRA+算法,大幅提升 PEFT 训练收敛速度以及训练效果;引入高性能生成加速到 RLHF PPO 算法,打破 PPO 训练中生成速度瓶颈,PPO 训练性能大幅领先。通用化支持 FastFFN、FusedQKV 等多个大模型训练性能优化方式,大模型训练更快、更稳定。 </div></details> ## 特性 ### <a href=#多硬件训推一体> 🔧 多硬件训推一体 </a> 支持英伟达 GPU、昆仑 XPU、昇腾 NPU、燧原 GCU 和海光 DCU 等多个硬件的大模型和自然语言理解模型训练和推理,套件接口支持硬件快速切换,大幅降低硬件切换研发成本。 当前支持的自然语言理解模型:[多硬件自然语言理解模型列表](./docs/zh/model_zoo/model_list_multy_device.md) ### <a href=#高效易用的预训练> 🚀 高效易用的预训练 </a> 支持纯数据并行策略、分组参数切片的数据并行策略、张量模型并行策略和流水线模型并行策略的4D 高性能训练,Trainer 支持分布式策略配置化,降低复杂分布式组合带来的使用成本; [Unified Checkpoint 大模型存储工具](./llm/docs/unified_checkpoint.md)可以使得训练断点支持机器资源动态扩缩容恢复。此外,异步保存,模型存储可加速95%,Checkpoint 压缩,可节省78.5%存储空间。 ### <a href=#高效精调> 🤗 高效精调 </a> 精调算法深度结合零填充数据流和 [FlashMask](./llm/docs/flashmask.md) 高性能算子,降低训练无效数据填充和计算,大幅提升精调训练吞吐。 ### <a href=#无损压缩和高性能推理> 🎛️ 无损压缩和高性能推理 </a> 大模型套件高性能推理模块内置动态插入和全环节算子融合策略,极大加快并行推理速度。底层实现细节封装化,实现开箱即用的高性能并行推理能力。 ## 文档 更多详细文档, 请访问 [PaddleNLP Documentation](https://paddlenlp.readthedocs.io/). ------------------------------------------------------------------------------------------ ## 模型支持 * 模型参数已支持 LLaMA 系列、Baichuan 系列、Bloom 系列、ChatGLM 系列、Gemma 系列、Mistral 系列、OPT 系列和 Qwen 系列,详细列表👉【LLM】模型参数支持列表如下: | 模型系列 | 模型名称 | |:-------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | [PP-UIE](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/application/information_extraction) | paddlenlp/PP-UIE-0.5B, paddlenlp/PP-UIE-1.5B, paddlenlp/PP-UIE-7B, paddlenlp/PP-UIE-14B | | [LLaMA](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | facebook/llama-7b, facebook/llama-13b, facebook/llama-30b, facebook/llama-65b | | [Llama2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Llama-2-7b, meta-llama/Llama-2-7b-chat, meta-llama/Llama-2-13b, meta-llama/Llama-2-13b-chat, meta-llama/Llama-2-70b, meta-llama/Llama-2-70b-chat | | [Llama3](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Meta-Llama-3-8B, meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3-70B, meta-llama/Meta-Llama-3-70B-Instruct | | [Llama3.1](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Meta-Llama-3.1-8B, meta-llama/Meta-Llama-3.1-8B-Instruct, meta-llama/Meta-Llama-3.1-70B, meta-llama/Meta-Llama-3.1-70B-Instruct, meta-llama/Meta-Llama-3.1-405B, meta-llama/Meta-Llama-3.1-405B-Instruct, meta-llama/Llama-Guard-3-8B | | [Llama3.2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Llama-3.2-1B, meta-llama/Llama-3.2-1B-Instruct, meta-llama/Llama-3.2-3B, meta-llama/Llama-3.2-3B-Instruct, meta-llama/Llama-Guard-3-1B | | [Llama3.3](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/llama) | meta-llama/Llama-3.3-70B-Instruct | | [Baichuan](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/baichuan) | baichuan-inc/Baichuan-7B, baichuan-inc/Baichuan-13B-Base, baichuan-inc/Baichuan-13B-Chat | | [Baichuan2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/baichuan) | baichuan-inc/Baichuan2-7B-Base, baichuan-inc/Baichuan2-7B-Chat, baichuan-inc/Baichuan2-13B-Base, baichuan-inc/Baichuan2-13B-Chat | | [Bloom](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/bloom) | bigscience/bloom-560m, bigscience/bloom-560m-bf16, bigscience/bloom-1b1, bigscience/bloom-3b, bigscience/bloom-7b1, bigscience/bloomz-560m, bigscience/bloomz-1b1, bigscience/bloomz-3b, bigscience/bloomz-7b1-mt, bigscience/bloomz-7b1-p3, bigscience/bloomz-7b1, bellegroup/belle-7b-2m | | [ChatGLM](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/chatglm/) | THUDM/chatglm-6b, THUDM/chatglm-6b-v1.1 | | [ChatGLM2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/chatglm2) | THUDM/chatglm2-6b | | [ChatGLM3](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/chatglm2) | THUDM/chatglm3-6b | | [DeepSeekV2](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/config/deepseek-v2) | deepseek-ai/DeepSeek-V2, deepseek-ai/DeepSeek-V2-Chat, deepseek-ai/DeepSeek-V2-Lite, deepseek-ai/DeepSeek-V2-Lite-Chat, deepseek-ai/DeepSeek-Coder-V2-Base, deepseek-ai/DeepSeek-Coder-V2-Instruct, deepseek-ai/DeepSeek-Coder-V2-Lite-Base, deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct | | [DeepSeekV3](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/config/deepseek-v2) | deepseek-ai/DeepSeek-V3, deepseek-ai/DeepSeek-V3-Base | | [DeepSeek-R1](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/llm/config/deepseek-v2) | deepseek-ai/DeepSeek-R1, deepseek-ai/DeepSeek-R1-Zero, deepseek-ai/DeepSeek-R1-Distill-Llama-70B, deepseek-ai/DeepSeek-R1-Distill-Llama-8B, deepseek-ai/DeepSeek-R1-Distill-Qwen-14B, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, deepseek-ai/DeepSeek-R1-Distill-Qwen-32B, deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | | [Gemma](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/gemma) | google/gemma-7b, google/gemma-7b-it, google/gemma-2b, google/gemma-2b-it | | [Mistral](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/mistral) | mistralai/Mistral-7B-Instruct-v0.3, mistralai/Mistral-7B-v0.1 | | [Mixtral](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/mixtral) | mistralai/Mixtral-8x7B-Instruct-v0.1 | | [OPT](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/opt) | facebook/opt-125m, facebook/opt-350m, facebook/opt-1.3b, facebook/opt-2.7b, facebook/opt-6.7b, facebook/opt-13b, facebook/opt-30b, facebook/opt-66b, facebook/opt-iml-1.3b, opt-iml-max-1.3b | | [Qwen](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | qwen/qwen-7b, qwen/qwen-7b-chat, qwen/qwen-14b, qwen/qwen-14b-chat, qwen/qwen-72b, qwen/qwen-72b-chat, | | [Qwen1.5](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen1.5-0.5B, Qwen/Qwen1.5-0.5B-Chat, Qwen/Qwen1.5-1.8B, Qwen/Qwen1.5-1.8B-Chat, Qwen/Qwen1.5-4B, Qwen/Qwen1.5-4B-Chat, Qwen/Qwen1.5-7B, Qwen/Qwen1.5-7B-Chat, Qwen/Qwen1.5-14B, Qwen/Qwen1.5-14B-Chat, Qwen/Qwen1.5-32B, Qwen/Qwen1.5-32B-Chat, Qwen/Qwen1.5-72B, Qwen/Qwen1.5-72B-Chat, Qwen/Qwen1.5-110B, Qwen/Qwen1.5-110B-Chat, Qwen/Qwen1.5-MoE-A2.7B, Qwen/Qwen1.5-MoE-A2.7B-Chat | | [Qwen2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2-0.5B, Qwen/Qwen2-0.5B-Instruct, Qwen/Qwen2-1.5B, Qwen/Qwen2-1.5B-Instruct, Qwen/Qwen2-7B, Qwen/Qwen2-7B-Instruct, Qwen/Qwen2-72B, Qwen/Qwen2-72B-Instruct, Qwen/Qwen2-57B-A14B, Qwen/Qwen2-57B-A14B-Instruct | | [Qwen2-Math](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2-Math-1.5B, Qwen/Qwen2-Math-1.5B-Instruct, Qwen/Qwen2-Math-7B, Qwen/Qwen2-Math-7B-Instruct, Qwen/Qwen2-Math-72B, Qwen/Qwen2-Math-72B-Instruct, Qwen/Qwen2-Math-RM-72B | | [Qwen2.5](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2.5-0.5B, Qwen/Qwen2.5-0.5B-Instruct, Qwen/Qwen2.5-1.5B, Qwen/Qwen2.5-1.5B-Instruct, Qwen/Qwen2.5-3B, Qwen/Qwen2.5-3B-Instruct, Qwen/Qwen2.5-7B, Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-7B-Instruct-1M, Qwen/Qwen2.5-14B, Qwen/Qwen2.5-14B-Instruct, Qwen/Qwen2.5-14B-Instruct-1M, Qwen/Qwen2.5-32B, Qwen/Qwen2.5-32B-Instruct, Qwen/Qwen2.5-72B, Qwen/Qwen2.5-72B-Instruct | | [Qwen2.5-Math](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2.5-Math-1.5B, Qwen/Qwen2.5-Math-1.5B-Instruct, Qwen/Qwen2.5-Math-7B, Qwen/Qwen2.5-Math-7B-Instruct, Qwen/Qwen2.5-Math-72B, Qwen/Qwen2.5-Math-72B-Instruct, Qwen/Qwen2.5-Math-RM-72B | | [Qwen2.5-Coder](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen2.5-Coder-1.5B, Qwen/Qwen2.5-Coder-1.5B-Instruct, Qwen/Qwen2.5-Coder-7B, Qwen/Qwen2.5-Coder-7B-Instruct | | [Qwen3](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/Qwen3-0.6B, Qwen/Qwen3-1.7B, Qwen/Qwen3-4B, Qwen/Qwen3-8B, Qwen/Qwen3-14B, Qwen/Qwen3-32B, Qwen/Qwen3-30B-A3B, Qwen/Qwen3-235B-A22B, Qwen/Qwen3-0.6B-Base, Qwen/Qwen3-1.7B-Base, Qwen/Qwen3-4B-Base, Qwen/Qwen3-8B-Base, Qwen/Qwen3-14B-Base, Qwen/Qwen3-30B-A3B-Base | | [QwQ](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/qwen/) | Qwen/QwQ-32B, Qwen/QwQ-32B-Preview | | [Yuan2](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/config/yuan/) | IEITYuan/Yuan2-2B, IEITYuan/Yuan2-51B, IEITYuan/Yuan2-102B | * 4D 并行和算子优化已支持 LLaMA 系列、Baichuan 系列、Bloom 系列、ChatGLM 系列、Gemma 系列、Mistral 系列、OPT 系列和 Qwen 系列,【LLM】模型4D 并行和算子支持列表如下: | 模型名称/并行能力支持 | 数据并行 | 张量模型并行 | | 参数分片并行 | | | 流水线并行 | |:---------------------:|:--------:|:------------:|:--------:|:------------:|:------:|:------:|:----------:| | | | 基础能力 | 序列并行 | stage1 | stage2 | stage3 | | | Llama | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Qwen | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Qwen1.5 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Qwen2 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Mixtral(moe) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | | Mistral | ✅ | ✅ | 🚧 | ✅ | ✅ | ✅ | 🚧 | | Baichuan | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Baichuan2 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | ChatGLM | ✅ | ✅ | 🚧 | ✅ | ✅ | ✅ | 🚧 | | ChatGLM2 | ✅ | 🚧 | 🚧 | ✅ | ✅ | ✅ | 🚧 | | ChatGLM3 | ✅ | 🚧 | 🚧 | ✅ | ✅ | ✅ | 🚧 | | Bloom | ✅ | ✅ | 🚧 | ✅ | ✅ | ✅ | 🚧 | | GPT-2/GPT-3 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | OPT | ✅ | ✅ | 🚧 | ✅ | ✅ | ✅ | 🚧 | | Gemma | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Yuan2 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | * 大模型预训练、精调(包含 SFT、PEFT 技术)、对齐、量化已支持 LLaMA 系列、Baichuan 系列、Bloom 系列、ChatGLM 系列、Mistral 系列、OPT 系列和 Qwen 系列,【LLM】模型预训练、精调、对齐、量化支持列表如下: | Model | Pretrain | SFT | LoRA | FlashMask | Prefix Tuning | DPO/SimPO/ORPO/KTO | RLHF | Mergekit | Quantization | |--------------------------------------------|:--------:|:---:|:----:|:---------:|:-------------:|:------------------:|:----:|:--------:|:------------:| | [Llama](./llm/config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [Qwen](./llm/config/qwen) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | 🚧 | | [Mixtral](./llm/config/mixtral) | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | ✅ | 🚧 | | [Mistral](./llm/config/mistral) | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | 🚧 | ✅ | 🚧 | | [Baichuan/Baichuan2](./llm/config/llama) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | | [ChatGLM-6B](./llm/config/chatglm) | ✅ | ✅ | ✅ | 🚧 | ✅ | 🚧 | 🚧 | ✅ | ✅ | | [ChatGLM2/ChatGLM3](./llm/config/chatglm2) | ✅ | ✅ | ✅ | 🚧 | ✅ | ✅ | 🚧 | ✅ | ✅ | | [Bloom](./llm/config/bloom) | ✅ | ✅ | ✅ | 🚧 | ✅ | 🚧 | 🚧 | ✅ | ✅ | | [GPT-3](./llm/config/gpt-3) | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | 🚧 | | [OPT](./llm/config/opt) | ✅ | ✅ | ✅ | 🚧 | 🚧 | 🚧 | 🚧 | ✅ | 🚧 | | [Gemma](./llm/config/gemma) | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | ✅ | 🚧 | | [Yuan](./llm/config/yuan) | ✅ | ✅ | ✅ | 🚧 | 🚧 | ✅ | 🚧 | ✅ | 🚧 | * [大模型推理](./llm/docs/predict/inference.md)已支持 LLaMA 系列、Qwen 系列、DeepSeek 系列、Mistral 系列、ChatGLM 系列、Bloom 系列和 Baichuan 系列,支持 Weight Only INT8及 INT4推理,支持 WAC(权重、激活、Cache KV)进行 INT8、FP8量化的推理,【LLM】模型推理支持列表如下: | 模型名称/量化类型支持 | FP16/BF16 | WINT8 | WINT4 | INT8-A8W8 | FP8-A8W8 | INT8-A8W8C8 | |:------------------------------------------:|:---------:|:-----:|:-----:|:---------:|:--------:|:-----------:| | [LLaMA](./llm/docs/predict/llama.md) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [Qwen](./llm/docs/predict/qwen.md) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [DeepSeek](./llm/docs/predict/deepseek.md) | ✅ | ✅ | ✅ | 🚧 | ✅ | 🚧 | | [Qwen-Moe](./llm/docs/predict/qwen.md) | ✅ | ✅ | ✅ | 🚧 | 🚧 | 🚧 | | [Mixtral](./llm/docs/predict/mixtral.md) | ✅ | ✅ | ✅ | 🚧 | 🚧 | 🚧 | | ChatGLM | ✅ | ✅ | ✅ | 🚧 | 🚧 | 🚧 | | Bloom | ✅ | ✅ | ✅ | 🚧 | 🚧 | 🚧 | | BaiChuan | ✅ | ✅ | ✅ | ✅ | ✅ | 🚧 | ## 安装 ### 环境依赖 * python >= 3.8 * paddlepaddle >= 3.0.0rc1 如果您尚未安装 PaddlePaddle,请参考 [飞桨官网](https://www.paddlepaddle.org.cn/) 进行安装。 ### pip 安装 ```shell pip install --upgrade paddlenlp==3.0.0b4 ``` 或者可通过以下命令安装最新 develop 分支代码: ```shell pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html ``` 更多关于 PaddlePaddle 和 PaddleNLP 安装的详细教程请查看[Installation](./docs/zh/get_started/installation.rst)。 ------------------------------------------------------------------------------------------ ## 快速开始 ### 大模型文本生成 PaddleNLP 提供了方便易用的 Auto API,能够快速的加载模型和 Tokenizer。这里以使用 `Qwen/Qwen2-0.5B` 模型做文本生成为例: ```python from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B") # if using CPU, please change float16 to float32 model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2-0.5B", dtype="float16") input_features = tokenizer("你好!请自我介绍一下。", return_tensors="pd") outputs = model.generate(**input_features, max_new_tokens=128) print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True)) # ['我是一个AI语言模型,我可以回答各种问题,包括但不限于:天气、新闻、历史、文化、科学、教育、娱乐等。请问您有什么需要了解的吗?'] ``` ### 大模型预训练 ```shell git clone https://github.com/PaddlePaddle/PaddleNLP.git && cd PaddleNLP # 如已clone或下载PaddleNLP可跳过 mkdir -p llm/data && cd llm/data wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.bin wget https://bj.bcebos.com/paddlenlp/models/transformers/llama/data/llama_openwebtext_100k.idx cd .. # change folder to PaddleNLP/llm # 如需使用use_fused_rms_norm=true,需要前往slm/model_zoo/gpt-3/external_ops安装fused_ln python -u run_pretrain.py ./config/qwen/pretrain_argument_0p5b.json ``` ### 大模型 SFT 精调 ```shell git clone https://github.com/PaddlePaddle/PaddleNLP.git && cd PaddleNLP # 如已clone或下载PaddleNLP可跳过 mkdir -p llm/data && cd llm/data wget https://bj.bcebos.com/paddlenlp/datasets/examples/AdvertiseGen.tar.gz && tar -zxvf AdvertiseGen.tar.gz cd .. # change folder to PaddleNLP/llm python -u run_finetune.py ./config/qwen/sft_argument_0p5b.json ``` 更多大模型全流程步骤,请参考[飞桨大模型套件](./llm)介绍。 另外我们还提供了快速微调方式, 无需 clone 源代码: ```python from paddlenlp.trl import SFTConfig, SFTTrainer from datasets import load_dataset dataset = load_dataset("ZHUI/alpaca_demo", split="train") training_args = SFTConfig(output_dir="Qwen/Qwen2.5-0.5B-SFT", device="gpu") trainer = SFTTrainer( args=training_args, model="Qwen/Qwen2.5-0.5B-Instruct", train_dataset=dataset, ) trainer.train() ``` 更多 PaddleNLP 内容可参考: * [精选模型库](./slm/model_zoo),包含优质预训练模型的端到端全流程使用。 * [多场景示例](./slm/examples),了解如何使用 PaddleNLP 解决 NLP 多种技术问题,包含基础技术、系统应用与拓展应用。 * [交互式教程](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995),在🆓免费算力平台 AI Studio 上快速学习 PaddleNLP。 ------------------------------------------------------------------------------------------ ## 社区交流 * 微信扫描二维码并填写问卷,即可加入交流群与众多社区开发者以及官方团队深度交流. <div align="center"> <img src="https://github.com/user-attachments/assets/3a58cc9f-69c7-4ccb-b6f5-73e966b8051a" width="150" height="150" /> </div> ## Citation 如果 PaddleNLP 对您的研究有帮助,欢迎引用 ```bibtex @misc{=paddlenlp, title={PaddleNLP: An Easy-to-use and High Performance NLP Library}, author={PaddleNLP Contributors}, howpublished = {\url{https://github.com/PaddlePaddle/PaddleNLP}}, year={2021} } ``` ## Acknowledge 我们借鉴了 Hugging Face 的[Transformers](https://github.com/huggingface/transformers)🤗关于预训练模型使用的优秀设计,在此对 Hugging Face 作者及其开源社区表示感谢。 ## License PaddleNLP 遵循[Apache-2.0开源协议](./LICENSE)。

AI & Machine Learning ML Frameworks
13K Github Stars
ERNIE
Open Source

ERNIE

<p align="center"> <img src="https://github.com/user-attachments/assets/9ad1ffce-2310-4f80-a3cd-7a117bfb4f17" width="300px"></a> </p> <div align="center"> [ERNIE Bot](https://ernie.baidu.com/) | [🤗Hugging Face](https://huggingface.co/baidu) | [AI Studio](https://aistudio.baidu.com/modelsoverview) 📑 [Blog](https://yiyan.baidu.com/blog/posts/ernie4.5) | 📚 [Cookbook](./cookbook/) | 📑 [Paper](https://yiyan.baidu.com/blog/publication/) | 🛠️ [Training](./docs/erniekit.md) | ⚡️ [Deploy](https://github.com/PaddlePaddle/FastDeploy) <a href="https://trendshift.io/repositories/14169" target="_blank"><img src="https://trendshift.io/api/badge/repositories/14169" alt="PaddlePaddle%2FERNIE | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> </div> ## 📣 Recent updates **[2025-11] 🔥 Released ERNIEKit v1.5:** - **New Features** - [ERNIE-4.5-VL-28B-A3B-Thinking] Supports SFT training and function call training for ERNIE-4.5-VL-28B-A3B-Thinking (https://huggingface.co/baidu/ERNIE-4.5-VL-28B-A3B-Thinking). **[2025-10] 🔥 Released ERNIEKit v1.4:** - **New Features** - VL Model Training: Support SFT for [PaddleOCR-VL-0.9B]((https://huggingface.co/PaddlePaddle/PaddleOCR-VL/tree/main/PaddleOCR-VL-0.9B)) model. More details in [PaddleOCR-VL-0.9B SFT](./docs/paddleocr_vl_sft.md). - Dataflow : Support padding-free startegy. - Packing data within a batch into a sequence to avoid padding, thereby reducing GPU memory usage and accelerating training. **[2025-09] 🔥 Released ERNIEKit v1.3:** - **New Features** - [ERNIE-4.5-21B-A3B-Thinking] Supports SFT training and function call training for ERNIE-4.5-21B-A3B-Thinking (https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking). - **Bug Fixes:** - [VL Model Training] Optimization of multimodal video data processing speed (#1266). **[2025-09] 🔥 Released ERNIEKit v1.2:** - **New Features** - [WebUI] Added support for training and conversation functionalities with ERNIE 28b/424b VL models. - [VL Model Training] Introduced support for query-response format in training data. - [Command-Line Tool] Added iluvatar GPU hardware support. - **Bug Fixes:** - [AutoParallel] Fix use_intermediate_api pp+recompute+moe bug (#1250) - [AutoParallel] Fix save checkpoint bug (#1242) - [VL Model Training] Fix lora 128k training bug (#1234) **[2025-09] 🔥 Released ERNIEKit v1.1:** ERNIEKit now supports SFT/LoRA for ERNIE-4.5-VL series. **[2025-06] 🔥 Released ERNIEKit v1.0:** We're excited to announce ERNIEKit v1.0, the most powerful and efficient toolkit yet for developing with the latest ERNIE models! ## Introduction to ERNIE 4.5 We introduce ERNIE 4.5, a new family of large-scale multimodal models comprising 10 distinct variants. The model family consist of Mixture-of-Experts (MoE) models with 47B and 3B active parameters, with the largest model having 424B total parameters, as well as a 0.3B dense model. For the MoE architecture, we propose a novel heterogeneous modality structure, which supports parameter sharing across modalities while also allowing dedicated parameters for each individual modality. This MoE architecture has the advantage to enhance multimodal understanding without compromising, and even improving, performance on text-related tasks. All of our models are trained with optimal efficiency using the [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) deep learning framework, which also enables high-performance inference and streamlined deployment for them. We achieve 47% Model FLOPs Utilization (MFU) in our largest ERNIE 4.5 language model pre-training. Experimental results show that our models achieve state-of-the-art performance across multiple text and multimodal benchmarks, especially in instruction following, world knowledge memorization, visual understanding and multimodal reasoning. All models are publicly accessible under Apache 2.0 to support future research and development in the field. Additionally, we open source the development toolkits for ERNIE 4.5, featuring industrial-grade capabilities, resource-efficient training and inference workflows, and multi-hardware compatibility. </br> <div align="center"> **ERNIE 4.5** <table style="table-layout: auto; border-collapse: collapse; border: 1px solid #ddd; text-align: center;"> <thead class="ant-table-thead"> <tr> <th colspan="2" style="border: 1px solid #ddd;text-align: center;background: lightgray;vertical-align: middle;color:black" >ERNIE 4.5 Models </th> <th colspan="3" style="border: 1px solid #ddd;text-align: center;background: lightgray;vertical-align: middle;color:black">Model Information</th> </tr> <tr> <th style="border: 1px solid #ddd;width: 100px;text-align: center;background: lightgray;vertical-align: middle;color:black">Model Category</th> <th style="border: 1px solid #ddd;width: 250px;text-align: center;background: lightgray;vertical-align: middle;color:black">Model</th> <th style="border: 1px solid #ddd; width: 100px;text-align: center;background: lightgray;vertical-align: middle;color:black">Input Modality</th> <th style="border: 1px solid #ddd; width: 100px;text-align: center;background: lightgray;vertical-align: middle;color:black">Output Modality</th> <th style="border: 1px solid #ddd; width: 100px;text-align: center;background: lightgray;vertical-align: middle;color:black">Context Window </th> </tr> </thead> <tbody class="ant-table-tbody"> <tr> <td rowspan="4" style="border: 1px solid #ddd;vertical-align: middle;">Large Language Models (LLMs)</td> <td style="border: 1px solid #ddd;">ERNIE-4.5-300B-A47B-Base</td> <td rowspan="4"style="border: 1px solid #ddd;">Text</td> <td rowspan="4"style="border: 1px solid #ddd;">Text</td> <td rowspan="10" style="border: 1px solid #ddd;">128K</td> </tr> <tr> <td style="border: 1px solid #ddd;">ERNIE-4.5-300B-A47B</td> </tr> <tr> <td style="border: 1px solid #ddd;">ERNIE-4.5-21B-A3B-Base</td> </tr> <tr> <td style="border: 1px solid #ddd;">ERNIE-4.5-21B-A3B</td> </tr> <tr> <td rowspan="4" style="border: 1px solid #ddd;vertical-align: middle;"> Vision-Language Models (VLMs)</td> <td style="border: 1px solid #ddd;">ERNIE-4.5-VL-424B-A47B-Base</td> <td rowspan="4"style="border: 1px solid #ddd;">Text/Image/Video</td> <td rowspan="4"style="border: 1px solid #ddd;">Text</td> </tr> <tr> <td style="border: 1px solid #ddd;">ERNIE-4.5-VL-424B-A47B</td> </tr> <tr> <td style="border: 1px solid #ddd;">ERNIE-4.5-VL-28B-A3B-Base</td> </tr> <tr> <td style="border: 1px solid #ddd;">ERNIE-4.5-VL-28B-A3B</td> </tr> <tr> <td rowspan="2" style="border: 1px solid #ddd;vertical-align: middle;">Dense Models</td> <td style="border: 1px solid #ddd;">ERNIE-4.5-0.3B-Base</td> <td rowspan="2"style="border: 1px solid #ddd;">Text</td> <td rowspan="2"style="border: 1px solid #ddd;">Text</td> </tr> <tr> <td style="border: 1px solid #ddd;">ERNIE-4.5-0.3B</td> </tr> </tbody> </table> </div> _Note: All models (including pre-trained weights and inference code) have been released on [🤗Hugging Face](https://huggingface.co/baidu), and [AI Studio](https://aistudio.baidu.com/index). Check our [blog](https://yiyan.baidu.com/blog/posts/ernie4.5) for more details._ </br> ## Highlights Our model family is characterized by three key innovations: 1. **Multimodal Heterogeneous MoE Pre-Training:** Our models are jointly trained on both textual and visual modalities to better capture the nuances of multimodal information and improve performance on tasks involving text understanding and generation, image understanding, and cross-modal reasoning. To achieve this without one modality hindering the learning of another, we designed a *heterogeneous MoE structure*, incorporated *modality-isolated routing*, and employed *router orthogonal loss* and *multimodal token-balanced loss*. These architectural choices ensure that both modalities are effectively represented, allowing for mutual reinforcement during training. 2. **Scaling-Efficient Infrastructure:** We propose a novel heterogeneous hybrid parallelism and hierarchical load balancing strategy for efficient training of ERNIE 4.5 models. By using intra-node expert parallelism, memory-efficient pipeline scheduling, FP8 mixed-precision training and finegrained recomputation methods, we achieve remarkable pre-training throughput. For inference, we propose *multi-expert parallel collaboration* method and *convolutional code quantization* algorithm to achieve 4-bit/2-bit lossless quantization. Furthermore, we introduce PD disaggregation with dynamic role switching for effective resource utilization to enhance inference performance for ERNIE 4.5 MoE models. Built on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle), ERNIE 4.5 delivers high-performance inference across a wide range of hardware platforms. 3. **Modality-Specific Post-Training:** To meet the diverse requirements of real-world applications, we fine-tuned variants of the pre-trained model for specific modalities. Our LLMs are optimized for general-purpose language understanding and generation. The VLMs focuses on visuallanguage understanding and supports both thinking and non-thinking modes. Each model employed a combination of *Supervised Fine-tuning (SFT)*, *Direct Preference Optimization (DPO)* or a modified reinforcement learning method named *Unified Preference Optimization (UPO)* for post-training. </br> ## Performance and Benchmark Results ERNIE-4.5-300B-A47B-Base surpasses DeepSeek-V3-671B-A37B-Base on 22 out of 28 benchmarks, demonstrating leading performance across all major capability categories. This underscores the substantial improvements in generalization, reasoning, and knowledge-intensive tasks brought about by scaling up the ERNIE-4.5-Base model relative to other state-of-the-art large models. With a total parameter size of 21B (approximately 70% that of Qwen3-30B), ERNIE-4.5-21B-A3B-Base outperforms Qwen3-30B-A3B-Base on several math and reasoning benchmarks, including BBH and CMATH. ERNIE-4.5-21B-A3B-Base remains highly competitive given its significantly smaller model size, demonstrating notable parameter efficiency and favorable performance trade-offs. ERNIE-4.5-300B-A47B, the post trained model, demonstrates significant strengths in instruction following and knowledge tasks, as evidenced by the state-of-the-art scores on benchmarks such as IFEval, Multi-IF, SimpleQA, and ChineseSimpleQA. The lightweight model ERNIE-4.5-21B-A3B achieves competitive performance compared to Qwen3-30B-A3B, despite having approximately 30% fewer total parameters. In the non-thinking mode, ERNIE-4.5-VL exhibits outstanding proficiency in visual perception, document and chart understanding, and visual knowledge, performing strongly across a range of established benchmarks. Under the thinking mode, ERNIE-4.5-VL not only demonstrates enhanced reasoning abilities compared to the non-thinking mode, but also retains the strong perception capabilities of the latter. ERNIE-4.5-VL-424B-A47B delivers consistently strong results across the full multimodal evaluation suite. Its thinking mode provides a distinct advantage on reasoning-centric tasks, narrowing or even surpassing the gap to OpenAI-o1 on challenging benchmarks such as MathVista, MMMU, and VisualPuzzle, while maintaining competitive performance on perception-focused datasets like CV-Bench and RealWorldQA. The lightweight vision-language model ERNIE-4.5-VL-28B-A3B achieves competitive or even superior performance compared to Qwen2.5-VL-7B and Qwen2.5-VL-32B across most benchmarks, despite using significantly fewer activation parameters. Notably, our lightweight model also supports both thinking and non-thinking modes, offering functionalities consistent with ERNIE-4.5-VL-424B-A47B. ### Performace of ERNIE-4.5 pre-trained models <div align="center"> <img src="https://yiyan.baidu.com/blog/posts/ernie4.5/base_model_benchmark.png" style="max-width: 80%; height: auto;"> </div> ### Performance of post-trained model ERNIE-4.5-300B-A47B <div align="center"> <img src="https://yiyan.baidu.com/blog/posts/ernie4.5/chat_model_benchmark1.png" style="max-width: 80%; height: auto;"> </div> ### Performance of post-trained model ERNIE-4.5-21B-A3B <div align="center"> <img src="https://github.com/user-attachments/assets/5bacaae8-ef27-494d-8c65-589ba187a084" style="max-width: 80%; height: auto;"> </div> ### Performance of post-trained multimodal models in thinking mode <div align="center"> <img src="https://yiyan.baidu.com/blog/posts/ernie4.5/vl_model_thinking_benchmark.png" style="max-width: 80%; height: auto;"> </div> ### Performance of post-trained multimodal models in non-thinking mode <div align="center"> <img src="https://github.com/user-attachments/assets/3ad69a9d-1233-48be-a7c4-b816d3aa17ca" style="max-width: 80%; height: auto;"> </div> </br> ## Model Development ERNIE 4.5 models are trained and deployed for inference using the [PaddlePaddle]((https://github.com/PaddlePaddle/Paddle)) framework. The full workflow of training, compression, and inference for ERNIE 4.5 is supported through the [ERNIEKit](./docs/erniekit.md) and [FastDeploy](https://github.com/PaddlePaddle/FastDeploy) toolkit. The table below details the feature matrix of the ERNIE 4.5 model family for training and inference. <div align="center"> | Model | Training | Inference | | ------------------------------ | ------------------------- | -------------------------------- | | ERNIE-4.5-300B-A47B-Base | SFT/SFT-LoRA/DPO/DPO-LoRA | BF16 / W4A16C16 / W8A16C16 / FP8 | | ERNIE-4.5-300B-A47B | SFT/SFT-LoRA/DPO/DPO-LoRA/QAT | BF16 / W4A16C16 / W8A16C16 / W4A8C8 / FP8 / 2Bits | | ERNIE-4.5-21B-A3B-Base | SFT/SFT-LoRA/DPO/DPO-LoRA | BF16 / W4A16C16 / W8A16C16 / FP8 | | ERNIE-4.5-21B-A3B | SFT/SFT-LoRA/DPO/DPO-LoRA | BF16 / W4A16C16 / W8A16C16 / FP8 | | ERNIE-4.5-VL-424B-A47B-Base | Coming Soon | BF16 / W4A16C16 / W8A16C16 / FP8 | | ERNIE-4.5-VL-424B-A47B | Coming Soon | BF16 / W4A16C16 / W8A16C16 / FP8 | | ERNIE-4.5-VL-28B-A3B-Base | Coming Soon | BF16 / W4A16C16 / W8A16C16 / FP8 | | ERNIE-4.5-VL-28B-A3B | Coming Soon | BF16 / W4A16C16 / W8A16C16 / FP8 | | ERNIE-4.5-0.3B-Base | SFT/SFT-LoRA/DPO/DPO-LoRA | BF16 / W8A16C16 / FP8 | | ERNIE-4.5-0.3B | SFT/SFT-LoRA/DPO/DPO-LoRA | BF16 / W8A16C16 / FP8 | </div> _Note: For different ERNIE 4.5 model, we provide diverse quantization schemes using the notation WxAxCx, where: W indicates weight precision, A indicates activation precision, C indicates KV Cache precision, x represents numerical precision._ ### ERNIEKit: ERNIE Development Toolkit Based on PaddlePaddle **ERNIEKit** is an industrial-grade training and compression development toolkit for ERNIE models based on PaddlePaddle, offering full-cycle development support for the ERNIE 4.5 model family. Key capabilities include: * High-performance pre-training implementation * Full-parameter supervised fine-tuning (SFT) * Direct Preference Optimization (DPO) * Parameter-efficient fine-tuning and alignment (SFT-LoRA/DPO-LoRA) * Quantization-Aware Training (QAT) * Post-Training Quantization (PTQ) [WIP] Minimum hardware requirements for training each model are documented [here](./docs/erniekit.md). #### Quick Start When you install ERNIEKit successfully, you can start training ERNIE 4.5 models with the following command: ```bash # download model from huggingface huggingface-cli download baidu/ERNIE-4.5-0.3B-Paddle --local-dir baidu/ERNIE-4.5-0.3B-Paddle # 8K Sequence Length, SFT erniekit train examples/configs/ERNIE-4.5-0.3B/sft/run_sft_8k.yaml ``` For detailed guides on installation, CLI usage, WebUI, multi-node training, and advanced features, please refer to [ERNIEKit Training Document](./docs/erniekit.md). For detailed guides on High-performance pre-training, please refer to [Pre-Training Document](./examples/pre-training/README.md). **ERNIEKit WebUI demo:** https://github.com/user-attachments/assets/6d44cb92-0826-42df-aa80-7656445e0f73 ### FastDeploy:High-performance Inference and Deployment Toolkit for LLMs and VLMs Based on PaddlePaddle **FastDeploy** is an inference and deployment toolkit for large language models and visual language models, developed based on PaddlePaddle. It delivers production-ready, easy-to-use multi-hardware deployment solutions with multi-level load-balanced PD disaggregation, comprehensive quantization format support, OpenAI API server and vLLM compatible etc. For installation please refer to [FastDeploy](https://github.com/PaddlePaddle/FastDeploy). #### Offline Inference ```python from fastdeploy import LLM, SamplingParams prompt = "Write me a poem about large language model." sampling_params = SamplingParams(temperature=0.8, top_p=0.95) llm = LLM(model="baidu/ERNIE-4.5-0.3B-Paddle", max_model_len=32768) outputs = llm.generate(prompt, sampling_params) ``` #### Online Serving ```bash python -m fastdeploy.entrypoints.openai.api_server \ --model "baidu/ERNIE-4.5-0.3B-Paddle" \ --max-model-len 32768 \ --port 9904 ``` For more inference and deployment guides, please refer to [FastDeploy](https://github.com/PaddlePaddle/FastDeploy). </br> ## Cookbooks Discover best-practice guides showcasing ERNIE’s capabilities across multiple domains: <div align="center"> | Cookbook | Description | Gradio Demo | | --- | --- | --- | | [Conversation](/cookbook/notebook/conversation_demo_en.ipynb) | Building conversational applications. | [conversation_demo.py](/cookbook/conversation_demo.py) | | [Simple ERNIE Bot](/cookbook/notebook/simple_ernie_bot_demo_en.ipynb) | Creating a lightweight web-based ERNIE Bot. |[simple_ernie_bot_demo.py](/cookbook/simple_ernie_bot_demo.py) | | [Web-Search-Enhanced Conversation](/cookbook/notebook/web_search_demo_en.ipynb) | Building conversational apps with integrated web search. | [web_search_demo.py](/cookbook/web_search_demo.py) | | [Knowledge Retrieval-based Q&A](/cookbook/notebook/knowledge_retrieval_demo_en.ipynb) | Building intelligent Q&A systems with private knowledge bases. | [knowledge_retrieval_demo.py](/cookbook/knowledge_retrieval_demo.py) | | [Advanced Search](/cookbook/notebook/advanced_search_demo_en.ipynb) | Building article-generation applications using deep information extraction. | [advanced_search_demo.py](/cookbook/advanced_search_demo.py) | | [SFT tutorial](/cookbook/notebook/sft_tutorial_en.ipynb) | Optimizing task performance through supervised fine-tuning with ERNIEKit. | - | | [DPO tutorial](/cookbook/notebook/dpo_tutorial_en.ipynb) | Aligning models with human preferences using ERNIEKit. | - | | [Text Recognition](/cookbook/notebook/text_recognition_tutorial_en.ipynb) | A Comprehensive Guide to Developing Text Recognition for Non-Chinese and Non-English Languages Using ERNIE and PaddleOCR. | - | | [Document Translation](/cookbook/notebook/document_translation_tutorial_en.ipynb) | Document Translation Practice Based on ERNIE and PaddleOCR. | - | | [Key Information Extraction](/cookbook/notebook/key_information_extraction_tutorial_en.ipynb) | Key Information Extraction in Contract Scenarios Based on ERNIE and PaddleOCR. | - | </div> </br> ## Community | PaddlePaddle WeChat official account | Join the tech discussion group | | :---: | :---: | | <img src="https://github.com/user-attachments/assets/864a45ec-0773-44b2-a2f1-c0e21e157792" width="150"> | <img src="https://github.com/user-attachments/assets/52e05674-7143-4207-8b19-67247fe88f55" width="150"> | ## License The ERNIE 4.5 models are provided under the Apache License 2.0. This license permits commercial use, subject to its terms and conditions. </br> ## Citation If you find ERNIE 4.5 useful or wish to use it in your projects, please kindly cite our technical report: ```bibtex @misc{ernie2025technicalreport, title={ERNIE 4.5 Technical Report}, author={Baidu-ERNIE-Team}, year={2025}, eprint={}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={} } ```

AI Tools ML Frameworks
7.7K Github Stars
FastDeploy
Open Source

FastDeploy

[English](README_EN.md) | 简体中文 <p align="center"> <a href="https://github.com/PaddlePaddle/FastDeploy/releases"><img src="https://github.com/user-attachments/assets/42b0039f-39e3-4279-afda-6d1865dfbffb" width="500"></a> </p> <p align="center"> <a href=""><img src="https://img.shields.io/badge/python-3.10-aff.svg"></a> <a href=""><img src="https://img.shields.io/badge/os-linux-pink.svg"></a> <a href="https://github.com/PaddlePaddle/FastDeploy/graphs/contributors"><img src="https://img.shields.io/github/contributors/PaddlePaddle/FastDeploy?color=9ea"></a> <a href="https://github.com/PaddlePaddle/FastDeploy/commits"><img src="https://img.shields.io/github/commit-activity/m/PaddlePaddle/FastDeploy?color=3af"></a> <a href="https://github.com/PaddlePaddle/FastDeploy/issues"><img src="https://img.shields.io/github/issues/PaddlePaddle/FastDeploy?color=9cc"></a> <a href="https://github.com/PaddlePaddle/FastDeploy/stargazers"><img src="https://img.shields.io/github/stars/PaddlePaddle/FastDeploy?color=ccf"></a> </p> <p align="center"> <a href="https://trendshift.io/repositories/4046" target="_blank"><img src="https://trendshift.io/api/badge/repositories/4046" alt="PaddlePaddle%2FFastDeploy | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a></br> <a href="https://paddlepaddle.github.io/FastDeploy/zh/get_started/installation/nvidia_gpu/"><b> 安装指导 </b></a> | <a href="https://paddlepaddle.github.io/FastDeploy/zh/get_started/quick_start"><b> 快速入门 </b></a> | <a href="https://paddlepaddle.github.io/FastDeploy/zh/supported_models/"><b> 支持模型列表 </b></a> </p> -------------------------------------------------------------------------------- # FastDeploy 飞桨大模型高效部署套件 ## 最新活动 **[2026-03] FastDeploy v2.5 全新发布!** 新增Qwen3-VL与Qwen3-VL MoE模型部署支持,新增W4AFP8量化方法,增强强化学习训练支持能力,包含170+项Bug修复与性能优化,升级全部内容参阅 [v2.5 ReleaseNote](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.5.0)。 **[2026-01] FastDeploy v2.4**: 新增 DeepSeek V3 与 Qwen3-MoE 模型的 PD 分离部署,增强MTP 投机解码能力,全面优化多硬件平台上的 MoE 推理与多模态前缀缓存性能,升级全部内容参阅 [v2.4 ReleaseNote](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.4.0)。 **[2025-11] FastDeploy v2.3**: 新增[ERNIE-4.5-VL-28B-A3B-Thinking](docs/zh/get_started/ernie-4.5-vl-thinking.md)与[PaddleOCR-VL-0.9B](docs/zh/best_practices/PaddleOCR-VL-0.9B.md)两大重磅模型在多硬件平台上的部署支持,进一步优化全方位推理性能,以及带来更多部署功能和易用性的提升,升级全部内容参阅[v2.3 ReleaseNote](https://github.com/PaddlePaddle/FastDeploy/releases/tag/v2.3.0)。 **[2025-09] FastDeploy v2.2**: HuggingFace生态模型兼容,性能进一步优化,更新增对[baidu/ERNIE-21B-A3B-Thinking](https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking)支持! **[2025-08] FastDeploy v2.1**:全新的KV Cache调度策略,更多模型支持PD分离和CUDA Graph,昆仑、海光等更多硬件支持增强,全方面优化服务和推理引擎的性能。 ## 关于 **FastDeploy** 是基于飞桨(PaddlePaddle)的大语言模型(LLM)与视觉语言模型(VLM)推理部署工具包,提供**开箱即用的生产级部署方案**,核心技术特性包括: - 🚀 **负载均衡式PD分解**:工业级解决方案,支持上下文缓存与动态实例角色切换,在保障SLO达标和吞吐量的同时优化资源利用率 - 🔄 **统一KV缓存传输**:轻量级高性能传输库,支持智能NVLink/RDMA选择 - 🤝 **OpenAI API服务与vLLM兼容**:单命令部署,兼容[vLLM](https://github.com/vllm-project/vllm/)接口 - 🧮 **全量化格式支持**:W8A16、W8A8、W4A16、W4A8、W2A16、FP8等 - ⏩ **高级加速技术**:推测解码、多令牌预测(MTP)及分块预填充 - 🖥️ **多硬件支持**:NVIDIA GPU、昆仑芯XPU、海光DCU、天数智芯GPU、燧原GCU、沐曦GPU、英特尔Gaudi等 ## 要求 - 操作系统: Linux - Python: 3.10 ~ 3.12 ## 安装 FastDeploy 支持在**英伟达(NVIDIA)GPU**、**昆仑芯(Kunlunxin)XPU**、**天数(Iluvatar)GPU**、**燧原(Enflame)GCU**、**海光(Hygon)DCU** 以及其他硬件上进行推理部署。详细安装说明如下: - [英伟达 GPU](./docs/zh/get_started/installation/nvidia_gpu.md) - [昆仑芯 XPU](./docs/zh/get_started/installation/kunlunxin_xpu.md) - [天数 CoreX](./docs/zh/get_started/installation/iluvatar_gpu.md) - [燧原 S60](./docs/zh/get_started/installation/Enflame_gcu.md) - [海光 DCU](./docs/zh/get_started/installation/hygon_dcu.md) - [沐曦 GPU](./docs/zh/get_started/installation/metax_gpu.md) - [英特尔 Gaudi](./docs/zh/get_started/installation/intel_gaudi.md) ## 入门指南 通过我们的文档了解如何使用 FastDeploy: - [10分钟快速部署](./docs/zh/get_started/quick_start.md) - [ERNIE-4.5 部署](./docs/zh/get_started/ernie-4.5.md) - [ERNIE-4.5-VL 部署](./docs/zh/get_started/ernie-4.5-vl.md) - [离线推理](./docs/zh/offline_inference.md) - [在线服务](./docs/zh/online_serving/README.md) - [最佳实践](./docs/zh/best_practices/README.md) ## 支持模型列表 通过我们的文档了解如何下载模型,如何支持torch格式等: - [模型支持列表](./docs/zh/supported_models.md) ## 进阶用法 - [量化](./docs/zh/quantization/README.md) - [分离式部署](./docs/zh/features/disaggregated.md) - [投机解码](./docs/zh/features/speculative_decoding.md) - [前缀缓存](./docs/zh/features/prefix_caching.md) - [分块预填充](./docs/zh/features/chunked_prefill.md) - [负载均衡调度Router](./docs/zh/online_serving/router.md) - [全局Cache池化](./docs/zh/features/global_cache_pooling.md) ## 致谢 FastDeploy 依据 [Apache-2.0 开源许可证](./LICENSE). 进行授权。在开发过程中,我们参考并借鉴了 [vLLM](https://github.com/vllm-project/vllm) 的部分代码,以保持接口兼容性,在此表示衷心感谢。

DevOps & Infrastructure LLM Tools & Chat UIs
3.7K Github Stars
Paddle
Open Source

Paddle

Ruby library for the Paddle API

ML Frameworks Payment & Checkout API Tools
27 Github Stars
Paddle-Lite
Open Source

Paddle-Lite

# Paddle Lite [English](README_en.md) | 简体中文 [![Documentation Status](https://img.shields.io/badge/中文文档-最新-brightgreen.svg)](https://www.paddlepaddle.org.cn/lite) [![Release](https://img.shields.io/github/release/PaddlePaddle/Paddle-Lite.svg)](https://github.com/PaddlePaddle/Paddle-Lite/releases) [![License](https://img.shields.io/badge/license-Apache%202-blue.svg)](LICENSE) Paddle Lite 是一个高性能、轻量级、灵活性强且易于扩展的深度学习推理框架,定位于支持包括移动端、嵌入式以及边缘端在内的多种硬件平台。 当前 Paddle Lite 不仅在百度内部业务中得到全面应用,也成功支持了众多外部用户和企业的生产任务。 ## 快速入门 使用 Paddle Lite,只需几个简单的步骤,就可以把模型部署到多种终端设备中,运行高性能的推理任务,使用流程如下所示: **一. 准备模型** Paddle Lite 框架直接支持模型结构为 [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) 深度学习框架产出的模型格式。目前 PaddlePaddle 用于推理的模型是通过 [save_inference_model](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/api/paddle/static/save_inference_model_cn.html#save-inference-model) 这个 API 保存下来的。 如果您手中的模型是由诸如 Caffe、Tensorflow、PyTorch 等框架产出的,那么您可以使用 [X2Paddle](https://github.com/PaddlePaddle/X2Paddle) 工具将模型转换为 PaddlePaddle 格式。 **二. 模型优化** Paddle Lite 框架拥有优秀的加速、优化策略及实现,包含量化、子图融合、Kernel 优选等优化手段。优化后的模型更轻量级,耗费资源更少,并且执行速度也更快。 这些优化通过 Paddle Lite 提供的 opt 工具实现。opt 工具还可以统计并打印出模型中的算子信息,并判断不同硬件平台下 Paddle Lite 的支持情况。您获取 PaddlePaddle 格式的模型之后,一般需要通过该 opt 工具做模型优化。opt 工具的下载和使用,请参考[模型优化方法](https://www.paddlepaddle.org.cn/lite/develop/user_guides/model_optimize_tool.html)。 **三. 下载或编译** Paddle Lite 提供了 Android/iOS/x86/macOS 平台的官方 Release 预测库下载,我们优先推荐您直接下载 [Paddle Lite 预编译库](https://www.paddlepaddle.org.cn/lite/develop/quick_start/release_lib.html),或者从 Release notes 处获取最新的[预编译编译库](https://github.com/PaddlePaddle/Paddle-Lite/releases)。 Paddle Lite 已支持多种环境下的源码编译,为了避免复杂、繁琐的环境搭建过程,我们建议您使用 [Docker 统一编译环境搭建](https://www.paddlepaddle.org.cn/lite/develop/source_compile/docker_env.html) 进行编译。当然,您也可以根据宿主机和目标设备的 CPU 架构和操作系统,在[源码编译](https://www.paddlepaddle.org.cn/lite/develop/source_compile/compile_env.html)中找到相应的环境搭建及编译指南,自行完成编译环境的搭建。 **四. 预测示例** Paddle Lite 提供了 C++、Java、Python 三种 API,并且提供了相应 API 的完整使用示例: - [C++ 完整示例](https://www.paddlepaddle.org.cn/lite/develop/user_guides/cpp_demo.html) - [Java 完整示例](https://www.paddlepaddle.org.cn/lite/develop/user_guides/java_demo.html) - [Python 完整示例](https://www.paddlepaddle.org.cn/lite/develop/user_guides/python_demo.html) 您可以参考示例中的说明快速了解使用方法,并集成到您自己的项目中去。 针对不同的硬件平台,Paddle Lite 提供了各个平台的完整示例: - [Android apps](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/android_app_demo.html) [[图像分类]](https://paddlelite-demo.bj.bcebos.com/apps/android/mobilenet_classification_demo.apk) [[目标检测]](https://paddlelite-demo.bj.bcebos.com/apps/android/yolo_detection_demo.apk) [[口罩检测]](https://paddlelite-demo.bj.bcebos.com/apps/android/mask_detection_demo.apk) [[人脸关键点]](https://paddlelite-demo.bj.bcebos.com/apps/android/face_keypoints_detection_demo.apk) [[人像分割]](https://paddlelite-demo.bj.bcebos.com/apps/android/human_segmentation_demo.apk) - [iOS apps](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/ios_app_demo.html) - [Linux apps](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/linux_arm_demo.html) - [Arm](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/arm_cpu.html) - [x86](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/x86.html) - [OpenCL](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/opencl.html) - [Metal](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/metal.html) - [华为麒麟 NPU](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/huawei_kirin_npu.html) - [华为昇腾 NPU](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/huawei_ascend_npu.html) - [昆仑芯 XPU](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/kunlunxin_xpu.html) - [昆仑芯 XTCL](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/kunlunxin_xtcl.html) - [高通 QNN](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/qualcomm_qnn.html) - [寒武纪 MLU](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/cambricon_mlu.html) - [(瑞芯微/晶晨/恩智浦) 芯原 TIM-VX](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/verisilicon_timvx.html) - [Android NNAPI](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/android_nnapi.html) - [联发科 APU](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/mediatek_apu.html) - [颖脉 NNA](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/imagination_nna.html) - [Intel OpenVINO](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/intel_openvino.html) - [亿智 NPU](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/eeasytech_npu.html) ## 主要特性 - 支持多平台:涵盖 Android、iOS、嵌入式 Linux 设备、Windows、macOS 和 Linux 主机 - 支持多种语言:包括 Java、Python、C++ - 轻量化和高性能:针对移动端设备的机器学习进行优化,压缩模型和二进制文件体积,高效推理,降低内存消耗 ## 持续集成 | System | x86 Linux | ARM Linux | Android (GCC/Clang) | iOS | |:-:|:-:|:-:|:-:|:-:| | CPU(32bit) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | | CPU(64bit) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | | OpenCL | - | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | | Metal | - | - | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | | 华为麒麟 NPU | - | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | | 华为昇腾 NPU | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | - | | 昆仑芯 XPU | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | - | | 昆仑芯 XTCL | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | - | | 高通 QNN | - | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | | 寒武纪 MLU | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | - | - | | (瑞芯微/晶晨/恩智浦) 芯原 TIM-VX | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | | Android NNAPI | - | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | | 联发科 APU | - | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | | 颖脉 NPU | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | - | | Intel OpenVINO | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | - | - | | 亿智 NPU | - | ![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg) | - | - | ## 架构设计 Paddle Lite 的架构设计着重考虑了对多硬件和平台的支持,并且强化了多个硬件在一个模型中混合执行的能力,多个层面的性能优化处理,以及对端侧应用的轻量化设计。 <p align="center"><img width="500" src="https://paddlelite-demo.bj.bcebos.com/devices/generic/paddle_lite_with_nnadapter.jpg"/></p> 其中,Analysis Phase 包括了 MIR(Machine IR) 相关模块,能够对原有的模型的计算图针对具体的硬件列表进行算子融合、计算裁剪 在内的多种优化。Execution Phase 只涉及到 Kernel 的执行,且可以单独部署,以支持极致的轻量级部署。 ## 进一步了解 Paddle Lite 如果您想要进一步了解 Paddle Lite,下面是进一步学习和使用 Paddle Lite 的相关内容: ### 文档和示例 - 完整文档: [Paddle Lite 文档](https://www.paddlepaddle.org.cn/lite) - API文档: - [C++ API 文档](https://www.paddlepaddle.org.cn/lite/develop/api_reference/cxx_api_doc.html) - [Java API 文档](https://www.paddlepaddle.org.cn/lite/develop/api_reference/java_api_doc.html) - [Python API 文档](https://www.paddlepaddle.org.cn/lite/develop/api_reference/python_api_doc.html) - [CV 图像处理 API 文档](https://www.paddlepaddle.org.cn/lite/develop/api_reference/cv.html) - Paddle Lite 工程示例: [Paddle-Lite-Demo](https://github.com/PaddlePaddle/Paddle-Lite-Demo) ### 关键技术 - 模型量化: - [静态离线量化](https://www.paddlepaddle.org.cn/lite/develop/user_guides/quant/quant_post_static.html) - [动态离线量化](https://www.paddlepaddle.org.cn/lite/develop/user_guides/quant/quant_post_dynamic.html) - 调试分析:[调试和性能分析工具](https://www.paddlepaddle.org.cn/lite/develop/user_guides/profiler.html) - 移动端模型训练:点击[了解一下](https://www.paddlepaddle.org.cn/lite/develop/demo_guides/cpp_train_demo.html) - 飞桨预训练模型库:试试在 [PaddleHub](https://www.paddlepaddle.org.cn/hublist?filter=hot&value=1) 浏览和下载 Paddle 的预训练模型 - 飞桨推理 AI 硬件统一适配框架 NNAdapter:点击[了解一下](https://www.paddlepaddle.org.cn/lite/develop/develop_guides/nnadapter.html) ### FAQ - FAQ:常见问题,可以访问 [FAQ](https://www.paddlepaddle.org.cn/lite/develop/quick_start/faq.html)、搜索 Issues、或者通过页面底部的联系方式联系我们 ### 贡献代码 - 贡献代码:如果您想一起参与 Paddle Lite 的开发,贡献代码,请访问[开发者共享文档](https://www.paddlepaddle.org.cn/lite/develop/develop_guides/for-developer.html) ## 交流与反馈 * AIStudio 实训平台端测部署系列课程:https://aistudio.baidu.com/aistudio/course/introduce/22690 * 欢迎您通过 [Github Issues](https://github.com/PaddlePaddle/Paddle-Lite/issues) 来提交问题、报告与建议 * 技术交流微信群:添加 wechat id:baidupaddle或扫描下方微信二维码,添加并回复小助手“端侧”,系统自动邀请加入;技术群 QQ 群: 一群696965088(已满) ;二群,959308808; <p align="center"><img width="200" height="200" src="https://user-images.githubusercontent.com/63448337/162189409-6c0ef74f-82fd-48c9-9fa7-fc3473428a63.png"/>&#8194;&#8194;&#8194;&#8194;&#8194;<img width="200" height="200" margin="500" src="https://github.com/PaddlePaddle/Paddle-Lite/blob/develop/docs/images/qq-group-chat.png"/></p> <p align="center">&#8194;&#8194;&#8194;微信公众号&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;&#8194;官方技术交流QQ群 * 如果您对我们的工作感兴趣,也欢迎[加入我们](https://github.com/PaddlePaddle/Paddle-Lite/issues/6091) ! ## 版权和许可证 Paddle Lite由 [Apache-2.0 license](LICENSE) 提供。

IoT & Embedded ML Frameworks
7.3K Github Stars
models
Open Source

models

# 欢迎使用飞桨产业级开源模型库 ## 简介 飞桨的产业级模型库,包含大量经过产业实践长期打磨的主流模型以及在国际竞赛中的夺冠模型;提供面向语义理解、图像分类、目标检测、图像分割、文字识别、语音合成等场景的多个端到端开发套件,满足企业低成本开发和快速集成的需求。飞桨的模型库是围绕国内企业实际研发流程量身定制打造的产业级模型库,服务企业遍布能源、金融、工业、农业等多个领域。 ## 近期更新 **`2022-11-29`**: 更新`release/2.4`分支,飞桨官方模型超过600个,生态模型超过260个(数量持续更新中). **`2022-5-17`**: 更新`release/2.3`分支,飞桨官方模型超过500个,生态模型超过170个. **`2021-11-30`**: 更新`release/2.2`分支,系统的梳理了飞桨官方模型、学术模型和社区模型的清单,其中官方模型超过400个,生态模型超过100个 **`Note`**:`release/2.2`以后分支模型均基于动态图实现,目前`dev-static`分支中仍有一些静态图模型代码,有需要的开发者可以继续切换到`dev-static`分支使用. ## 主要内容 | 目录 | 说明 | | --- | --- | | [官方模型(official)](docs/official/README.md) |• 面向产业实践,数量超过600个<br />• [飞桨PP系列模型](docs/official/PP-Models.md),效果与精度最佳平衡<br />• 支持使用动态图开发视觉、自然语言、语音和推荐等领域模型<br />• 飞桨官方实现并提供持续技术支持及答疑<br />• 与飞桨核心框架版本对齐,已经经过充分的测试保证 | |[学术模型(research)](docs/research/README.md) |• 面向学术前沿,侧重对于问题的持续更新<br />• 主要由飞桨相关的学术生态合作伙伴贡献| |[社区模型(community)](docs/community/README.md) | • 面向更多丰富场景,侧重对于学术论文的覆盖<br />• 主要由飞桨生态开发者贡献,持续更新中| ## 欢迎加入飞桨模型库技术交流群 - 如果你希望了解飞桨模型库最新进展,或者希望与资深开发者一起讨论产业实践关注的重点模型,欢迎扫码加入飞桨模型库交流群: - (微信扫码填写简单问卷即可添加小助手,加上好友之后回复"model" 即可入群) <div align="center"> <img src="https://user-images.githubusercontent.com/23690325/165911212-cda07629-1bab-4cc3-8228-e5b69320fe4d.jpg" width = "200" height = "200" /> </div> <a name="致谢"></a> ## 致谢开发者 感谢所有为飞桨产业级模型库贡献代码的开发者,也期待你的加入。 ## 许可证书 此向导由[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)贡献,受[Apache-2.0 license](LICENSE)许可认证。

AI & Machine Learning ML Frameworks
6.9K Github Stars
awesome-DeepLearning
Open Source

awesome-DeepLearning

# 一、项目简介 本项目是[飞桨官方](https://www.paddlepaddle.org.cn/?fr=paddleEdu_github)出品的一站式深度学习在线百科,飞桨致力于让深度学习技术的创新与应用更简单,更多飞桨内容欢迎访问[飞桨官网](https://www.paddlepaddle.org.cn/?fr=paddleEdu_github)。本项目内容涵盖: 📒课程类:[**零基础实践深度学习**](https://aistudio.baidu.com/aistudio/course/introduce/1297)、**产业实践深度学习**、**[特色课程](https://aistudio.baidu.com/aistudio/education/group/info/24322)、飞桨套件课程汇总资料** 📒书籍类:**《动手学深度学习》飞桨版** 📒宝典类:[**深度学习百问**](https://paddlepedia.readthedocs.io/en/latest/index.html)、**面试宝典** 📒案例类:**[飞桨产业实践范例库](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/Paddle_Industry_Practice_Sample_Library)**(包含智慧城市:[火灾烟雾检测](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/Paddle_Industry_Practice_Sample_Library/Fire_and_Smoke_Detection)、 [安全帽检测](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/Paddle_Industry_Practice_Sample_Library/Hemtle%20Detection) ;智能制造:[钢材缺陷检测](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/Paddle_Industry_Practice_Sample_Library/paddlex_steel_defect_seg-master) 、 [机械手抓取](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/Paddle_Industry_Practice_Sample_Library/robot_grab);互联网:[财报识别与关键字段抽取](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/Paddle_Industry_Practice_Sample_Library/Report_Recognition_and_Analysis) 等。 从理论到实践,从科研到产业应用,各类学习材料一应俱全,旨在帮助开发者高效地学习和掌握深度学习知识,快速成为AI跨界人才。 <center><img src="./docs/images/cover/repo_cover1.png" width=60%></center> * **内容全面**:无论您是深度学习初学者,还是资深用户,都可以在本项目中快速获取到需要的学习材料。 * **形式丰富**:材料形式多样,包括可在线运行的notebook、视频、书籍、B站直播等,满足您随时随地学习的需求。 * **实时更新**:本项目中涉及到的代码均匹配Paddle最新发布版本,开发者可以实时学习最新的深度学习任务实现方案。 * **前沿分享**:定期分享顶会最新论文解读和代码复现,开发者可以实时掌握最新的深度学习算法。 #### <span id = '0'>如果本项目对您有帮助,欢迎点击网页右上方进行star❤️</span> --- ## 👨‍🏫我是高校用户 | 我希望: | 我可以学习: | | ------------ | ------------------------------------------------------------ | | 入门深度学习 | 零基础实践深度学习[:arrow_heading_down:](#1)、深度学习百问[:arrow_heading_down:](#2)、动手学深度学习paddle版[:arrow_heading_down:](#dive) | | 进阶深度学习 | 产业实践深度学习、深度学习百问[:arrow_heading_down:](#2)、面试宝典[:arrow_heading_down:](#6) | | 趣味深度学习 | 特色课程[:arrow_heading_down:](#3)、[飞桨产业实践范例库](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/Paddle_Industry_Practice_Sample_Library) | ## 👨‍💻我是企业用户 | 我希望: | 我可以学习: | | ------------ | ------------------------------------------------------------ | | 入门深度学习 | 零基础实践深度学习[:arrow_heading_down:](#1)、深度学习百问[:arrow_heading_down:](#2)、动手学深度学习paddle版[:arrow_heading_down:](#dive) | | 进阶深度学习 | 产业实践深度学习、特色课程[:arrow_heading_down:](#3)、面试宝典[:arrow_heading_down:](#6) | | 实践深度学习 | [飞桨产业实践范例库](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/Paddle_Industry_Practice_Sample_Library)、飞桨各产品课程[:arrow_heading_down:](#fj) | --- # 二、项目内容 # 👉课程类 ## <span id =1> 零基础实践深度学习</span> - **AI Studio在线课程:[《零基础实践深度学习》](https://aistudio.baidu.com/aistudio/course/introduce/1297 )**:理论和代码结合、实践与平台结合,包含20小时视频课程,由百度杰出架构师、飞桨产品负责人和资深研发人员共同打造。 <center><img src="./docs/images/cover/0_cover.png"/></center><br></br> - **《零基础实践深度学习》书籍**:本课程配套书籍,由清华出版社2020年底发行,京东/当当等电商均有销售。 <center><img src="https://github.com/ZhangHandi/images-for-paddledocs/blob/main/images/readme/book.png?raw=true"/></center><br></br> ## <span id ='3'>特色课 - Transformer系列</span> 飞桨教育官方出品的Transformer系列内容解读可以参考以下两个平台。 * Transformer原理和实践系列课:https://aistudio.baidu.com/aistudio/education/group/info/24683 * 飞桨教育官方账号:https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086 | 领域 | **章节名称** | 课程简介 | notebook链接 | | ----------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | NLP | 经典的预训练语言模型(上)-预训练模型发展历史 | 介绍预训练语言模型的发展历史,word2vec,elmo,bert,gpt,bert一些拓展。 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2287294) | | NLP | 经典的预训练模型(上)-ELMo | 全面详细的介绍ELMo模型结构,优缺点等。 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2287335) | | NLP | 经典的预训练模型(上)-Transformer | 讲解Transformer的基本原理,包括Embedding,self-attention,encoder,decoder,复杂度计算,共享机制等内容。 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2287386) | | NLP | 经典的预训练模型(下)-GPT | 全面详细的介绍GPT的原理,预训练和finetune模式,GPT模型结构,优缺点等。 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2295114) | | NLP | 经典的预训练模型(下)-BERT | 全面详细的介绍BERT的基本原理,预训练任务和fine tune的方式,BERT本身的模型结构,优缺点等。 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2297740) | | NLP | 预训练模型之自然语言理解-RoBERTa | 讲解预训练模型在自然语言理解方面的改进--RoBERTa | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2299099) | | NLP | 预训练模型之自然语言理解-ERNIE | 讲解预训练模型之自然语言理解的改进:ERNIE | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2299380) | | NLP | 预训练模型之自然语言理解-KBERT | 讲解预训练模型之自然语言理解的改进:KBERT | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2307309) | | NLP | 预训练模型之自然语言理解-THU-ERNIE | 讲解预训练模型之自然语言理解的改进:THU-ERNIE | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2307342) | | NLP | 预训练模型之长序列建模-Transformer-XL | 讲解预训练模型之长序列建模的改进:Transformer-XL | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2307389) | | NLP | 预训练模型之长序列建模-XLNet | 讲解自然语言理解之长序列建模的改进:XLNet | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2307494) | | NLP | 预训练模型之长序列建模-Longformer | 讲解预训练模型之长序列建模的改进:Longformer | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2307544) | | 模型优化 | 预训练模型-高效结构 | 基于ELECTRA的标点符号预测 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2294324) | | 模型优化 | 预训练模型-蒸馏 | 预训练模型蒸馏算法:Patient-KD、DistilBERT、TinyBERT、DynaBERT模型详解,以及使用DynaBERT策略对TinyBERT进行模型蒸馏 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2258091) | | CV | 图像领域的Transformer-Vit,DeiT | 详细讲解ViT 以及 DeiT原理 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2299267) | | CV | 图像领域的Transformer-Swin Transformer | 详细讲解Swin Transformer原理 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2292148) | | CV | CV领域的Transformer模型DETR在目标检测任务中的应用 | 详细讲解DETR原理及代码解析 | [notebook链接](https://aistudio.baidu.com/aistudio/projectdetail/2290729) | 返回[:arrow_heading_up:](#0) ----- # 👉书籍类 ## <span id ='dive'>《动手学深度学习》paddle版</span> 本项目将《[动手学深度学习](http://zh.d2l.ai/)》原书中MXNet代码实现改为PaddlePaddle实现。原书作者:阿斯顿·张、李沐、扎卡里 C. 立顿、亚历山大 J. 斯莫拉以及其他社区贡献者,GitHub地址:https://github.com/d2l-ai/d2l-zh。 本项目面向对深度学习感兴趣,尤其是想使用PaddlePaddle进行深度学习的童鞋。本项目并不要求你有任何深度学习或者机器学习的背景知识,你只需了解基础的数学和编程,如基础的线性代数、微分和概率,以及基础的Python编程。 <div align=center> <img width="500" src="./Dive-into-DL-paddlepaddle/docs/img/cover.jpg"> </div> 返回[:arrow_heading_up:](#0) ---- # 👉宝典类 ## <span id ='2'>深度学习百问</span> 深度学习百问内容包含深度学习基础篇、深度学习进阶篇、深度学习应用篇、强化学习篇以及面试宝典,详细信息请参阅[Paddle知识点文档平台](https://paddlepedia.readthedocs.io/en/latest/index.html)。 * **深度学习基础篇** 1. [深度学习](https://paddlepedia.readthedocs.io/en/latest/tutorials/deep_learning/index.html#) 2. [卷积神经网络](https://paddlepedia.readthedocs.io/en/latest/tutorials/CNN/index.html) 3. [序列模型](https://paddlepedia.readthedocs.io/en/latest/tutorials/sequence_model/index.html) * **深度学习进阶篇** 1. [预训练模型](https://paddlepedia.readthedocs.io/en/latest/tutorials/pretrain_model/index.html) 2. [对抗神经网络](https://paddlepedia.readthedocs.io/en/latest/tutorials/generative_adversarial_network/index.html) * **深度学习应用篇** 1. [计算机视觉](https://paddlepedia.readthedocs.io/en/latest/tutorials/computer_vision/index.html) 2. [自然语言处理](https://paddlepedia.readthedocs.io/en/latest/tutorials/natural_language_processing/index.html) 3. [推荐系统](https://paddlepedia.readthedocs.io/en/latest/tutorials/recommendation_system/index.html) * **产业实践篇** 1. [模型压缩](https://paddlepedia.readthedocs.io/en/latest/tutorials/model_compress/index.html) 2. [模型部署](https://paddlepedia.readthedocs.io/en/latest/tutorials/model_deployment/index.html) * **强化学习篇** 1. [强化学习](https://paddlepedia.readthedocs.io/en/latest/tutorials/reinforcement_learning/index.html) * <span id ='6'>**面试宝典**</span> 1. [深度学习基础常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html) 2. [卷积模型常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html#id2) 3. [预训练模型常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html#id3) 4. [对抗神经网络常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html#id4) 5. [计算机视觉常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html#id5) 6. [自然语言处理常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html#id6) 7. [推荐系统常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html#id7) 8. [模型压缩常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html#id8) 9. [强化学习常见面试题](https://paddlepedia.readthedocs.io/en/latest/tutorials/interview_questions/interview_questions.html#id9) 返回[:arrow_heading_up:](#0) ----- # 👉案例类 ## <span id ='5'>飞桨应用案例集</span> | 领域 | 产业案例 | 来源 | 更多内容 | | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | **智能工业** | [厂区传统仪表统计监测](https://paddlex.readthedocs.io/zh_CN/develop/examples/meter_reader.html) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智能工业** | [新能源汽车锂电池隔膜质检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2104) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智能工业** | [天池铝材表面缺陷检测](https://paddlex.readthedocs.io/zh_CN/develop/examples/industrial_quality_inspection/README.html) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智能工业** | [安全帽检测](https://github.com/PaddleCV-FAQ/PaddleDetection-FAQ/blob/main/Lite%E9%83%A8%E7%BD%B2/yolov3_for_raspi.md) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧城市** | [高尔夫球场遥感监测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2103) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧城市** | [积雪语义分割](https://paddlex.readthedocs.io/zh_CN/develop/examples/multi-channel_remote_sensing/README.html) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧城市** | [戴口罩的人脸识别](https://aistudio.baidu.com/aistudio/projectdetail/267322?channelType=0&channel=0) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧交通** | [车道线分割和红绿灯安全检测](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/configs/vehicle/README_cn.md) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧交通** | [【PaddleDetection2.0专项】PP-YOLOv2](https://aistudio.baidu.com/aistudio/projectdetail/1922155?channelType=0&channel=0) | 飞桨PaddleDet | [更多paddleDet案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/330600) | | **智慧交通** | [PaddleX助力无人驾驶(基于YOLOv3的车辆检测和车道线分割)](https://aistudio.baidu.com/aistudio/projectdetail/464339?channelType=0&channel=0) | 开发者[BIT可达鸭](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/67156) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧交通** | [eblite_标志物检测](https://aistudio.baidu.com/aistudio/projectdetail/596152?channelType=0&channel=0) | 开发者[TobeWell](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/59591) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧交通** | [PaddleOCR: 车牌识别](https://aistudio.baidu.com/aistudio/projectdetail/739559?channelType=0&channel=0) | 飞桨开发者[寂寞你快进去](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/180581) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧农林** | [耕地地块识别](https://mp.weixin.qq.com/s/JlDVmYlhN7sF0hpRlncDNw) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧农林** | [AI识虫](https://aistudio.baidu.com/aistudio/projectdetail/439888) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧农林** | [更快更强! 高效快速的PP-YOLO实战演练](https://aistudio.baidu.com/aistudio/projectdetail/708923?channelType=0&channel=0) | 飞桨PaddleDet | [更多paddleDet案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/330600) | | **智慧农林** | [PaddleX快速上手-Faster RCNN目标检测](https://aistudio.baidu.com/aistudio/projectdetail/439888?channelType=0&channel=0) | 飞桨PaddleX | [更多PaddleX案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/189619) | | **智慧农林** | [AI识虫检测分享](https://aistudio.baidu.com/aistudio/projectdetail/289616?channelType=0&channel=0) | 开发者[aaaLKgo](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/110992) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧农林** | [基于PaddleX实现森林火灾监测](https://aistudio.baidu.com/aistudio/projectdetail/1968964?channelType=0&channel=0) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧医疗** | [医学常见中草药分类](https://aistudio.baidu.com/aistudio/projectdetail/1434738?channelType=0&channel=0) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧医疗** | [眼疾识别](https://www.paddlepaddle.org.cn/tutorials/projectdetail/1630501) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧医疗** | [基于Paddle的肝脏CT影像分割](https://aistudio.baidu.com/aistudio/projectdetail/250994?channelType=0&channel=0) | 开发者[代码生成器](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/33061) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **智慧医疗** | [PaddleHub 肺炎CT影像分析](https://aistudio.baidu.com/aistudio/projectdetail/289819?channelType=0&channel=0) | 飞桨PaddleHub | [更多PaddleHub案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/79927) | | **智慧医疗** | [基于飞桨PGL的高致病性传染病的传播趋势预测基线系统](https://aistudio.baidu.com/aistudio/projectdetail/457185?channelType=0&channel=0) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [人摔倒检测](https://aistudio.baidu.com/aistudio/projectdetail/2071768) | 开发者[Niki_173](https://github.com/Niki173) | [该开发者更多案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/474269) | | **其他** | [足球比赛动作定位](https://github.com/PaddlePaddle/PaddleVideo/tree/application/FootballAction) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [基于强化学习的飞行器仿真](https://github.com/PaddlePaddle/PARL/tree/develop/examples/tutorials/homework/lesson5/ddpg_quadrotor) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [基于ERNIE-Gram实现语义匹配](https://aistudio.baidu.com/aistudio/projectdetail/2247755) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [『NLP打卡营』实践课5:文本情感分析](https://aistudio.baidu.com/aistudio/projectdetail/1968542?channelType=0&channel=0) | 飞桨PaddleNLP | [更多飞桨PaddleNLP案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995) | | **其他** | [『NLP经典项目集』03:利用情感分析选择年夜饭](https://aistudio.baidu.com/aistudio/projectdetail/1468469?channelType=0&channel=0) | 飞桨PaddleNLP | [更多飞桨PaddleNLP案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995) | | **其他** | [分类任务:如何在客服对话中,识别客户情绪的好坏](https://aistudio.baidu.com/aistudio/projectdetail/121630?channelType=0&channel=0) | 开发者[中大bbking](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/34238) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [『NLP打卡营』实践课3:使用预训练模型实现快递单信息抽取](https://aistudio.baidu.com/aistudio/projectdetail/1329361?channelType=0&channel=0) | 飞桨PaddleNLP | [更多飞桨PaddleNLP案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/574995) | | **其他** | [发愁七夕文案?PaddleHub情话生成送给你 (文内含七夕抽奖)](https://aistudio.baidu.com/aistudio/projectdetail/746002?channelType=0&channel=0) | 飞桨PaddleHub | [更多PaddleHub案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/79927) | | **其他** | [基于PaddleDetection的PCB瑕疵检测](https://aistudio.baidu.com/aistudio/projectdetail/2240725) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [基于百度飞桨的单/多镜头行人追踪(非官方Baseline)](https://aistudio.baidu.com/aistudio/projectdetail/1411754?channelType=0&channel=0) | 开发者[BIT可达鸭](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/67156) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [PaddleLite树莓派从0到1:安全帽检测小车部署(一)](https://aistudio.baidu.com/aistudio/projectdetail/1059610?channelType=0&channel=0) | 开发者[深渊上的炕](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/90149) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [PaddleX、PP-Yolo:手把手教你训练、加密、部署目标检测模型](https://aistudio.baidu.com/aistudio/projectdetail/920753?channelType=0&channel=0) | 开发者[深渊上的炕](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/90149) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [中文语音识别](https://aistudio.baidu.com/aistudio/projectdetail/2280562) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [PaddleHub一键OCR中文识别(超轻量8.1M模型,火爆)](https://aistudio.baidu.com/aistudio/projectdetail/507159?channelType=0&channel=0) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [老北京城影像修复](https://aistudio.baidu.com/aistudio/projectdetail/1161285?channelType=0&channel=0) | 飞桨PaddleGAN | [更多PaddleGAN案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/52570) | | **其他** | [飞桨创意之星 宋代诗人念诗的秘密——PaddleGAN实现精准唇形合成](https://aistudio.baidu.com/aistudio/projectdetail/1463208?channelType=0&channel=0) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [通过OCR实现验证码识别](https://aistudio.baidu.com/aistudio/projectdetail/1100507?channelType=0&channel=0) | 飞桨官方 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [PaddleHub一键OCR中文识别(超轻量8.1M模型,火爆)](https://aistudio.baidu.com/aistudio/projectdetail/507159?channelType=0&channel=0) | 飞桨PaddleHub | [更多PaddleHub案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/79927) | | **其他** | [全流程,从零搞懂基于PaddlePaddle的图像分割](https://aistudio.baidu.com/aistudio/projectdetail/1674328?channelType=0&channel=0) | 开发者[nanting03](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/129509) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [负荷预测0.1](https://aistudio.baidu.com/aistudio/projectdetail/2183242?channelType=0&channel=0) | 开发者[gaomaosheng0](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/29822) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | [AI 实现皮影戏,传承正在消失的艺术](https://aistudio.baidu.com/aistudio/projectdetail/764130?channelType=0&channel=0) | 开发者[Zohar](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/331031) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **其他** | 『[深度学习7日打卡营』人脸关键点检测](https://aistudio.baidu.com/aistudio/projectdetail/1487972?channelType=0&channel=0) | 开发者[TC.Long](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/157251) | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase) | | **强化学习** | [DDPG算法应用于股票量化交易](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/examples/DDPG%20for%20Stock%20Trading) | 开发者 | [更多飞桨案例](https://www.paddlepaddle.org.cn/customercase?fr=paddleEdu_github) | ## <span id ='5'>飞桨学术案例集</span> | 技术方向 | 学术案例 | 来源 | 更多内容 | | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | 机器学习 | [鸢尾花分类](https://aistudio.baidu.com/aistudio/projectdetail/78918?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 前馈神经网络 | [波士顿房价预测](https://aistudio.baidu.com/aistudio/projectdetail/79112?channelType=0&channel=0) | [开发者AIStudioHelper](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/47528) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分类 | [手写数字识别](https://aistudio.baidu.com/aistudio/projectdetail/325575?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分类 | [猫狗分类](https://aistudio.baidu.com/aistudio/projectdetail/78960?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分类 | [图像分类网络VGG在多表情识别任务中的应用](https://aistudio.baidu.com/aistudio/projectdetail/2369842) | [开发者之雍Jerry](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/530098) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分类 | [图像分类-ResNet](https://aistudio.baidu.com/aistudio/projectdetail/56779?channelType=0&channel=0) | [开发者笨笨](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/39) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分类 | [用PaddlePaddle实现图像分类-SE_ResNeXt](https://aistudio.baidu.com/aistudio/projectdetail/169410?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分类 | [深入理解图像分类中的Transformer-Vit,DeiT](https://aistudio.baidu.com/aistudio/projectdetail/2293050) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分类 | [Swin Transformer](https://aistudio.baidu.com/aistudio/projectdetail/2280436) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分类 | [小样本学习(Few-Shot Learning)](https://aistudio.baidu.com/aistudio/projectdetail/2342018?channelType=0&channel=0) | [开发者DeepGeGe](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/746341) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分割 | [经典实例分割模型Mask RCNN](https://aistudio.baidu.com/aistudio/projectdetail/122273?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分割 | [PaddleSeg_DeepLabv3+](https://aistudio.baidu.com/aistudio/projectdetail/226703?channelType=0&channel=0) | [飞桨PaddleSeg](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/96056) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像分割 | [基于PaddlePaddle的语义分割DeepLabV3+实现](https://aistudio.baidu.com/aistudio/projectdetail/124366?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像检测 | [深度学习进阶-目标检测](https://aistudio.baidu.com/aistudio/projectdetail/78972?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像检测 | [一文详解yolov3目标检测算法](https://aistudio.baidu.com/aistudio/projectdetail/2240328) | [开发者AIStudio96069](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/96069) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 图像检测 | [CV领域的Transformer模型DETR在目标检测任务中的应用](https://aistudio.baidu.com/aistudio/projectdetail/2290729) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 视频分类 | [TSN视频分类](https://aistudio.baidu.com/aistudio/projectdetail/2280460) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 视频分类 | [Paddle2.1实现视频理解经典模型 — TSM](https://aistudio.baidu.com/aistudio/projectdetail/2311166?contributionType=1) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 视频分类 | [基于Attention和Bi-LSTM实现视频分类](https://aistudio.baidu.com/aistudio/projectdetail/2313514) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 视频分类 | [CV领域的Transformer模型TimeSformer实视频理解](https://aistudio.baidu.com/aistudio/projectdetail/2291410) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | GAN | [一文搞懂生成对抗网络之经典GAN(动态图、VisualDL2.0)](https://aistudio.baidu.com/aistudio/projectdetail/551962?channelType=0&channel=0) | [开发者FutureSI](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/76563) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | GAN | [基于PaddlePaddle的StarGAN,AttGAN,STGAN算法](https://aistudio.baidu.com/aistudio/projectdetail/169439?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | OCR | [文字识别-CRNN](https://aistudio.baidu.com/aistudio/projectdetail/190025?channelType=0&channel=0) | [开发者哦吼](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/34706) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | NLP | [基于ERNIE实现9项GLUE任务](https://aistudio.baidu.com/aistudio/projectdetail/2345396) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | NLP | [NLP领域的XLNet模型在情感分析中的应用](https://aistudio.baidu.com/aistudio/projectdetail/2333184) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | NLP | [NLP领域中的ERNIE模型在阅读理解中的应用](https://aistudio.baidu.com/aistudio/projectdetail/2333137) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | NLP | [NLP领域的ELECTRA在符号预测上的应用](https://aistudio.baidu.com/aistudio/projectdetail/2311092) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | NLP | [NLP领域的Transformer在机器翻译上的应用](https://aistudio.baidu.com/aistudio/projectdetail/2311016) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | NLP | [【Paddle打比赛】讯飞赛题—中文问题相似度挑战赛0.9+Baseline](https://aistudio.baidu.com/aistudio/projectdetail/2271498) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | NLP | [用PaddlePaddle实现BERT](https://aistudio.baidu.com/aistudio/projectdetail/122282?channelType=0&channel=0) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 多模态 | [【Paddle CLIP】你写啥他画啥,一个专属于你的小画家](https://aistudio.baidu.com/aistudio/projectdetail/2332016?channelType=0&channel=0) | [PaddleFleet](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/940489) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 强化学习 | [从代码到论文理解并复现MADDPG算法(PARL)](https://aistudio.baidu.com/aistudio/projectdetail/637951?channelType=0&channel=0) | [开发者Mr.郑先生_](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/147378) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 推荐 | [基于DeepFM 模型的点击率预估](https://github.com/PaddlePaddle/awesome-DeepLearning/tree/master/examples/DeepFM for CTR Prediction) | [PaddleEdu](https://github.com/PaddlePaddle/awesome-DeepLearning) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 推荐 | [基于DSSM的电影推荐](https://aistudio.baidu.com/aistudio/projectdetail/2324144) | [AIStudio官方](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | | 知识蒸馏 | [基于CIFAR100的SSLD蒸馏实验](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.2/docs/zh_CN/advanced_tutorials/distillation/distillation.md) | [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) | [更多飞桨案例](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086) | 返回[:arrow_heading_up:](#0) ---- # 👉竞赛类 | 领域 | 竞赛案例 | 来源 |介绍 | | -------- | ---------------| ---- | ---- | | 机器学习 | [【Paddle打比赛】个贷违约预测Baseline+ 0.607](https://aistudio.baidu.com/aistudio/projectdetail/2466206) |[开发者w5688414](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/169515)| DataFountain个贷违约预测,参考官方的baseline并用paddle进行改进 | | NLP | [【Paddle打比赛】讯飞赛题—中文问题相似度挑战赛0.9+Baseline](https://aistudio.baidu.com/aistudio/projectdetail/2271498) |[PaddleEdu](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/908086)| 中文问题相似度挑战赛paddle版本Baseline,基于paddlenlp通过预训练模型的微调完成问题相似度评定任务 | | NLP | [基于PaddleHub的疫情期间网民情绪识别](https://aistudio.baidu.com/aistudio/projectdetail/294224?channelType=0&channel=0) | [开发者CChan](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/82456)| 本项目为疫情期间网民情绪识别比赛的解决方案。使用了PaddleHub和ERNIE实现对疫情期间微博文本的情绪识别。 | | NLP | [【Paddle打比赛】产品评论观点提取竞赛baseline](https://aistudio.baidu.com/aistudio/projectdetail/2417709) | [开发者w5688414](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/169515) | DataFountain基于BERT的产品评论观点提取竞赛baseline,增加了优化方法| | NLP | [【Paddle打比赛】剧本角色情感识别baseline-精度0.676](https://aistudio.baidu.com/aistudio/projectdetail/2423977) | [开发者w5688414](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/169515) | 剧本角色情感识别baseline,使用bert模型| | 语音 |[【Paddle打比赛】语音合成](https://aistudio.baidu.com/aistudio/projectdetail/2793102?contributionType=1) | [开发者XYZ_916](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/812202)| 2021 新网银行智能语音大赛baseline。截止2021.11.17,该方案在总分榜第一,作品榜第二 | | CV | [中文场景文字识别挑战赛baseline](https://aistudio.baidu.com/aistudio/projectdetail/229728?channelType=0&channel=0) | [小度AIStudio](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/7)| 中文场景文字识别挑战赛的baseline项目, 用于参赛选手借鉴参考 | |CV|[【Paddle打比赛】手写字体OCR识别竞赛baseline](https://aistudio.baidu.com/aistudio/projectdetail/2606211)| [开发者Pink peach](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/532066)| 2021世界人工智能创新大赛,手写字体OCR识别竞赛baseline| | CV | [2020 CCF BDCI: 遥感影像地块分割baseline](https://aistudio.baidu.com/aistudio/projectdetail/1090790?channelType=0&channel=0) | [开发者lxastro](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/349179)| 2020 CCF BDCI: 遥感影像地块分割的baseline模型库,包括baseline模型的训练方法和比赛的评测脚本。 | | CV | [第三届中国AI+创新创业大赛:半监督学习目标定位竞赛第1名方案](https://aistudio.baidu.com/aistudio/projectdetail/2210815)| [开发者张牙舞爪](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/635490)| 半监督学习目标定位竞赛第一名方案分享 A榜得分0.81425 B榜得分0.80428 | |数据挖掘|[【Padddle打比赛】心电图智能诊断竞赛Baseline-0.6765](https://aistudio.baidu.com/aistudio/projectdetail/2712180)|[开发者w5688414](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/169515) | AIWIN 心电图智能诊断竞赛| 返回​[:arrow_heading_up:](#0) # 👉汇总 ## <span id='fj'>飞桨各产品学习资料汇总</span> | 产品 | 视频课程 | 学习文档 | | -------------------------------- | ------------------------------------------------------------ | -------- | | PaddleGAN | [生成对抗网络七日打卡营](https://aistudio.baidu.com/aistudio/course/introduce/16651) | | | PaddleOCR | [OCR自动标注小工具讲解](https://www.bilibili.com/video/BV1uX4y1K7PW)、[3.5M超轻量实用OCR模型解读](https://www.bilibili.com/video/BV1p54y1y7CM)、[OCR应用与部署实战](https://www.bilibili.com/video/BV1Zz4y1C7MW) | | | PaddleClas | [PaddleClas系列直播课](https://aistudio.baidu.com/aistudio/course/introduce/24519) | | | PaddleDetection | [目标检测7日打卡营](https://aistudio.baidu.com/aistudio/course/introduce/1617) | | | PaddleX | [PaddleX实例分割任务详解](https://www.bilibili.com/video/BV1M44y1r7s6)、[PaddleX目标检测任务详解](https://www.bilibili.com/video/BV1HB4y1A73b)、[PaddleX语义分割任务详解](https://www.bilibili.com/video/BV1qQ4y1Z7co)、[PaddleX图像分类任务详解](https://www.bilibili.com/video/BV1nK411F7J9)、[PaddleX客户端操作指南](https://www.bilibili.com/video/BV1bz4y1C7wr)、[飞桨全流程开发工具PaddleX](https://www.bilibili.com/video/BV17i4y1b7TZ) | | | <span id ='hub'>PaddleHub</span> | [手把手教你转换PaddleHub模型教程](https://www.bilibili.com/video/BV1YK411V71d) | | | <span id = 'vdl'>VDL</span> | [可视化分析工具助力AI算法快速开发](https://www.bilibili.com/video/BV1uy4y137iH)、[深度学习算法可视化调优实战演示](https://www.bilibili.com/video/BV1iD4y1o7Pf) | | | 高层API | [高层API助你快速上手深度学习](https://aistudio.baidu.com/aistudio/course/introduce/6771) | | | <span id='nlp'>PaddleNLP</span> | [基于深度学习的自然语言处理](https://www.bilibili.com/video/BV1fB4y1M7A3) | | 返回​[:arrow_heading_up:](#0) # 三、技术交流 非常感谢您使用本项目。您在使用过程中有任何建议或意见,可以在 **[Issue](https://github.com/PaddlePaddle/tutorials/issues)** 上反馈给我们,也可以通过扫描下方的二维码联系我们,飞桨的开发人员非常高兴能够帮助到您,并与您进行更深入的交流和技术探讨。 <center><img src="https://github.com/ZhangHandi/images-for-paddledocs/blob/main/images/readme/qr_code.png?raw=true"/></center><br></br> # 四、许可证书 本项目的发布受[Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0.txt)许可认证。 # 五、贡献内容 本项目的不断成熟离不开各位开发者的贡献,如果您对深度学习知识分享感兴趣,非常欢迎您能贡献给我们,让更多的开发者受益。 本项目欢迎任何贡献和建议,大多数贡献都需要你同意参与者许可协议(CLA)来声明你有权并实际上授权我们可以使用你的贡献。 ### 代码贡献规范 > pip install pre-commit > > pre-commit install 添加修改的代码后,对修改的文件进行代码规范,pre-commit 会自动调整代码格式,执行一次即可,后续commit不需要再执行。提交pr流程,详见:[awesome-DeepLearning 提交 pull request 流程](./examples/awesome-DeepLearning_pr_procedure.md)。 ### 贡献者 以下是awesome-DeepLearning贡献者列表: [yang zhou](https://youngzhou1999.github.io/cv/),[Niki_173](https://github.com/Niki173),[Twelveeee](https://github.com/Twelveeee),[buriedms](https://github.com/buriedms),[AqourAreA](https://github.com/AqourAreA),[zhangjin12138](https://github.com/zhangjin12138),[rerny](https://github.com/rerny),[LiuCongNLP](https://www.zhihu.com/people/LiuCongNLP),[LemonCherryFu](https://github.com/LemonCherryFu), [lutianhao](https://github.com/lutianhao)

Education & Learning ML Frameworks
3.6K Github Stars
Research
Open Source

Research

# Research 发布基于飞桨的前沿研究工作,包括CV、NLP、KG、STDM等领域的顶会论文和比赛冠军模型。 ## 目录 * [计算机视觉(Computer Vision)](#计算机视觉) * [自然语言处理(Natrual Language Processing)](#自然语言处理) * [知识图谱(Knowledge Graph)](#知识图谱) * [时空数据挖掘(Spatial-Temporal Data-Mining)](#时空数据挖掘) ## 计算机视觉 | 任务类型 | 目录 | 简介 | 论文链接 | | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | | 图像检索 | [GNN-Re-Ranking](CV/GNN-Re-Ranking/) | 基于GNN的快速图像检索Re-Ranking。 | https://arxiv.org/abs/2012.07620v2 | | 车流统计 | [VehicleCounting](CV/VehicleCounting/) | AICITY2020 车流统计竞赛datasetA TOP1 方案。 | - | | 车辆再识别 | [PaddleReid](CV/PaddleReid) | 给定目标车辆,在检索库中检索同id车辆,支持多种特征子网络。 | - | | 车辆异常检测 | [AICity2020-Anomaly-Detection](CV/AICity2020-Anomaly-Detection) | 在监控视频中检测车辆异常情况,例如车辆碰撞、失速等。| - | | 医学图像分析 | [AGEchallenge](CV/AGEchallenge) | 任务:在AS-OCT图像的公共数据集上进行闭角型分类和巩膜突点定位;基线模型:对应以上各任务的基线模型。 | - | | 光流估计 | [PWCNet](CV/PWCNet) | 基于金字塔式处理,逐层学习细部光流,设计代价容量函数三原则的CNN模型,用于光流估计。 | https://arxiv.org/abs/1709.02371 | | 语义分割 | [SemSegPaddle](CV/SemSegPaddle) | 针对多个数据集的图像语义分割模型的实现,包括Cityscapes、Pascal Context和ADE20K。 | - | | 轻量化检测 | [astar2019](CV/astar2019) | 百度之星轻量化检测比赛评测工具。 | - | | 地标检索与识别 | [landmark](CV/landmark) | 基于检索的地标检索与识别系统,支持地标型与非地标型识别、识别与检索结果相结合的多重识别结果投票和重新排序。 | https://arxiv.org/abs/1906.03990 | | 图像分类 | [webvision2018](CV/webvision2018) | 模型利用重加权网络(URNet)缓解web数据中偏倚和噪声的影响,进行web图像分类。 | https://arxiv.org/abs/1811.00700 | | 图像分类 | [CLPI](CV/CLPI-Collaborative-Learning-for-Diabetic-Retinopathy-Grading) | 模型利用一个Lesion Generator改善了糖尿病视网膜病变图像分级的模型性能,理论上可用于所有希望实现局部+整体模型分析的场景 | - | | 图像分类 | [RSNA-IHD](CV/Effective Transformer-based Solution for RSNA Intracranial Hemorrhage Detection) | 提出了一种有效的颅内出血检测(IHD)方法,其性能超过了在RSNA-IHD竞赛(2019)中获胜的解决方案。与此同时,与获胜者的解决方案相比,我们的模型只有其20%的参数量和10%的FLOPs | https://arxiv.org/abs/2205.07556 | | 小样本学习 | [PaddleFSL](CV/PaddleFSL) | 小样本学习工具包,可复现多个常用基线方法在多个图片分类数据集上的汇报效果 | - | | 迁移学习 | [SMILE](CV/SMILE) | 提出了一种自蒸馏样本混合迁移学习框架,适用于小样本图片分类 | https://arxiv.org/abs/2103.13941 | ## 自然语言处理 | 任务类型 | 目录 | 简介 | 论文链接 | | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | | 中文词法分析 | [LAC(Lexical Analysis of Chinese)](https://github.com/baidu/lac) | 百度自主研发中文特色模型词法分析任务,集成了中文分词、词性标注和命名实体识别任务。输入是一个字符串,而输出是句子中的词边界和词性、实体类别。 | - | | 主动对话 | [DuConv](NLP/ACL2019-DuConv) | 机器根据给定知识信息主动引领对话进程完成设定的对话目标。 |https://www.aclweb.org/anthology/P19-1369/| | 语义解析 | [Text2SQL-BASELINE](NLP/Text2SQL-BASELINE) | 输入自然语言问题和相应的数据库,生成与问题对应的 SQL 查询语句,通过执行该 SQL 可得到问题的答案。 | - | | 多轮对话 | [DAM](NLP/ACL2018-DAM) | 开放领域多轮对话匹配的深度注意力机制模型,根据多轮对话历史和候选回复内容,排序出最合适的回复。 | http://aclweb.org/anthology/P18-1103 | | 阅读理解 | [DuReader](NLP/ACL2018-DuReader) | 数据集:大规模、面向真实应用、由人类生成的中文阅读理解数据集,聚焦于真实世界中的不限定领域的问答任务;基线系统:针对DuReader数据集实现的经典BiDAF模型。 | https://www.aclweb.org/anthology/W18-2605/ | | 关系抽取 | [ARNOR](NLP/ACL2019-ARNOR) | 数据集:用于对远程监督关系提取模型进行句子级别的评价;模型:基于注意力正则化识别噪声数据,通过bootstrap方法逐步选择出高质量的标注数据。| https://www.aclweb.org/anthology/P19-1135/ | | 机器翻译 | [JEMT](NLP/ACL2019-JEMT) | 模型的输入端包括文字信息及发音信息,嵌入层融合文字信息和发音信息进行翻译。 | https://arxiv.org/abs/1810.06729 | | 阅读理解 | [KTNET](NLP/ACL2019-KTNET) | 模型将知识库中的知识整合到预先训练好的上下文表示中,利用丰富的知识增强机器阅读理解的预训练语言表示。 | https://www.aclweb.org/anthology/P19-1226 | | 对话生成 | [PLATO](NLP/Dialogue-PLATO) | 基于隐空间的端到端的预训练对话生成模型,可以灵活支持多种对话,包括闲聊、知识聊天、对话问答等。 | http://arxiv.org/abs/1910.07931 | | 阅读理解 | [DuReader-Robust-BASELINE](NLP/DuReader-Robust-BASELINE) | 数据集:DuReader-robust,中文数据集,用于全面评价机器阅读理解模型的鲁棒性;基线系统:针对该数据集,基于[ERNIE](https://arxiv.org/abs/1904.09223)实现的阅读理解基线系统。 | https://arxiv.org/abs/2004.11142 | | 对话生成 | [AKGCM](NLP/EMNLP2019-AKGCM) | 包含知识增强图、知识选择和知识感知响应生成器的聊天机器人。 | https://www.aclweb.org/anthology/D19-1187/ | | 机器翻译 | [MAL](NLP/EMNLP2019-MAL) | 多智能体端到端联合学习框架,通过多个智能体的互相学习提升翻译质量。 | https://arxiv.org/abs/1909.01101 | | 对话生成 | [MMPMS](NLP/IJCAI2019-MMPMS) | 针对开放域对话中一对多问题,利用多映射机制和后验映射选择模块进行多样性、丰富化的对话生成。 | https://arxiv.org/abs/1906.01781 | | 阅读理解 | [MRQA2019-BASELINE](NLP/MRQA2019-BASELINE) | 机器阅读理解任务的基线模型,基于[ERNIE](https://arxiv.org/abs/1904.09223)预训练模型,支持多GPU微调预测。 | - | | 阅读理解 | [D-NET](NLP/MRQA2019-D-NET) | 预训练及微调框架,包含多任务学习及多预训练模型的融合,用于阅读理解模型的生成。 | https://www.aclweb.org/anthology/D19-5828/ | | 建议挖掘 | [MPM](NLP/NAACL2019-MPM) | 利用多视角架构来学习表示和双向transformer编码器进行论坛评论建议挖掘。 | https://www.aclweb.org/anthology/S19-2216/ | | 多文档摘要 | [ACL2020-GraphSum](NLP/ACL2020-GraphSum) | 基于图表示的生成式多文档摘要模型,将显式图结构信息引入到端到端摘要生成过程中。 | https://www.aclweb.org/anthology/2020.acl-main.555.pdf | | 融合多种对话类型的对话式推荐 | [ACL2020-DuRecDial](NLP/ACL2020-DuRecDial) | 提出新任务:融合闲聊、任务型对话、问答和推荐等多种对话类型的对话式推荐,构建DuRecDial数据集,提出具有多对话目标驱动策略机制的对话生成框架。 | https://www.aclweb.org/anthology/2020.acl-main.98/ | | 面向推荐的对话 | [Conversational-Recommendation-BASELINE](NLP/Conversational-Recommendation-BASELINE) | 融合人机对话系统和个性化推荐系统,定义新一代智能推荐技术,该系统先通过问答或闲聊收集用户兴趣和偏好,然后主动给用户推荐其感兴趣的内容,比如餐厅、美食、电影、新闻等。 | - | | 稠密段落检索 | [ACL2021-PAIR](NLP/ACL2021-PAIR) | 基于以段落相似度为中心的相似度关系提升稠密段落检索,基于知识蒸馏进行采样,采用两阶段训练方式。 | https://aclanthology.org/2021.findings-acl.191/ | | 任务式对话 | [EMNLP2022-Q-TOD](NLP/EMNLP2022-Q-TOD) | 自然语言查询驱动的任务式对话系统,提出由查询生成、知识检索和回复生成组成的三阶段新框架。 | https://arxiv.org/abs/2210.07564 | ## 知识图谱 | 任务类型 | 目录 | 简介 | 论文链接 | | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | | 知识图谱表示学习 | [CoKE](KG/CoKE) | 百度自主研发语境化知识图谱表示学习框架CoKE,在知识图谱链接预测和多步查询任务上取得学界领先效果。| [https://arxiv.org/abs/1911.02168](https://arxiv.org/abs/1911.02168) | | 关系抽取 | [DuIE\_Baseline](KG/DuIE_Baseline) | 语言与智能技术竞赛关系抽取任务DuIE 2.0基线系统,通过设计结构化标注体系,实现基于[ERNIE](https://arxiv.org/abs/1904.09223)的端到端SPO抽取模型。| - | | 事件抽取 |[DuEE\_baseline](hKG/DuEE_baseline)| 语言与智能技术竞赛事件抽取任务DuEE 1.0基线系统,实现基于[ERNIE](https://arxiv.org/abs/1904.09223)+CRF的Pipeline事件抽取模型。| - | | 实体链指 |[DuEL\_Baseline](KG/DuEL_Baseline)| 面向中文短文本的实体链指任务(CCKS 2020)的基线系统,实现基于[ERNIE](https://arxiv.org/abs/1904.09223)和多任务机制的实体链指模型。| - | | 辅助诊断 |[SignOrSymptom\_Relationship](KG/ACL2020_SignOrSymptom_Relationship)| 针对EMR具有无结构化文本和结构化信息并存的特点,结合医疗NLU,以深度学习模型实现EMR的向量化表示、诊断预分类和概率计算。| - | | 文档级关系抽取 | [SSAN](KG/AAAI2021_SSAN) | 引入并建模实体间的依赖结构,在文档级关系抽取任务上取得学界领先效果。| [https://arxiv.org/abs/2102.10249](https://arxiv.org/abs/2102.10249) | ## 时空数据挖掘 | 任务类型 | 目录 | 简介 | 论文链接 | | ------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | -------- | | 固定资产价值估计 |[MONOPOLY](ST_DM/CIKM2019-MONOPOLY)| 实用的POI商业智能算法,对大量其他的固定资产进行价值估计,包括城市居民对不同公共资产价格评估、私有房价评估偏好的发现与量化分析,以及对评估固定资产价格需考虑的空间范围的确定。 | https://dl.acm.org/doi/10.1145/3357384.3357810 | | 兴趣点生成 |[P3AC](ST_DM/KDD2020-P3AC)| 具备个性化的前缀嵌入的POI自动生成。 | - | | 区域生成 |[P3AC](ST_DM/GenRegion)| 基于路网进行区域划分的方法, 实现对特定区域基于路网的全划分,区域之间无交叠,无空隙,算法支持对全球的区域划分。| - | ## 许可证书 此向导由[PaddlePaddle](https://github.com/PaddlePaddle/Paddle)贡献,受[Apache-2.0 license](LICENSE)许可认证。

Education & Learning Game Development ML Frameworks
1.8K Github Stars
RocketQA
Open Source

RocketQA

<p align=center> <img src="https://github.com/PaddlePaddle/RocketQA/blob/main/RocketQA_title.png" /> </p> <div align=center> ![](https://img.shields.io/badge/license-Apache%202-blue) ![](https://img.shields.io/badge/version-v1.0-green) ![](https://img.shields.io/badge/JupyterNotebook-Try%20%F0%9F%9A%80RocketQA%20Now!-orange) ![](https://img.shields.io/badge/requirements-up%20to%20date-brightgreen) ![](https://img.shields.io/badge/size-1.68MB-blue) </div> In recent years, the dense retrievers based on pre-trained language models have achieved remarkable progress. To facilitate more developers using cutting edge technologies, this repository provides an easy-to-use toolkit for running and fine-tuning the state-of-the-art dense retrievers, namely **🚀RocketQA**. This toolkit has the following advantages: * ***State-of-the-art***: 🚀RocketQA provides our well-trained models, which achieve SOTA performance on many dense retrieval datasets. And it will continue to update the [latest models](https://github.com/PaddlePaddle/RocketQA#news). * ***First-Chinese-model***: 🚀RocketQA provides the first open source Chinese dense retrieval model, which is trained on millions of manual annotation data from [DuReader](https://github.com/baidu/DuReader). * ***Easy-to-use***: By integrating this toolkit with [JINA](https://jina.ai/), 🚀RocketQA can help developers build an end-to-end retrieval system and question answering system with several lines of code. <img src="https://github.com/PaddlePaddle/RocketQA/blob/main/RocketQA_flow.png" alt="" align=center /> ## News * 🎉 Nov 27, 2022: Our survey paper on dense retrieval [Dense Text Retrieval based on Pretrained Language Models: A Survey](https://arxiv.org/pdf/2211.14876.pdf) was publicly available. * Oct 8, 2022: [DuReader<sub>retrieval</sub>](https://arxiv.org/abs/2203.10232) was accepted by EMNLP 2022. [[data]](https://github.com/baidu/DuReader/tree/master/DuReader-Retrieval); The latest version of DuReader<sub>retrieval</sub> contains cross-lingual retrieval benchmarks. Stay tuned! * Apr 29, 2022: **Training function** is added to RocketQA toolkit. And the baseline models of **DuReader<sub>retrieval</sub>** (both cross encoder and dual encoder) are available in RocketQA models. * Mar 30, 2022: We released **DuReader<sub>retrieval</sub>**, a large-scale Chinese benchmark for passage retrieval. The dataset contains over 90K questions and 8M passages from Baidu Search. [[paper]](https://arxiv.org/abs/2203.10232) [[data]](https://github.com/baidu/DuReader/tree/master/DuReader-Retrieval) ; The baseline of **DuReader<sub>retrieval</sub>** [leaderboard](https://aistudio.baidu.com/aistudio/competition/detail/157/0/introduction) was also released. [[code/model]](https://github.com/PaddlePaddle/RocketQA/tree/main/research/DuReader-Retrieval-Baseline) * Dec 3, 2021: The toolkit of dense retriever RocketQA was released, including the first chinese dense retrieval model trained on DuReader. * Aug 26, 2021: [RocketQA v2](https://arxiv.org/pdf/2110.07367.pdf) was accepted by EMNLP 2021. [[code/model]](https://github.com/PaddlePaddle/RocketQA/tree/main/research/RocketQAv2_EMNLP2021) * May 5, 2021: [PAIR](https://aclanthology.org/2021.findings-acl.191.pdf) was accepted by ACL 2021. [[code/model]](https://github.com/PaddlePaddle/RocketQA/tree/main/research/PAIR_ACL2021) * Mar 11, 2021: [RocketQA v1](https://arxiv.org/pdf/2010.08191.pdf) was accepted by NAACL 2021. [[code/model]](https://github.com/PaddlePaddle/RocketQA/tree/main/research/RocketQA_NAACL2021) ## Installation We provide two installation methods: ***Python Installation Package*** and ***Docker Environment*** ### Install with Python Package First, install [PaddlePaddle](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html). ```bash # GPU version: $ pip install paddlepaddle-gpu # CPU version: $ pip install paddlepaddle ``` Second, install rocketqa package (latest version: 1.1.0): ```bash $ pip install rocketqa ``` NOTE: this toolkit MUST be running on Python3.6+ with [PaddlePaddle](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html) 2.0+. ### Install with Docker ```bash docker pull rocketqa/rocketqa docker run -it docker.io/rocketqa/rocketqa bash ``` ## Getting Started Refer to the examples below, you can build and run your own Search Engine with several lines of code. We also provide a [Playground](https://aistudio.baidu.com/aistudio/projectdetail/3225255?contributionType=1) with JupyterNotebook. Try 🚀RocketQA straight away in your browser! ### Running with JINA [JINA](https://jina.ai/) is a cloud-native neural search framework to build SOTA and scalable deep learning search applications in minutes. Here is a simple example to build a Search Engine based on JINA and RocketQA. ```bash cd examples/jina_example pip3 install -r requirements.txt # Generate vector representations and build a libray for your Documents # JINA will automaticlly start a web service for you python3 app.py index toy_data/test.tsv # Try some questions related to the indexed Documents python3 app.py query_cli ``` Please view [JINA example](https://github.com/PaddlePaddle/RocketQA/tree/main/examples/jina_example) to know more. ### Running with FAISS We also provide a simple example built on [Faiss](https://github.com/facebookresearch/faiss). ```bash cd examples/faiss_example/ pip3 install -r requirements.txt # Generate vector representations and build a libray for your Documents python3 index.py zh ../data/dureader.para test_index # Start a web service on http://localhost:8888/rocketqa python3 rocketqa_service.py zh ../data/dureader.para test_index # Try some questions related to the indexed Documents python3 query.py ``` ## API You can also easily integrate 🚀RocketQA into your own task. We provide two types of models, ERNIE-based dual encoder for answer retrieval and ERNIE-based cross encoder for answer re-ranking. For running our models, you can use the following functions. ### Load model #### [`rocketqa.available_models()`](https://github.com/PaddlePaddle/RocketQA/blob/3a99cf2720486df8cc54acc0e9ce4cbcee993413/rocketqa/rocketqa.py#L17) Returns the names of the available RocketQA models. To know more about the available models, please see the code comment. #### [`rocketqa.load_model(model, use_cuda=False, device_id=0, batch_size=1)`](https://github.com/PaddlePaddle/RocketQA/blob/3a99cf2720486df8cc54acc0e9ce4cbcee993413/rocketqa/rocketqa.py#L52) Returns the model specified by the input parameter. It can initialize both dual encoder and cross encoder. By setting input parameter, you can load either RocketQA models returned by "available_models()" or your own checkpoints. ### Dual encoder Dual-encoder returned by "load_model()" supports the following functions: #### [`model.encode_query(query: List[str])`](https://github.com/PaddlePaddle/RocketQA/blob/1746b938d659c7f8d0b9f960e3199dcbd945adac/rocketqa/encoder/dual_encoder.py#L151) Given a list of queries, returns their representation vectors encoded by model. #### [`model.encode_para(para: List[str], title: List[str])`](https://github.com/PaddlePaddle/RocketQA/blob/1746b938d659c7f8d0b9f960e3199dcbd945adac/rocketqa/encoder/dual_encoder.py#L179) Given a list of paragraphs and their corresponding titles (optional), returns their representations vectors encoded by model. #### [`model.matching(query: List[str], para: List[str], title: List[str])`](https://github.com/PaddlePaddle/RocketQA/blob/1746b938d659c7f8d0b9f960e3199dcbd945adac/rocketqa/encoder/dual_encoder.py#L212) Given a list of queries and paragraphs (and titles), returns their matching scores (dot product between two representation vectors). #### [`model.train(train_set: str, epoch: int, save_model_path: str, args)`](https://github.com/PaddlePaddle/RocketQA/blob/1746b938d659c7f8d0b9f960e3199dcbd945adac/rocketqa/encoder/dual_encoder.py#L247) Given the hyperparameters `train_set`, `epoch` and `save_model_path`, you can train your own dual encoder model or finetune our models. Other settings like `save_steps` and `learning_rate` can also be set in `args`. Please refer to examples/example.py for detail. ### Cross encoder Cross-encoder returned by "load_model()" supports the following function: #### [`model.matching(query: List[str], para: List[str], title: List[str])`](https://github.com/PaddlePaddle/RocketQA/blob/1746b938d659c7f8d0b9f960e3199dcbd945adac/rocketqa/encoder/cross_encoder.py#L156) Given a list of queries and paragraphs (and titles), returns their matching scores (probability that the paragraph is the query's right answer). #### [`model.train(train_set: str, epoch: int, save_model_path: str, args)`](https://github.com/PaddlePaddle/RocketQA/blob/1746b938d659c7f8d0b9f960e3199dcbd945adac/rocketqa/encoder/cross_encoder.py#L193) Given the hyperparameters `train_set`, `epoch` and `save_model_path`, you can train your own cross encoder model or finetune our models. Other settings like `save_steps` and `learning_rate` can also be set in `args`. Please refer to examples/example.py for detail. ### Examples Following the examples below, you can retrieve the vector representations of your documents and connect 🚀RocketQA to your own tasks. #### Run RocketQA Model To run RocketQA models, you should set the parameter `model` in 'load_model()' with RocketQA model name returned by 'available_models()'. ```python import rocketqa query_list = ["trigeminal definition"] para_list = [ "Definition of TRIGEMINAL. : of or relating to the trigeminal nerve.ADVERTISEMENT. of or relating to the trigeminal nerve. ADVERTISEMENT."] # init dual encoder dual_encoder = rocketqa.load_model(model="v1_marco_de", use_cuda=True, device_id=0, batch_size=16) # encode query & para q_embs = dual_encoder.encode_query(query=query_list) p_embs = dual_encoder.encode_para(para=para_list) # compute dot product of query representation and para representation dot_products = dual_encoder.matching(query=query_list, para=para_list) ``` #### Train Your Own Model To train your own models, you can use `train()` function with your dataset and parameters. Training data contains 4 columns: query, title, para, label (0 or 1), separated by "\t". For detail about parameters and dataset, please refer to './examples/example.py' ```python import rocketqa # init cross encoder, and set device and batch_size cross_encoder = rocketqa.load_model(model="zh_dureader_ce", use_cuda=True, device_id=0, batch_size=32) # finetune cross encoder based on "zh_dureader_ce_v2" cross_encoder.train('./examples/data/cross.train.tsv', 2, 'ce_models', save_steps=1000, learning_rate=1e-5, log_folder='log_ce') ``` #### Run Your Own Model To run your own models, you should set parameter `model` in 'load_model()' with a JSON config file. ```python import rocketqa # init cross encoder cross_encoder = rocketqa.load_model(model="./examples/ce_models/config.json", use_cuda=True, device_id=0, batch_size=16) # compute relevance of query and para relevance = cross_encoder.matching(query=query_list, para=para_list) ``` config is a JSON file like this ``` { "model_type": "cross_encoder", "max_seq_len": 384, "model_conf_path": "zh_config.json", "model_vocab_path": "zh_vocab.txt", "model_checkpoint_path": ${YOUR_MODEL}, "for_cn": true, "share_parameter": 0 } ``` Folder `examples` provides more details. ## Citations If you find RocketQA v1 models helpful, feel free to cite our publication [RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering](https://arxiv.org/pdf/2010.08191.pdf) ``` @inproceedings{rocketqa_v1, title="RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering", author="Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, Daxiang Dong, Hua Wu and Haifeng Wang", year="2021", booktitle = "In Proceedings of NAACL" } ``` If you find PAIR models helpful, feel free to cite our publication [PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval](https://aclanthology.org/2021.findings-acl.191.pdf) ``` @inproceedings{rocketqa_pair, title="PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval", author="Ruiyang Ren, Shangwen Lv, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang and Ji-Rong Wen", year="2021", booktitle = "In Proceedings of ACL Findings" } ``` If you find RocketQA v2 models helpful, feel free to cite our publication [RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking](https://arxiv.org/pdf/2110.07367.pdf) ``` @inproceedings{rocketqa_v2, title="RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking", author="Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang and Ji-Rong Wen", year="2021", booktitle = "In Proceedings of EMNLP" } ``` If you find DuReader<sub>retrieval</sub> dataset helpful, feel free to cite our publication [DuReader_retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine](https://arxiv.org/pdf/2203.10232.pdf) ``` @inproceedings{DuReader_retrieval, title="DuReader_retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine", author="Yifu Qiu, Hongyu Li, Yingqi Qu, Ying Chen, Qiaoqiao She, Jing Liu, Hua Wu and Haifeng Wang", booktitle = "In Proceedings of EMNLP" year="2022" } ``` If you find our survey useful for your work, please cite the following paper [Dense Text Retrieval based on Pretrained Language Models: A Survey](https://arxiv.org/pdf/2211.14876.pdf) ``` @article{DRSurvey, title={Dense Text Retrieval based on Pretrained Language Models: A Survey}, author={Wayne Xin Zhao, Jing Liu, Ruiyang Ren, Ji-Rong Wen}, year={2022}, journal={arXiv preprint arXiv:2211.14876} } ``` ## License This repository is provided under the [Apache-2.0 license](https://github.com/PaddlePaddle/RocketQA/blob/main/LICENSE). ## Contact Information For help or issues using RocketQA, please submit a Github issue. For other communication or cooperation, please contact Jing Liu ([email protected]) or scan the following QR Code. <img src="https://github.com/PaddlePaddle/RocketQA/blob/main/BaiduNLP-QRCode.png" width = "300" height = "300" alt="" align=center />

ML Frameworks
786 Github Stars
PaddleMIX
Open Source

PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

AI & Machine Learning ML Frameworks
724 Github Stars
PaddleOCR
Open Source

PaddleOCR

<div align="center"> <p> <img width="800" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/README/Banner.png" alt="Star-history"> </p> <h3>Global Leading OCR Toolkit & Document AI Engine</h3> English | [简体中文](./readme/README_cn.md) | [繁體中文](./readme/README_tcn.md) | [日本語](./readme/README_ja.md) | [한국어](./readme/README_ko.md) | [Français](./readme/README_fr.md) | [Русский](./readme/README_ru.md) | [Español](./readme/README_es.md) | [العربية](./readme/README_ar.md) <!-- icon --> [![PyPI Downloads](https://static.pepy.tech/badge/paddleocr)](https://pepy.tech/projects/paddleocr) [![Used by](https://img.shields.io/badge/Used%20by-6k%2B%20repositories-blue)](https://github.com/PaddlePaddle/PaddleOCR/network/dependents) ![python](https://img.shields.io/badge/python-3.8~3.12-aff.svg) ![os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-pink.svg) ![hardware](https://img.shields.io/badge/hardware-cpu%2C%20gpu%2C%20xpu%2C%20npu-yellow.svg) [![AI Studio](https://img.shields.io/badge/PaddleOCR-_Offiical_Website-1927BA?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAgAAAAIACAMAAADDpiTIAAAABlBMVEU2P+X///+1KuUwAAAHKklEQVR42u3dS5bjOAwEwALvf2fMavZum6IAImI7b2yYSqU+1Zb//gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADKCR/+fzly7rD92yVg69xh8zeLwOa5w+ZvFYHtc4ft3ykB++cOm79PAp6YO2z/Ngl4ZO5l+9+yT4QAvLqS748VF33Ylzdvzpl72f6z53YIGJ6SZdPeNHcIwOycaADdLgCSIgAIgCOAACAAykIAEAAEAAFAABCAT+WQuQVgeBqXhXQIQAAYegowLQBpbg3gZGFyAC6vgBQAMREA2/YfDPxyaDQNyTNz+3Zwn5J4ZG7PB2h0kHhi7plPCImmJwkPzO0RMa3OET0i5uGlzHFze0xcu0vE2Dq3J4U2vEPgSaHbFzPNDQAAAAAAAMBNovdw+cP/ny+uaf7w/+eYADy8kE+F4Offdjn6zZXhAXgiA78G4MNNsmnu1Xr7b3mbOL8T5Ja5bw/A35EC2LiWpzt1y9jRugBy30fLg3NvHPvnuZcC2NsCUXA/aRmA89V07Fwgt37uH8deCmBr6N44pP4UgaUATpdA7v/cMbIB8okliY65/SW5HhJ1ehPmM+8edwXgpbu4R88FayR32Y/P7oZZbOx13/Zr//ZHx27bAPnkFoyewYlbAhD3TvBobr95gaUAtr1EdNx1lgI4OcTTuR3z6+FZMEDRcu9ZCuDgGCdyGxMa4EgBRMvcjrkM7NgBZw5c0TwAUWUhZwRXA2xaya65Xa3jO2qYZ8bu2AD5w38tG5V8aZpoGN6Tz0bOfa9bceyWAciTO0jWyO1Tc5cLwJmF/JfPnXVyu3/slgHIg1n79O2O5fZv+1cHV7sC2HYqmUdHysNzX3sVkMcjUK5Gc+dMs28E5bGtm0V3gloBOP9vgZv+4sYn3RUaYFMCol5uN77g6lUApc8pWs69Zn7snS9Z9Q8G0S0AUTVUUTG3A54R1KSvo/diLAv5fKzynZeN6xogC75u93+AtBTA47OlAFSv6qY/vp3DAjD8iv2ZdFYJwKynMhTK1rInPfzaxW81LnvSgFP9KxrATaCLA3DxHpbFX31ZyNm5XRZyXG5bNkAWfP0rcrsUwOgC6NIAzgBcBiqAWwPgLrAGuGBP6jr2sifdfiJ6QQM4Bbw4AK4B3129ZSFn53ZZyA/GyFty27IBFMDFAXAG8PbyLQv5xULGPRl0K3h2AbwcgCZPhs+LD1zLnjS6AN4NwMU/DVFh7LyhASreTbvqrxdr/J4XT4Swz4FrTS+AGJ7bNbwAYkxuWzZAVljHrJfbjb9wviYXwFO/FJ8Vli4vaICsEMFyBbA3tmtsAUS0zG1c/bj4YwsZH2/+Whd0+1Nb+S7IE2sfPw4RL0XmsR8Nqvz7qFngmPHF34EqjP15AAofAkosZKPC/K6FVoeP02Ehi540NG6AK/4pYP3cLgVwXwHkDQ1QcSGb/uF4WwCmfX8u/+4vgLINcMUlQIfcLgXwXAF0+BGkpQDuuJx7/hwgpu//cWVuO3wxJOz/z8297vgYBwaIO3O7Kn+c194578ltywbIgu8fl+Z2lS+APvnLjnOv8hsgSqxjgwL4Ln9LAezaj98tgPzy7ZcC+GQzxrWxXQpgx370dm6/H7v6jaBoso5dY1swAFlwHWvfBf5pxVa93fCtdx64+1dsgCy4joWvAfPX9VoKYMs6Zse9/8Mlvv7LILlhAfKFFdsSutJXAdFkL3qlADJPrXFcXAC5KYaH586jO9mtAch9S3T0GQJ726ZWAE49kjP3rlDJuetdaL/1zeqZY9c7CRz7s0wCUPxienQBnAuAAtAAlxaAAAxfyBQABSAACkAAFIAAKAABUAACMEkKwL170oh7V8ueNLoAjgTAXWAN4BRwcABcA2oABTA4AApAAyiAwQFQABpAAQwOgALQADMWUgCuEmNyu15fSIY3gFPAiwPgFFADKIDBAVAAGkABCIACmBqAUAAaQAHMDUCMWkgBuMWw3K43F5LhDeAU8OIAuAmkARTA4AAoAA2gAARAAUwNgLvAGkABDA6Au8AaoKOJuV0vLSTDG8Ap4MUBcBNIAyiAwQFQABpAAQwOgALQAApAABTA1AC4C6wBOhqb23V+IRneAE4BLw6Aa0ANoAAGB0ABaAAFMDgACkADKAABUABTA+AusAboKATAQs4trjV+IYcfuJYCcA6gAATAQk69dFkKQANYyLkFcLIBFIDLQAVwawDsSRrAEWBwAJwCagAFMDgACkADKIDBAVAAGkABCIACmBoAzwXWAApgcADsSRrg0iNACoACEADXgAIwdCFTACykALgGFIAfl0kBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPBv/gN+IH8U6YveYgAAAABJRU5ErkJggg==&labelColor=white)](https://www.paddleocr.com) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/PaddlePaddle/PaddleOCR) [![License](https://img.shields.io/badge/license-Apache_2.0-green)](../LICENSE) </div> **PaddleOCR converts PDF documents and images into structured, LLM-ready data (JSON/Markdown) with industry-leading accuracy. With 70k+ Stars and trusted by top-tier projects like Dify, RAGFlow, and Cherry Studio, PaddleOCR is the bedrock for building intelligent RAG and Agentic applications.** ## 🚀 Key Features ### 📄 Intelligent Document Parsing (LLM-Ready) > *Transforming messy visuals into structured data for the LLM era.* * **SOTA Document VLM**: Featuring **PaddleOCR-VL-1.6 (0.9B)**, the industry's leading lightweight vision-language model for document parsing. It achieves 96.3% accuracy on OmniDocBench v1.6, leads in text, formula, and table recognition, and shows significantly enhanced capabilities in ancient documents, rare characters, seals, and charts, with structured outputs in **Markdown** and **JSON** formats. * **Structure-Aware Conversion**: Powered by **PP-StructureV3**, seamlessly convert complex PDFs and images into **Markdown** or **JSON**. Unlike the PaddleOCR-VL series models, it provides more fine-grained coordinate information, including table cell coordinates, text coordinates, and more. * **Production-Ready Efficiency**: Achieve commercial-grade accuracy with an ultra-small footprint. Outperforms numerous closed-source solutions in public benchmarks while remaining resource-efficient for edge/cloud deployment. ### 🔍 Universal Text Recognition (Scene OCR) > *The global gold standard for high-speed, multilingual text spotting.* * **100+ Languages Supported**: Native recognition for a vast global library. Our **PP-OCRv5** single-model solution elegantly handles multilingual mixed documents (Chinese, English, Japanese, Pinyin, etc.). * **Complex Element Mastery**: Beyond standard text recognition, we support **natural scene text spotting** across a wide range of environments, including IDs, street views, books, and industrial components * **Performance Leap**: PP-OCRv5 delivers a **13% accuracy boost** over previous versions, maintaining the "Extreme Efficiency" that PaddleOCR is famous for. <div align="center"> <p> <img width="100%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/README/Arch.jpg" alt="PaddleOCR Architecture"> </p> </div> ### 🛠️ Developer-Centric Ecosystem * **Seamless Integration**: The premier choice for the AI Agent ecosystem—deeply integrated with **Dify, RAGFlow, Pathway, and Cherry Studio**. * **LLM Data Flywheel**: A complete pipeline to build high-quality datasets, providing a sustainable "Data Engine" for fine-tuning Large Language Models. * **One-Click Deployment**: Supports various hardware backends (NVIDIA GPU, Intel CPU, Kunlunxin XPU, and diverse AI Accelerators). ## 📣 Recent updates ### 🔥 2026.05.28: Release of PaddleOCR 3.6.0 - PaddleOCR-VL-1.6 highlights: - **New SOTA Accuracy**: Achieves over 96.3% on OmniDocBench v1.6, also sets new SOTA on OmniDocBench v1.5 and Real5-OmniDocBench, leading both open-source and proprietary solutions in text, formula, and table recognition. - **Comprehensive Capability Upgrade**: Significant improvements in table, ancient document, and rare character recognition, with notably enhanced seal recognition, spotting, and chart understanding across multiple scenarios. - **Seamless Migration**: Model architecture is fully consistent with PaddleOCR-VL-1.5, enabling zero-cost adaptation—swap and go. - **Try it now**: Available on [HuggingFace](https://huggingface.co/PaddlePaddle/PaddleOCR-VL-1.6) or our [Official Website](https://www.paddleocr.com). <details> <summary><strong>2026.04.21: Release of PaddleOCR 3.5.0</strong></summary> * **Flexible inference backends**: Seamlessly switch between Paddle static graph, Paddle dynamic graph, or Transformers. PaddleOCR is now deeply integrated with the Hugging Face ecosystem, and 20 major models support Transformers as the inference backend. * **Office documents to Markdown**: Convert common document formats such as Word, Excel, and PowerPoint into Markdown. * **DOCX export for parsed results**: The `PaddleOCR-VL` series, `PP-StructureV3`, and `PP-DocTranslation` now support exporting parsed results to DOCX for convenient viewing and editing in Microsoft Word. * **Official browser inference SDK**: Released `PaddleOCR.js`, the official browser inference SDK that supports running `PP-OCRv5` directly in the browser. </details> <details> <summary><strong>2026.01.29: Release of PaddleOCR 3.4.0</strong></summary> * PaddleOCR-VL-1.5 (SOTA 0.9B VLM): Our latest flagship model for document parsing is now live! * **94.5% Accuracy on OmniDocBench**: Surpassing top-tier general large models and specialized document parsers. * **Real-World Robustness**: First to introduce the **PP-DocLayoutV3** algorithm for irregular shape positioning, mastering 5 tough scenarios: *Skew, Warping, Scanning, Illumination, and Screen Photography*. * **Capability Expansion**: Now supports **Seal Recognition**, **Text Spotting**, and expands to **111 languages** (including China’s Tibetan script and Bengali). * **Long Document Mastery**: Supports automatic cross-page table merging and hierarchical heading identification. * **Try it now**: Available on [HuggingFace](https://huggingface.co/PaddlePaddle/PaddleOCR-VL-1.5) or our [Official Website](https://www.paddleocr.com). </details> <details> <summary><strong>2025.10.16: Release of PaddleOCR 3.3.0</strong></summary> - Released PaddleOCR-VL: - **Model Introduction**: - **PaddleOCR-VL** is a SOTA and resource-efficient model tailored for document parsing. Its core component is PaddleOCR-VL-0.9B, a compact yet powerful vision-language model (VLM) that integrates a NaViT-style dynamic resolution visual encoder with the ERNIE-4.5-0.3B language model to enable accurate element recognition. **This innovative model efficiently supports 109 languages and excels in recognizing complex elements (e.g., text, tables, formulas, and charts), while maintaining minimal resource consumption**. Through comprehensive evaluations on widely used public benchmarks and in-house benchmarks, PaddleOCR-VL achieves SOTA performance in both page-level document parsing and element-level recognition. It significantly outperforms existing solutions, exhibits strong competitiveness against top-tier VLMs, and delivers fast inference speeds. These strengths make it highly suitable for practical deployment in real-world scenarios. The model has been released on [HuggingFace](https://huggingface.co/PaddlePaddle/PaddleOCR-VL). Everyone is welcome to download and use it! More introduction information can be found in [PaddleOCR-VL](https://www.paddleocr.ai/latest/version3.x/algorithm/PaddleOCR-VL/PaddleOCR-VL.html). - **Core Features**: - **Compact yet Powerful VLM Architecture**: We present a novel vision-language model that is specifically designed for resource-efficient inference, achieving outstanding performance in element recognition. By integrating a NaViT-style dynamic high-resolution visual encoder with the lightweight ERNIE-4.5-0.3B language model, we significantly enhance the model’s recognition capabilities and decoding efficiency. This integration maintains high accuracy while reducing computational demands, making it well-suited for efficient and practical document processing applications. - **SOTA Performance on Document Parsing**: PaddleOCR-VL achieves state-of-the-art performance in both page-level document parsing and element-level recognition. It significantly outperforms existing pipeline-based solutions and exhibiting strong competitiveness against leading vision-language models (VLMs) in document parsing. Moreover, it excels in recognizing complex document elements, such as text, tables, formulas, and charts, making it suitable for a wide range of challenging content types, including handwritten text and historical documents. This makes it highly versatile and suitable for a wide range of document types and scenarios. - **Multilingual Support**: PaddleOCR-VL Supports 109 languages, covering major global languages, including but not limited to Chinese, English, Japanese, Latin, and Korean, as well as languages with different scripts and structures, such as Russian (Cyrillic script), Arabic, Hindi (Devanagari script), and Thai. This broad language coverage substantially enhances the applicability of our system to multilingual and globalized document processing scenarios. - Released PP-OCRv5 Multilingual Recognition Model: - Improved the accuracy and coverage of Latin script recognition; added support for Cyrillic, Arabic, Devanagari, Telugu, Tamil, and other language systems, covering recognition of 109 languages. The model has only 2M parameters, and the accuracy of some models has increased by over 40% compared to the previous generation. </details> <details> <summary><strong>2025.08.21: Release of PaddleOCR 3.2.0</strong></summary> - **Significant Model Additions:** - Introduced training, inference, and deployment for PP-OCRv5 recognition models in English, Thai, and Greek. **The PP-OCRv5 English model delivers an 11% improvement in English scenarios compared to the main PP-OCRv5 model, with the Thai and Greek recognition models achieving accuracies of 82.68% and 89.28%, respectively.** - **Deployment Capability Upgrades:** - **Full support for PaddlePaddle framework versions 3.1.0 and 3.1.1.** - **Comprehensive upgrade of the PP-OCRv5 C++ local deployment solution, now supporting both Linux and Windows, with feature parity and identical accuracy to the Python implementation.** - **High-performance inference now supports CUDA 12, and inference can be performed using either the Paddle Inference or ONNX Runtime backends.** - **The high-stability service-oriented deployment solution is now fully open-sourced, allowing users to customize Docker images and SDKs as required.** - The high-stability service-oriented deployment solution also supports invocation via manually constructed HTTP requests, enabling client-side code development in any programming language. - **Benchmark Support:** - **All production lines now support fine-grained benchmarking, enabling measurement of end-to-end inference time as well as per-layer and per-module latency data to assist with performance analysis. [Here's](docs/version3.x/pipeline_usage/instructions/benchmark.en.md) how to set up and use the benchmark feature.** - **Documentation has been updated to include key metrics for commonly used configurations on mainstream hardware, such as inference latency and memory usage, providing deployment references for users.** - **Bug Fixes:** - Resolved the issue of failed log saving during model training. - Upgraded the data augmentation component for formula models for compatibility with newer versions of the albumentations dependency, and fixed deadlock warnings when using the tokenizers package in multi-process scenarios. - Fixed inconsistencies in switch behaviors (e.g., `use_chart_parsing`) in the PP-StructureV3 configuration files compared to other pipelines. - **Other Enhancements:** - **Separated core and optional dependencies. Only minimal core dependencies are required for basic text recognition; additional dependencies for document parsing and information extraction can be installed as needed.** - **Enabled support for NVIDIA RTX 50 series graphics cards on Windows; users can refer to the [installation guide](docs/version3.x/installation.en.md) for the corresponding PaddlePaddle framework versions.** - **PP-OCR series models now support returning single-character coordinates.** - Added AIStudio, ModelScope, and other model download sources, allowing users to specify the source for model downloads. - Added support for chart-to-table conversion via the PP-Chart2Table module. - Optimized documentation descriptions to improve usability. </details> [History Log](https://paddlepaddle.github.io/PaddleOCR/latest/en/update/update.html) ## 🚀 Quick Start ### Step 1: Try Online PaddleOCR official website provides interactive **Experience Center** and **APIs**—no setup required, just one click to experience. 👉 [Visit Official Website](https://www.paddleocr.com) ### Step 2: Local Deployment For local usage, please refer to the following documentation based on your needs: - **PP-OCR Series**: See [PP-OCR Documentation](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/OCR.html) - **PaddleOCR-VL Series**: See [PaddleOCR-VL Documentation](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/PaddleOCR-VL.html) - **PP-StructureV3**: See [PP-StructureV3 Documentation](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/PP-StructureV3.html) - **More Capabilities**: See [More Capabilities Documentation](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/pipeline_overview.html) ## 🧩 More Features - Convert models to ONNX format: [Obtaining ONNX Models](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/inference_deployment/others/obtaining_onnx_models.html). - Accelerate inference using engines like OpenVINO, ONNX Runtime, TensorRT, or perform inference using ONNX format models: [High-Performance Inference](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/inference_deployment/local_inference/high_performance_inference.html). - Accelerate inference using multi-GPU and multi-process: [Parallel Inference for Pipelines](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/pipeline_usage/instructions/parallel_inference.html). - Integrate PaddleOCR into applications written in C++, C#, Java, etc.: [Serving](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/inference_deployment/serving/serving.html). ## 🔄 Quick Overview of Execution Results ### PP-OCRv5 <div align="center"> <p> <img width="100%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/README/PP-OCRv5_demo.gif" alt="PP-OCRv5 Demo"> </p> </div> ### PP-StructureV3 <div align="center"> <p> <img width="100%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/README/PP-StructureV3_demo.gif" alt="PP-StructureV3 Demo"> </p> </div> ### PaddleOCR-VL <div align="center"> <p> <img width="100%" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/README/PaddleOCR-VL_demo.gif" alt="PP-StructureV3 Demo"> </p> </div> ## ✨ Stay Tuned ⭐ **Star this repository to keep up with exciting updates and new releases, including powerful OCR and document parsing capabilities!** ⭐ <div align="center"> <p> <img width="1200" src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/main/images/paddleocr/README/star_paddleocr2.en.gif" alt="Star-Project"> </p> </div> ## 👩‍👩‍👧‍👦 Community <div align="center"> | PaddlePaddle WeChat official account | Join the tech discussion group | | :---: | :---: | | <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qrcode_for_paddlepaddle_official_account.jpg" width="150"> | <img src="https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/paddleocr/README/qr_code_for_the_questionnaire.jpg" width="150"> | </div> ## 😃 Awesome Projects Leveraging PaddleOCR PaddleOCR wouldn't be where it is today without its incredible community! 💗 A massive thank you to all our longtime partners, new collaborators, and everyone who's poured their passion into PaddleOCR — whether we've named you or not. Your support fuels our fire! <div align="center"> | Project Name | Description | | ------------ | ----------- | | [Dify](https://github.com/langgenius/dify) <a href="https://github.com/langgenius/dify"><img src="https://img.shields.io/github/stars/langgenius/dify"></a>|Production-ready platform for agentic workflow development.| | [RAGFlow](https://github.com/infiniflow/ragflow) <a href="https://github.com/infiniflow/ragflow"><img src="https://img.shields.io/github/stars/infiniflow/ragflow"></a>|RAG engine based on deep document understanding.| | [pathway](https://github.com/pathwaycom/pathway) <a href="https://github.com/pathwaycom/pathway"><img src="https://img.shields.io/github/stars/pathwaycom/pathway"></a>|Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.| | [MinerU](https://github.com/opendatalab/MinerU) <a href="https://github.com/opendatalab/MinerU"><img src="https://img.shields.io/github/stars/opendatalab/MinerU"></a>|Multi-type Document to Markdown Conversion Tool| | [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR) <a href="https://github.com/hiroi-sora/Umi-OCR"><img src="https://img.shields.io/github/stars/hiroi-sora/Umi-OCR"></a>|Free, Open-source, Batch Offline OCR Software.| | [cherry-studio](https://github.com/CherryHQ/cherry-studio) <a href="https://github.com/CherryHQ/cherry-studio"><img src="https://img.shields.io/github/stars/CherryHQ/cherry-studio"></a>|A desktop client that supports for multiple LLM providers.| | [haystack](https://github.com/deepset-ai/haystack)<a href="https://github.com/deepset-ai/haystack"><img src="https://img.shields.io/github/stars/deepset-ai/haystack"></a> |AI orchestration framework to build customizable, production-ready LLM applications.| | [OmniParser](https://github.com/microsoft/OmniParser)<a href="https://github.com/microsoft/OmniParser"><img src="https://img.shields.io/github/stars/microsoft/OmniParser"></a> |OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent.| | [QAnything](https://github.com/netease-youdao/QAnything)<a href="https://github.com/netease-youdao/QAnything"><img src="https://img.shields.io/github/stars/netease-youdao/QAnything"></a> |Question and Answer based on Anything.| | [Learn more projects](./awesome_projects.md) | [More projects based on PaddleOCR](./awesome_projects.md)| </div> ## 👩‍👩‍👧‍👦 Contributors <div align="center"> <a href="https://github.com/PaddlePaddle/PaddleOCR/graphs/contributors"> <img src="https://contrib.rocks/image?repo=PaddlePaddle/PaddleOCR&max=400&columns=20" width="800"/> </a> </div> ## 🌟 Star <div align="center"> <p> <img width="800" src="https://api.star-history.com/svg?repos=PaddlePaddle/PaddleOCR&type=Date" alt="Star-history"> </p> </div> ## 📄 License This project is released under the [Apache 2.0 license](LICENSE). ## 🎓 Citation ```bibtex @misc{cui2025paddleocr30technicalreport, title={PaddleOCR 3.0 Technical Report}, author={Cheng Cui and Ting Sun and Manhui Lin and Tingquan Gao and Yubo Zhang and Jiaxuan Liu and Xueqing Wang and Zelun Zhang and Changda Zhou and Hongen Liu and Yue Zhang and Wenyu Lv and Kui Huang and Yichao Zhang and Jing Zhang and Jun Zhang and Yi Liu and Dianhai Yu and Yanjun Ma}, year={2025}, eprint={2507.05595}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2507.05595}, } @misc{cui2025paddleocrvlboostingmultilingualdocument, title={PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model}, author={Cheng Cui and Ting Sun and Suyin Liang and Tingquan Gao and Zelun Zhang and Jiaxuan Liu and Xueqing Wang and Changda Zhou and Hongen Liu and Manhui Lin and Yue Zhang and Yubo Zhang and Handong Zheng and Jing Zhang and Jun Zhang and Yi Liu and Dianhai Yu and Yanjun Ma}, year={2025}, eprint={2510.14528}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2510.14528}, } @misc{cui2026paddleocrvl15multitask09bvlm, title={PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing}, author={Cheng Cui and Ting Sun and Suyin Liang and Tingquan Gao and Zelun Zhang and Jiaxuan Liu and Xueqing Wang and Changda Zhou and Hongen Liu and Manhui Lin and Yue Zhang and Yubo Zhang and Yi Liu and Dianhai Yu and Yanjun Ma}, year={2026}, eprint={2601.21957}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.21957}, } @misc{zhang2026paddleocrvl16expandingfrontierdocument, title={PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training}, author={Zelun Zhang and Hongen Liu and Suyin Liang and Yubo Zhang and Yiqing Xiang and Jiaxuan Liu and Ting Sun and Manhui Lin and Yue Zhang and Changda Zhou and Tingquan Gao and Cheng Cui and Yi Liu and Dianhai Yu and Yanjun Ma}, year={2026}, eprint={2606.03264}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2606.03264}, } ```

Data Labeling Knowledge Bases & RAG
81.6K Github Stars