112 Star 777 Fork 440

MindSpore / mindformers

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

欢迎来到MindSpore Transformers(MindFormers)

LICENSE Downloads PyPI PyPI - Python Version

一、介绍

MindSpore Transformers套件的目标是构建一个大模型训练、微调、评估、推理、部署的全流程开发套件,提供业内主流的Transformer类预训练模型和SOTA下游任务应用,涵盖丰富的并行特性。期望帮助用户轻松的实现大模型训练和创新研发。

MindSpore Transformers套件基于MindSpore内置的并行技术和组件化设计,具备如下特点:

  • 一行代码实现从单卡到大规模集群训练的无缝切换;
  • 提供灵活易用的个性化并行配置;
  • 能够自动进行拓扑感知,高效地融合数据并行和模型并行策略;
  • 一键启动任意任务的单卡/多卡训练、微调、评估、推理流程;
  • 支持用户进行组件化配置任意模块,如优化器、学习策略、网络组装等;
  • 提供Trainer、pipeline、AutoClass等高阶易用性接口;
  • 提供预置SOTA权重自动下载及加载功能;
  • 支持人工智能计算中心无缝迁移部署;

如果您对MindSpore Transformers有任何建议,请通过issue与我们联系,我们将及时处理。

目前支持的模型列表如下:

模型 model name
LLama2 llama2_7b, llama2_13b, llama2_7b_lora, llama2_13b_lora, llama2_70b
GLM2 glm2_6b, glm2_6b_lora
GLM3 glm3_6b, glm3_6b_lora
GPT2 gpt2, gpt2_13b
Baichuan2 baichuan2_7b, baichuan2_13b, baichuan2_7b_lora, baichuan2_13b_lora
Qwen qwen_7b, qwen_14b, qwen_7b_lora, qwen_14b_lora
Qwen1.5 qwen1.5-14b, qwen1.5-72b
CodeGeex2 codegeex2_6b
CodeLlama codellama_34b
DeepSeek deepseek-coder-33b-instruct
Internlm internlm_7b, internlm_20b, internlm_7b_lora
Mixtral mixtral-8x7b
Wizardcoder wizardcoder_15b
Yi yi_6b, yi_34b

二、MindFormers安装

Linux源码编译方式安装

支持源码编译安装,用户可以执行下述的命令进行包的安装。

git clone -b r1.1.0 https://gitee.com/mindspore/mindformers.git
cd mindformers
bash build.sh

三、版本匹配关系

当前支持的硬件为Atlas 800训练服务器与Atlas 800T A2训练服务器。

当前套件建议使用的Python版本为3.9。

MindFormers MindPet MindSpore CANN 驱动固件 镜像链接 备注
dev 1.0.4 2.3版本(尚未发布) 尚未发布 尚未发布 / 开发分支(非稳定版本)

其中CANN,固件驱动的安装需与使用的机器匹配,请注意识别机器型号,选择对应架构的版本

四、快速使用

MindFormers套件对外提供两种使用和开发形式,为开发者提供灵活且简洁的使用方式和高阶开发接口。

方式一:使用msrun方式启动(仅适用于配套MindSpore2.3以上版本)

用户可以直接clone整个仓库,按照以下步骤即可运行套件中已支持的任意configs模型任务配置文件,方便用户快速进行使用和开发:

目前msrun方式启动不支持指定device_id启动,msrun命令会按当前节点所有显卡顺序设置rank_id。

  • 参数说明

    参数 单机是否必选 多机是否必选 默认值 说明
    WORKER_NUM 8 所有节点中使用计算卡的总数
    LOCAL_WORKER × 8 当前节点中使用计算卡的数量
    MASTER_ADDR × 127.0.0.1 指定分布式启动主节点的ip
    MASTER_PORT × 8118 指定分布式启动绑定的端口号
    NODE_RANK × 0 指定当前节点的rank id
    LOG_DIR × output/msrun_log 日志输出路径,若不存在则递归创建
    JOIN × False 是否等待所有分布式进程退出
    CLUSTER_TIME_OUT × 600 分布式启动的等待时间,单位为秒

单机多卡

# 单机多卡快速启动方式,默认8卡启动
bash scripts/msrun_launcher.sh "run_mindformer.py \
 --config {CONFIG_PATH} \
 --run_mode {train/finetune/eval/predict}"

# 单机多卡快速启动方式,仅设置使用卡数即可
bash scripts/msrun_launcher.sh "run_mindformer.py \
 --config {CONFIG_PATH} \
 --run_mode {train/finetune/eval/predict}" WORKER_NUM

# 单机多卡自定义启动方式
bash scripts/msrun_launcher.sh "run_mindformer.py \
 --config {CONFIG_PATH} \
 --run_mode {train/finetune/eval/predict}" \
 WORKER_NUM MASTER_PORT LOG_DIR JOIN CLUSTER_TIME_OUT
  • 使用示例

    # 单机多卡快速启动方式,默认8卡启动
    bash scripts/msrun_launcher.sh "run_mindformer.py \
      --config path/to/xxx.yaml \
      --run_mode finetune"
    
    # 单机多卡快速启动方式
    bash scripts/msrun_launcher.sh "run_mindformer.py \
      --config path/to/xxx.yaml \
      --run_mode finetune" 8
    
    # 单机多卡自定义启动方式
    bash scripts/msrun_launcher.sh "run_mindformer.py \
      --config path/to/xxx.yaml \
      --run_mode finetune" \
      8 8118 output/msrun_log False 300

多机多卡

多机多卡执行脚本进行分布式训练需要分别在不同节点运行脚本,并将参数MASTER_ADDR设置为主节点的ip地址, 所有节点设置的ip地址相同,不同节点之间仅参数NODE_RANK不同。

# 多机多卡自定义启动方式
bash scripts/msrun_launcher.sh "run_mindformer.py \
 --config {CONFIG_PATH} \
 --run_mode {train/finetune/eval/predict}" \
 WORKER_NUM LOCAL_WORKER MASTER_ADDR MASTER_PORT NODE_RANK LOG_DIR JOIN CLUSTER_TIME_OUT
  • 使用示例

    # 节点0,节点ip为192.168.1.1,作为主节点,总共8卡且每个节点4卡
    bash scripts/msrun_launcher.sh "run_mindformer.py \
      --config {CONFIG_PATH} \
      --run_mode {train/finetune/eval/predict}" \
      8 4 192.168.1.1 8118 0 output/msrun_log False 300
    
    # 节点1,节点ip为192.168.1.2,节点0与节点1启动命令仅参数NODE_RANK不同
    bash scripts/msrun_launcher.sh "run_mindformer.py \
      --config {CONFIG_PATH} \
      --run_mode {train/finetune/eval/predict}" \
      8 4 192.168.1.1 8118 1 output/msrun_log False 300

单卡启动

通过统一接口启动,根据模型的config配置,完成任意模型的单卡训练、微调、评估、推理流程。

# 训练启动,run_mode支持train、finetune、eval、predict四个关键字,以分别完成模型训练、评估、推理功能,默认使用配置文件中的run_mode
python run_mindformer.py --config {CONFIG_PATH} --run_mode {train/finetune/eval/predict}

方式二:调用API启动

详细高阶API使用教程请参考:MindFormers大模型使用教程

  • 准备工作

    • step 1:安装mindformers

      具体安装请参考第二章

    • step2: 准备数据

      准备相应任务的数据集,请参考docs目录下各模型的README.md文档准备相应数据集。

  • Trainer 快速入门

    用户可以通过以上方式安装mindformers库,然后利用Trainer高阶接口执行模型任务的训练、微调、评估、推理功能。

    # 以gpt2模型为例
    import mindspore; mindspore.set_context(mode=0, device_id=0)
    from mindformers import Trainer
    
    # 初始化预训练任务
    trainer = Trainer(task='text_generation',
                      model='gpt2',
                      train_dataset='path/to/train_dataset',
                      eval_dataset='path/to/eval_dataset')
    # 开启预训练
    trainer.train()
    
    # 开启全量微调
    trainer.finetune()
    
    # 开启评测
    trainer.evaluate()
    
    # 开启推理
    predict_result = trainer.predict(input_data="An increasing sequence: one,", do_sample=False, max_length=20)
    print(predict_result)
    # output result is: [{'text_generation_text': ['An increasing sequence: one, two, three, four, five, six, seven, eight,']}]
    
    # Lora微调
    trainer = Trainer(task="text_generation", model="gpt2", pet_method="lora",
                      train_dataset="path/to/train_dataset")
    trainer.finetune(finetune_checkpoint="gpt2")
  • pipeline 快速入门

    MindFormers套件为用户提供了已集成模型的pipeline推理接口,方便用户体验大模型推理服务。

    pipeline使用样例如下:

    # 以gpt2 small为例
    import mindspore; mindspore.set_context(mode=0, device_id=0)
    from mindformers.pipeline import pipeline
    
    pipeline_task = pipeline(task="text_generation", model="gpt2")
    pipeline_result = pipeline_task("An increasing sequence: one,", do_sample=False, max_length=20)
    print(pipeline_result)

    结果打印示例(已集成的gpt2模型权重推理结果):

    [{'text_generation_text': ['An increasing sequence: one, two, three, four, five, six, seven, eight,']}]
  • AutoClass 快速入门

    MindFormers套件为用户提供了高阶AutoClass类,包含AutoConfig、AutoModel、AutoProcessor、AutoTokenizer四类,方便开发者进行调用。

    • AutoConfig获取已支持的任意模型配置

      from mindformers import AutoConfig
      
      # 获取gpt2的模型配置
      gpt2_config = AutoConfig.from_pretrained('gpt2')
      # 获取vit_base_p16的模型配置
      vit_base_p16_config = AutoConfig.from_pretrained('vit_base_p16')
    • AutoModel获取已支持的网络模型

      from mindformers import AutoModel
      
      # 利用from_pretrained功能实现模型的实例化(默认加载对应权重)
      gpt2 = AutoModel.from_pretrained('gpt2')
      # 利用from_config功能实现模型的实例化(默认加载对应权重)
      gpt2_config = AutoConfig.from_pretrained('gpt2')
      gpt2 = AutoModel.from_config(gpt2_config)
      # 利用save_pretrained功能保存模型对应配置
      gpt2.save_pretrained('./gpt2', save_name='gpt2')
    • AutoProcessor获取已支持的预处理方法

      from mindformers import AutoProcessor
      
      # 通过模型名关键字获取对应模型预处理过程(实例化gpt2的预处理过程,通常用于Trainer/pipeline推理入参)
      gpt2_processor_a = AutoProcessor.from_pretrained('gpt2')
      # 通过yaml文件获取相应的预处理过程
      gpt2_processor_b = AutoProcessor.from_pretrained('configs/gpt2/run_gpt2.yaml')
    • AutoTokenizer获取已支持的tokenizer方法

      from mindformers import AutoTokenizer
      # 通过模型名关键字获取对应模型预处理过程(实例化gpt2的tokenizer,通常用于Trainer/pipeline推理入参)
      gpt2_tokenizer = AutoTokenizer.from_pretrained('gpt2')

五、贡献

欢迎参与社区贡献,可参考MindSpore贡献要求Contributor Wiki

六、许可证

Apache 2.0许可证

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

MindSpore Transformers套件的目标是构建一个大模型训练、推理、部署的全流程套件: 提供业内主流的Transformer类预训练模型, 涵盖丰富的并行特性。 期望帮助用户轻松的实现大模型训练。 文档:https://mindformers.readthedocs.io/zh-cn/latest/ 展开 收起
Python 等 3 种语言
Apache-2.0
取消

发行版 (3)

全部

贡献者

全部

近期动态

加载更多
不能加载更多了
Python
1
https://gitee.com/mindspore/mindformers.git
git@gitee.com:mindspore/mindformers.git
mindspore
mindformers
mindformers
r1.1.0

搜索帮助