name | about | labels |
---|---|---|
Bug Report | Use this template for reporting a bug | kind/bug |
llama2 不支持float32初始化,报错
RuntimeError: For 'load_param_into_net', model.layers.0.attention.wq.weight in the argument 'net' should have the same type as model.layers.0.attention.wq.weight in the argument 'parameter_dict'. but got its type Float32 in the argument 'net' and type BFloat16 in the argument 'parameter_dict'.May you need to check whether the checkpoint you loaded is correct.
Ascend
/GPU
/CPU
) / 硬件环境:Please delete the backend not involved / 请删除不涉及的后端:
/device ascend/GPU/CPU/kirin/等其他芯片
Software Environment / 软件环境 (Mandatory / 必填):
-- MindSpore version (e.g., 1.7.0.Bxxx) :
-- Python version (e.g., Python 3.7.5) :
-- OS platform and distribution (e.g., Linux Ubuntu 16.04):
-- GCC/Compiler version (if compiled from source):
Excute Mode / 执行模式 (Mandatory / 必填)(PyNative
/Graph
):
Please delete the mode not involved / 请删除不涉及的模式:
/mode pynative
/mode graph
Please assign maintainer to check this issue.
请为此issue分配处理人。
@fangwenyi @chengxiaoli @Shawny
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
感谢您的提问,您可以评论//mindspore-assistant更快获取帮助:
回归时间:2024.3.19
回归版本:MindSpore 2.2.12.B004
回归步骤:llama2网络,bf16训练保存ckpt,fp32的网络加载ckpt训练
回归结果:
2024-03-19 12:35:24,349 - mindformers[mindformers/core/callback/callback.py:259] - WARNING - pipeline stages: 2 > 1, the loss on the last card is valid.
2024-03-19 12:35:24,350 - mindformers[mindformers/core/callback/callback.py:339] - INFO - { Epoch:[ 1/ 1], step:[ 34/ 200], loss: 9.903, per_step_time: 15583ms, lr: 4.9373055e-05, overflow cond: True, loss_scale: 1.0, TFLOPs: 1.74, global_norm: [2.3966413]
2024-03-19 12:35:24,351 - mindformers[mindformers/core/callback/callback.py:347] - INFO - 17.0% |████████ | 0.13 samples/s/p 0:43:06 }
2024-03-19 12:35:24,413 - mindformers[mindformers/core/callback/callback.py:582] - INFO - ......Saving ckpt......
2024-03-19 12:35:50,803 - mindformers[mindformers/core/callback/callback.py:259] - WARNING - pipeline stages: 2 > 1, the loss on the last card is valid.
2024-03-19 12:35:50,804 - mindformers[mindformers/core/callback/callback.py:339] - INFO - { Epoch:[ 1/ 1], step:[ 36/ 200], loss: 9.918, per_step_time: 11873ms, lr: 4.9373055e-05, overflow cond: True, loss_scale: 1.0, TFLOPs: 2.28, global_norm: [2.3488748]
2024-03-19 12:35:50,806 - mindformers[mindformers/core/callback/callback.py:347] - INFO - 18.0% |█████████ | 0.17 samples/s/p 0:32:27 }
2024-03-19 12:35:50,806 - mindformers[mindformers/core/callback/callback.py:582] - INFO - ......Saving ckpt......
2024-03-19 12:36:17,182 - mindformers[mindformers/core/callback/callback.py:259] - WARNING - pipeline stages: 2 > 1, the loss on the last card is valid.
2024-03-19 12:36:17,183 - mindformers[mindformers/core/callback/callback.py:339] - INFO - { Epoch:[ 1/ 1], step:[ 38/ 200], loss: 9.907, per_step_time: 11904ms, lr: 4.9373055e-05, overflow cond: True, loss_scale: 1.0, TFLOPs: 2.28, global_norm: [2.3897629]
2024-03-19 12:36:17,184 - mindformers[mindformers/core/callback/callback.py:347] - INFO - 19.0% |█████████ | 0.17 samples/s/p 0:32:08 }
2024-03-19 12:36:17,185 - mindformers[mindformers/core/callback/callback.py:582] - INFO - ......Saving ckpt......
2024-03-19 12:36:43,389 - mindformers[mindformers/core/callback/callback.py:259] - WARNING - pipeline stages: 2 > 1, the loss on the last card is valid.
2024-03-19 12:36:43,390 - mindformers[mindformers/core/callback/callback.py:339] - INFO - { Epoch:[ 1/ 1], step:[ 40/ 200], loss: 9.899, per_step_time: 11989ms, lr: 4.9373055e-05, overflow cond: True, loss_scale: 1.0, TFLOPs: 2.26, global_norm: [2.441086]
2024-03-19 12:36:43,392 - mindformers[mindformers/core/callback/callback.py:347] - INFO - 20.0% |██████████ | 0.17 samples/s/p 0:31:58 }
2024-03-19 12:36:43,393 - mindformers[mindformers/core/callback/callback.py:582] - INFO - ......Saving ckpt......
2024-03-19 12:37:09,711 - mindformers[mindformers/core/callback/callback.py:259] - WARNING - pipeline stages: 2 > 1, the loss on the last card is valid.
2024-03-19 12:37:09,712 - mindformers[mindformers/core/callback/callback.py:339] - INFO - { Epoch:[ 1/ 1], step:[ 42/ 200], loss: 9.908, per_step_time: 12002ms, lr: 4.9373055e-05, overflow cond: True, loss_scale: 1.0, TFLOPs: 2.26, global_norm: [2.3874996]
2024-03-19 12:37:09,713 - mindformers[mindformers/core/callback/callback.py:347] - INFO - 21.0% |██████████ | 0.17 samples/s/p 0:31:36 }
2024-03-19 12:37:09,714 - mindformers[mindformers/core/callback/callback.py:582] - INFO - ......Saving ckpt......
``
回归结论:回归通过
登录 后才可以发表评论