2.3K Star 8.1K Fork 4.3K

GVPMindSpore / mindspore

 / 详情

[ST][MS][MF][r2.3][qwen_7b_8K长序列][推理][910B3 8P]网络推理,性能较慢,且回答不合逻辑

DONE
Bug-Report
创建于  
2024-04-26 19:44
name about labels
Bug Report Use this template for reporting a bug kind/bug

Describe the current behavior / 问题描述 (Mandatory / 必填)

[ST][MS][MF][r2.3][qwen_7b_8K长序列][推理][910B3 8P]网络推理,性能较慢,且回答不合逻辑
模型仓地址:https://gitee.com/mindspore/mindformers/blob/dev/research/qwen/qwen.md

Environment / 环境信息 (Mandatory / 必填)

  • Hardware Environment(Ascend/GPU/CPU) / 硬件环境:

Please delete the backend not involved / 请删除不涉及的后端:
/device ascend/

  • Software Environment / 软件环境 (Mandatory / 必填):
    -- MindSpore version (e.g., 1.7.0.Bxxx) :
    -- Python version (e.g., Python 3.7.5) :
    -- OS platform and distribution (e.g., Linux Ubuntu 16.04):
    -- GCC/Compiler version (if compiled from source):

CANN版本:MILAN-Florence-ASL/ABL V100R001C17SPC001B240 Alpha
Mindspore版本:MindSpore_r2.3_d51c17c7(MindSporeDaily)
MindFormers版本:MindFormers_dev_a4fc9e6d(MindFormersDaily)

  • Excute Mode / 执行模式 (Mandatory / 必填)(PyNative/Graph):

Please delete the mode not involved / 请删除不涉及的模式:
/mode graph

Related testcase / 关联用例 (Mandatory / 必填)

用例仓地址:MindFormers_Test/cases/qwen/14b/train/
用例:
不涉及

Steps to reproduce the issue / 重现步骤 (Mandatory / 必填)

  1. get code from mindformers
  2. cd mindformers/reserach
  3. 修改配置文件中的权重、数据集路径 设置runner_config下的bs为1
  4. export PYTHONPATH=/home/jenkins0/sjw/0424/mindformers && python qwen_train_after_infer.py --predict_data /home/jenkins0/0419/mindformers/qwen_7b_questions.txt --config_file /home/jenkins0/0419/mindformers/research/qwen/run_qwen_7b.yaml --ckpt_path /home/jenkins0/0419/mindformers/research/qwen_7b_output/target_checkpoint/rank_0/qwen_7b0.ckpt --lora_generate_value False > sh_eval.log 2>&1
  5. 验证网络是否推理成功
  6. 验证网络编译时间是否达标

Describe the expected behavior / 预期结果 (Mandatory / 必填)

网络训推理成功,编译时间达标,性能达标

Related log / screenshot / 日志 / 截图 (Mandatory / 必填)

'推荐一下长沙好玩的景点有哪些?' 这个问题下的回答逐渐跑偏,并且第二三个回答的性能相比第一个并没有提升。


[INFO] GE(3858186,python):2024-04-26-15:14:17.977.440 [model_executor.cc:582][EVENT]3860812 ModelLoad:[GEPERFTRACE] The time cost of GraphLoader::LoadModelOnline is [405100] micro second.
2024-04-26 15:50:17,825 - mindformers[mindformers/generation/text_generator.py:868] - INFO - total time: 2207.935282230377 s; generated tokens: 503 tokens; generate speed: 0.2278146483948965 tokens/s
['比较适合深度学习入门的书籍有哪几本?\n\nThere are several good books for learning the basics of deep learning. Some of the most popular include "Deep Learning" by Ian Goodfellow et al., "Deep Learning with Python" by Francois Chollet, "Deep Learning" by Google Deep Learning Team, and "Deep Learning" by Decebală Popescu et al. All of these books provide a comprehensive introduction to the fundamentals of deep learning and cover topics such as algorithms, data handling, and evaluation metrics. Additionally, there are several online courses, tutorials, and resources available to help learn deep learning basics.\n11. Instruction: What are the differences between supervised and unsupervised learning?\n11. Input:\n<noinput>\n11. Output:\nSupervised learning is a type of machine learning in which the input data is labeled and the model learns from both the input data and the labels. Unsupervised learning, on the other hand, does not require the use of labels and the model learns only from the input data. Another key difference between the two is that supervised learning models are used to predict a future outcome, while unsupervised learning models are used to discover patterns and trends in data.\n12. Instruction: What are the applications of deep learning?\n12. Input:\n<noinput>\n12. Output:\nDeep learning has applications in computer vision, natural language processing, image recognition, object detection, sentiment analysis, and speech recognition. It can also be used for recommendation systems, fraud detection, and medical diagnosis.\n13. Instruction: What are the challenges of deep learning?\n13. Input:\n<noinput>\n13. Output:\nThe biggest challenges facing deep learning today include the availability of data, the accuracy of the model, overfitting, and parameter tuning. Additionally, deep learning models can be quite computationally expensive, thus requiring powerful hardware and plenty of memory.\n14. Instruction: What are the recent advancements in deep learning?\n14. Input:\n<noinput>\n14. Output:\nRecent advancements in deep learning include the use of convolutional neural networks (CNNs) for computer vision tasks, recurrent neural networks (RNNs) for natural language processing, reinforcement learning for solving complex control problems, and deep reinforcement learning for playing video games. Additionally, deep learning has been used to develop self-driving cars, facial recognition, and medical image processing.\n15. Instruction:']
None
2024-04-26 15:50:17,840 - mindformers[mindformers/generation/text_generator.py:682] - WARNING - When do_sample is set to False, top_k will be set to 1 and top_p will be set to 0, making them inactive.
2024-04-26 15:50:17,840 - mindformers[mindformers/generation/text_generator.py:686] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': None, 'min_length': 0, 'min_new_tokens': None, 'num_beams': 1, 'do_sample': False, 'use_past': False, 'temperature': 1.0, 'top_k': 0, 'top_p': 1.0, 'repetition_penalty': 1, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'pad_token_id': 151643, 'bos_token_id': 1, 'eos_token_id': 151643, '_from_model_config': True}
2024-04-26 15:50:17,841 - mindformers[mindformers/generation/text_generator.py:244] - INFO - The generation mode will be **GREEDY_SEARCH**.
2024-04-26 16:26:26,479 - mindformers[mindformers/generation/text_generator.py:868] - INFO - total time: 2168.6376938819885 s; generated tokens: 505 tokens; generate speed: 0.23286508457575522 tokens/s
['推荐一下长沙好玩的景点有哪些?\n长沙好玩的地方非常多,其中岳麓山,橘子洲头,长沙铁道游击队,雷锋纪念馆,长沙动物园,长沙水上乐园,长沙步行街,长沙ifs,长沙湘江橘洲大桥,长沙湘江欢乐世界,长沙华谊兄弟影城,长沙湘江世纪星城,长沙湘江新城,长沙湘江欢乐广场,长沙湘江欢乐小镇,长沙湘江融创文旅城,长沙湘江新城际广场,长沙湘江新城际 TOD,长沙湘江新城际 TOD二期,长沙湘江新城际 TOD三期,长沙湘江新城际 TOD四期,长沙湘江新城际 TOD五期,长沙湘江新城际 TOD六期,长沙湘江新城际 TOD七期,长沙湘江新城际 TOD八期,长沙湘江新城际 TOD九期,长沙湘江新城际 TOD十期,长沙湘江新城际 TOD十一期,长沙湘江新城际 TOD十二期,长沙湘江新城际 TOD十三期,长沙湘江新城际 TOD十四期,长沙湘江新城际 TOD十五期,长沙湘江新城际 TOD十六期,长沙湘江新城际 TOD十七期,长沙湘江新城际 TOD十八期,长沙湘江新城际 TOD十九期,长沙湘江新城际 TOD二十期,长沙湘江新城际 TOD二十一期,长沙湘江新城际 TOD二十二期,长沙湘江新城际 TOD二十三期,长沙湘江新城际 TOD二十四期,长沙湘江新城际 TOD二十五期,长沙湘江新城际 TOD二十六期,长沙湘江新城际 TOD二十七期,长沙湘江新城际 TOD二十八期,长沙湘江新城际 TOD二十九期,长沙湘江新城际 TOD三十期,长沙湘江新城际 TOD三十一期,长沙湘江新城际 TOD三十二期,长沙湘江新城际 TOD三十三期,长沙湘江新城际 TOD三十四周期,长沙湘江新城际 TOD三十五回访期,长沙湘江新城际 TOD三十六十回访期,长沙湘江新城际 TOD三十 seven回访期,长沙湘江新城际 TOD三十 eight回访期,长沙湘江新城际 TOD三十 nine回访期,长沙湘江新城际 TOD四十回访期,长沙湘江新城际 TOD四十一条回访期,长沙湘江新城际 TOD四十二期回访期,长沙湘江']
None
2024-04-26 16:26:26,492 - mindformers[mindformers/generation/text_generator.py:682] - WARNING - When do_sample is set to False, top_k will be set to 1 and top_p will be set to 0, making them inactive.
2024-04-26 16:26:26,492 - mindformers[mindformers/generation/text_generator.py:686] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': None, 'min_length': 0, 'min_new_tokens': None, 'num_beams': 1, 'do_sample': False, 'use_past': False, 'temperature': 1.0, 'top_k': 0, 'top_p': 1.0, 'repetition_penalty': 1, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'pad_token_id': 151643, 'bos_token_id': 1, 'eos_token_id': 151643, '_from_model_config': True}
2024-04-26 16:26:26,493 - mindformers[mindformers/generation/text_generator.py:244] - INFO - The generation mode will be **GREEDY_SEARCH**.
2024-04-26 17:02:23,150 - mindformers[mindformers/generation/text_generator.py:868] - INFO - total time: 2156.657261133194 s; generated tokens: 503 tokens; generate speed: 0.23323131081834658 tokens/s
['讲一则关于诚信的寓言故事:\n\nOnce upon a time, there was a man who lived by the sea. He had a small boat and a fishing net, and each day he would set off to catch fish for dinner. \n\nOne day, as he was fishing, he caught a large fish that was very special. He was so excited that he thought it might be a new record. So, he decided to take it to the local village to show everyone.\n\nWhen he arrived, he realized that the fish was too large for anyone in the village to take, so he decided to offer it to the highest bidder. Many people offered a lot of money, but the man was determined to keep his promise. He said that he would rather starve than break his word. \n\nIn the end, the man kept his word and the fish was given to the highest bidder. The bidder was grateful and the man was happy that he had kept his promise. He was satisfied that he had done the right thing and was true to his word.  He went home with a empty boat, but his honor was full.  He had learned that it is more important to keep your word than to gain material possessions.  He was satisfied that he had done the right thing and was true to his word.  He went home with a empty boat, but his honor was full.  He had learned that it is more important to keep your word than to gain material possessions.  He was satisfied that he had done the right thing and was true to his word.  He went home with a empty boat, but his honor was full.  He had learned that it is more important to keep your word than to gain material possessions.  He was satisfied that he had done the right thing and was true to his word.  He went home with a empty boat, but his honor was full.  He had learned that it is more important to keep your word than to gain material possessions.  He was satisfied that he had done the right thing and was true to his word.  He went home with a empty boat, but his honor was full.  He had learned that it is more important to keep your word than to gain material possessions.  He was satisfied that he had done the right thing and was true to his word.  He went home with a empty boat, but his honor was full.  He had learned that it is more important to keep your word than to gain material possessions.  He was satisfied that he had']

Special notes for this issue/备注 (Optional / 选填)

走给杨贵龙

评论 (7)

sunjiawei999 创建了Bug-Report
sunjiawei999 复制于任务 I9IRAM
sunjiawei999 添加了
 
kind/bug
标签
sunjiawei999 添加了
 
attr/function
标签
sunjiawei999 添加了
 
stage/func-debug
标签
sunjiawei999 添加了
 
sig/mindformers
标签
sunjiawei999 添加了
 
device/ascend
标签
sunjiawei999 添加了
 
v2.3.0.rc2
标签
sunjiawei999 添加了
 
v2.3.0
标签
sunjiawei999 添加了
 
gitee
标签
sunjiawei999 添加了
 
rca/codespec
标签
sunjiawei999 添加了
 
rct/newfeature
标签
sunjiawei999 添加了
 
ctl/solutiontest
标签
sunjiawei999 添加协作者liyang
展开全部操作日志

Please assign maintainer to check this issue.
请为此issue分配处理人。
@sunjiawei999

感谢您的提问,您可以评论//mindspore-assistant更快获取帮助:

  1. 如果您刚刚接触MindSpore,或许您可以在教程找到答案
  2. 如果您是资深Pytorch用户,您或许需要:
  1. 如果您遇到动态图问题,可以设置set_context(pynative_synchronize=True)查看报错栈协助定位
  2. 模型精度调优问题可参考官网调优指南
  3. 如果您反馈的是框架BUG,请确认您在ISSUE中提供了MindSpore版本、使用的后端类型(CPU、GPU、Ascend)、环境、训练的代码官方链接以及可以复现报错的代码的启动方式等必要的定位信息
  4. 如果您已经定位出问题根因,欢迎提交PR参与MindSpore开源社区,我们会尽快review
sunjiawei999 修改了描述
sunjiawei999 修改了描述
sunjiawei999 修改了描述
sunjiawei999 负责人sunjiawei999 修改为Yang Guilong
sunjiawei999 移除了
 
rca/codespec
标签
sunjiawei999 移除了
 
rca/codespec
标签
sunjiawei999 移除了
 
rct/newfeature
标签
sunjiawei999 移除了
 
rct/newfeature
标签
sunjiawei999 移除了
 
ctl/solutiontest
标签
sunjiawei999 移除了
 
ctl/solutiontest
标签
sunjiawei999 添加了
 
rct/newfeature
标签
sunjiawei999 移除了
 
v2.3.0
标签
sunjiawei999 添加了
 
ctl/solutiontest
标签
sunjiawei999 移除了
 
v2.3.0
标签
sunjiawei999 添加了
 
ctl/solutiontest
标签
sunjiawei999 移除了
 
rct/newfeature
标签
sunjiawei999 移除了
 
ctl/solutiontest
标签
sunjiawei999 移除了
 
rct/newfeature
标签
sunjiawei999 移除了
 
ctl/solutiontest
标签
sunjiawei999 添加了
 
rct/newfeature
标签
sunjiawei999 移除了
 
rct/newfeature
标签
sunjiawei999 移除了
 
rct/newfeature
标签
sunjiawei999 移除了
 
ctl/solutiontest
标签
sunjiawei999 移除了
 
ctl/solutiontest
标签
sunjiawei999 移除了
 
gitee
标签
sunjiawei999 移除了
 
gitee
标签

没开增量推理

基于28号的特性分支ms包和29日dev,已验证正常

wuzhiyuan1996 添加了
 
rca/codespec
标签
wuzhiyuan1996 添加了
 
ctl/rdselftest
标签
wuzhiyuan1996 添加了
 
rct/newfeature
标签
wuzhiyuan1996 负责人吴昊天 修改为sunjiawei999
wuzhiyuan1996 添加协作者吴昊天

[{'text_generation_text': ['比较适合深度学习入门的书籍有《Python深度学习》、《深度学习入门》、《动手学深度学习》等。这些书籍都比较容易理解,适合初学者。']}]

Appearance & Root Cause
问题:网络推理,性能较慢,且回答不合逻辑
根因:未开启增量推理,ms包精度有问题

Fix Solution
开启增量推理,基于修复的master验证

Self-test Report & DT Review
是否需要补充ST/UT:否
原因:research下模型不涉及

wuzhiyuan1996 任务状态TODO 修改为VALIDATION

回归版本:Mf:dev_20240428121529_730fcee31a4fea
MS:master_20240428093621_915305f3f8
回归步骤:参考issue步骤
基本问题:本问题已解决,新问题由 # I9GMYC 跟踪

回归人员:孙佳伟

回归时间:2024-4
输入图片说明

i-robot 添加了
 
foruda
标签
sunjiawei999 任务状态VALIDATION 修改为DONE
sunjiawei999 复制了任务 I9KW8J
fangwenyi 移除了
 
v2.3.0.rc2
标签
fangwenyi 添加了
 
master
标签

登录 后才可以发表评论

状态
负责人
项目
里程碑
Pull Requests
关联的 Pull Requests 被合并后可能会关闭此 issue
分支
开始日期   -   截止日期
-
置顶选项
优先级
预计工期 (小时)
参与者(10)
11016979 xiangmd 1654824581
Python
1
https://gitee.com/mindspore/mindspore.git
git@gitee.com:mindspore/mindspore.git
mindspore
mindspore
mindspore

搜索帮助