[Question]: 昇腾910b运行llama_npu_sft_N1C8.sh时报错 #9469

feria-tu · 2024-11-21T04:25:25Z

请提出你的问题

修改了llama_npu_sft_N1C8.sh中的卡数和模型，修改后文件具体如下，开始训练时报了这个错误：OSError: The mask tensor dtype must be bool , but got bfloat16
[/paddle/backends/npu/custom_op/fused_attention_npu.cc:214]

llama_npu_sft_N1C8.sh文件内容：
set -x

max_steps=${1:-100}

export GLOG_v=0
export FLAGS_npu_storage_format=1
export FLAGS_use_stride_kernel=0
export HCCL_OP_BASE_FFTS_MODE_ENABLE=TRUE
export MULTI_STREAM_MEMORY_REUSE=1
export HCCL_INTRA_PCIE_ENABLE=0
export HCCL_INTRA_ROCE_ENABLE=1
export FLAGS_allocator_strategy=naive_best_fit
export ASCEND_RT_VISIBLE_DEVICES="0,1,2,3"
export FLAGS_NPU_MC2=1
export MC2_Recompute=1

source /usr/local/Ascend/ascend-toolkit/set_env.sh
export PYTHONPATH=../../../:$PYTHONPATH
ps aux | grep run_finetune.py | grep -v grep | awk '{print $2}' | xargs kill -9

python -u -m paddle.distributed.launch
--devices "0,1,2,3"
--log_dir "./sft_bf16_qwen_N1C4"
../../run_finetune.py
--device "npu"
--model_name_or_path "Qwen/Qwen2.5-7B-Instruct"
--dataset_name_or_path "data/"
--output_dir "./output/sft_bf16_qwen_N1C4"
--per_device_train_batch_size 2
--gradient_accumulation_steps 64
--per_device_eval_batch_size 1
--eval_accumulation_steps 1
--max_steps ${max_steps}
--learning_rate 3e-06
--warmup_steps 20
--save_steps 1000
--logging_steps 1
--evaluation_strategy "epoch"
--src_length 1024
--max_length 4096
--bf16 true
--fp16_opt_level "O2"
--do_train true
--tensor_parallel_output true
--disable_tqdm true
--eval_with_do_generation false
--metric_for_best_model "accuracy"
--recompute false
--tensor_parallel_degree 4
--pipeline_parallel_degree 1
--zero_padding 0
--amp_master_grad true
--fuse_attention_qkv true
--fuse_attention_ffn true
--sequence_parallel 1
--use_flash_attention 1
--use_fused_rope 1
--use_fused_rms_norm 1
--sharding_parallel_degree 1
--pad_to_multiple_of 4096
--sharding "stage1"
--sharding_parallel_config "enable_stage1_tensor_fusion enable_stage1_overlap"

feria-tu added the question Further information is requested label Nov 21, 2024

paddle-bot bot assigned lugimzzz Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: 昇腾910b运行llama_npu_sft_N1C8.sh时报错 #9469

[Question]: 昇腾910b运行llama_npu_sft_N1C8.sh时报错 #9469

feria-tu commented Nov 21, 2024

[Question]: 昇腾910b运行llama_npu_sft_N1C8.sh时报错 #9469

[Question]: 昇腾910b运行llama_npu_sft_N1C8.sh时报错 #9469

Comments

feria-tu commented Nov 21, 2024

请提出你的问题