RISC-V 跑大模型（三）：LLaMA中文擴展

這是RISC-V跑大模型系列的第三篇文章，前面我們為大家介紹了如何在RISC-V下運行LLaMA，本篇我們將會介紹如何為LLaMA提供中文支持。

1.模型擴充

以下步驟在X86下進行：

1.1準備工作

安裝最新版本的python和以下依賴庫。

pip install protobuf==3.20.0	結構化數據存儲格式
pip install transformers	把原版模型轉換為HF格式
pip install sentencepiece	無監督的文本標記器和去標記器
pip install peft	使用LoRA的工具

1.2模型下載

下載LLaMA原版模型和中文擴充

LLaMA原版模型：

https://ipfs.io/ipfs/Qmb9y5GCkTG7ZzbBWMu2BXwMkzyCKcUjtEKPpgdZ7GEFKm/

中文擴充：

https://huggingface.co/ziqingyang/chinese-alpaca-lora-7b

下載后的目錄如下：

1.3合并模型

(1) 將LLaMA原版模型轉換為Huggingface格式。這一步需要借助transformers提供的腳本convert_llama_weights_to_hf.py。

下載鏈接：https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py

請執行以下命令:

python convert_llama_weights_to_hf.py --input_dir path_to_original_llama_root_dir --model_size 7B --output_dir path_to_original_llama_hf_dir

命令解釋：將原版LLaMA的tokenizer.model放在--input_dir指定的目錄，其余文件放在${input_dir}/${model_size}下。執行以下命令后，--output_dir中將存放轉換好的Huggingface版權重。

(2) 合并LoRA權重，生成Huggingface全量模型，這一步需要借助：merge_llama_with_chinese_lora.py。

下載鏈接：

https://github.com/y mcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_llama_with_chinese_lora.py

執行命令:

python merge_llama_with_chinese_lora.py --base_model path_to_original_llama_hf_dir --lora_model chinese-alpaca-lora-7b --output_dir path_to_output_dir

命令解釋：這一步的參數可以參照上一步。

2.移植模型

在完成前面的步驟后會得到一個path_to_output_dir的目錄，目錄內容如下：

將目錄下的consolidate.00.path和params.json上傳到RISC-V中的llama.cpp/models目錄下，這一步可以借助scp來實現：scp “源文件路徑” 賬戶@地址:目的路徑。之后的步驟可以參考本系列的第二篇文章。鏈接如下：RISC-V 跑大模型（二）：LLaMA零基礎移植教程

最后的運行效果：