LLM 微调

Llama2-7b 4bit 微调 (QLoRA)¶

此示例展示了如何微调 Llama2-7b 以遵循指令。指令微调是将通用大型语言模型调整为聊天机器人的第一步。

此示例不使用分布式训练或大数据功能。它旨在在任何具备 GPU 的机器上本地运行。

先决条件¶

HuggingFace API Token
对 Llama2-7b-hf 的访问权限审批
至少 12 GiB 显存的 GPU (在我们的测试中，使用了 Nvidia T4)

安装¶

pip install ludwig ludwig[llm]

运行¶

我们将使用 Stanford Alpaca 数据集，该数据集将格式化为如下所示的表格文件

指令	输入	输出
给出三条保持健康的建议。		1. 饮食均衡，确保包含...
按以下顺序排列给定的项目...	蛋糕，我，吃	我吃蛋糕。
写一段关于某个著名人物的介绍段落...	米歇尔·奥巴马	米歇尔·奥巴马是一位鼓舞人心的女性，她...
...	...	...

创建一个名为 model.yaml 的 YAML 配置文件，内容如下：

model_type: llm
base_model: meta-llama/Llama-2-7b-hf

quantization:
  bits: 4

adapter:
  type: lora

prompt:
  template: |
    ### Instruction:
    {instruction}

    ### Input:
    {input}

    ### Response:

input_features:
  - name: prompt
    type: text

output_features:
  - name: output
    type: text

trainer:
  type: finetune
  learning_rate: 0.0001
  batch_size: 1
  gradient_accumulation_steps: 16
  epochs: 3
  learning_rate_scheduler:
    warmup_fraction: 0.01

preprocessing:
  sample_ratio: 0.1

现在开始训练模型

ludwig train --config model.yaml --dataset "ludwig://alpaca"