添加 Hyperopt 算法

Ludwig 中的超参数优化设计基于两个抽象接口：HyperoptSampler 和 HyperoptExecutor。

有关采样器和执行器如何配置的示例，请参阅Hyperopt 配置。

HyperoptSampler¶

HyperoptSampler 规定了如何对超参数值进行采样。

采样器由 Ludwig 配置的 hyperopt 部分中的 sampler 部分进行配置。

每个需要采样的超参数都在 hyperopt.parameters 中声明，该部分还指定了 Sampler 应该遵守的其他约束。例如

hyperopt:
    goal: minimize
    output_feature: combined
    metric: loss
    split: validation
    parameters:
        trainer.learning_rate:
            space: linear
            range:
                low: 0.001
                high: 0.1
            steps: 4
        text.fc_layers:
            space: choice
            categories:
                - [{"output_size": 512}, {"output_size": 256}]
                - [{"output_size": 512}]
                - [{"output_size": 256}]

这里，trainer.learning_rate 是连续采样的，而 text.fc_layers 是离散采样的。

注意

不同的 HyperoptSamplers 在此处描述。

HyperoptExecutor¶

HyperoptExecutor 规定了如何执行超参数优化，其操作独立于超参数实际如何采样。

HyperoptExecutor 使用 HyperoptSampler 对超参数值进行采样，通常会初始化一个执行上下文（例如多线程池），并根据采样器执行超参数优化。

首先，从 HyperoptSampler 中采样一批新的参数值。然后，将采样的参数值与种子 Ludwig 配置合并，采样的参数值会覆盖种子的值。

执行训练并收集验证损失和指标。将 (sampled_parameters, statistics) 对提供给 HyperoptSampler.update 函数，以告知下一个超参数样本。

循环重复，直到所有样本都采样完毕。

最后，HyperoptExecutor.execute 返回一个字典列表，每个字典包含：采样的参数、指标得分以及其他训练、验证和测试统计信息。

返回的列表会被打印并保存到磁盘，以便也可以用作超参数优化可视化的输入。

注意

不同的 HyperoptExecutors 在此处描述

添加 HyperoptSampler¶

1. 添加一个新的采样器类¶

基础 HyperoptSampler 类的源代码位于 ludwig/hyperopt/sampling.py。

继承基础类的类应该在此文件中定义。

`init`¶

def __init__(self, goal: str, parameters: Dict[str, Any]):

基础 HyperoptStrategy 类构造函数的参数如下

goal 表示是最小化还是最大化 hyperopt 部分中定义的任何输出特征在任何分割上的指标或损失
parameters 包含所有需要优化的超参数及其类型和范围/值。

示例

goal = "minimize"
parameters = {
    "training.learning_rate": {
        "type": "float",
        "low": 0.001,
        "high": 0.1,
        "steps": 4,
        "scale": "linear"
    },
    "combiner.num_fc_layers": {
        "type": "int",
        "low": 2,
        "high": 6,
        "steps": 3
    }
}

sampler = GridSampler(goal, parameters)

`sample`¶

def sample(self) -> Dict[str, Any]:

sample 是一个根据采样器生成新样本的方法。它返回一组参数名称及其值。如果 finished() 返回 True，则调用 sample 将返回 IndexError。

示例返回值

{'training.learning_rate': 0.005, 'combiner.num_fc_layers': 2, 'utterance.cell_type': 'gru'}

`sample_batch`¶

def sample_batch(self, batch_size: int = 1) -> List[Dict[str, Any]]:

sample_batch 方法返回一个长度小于或等于 batch_size 的采样参数列表。如果 finished() 返回 True，则调用 sample_batch 将返回 IndexError。

示例返回值

[{'training.learning_rate': 0.005, 'combiner.num_fc_layers': 2, 'utterance.cell_type': 'gru'}, {'training.learning_rate': 0.015, 'combiner.num_fc_layers': 3, 'utterance.cell_type': 'lstm'}]

`update`¶

def update(
    self,
    sampled_parameters: Dict[str, Any],
    metric_score: float
):

update 使用先前计算的结果更新采样器。

sampled_parameters 是采样参数的字典。
metric_score 是指定样本获得的优化指标值。

对于无状态策略（如 grid 和 random）不需要它，但对于有状态策略（如 bayesian 和 evolutionary）则需要它。

示例

sampled_parameters = {
    'training.learning_rate': 0.005,
    'combiner.num_fc_layers': 2,
    'utterance.cell_type': 'gru'
}
metric_score = 2.53463

sampler.update(sampled_parameters, metric_score)

`update_batch`¶

def update_batch(
    self,
    parameters_metric_tuples: Iterable[Tuple[Dict[str, Any], float]]
):

update_batch 以批量方式使用先前计算的结果更新采样器。

parameters_metric_tuples 是采样参数及其各自指标值对的列表。

对于无状态策略（如 grid 和 random）不需要它，但对于有状态策略（如 bayesian 和 evolutionary）则需要它。

示例

sampled_parameters = [
    {
        'training.learning_rate': 0.005,
        'combiner.num_fc_layers': 2,
        'utterance.cell_type': 'gru'
    },
    {
        'training.learning_rate': 0.015,
        'combiner.num_fc_layers': 5,
        'utterance.cell_type': 'lstm'
    }
]
metric_scores = [2.53463, 1.63869]

sampler.update_batch(zip(sampled_parameters, metric_scores))

`finished`¶

def finished(self) -> bool:

当所有样本都已采样完毕时，finished 方法返回 True，否则返回 False。

2. 将新的采样器类添加到相应的采样器注册表¶

sampler_registry 包含模型定义 hyperopt 部分中的 sampler 名称与 HyperoptSampler 子类之间的映射。

将新的采样器添加到注册表

sampler_registry = {
    "random": RandomSampler,
    "grid": GridSampler,
    ...,
    "new_sampler_name": NewSamplerClass
}

添加 HyperoptExecutor¶

1. 添加一个新的执行器类¶

基础 HyperoptExecutor 类的源代码位于 ludwig/utils/hyperopt_utils.py 模块中。继承基础类的类应该在此模块中定义。

`init`¶

def __init__(
    self,
    hyperopt_sampler: HyperoptSampler,
    output_feature: str,
    metric: str,
    split: str
)

基础 HyperoptExecutor 类构造函数的参数如下

hyperopt_sampler 是一个 HyperoptSampler 对象，用于对超参数值进行采样
output_feature 是一个 str，包含我们想要优化其指标或损失的输出特征的名称。可用值包括 combined（默认）或模型定义中提供的任何输出特征的名称。combined 是一个特殊的输出特征，允许优化所有输出特征的汇总损失和指标。
metric 是我们想要优化的指标。默认值是 loss，但根据 output_feature 中定义的特征类型，有不同的指标和损失可用。请查看特定输出特征类型的指标部分，以了解有哪些指标可用。
split 是我们想要计算指标的数据分割。默认是 validation（验证）分割，但你可以灵活指定 train（训练）或 test（测试）分割。

示例

goal = "minimize"
parameters = {
    "training.learning_rate": {
        "type": "float",
        "low": 0.001,
        "high": 0.1,
        "steps": 4,
        "scale": "linear"
    },
    "combiner.num_fc_layers": {
        "type": "int",
        "low": 2,
        "high": 6,
        "steps": 3
    }
}
output_feature = "combined"
metric = "loss"
split = "validation"

grid_sampler = GridSampler(goal, parameters)
executor = SerialExecutor(grid_sampler, output_feature, metric, split)

`execute`¶

def execute(
    self,
    config,
    dataset=None,
    training_set=None,
    validation_set=None,
    test_set=None,
    training_set_metadata=None,
    data_format=None,
    experiment_name="hyperopt",
    model_name="run",
    model_load_path=None,
    model_resume_path=None,
    skip_save_training_description=False,
    skip_save_training_statistics=False,
    skip_save_model=False,
    skip_save_progress=False,
    skip_save_log=False,
    skip_save_processed_input=False,
    skip_save_unprocessed_output=False,
    skip_save_predictions=False,
    skip_save_eval_stats=False,
    output_directory="results",
    gpus=None,
    gpu_memory_limit=None,
    allow_parallel_threads=True,
    use_horovod=None,
    random_seed=default_random_seed,
    debug=False,
    **kwargs
):

execute 方法执行超参数优化。它可以利用 run_experiment 函数获取训练和评估统计信息，并利用 self.get_metric_score 函数根据 self.output_feature、self.metric 和 self.split 从评估结果中提取指标得分。

2. 将新的执行器类添加到相应的执行器注册表¶

executor_registry 包含模型定义 hyperopt 部分中的 executor 名称与 HyperoptExecutor 子类之间的映射。要使新的执行器可用，将其添加到注册表

executor_registry = {
    "serial": SerialExecutor,
    "parallel": ParallelExecutor,
    "fiber": FiberExecutor,
    "new_executor_name": NewExecutorClass
}

添加 Hyperopt 算法

HyperoptSampler¶

HyperoptExecutor¶

添加 HyperoptSampler¶

1. 添加一个新的采样器类¶

__init__¶

sample¶

sample_batch¶

update¶

update_batch¶

finished¶

2. 将新的采样器类添加到相应的采样器注册表¶

添加 HyperoptExecutor¶

1. 添加一个新的执行器类¶

__init__¶

execute¶

2. 将新的执行器类添加到相应的执行器注册表¶

`init`¶

`sample`¶

`sample_batch`¶

`update`¶

`update_batch`¶

`finished`¶

`init`¶

`execute`¶