预测和评估

模型训练完成后，可用于对新数据进行目标输出特征的预测。

我们创建了一个包含 10 条影评输入特征的小型测试数据集，可用于测试。在此处下载测试数据集。

让我们对测试数据集进行一些预测！

CLIPythonDocker CLI

ludwig predict --model_path results/experiment_run/model --dataset rotten_tomatoes_test.csv

# This step can be skipped if you are working in a notebook, and you can simply
# re-use the model created in the training section.
model = LudwigModel.load('results/experiment_run/model')

predictions, _ = model.predict(dataset='rotten_tomatoes_test.csv')
predictions.head()

docker run -t -i --mount type=bind,source={absolute/path/to/rotten_tomatoes_data},target=/rotten_tomatoes_data ludwigai/ludwig predict --model_path /rotten_tomatoes_data/results/experiment_run/model --dataset /rotten_tomatoes_data/rotten_tomatoes.csv

运行此命令将返回模型预测结果。您的结果应类似于此

索引	推荐概率	推荐预测结果	推荐概率_False	推荐概率_True	推荐概率
0	[0.09741002321243286, 0.9025899767875671]	True	0.097410	0.902590	0.902590
1	[0.6842662990093231, 0.3157337009906769]	False	0.684266	0.315734	0.684266
2	[0.026504933834075928, 0.973495066165 9241]	True	0.026505	0.973495	0.973495
3	[0.022977590560913086, 0.9770224094390869]	True	0.022978	0.977022	0.977022
4	[0.9472369104623795, 0.052763089537620544]	False	0.947237	0.052763	0.947237

还有一个方便的 ludwig experiment CLI 命令可用。这一个命令可以执行训练，然后使用具有最佳验证指标的检查点进行预测。

除了预测，Ludwig 还根据输出特征的类型计算一系列评估指标。每种输出特征类型计算的具体指标可以在这里找到。

注意

非损失类评估指标，例如准确率，需要目标输出的真实值。

CLIPythonDocker CLI

ludwig evaluate --dataset path/to/data.csv --model_path /path/to/model

eval_stats, _, _ = model.evaluate(dataset='rotten_tomatoes_test.csv')

cp rotten_tomatoes_test.csv ./rotten_tomatoes_data
docker run -t -i --mount type=bind,source={absolute/path/to/rotten_tomatoes_data},target=/rotten_tomatoes_data ludwigai/ludwig evaluate --dataset /rotten_tomatoes_data/rotten_tomatoes_test.csv --model_path /rotten_tomatoes_data/results/experiment_run/model

可以使用 ludwig visualize 命令可视化评估性能。这使我们能够可视化指标，以便比较不同模型的性能和预测结果。例如，如果您有两个模型要比较评估统计数据，您可以使用以下命令

CLIPythonDocker CLI

ludwig visualize --visualization compare_performance --test_statistics path/to/test_statistics_model_1.json path/to/test_statistics_model_2.json

from ludwig.visualize import compare_performance

compare_performance([eval_stats_model_1, eval_stats_model_2])

docker run -t -i --mount type=bind,source={absolute/path/to/rotten_tomatoes_data},target=/rotten_tomatoes_data ludwigai/ludwig visualize --visualization compare_performance --test_statistics /rotten_tomatoes_data/path/to/test_statistics_model_1.json /rotten_tomatoes_data/path/to/test_statistics_model_2.json

这将返回一个条形图，比较每个模型在不同指标上的性能，如下例所示。

Performance Comparison