视觉问答
image_path | question | answer |
---|---|---|
imdata/image_000001.jpg | 山上有雪吗? | 是 |
imdata/image_000002.jpg | 轮子是什么颜色 | 蓝色 |
imdata/image_000003.jpg | 玻璃碗里是什么餐具 | 刀 |
ludwig experiment \
--dataset vqa.csv \
--config config.yaml
使用 config.yaml
input_features:
-
name: image_path
type: image
encoder:
type: stacked_cnn
-
name: question
type: text
encoder:
type: parallel_cnn
output_features:
-
name: answer
type: text
decoder:
type: generator
cell_type: lstm
loss:
type: softmax_cross_entropy