The goal of YouMakeup VQA challenge is to provide a common benchmark for fine-grained action understanding in specific domain videos. The makeup instructional videos are naturally more fine-grained than open-domain videos. Different action steps share the similar backgrounds, but contain subtle and critical differences such as actions, tools and applied facial areas, resulting in different effects on the face. Therefore, it requires fine-grained discrimination abiltilies within temporal and spatial context.
We propose two VQA sub-challenges based on YouMakeup dataset, namely Facial Image Ordering Sub-Challenge and Step Ordering Sub-Challenge. The goal of Facial Image Ordering Sub-Challenge is to understand changes of object given a certain action in flexible natural language expression, while Step Ordering Sub-Challenge aims at evaluating models' abilities in cross-modal semantic alignments between visual and texts.
· Jan 15, 2020: Dataset available for download (training, validation and test set)
· April 6, 2020: Web Site and Call for Participation Ready
· April 12, 2020: Baseline codes and models available for download
· June 1, 2020: Results submission deadline
· June 8, 2020: Paper submission deadline
@inproceedings{wang2019youmakeup,
title={YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension},
author={Wang, Weiying and Wang, Yongcheng and Chen, Shizhe and Jin, Qin},
booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
pages={5136--5146},
year={2019}
}