UNC-Ref
- 训练集:4×10^4 个物体,对应 1.13762×10^5 个自然语言表达
- 验证集:5000 个物体,对应 1.4246×10^4 个自然语言表达
参考文献:Kazemzadeh S, Ordonez V, Matten M, et al. Referitgame: Referring to objects in photographs of natural scenes[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014: 787-798.
G-Ref
- 训练集:4.4882×10^4 个物体,对应 8.5747×10^4 个自然语言表达
- 验证集:5000 个物体,对应 9536 个自然语言表达
参考文献:Mao J, Huang J, Toshev A, et al. Generation and comprehension of unambiguous object descriptions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 11-20.