Author: Marc Orós
Supervisors: Dimosthenis Karatzas
Presentation time: 10:30
Virtual Room: 5.3 | Live presentation URL
Most machine learning methods are susceptible to biases in training data, which can hinder their performance in certain situations. Image captioning is a research area where human introduced biases are important and have significant effects. We describe our implementation of a system that can be used to generate proxy training data with reduced bias by separating the foreground and background in order to use different representations and perform object replacements, as well as a model that utilises this separated image data to perform image captioning. We also evaluate multiple variants of our model on the MS-COCO dataset to evaluate its performance both in the general case as well as situations where the dataset’s bias could make typical captioning models perform worse.