Please use this identifier to cite or link to this item: https://zone.biblio.laurentian.ca/handle/10219/4100
Title: Emotion-centric image captioning using a self-critical mean teacher learning approach
Authors: Yousefi, Aryan
Keywords: Image captioning;computer vision;natural language processing;mean teacher learning;self-critical sequence training
Issue Date: 7-Nov-2022
Abstract: Image Captioning is the multi-modal task of automatically generating natural language descriptions based on a visual input using various Deep Learning techniques. This research area is in the intersection of Computer Vision and Natural Language Processing fields, and it has gained an increasing popularity over the past few years. Image Captioning is an important part of scene understanding with various extensive applications, such as helping visually impaired people, recommendations in editing applications, and usage in virtual assistants. However, most of the previous work in this topic has been focused on purely objective content-based descriptions of the image scenes. The goal of this thesis is to generate more engaging captions by leveraging humanlike emotional responses in the captioning process. To achieve this task, a Mean Teacher Learningbased method has been applied on the recently introduced ArtEmis dataset. This method includes a self distillation relationship between the memory-augmented language models with meshed connectivity, which will be first trained in a cross-entropy based phase, and then fine-tuned in a Self-Critical Sequence Training phase. In addition, we propose a novel classification module by decreasing texture bias and encouraging the model towards a shape-based classification. We also propose a method to utilize extra emotional supervision signals in the caption generation process, leveraging the image-to-emotion classifier. Comparing with the state-of-the-art results on ArtEmis dataset, our proposed model outperforms the current benchmark significantly in multiple popular evaluation metrics, such as BLEU, METEOR, ROUGE-L, and CIDEr
URI: https://zone.biblio.laurentian.ca/handle/10219/4100
Appears in Collections:Computational Sciences - Master's theses

Files in This Item:
File Description SizeFormat 
Thesis FINAL-Aryan Yousefi_14_Nov-2022.pdf1.94 MBAdobe PDFThumbnail
View/Open


Items in LU|ZONE|UL are protected by copyright, with all rights reserved, unless otherwise indicated.