Please use this identifier to cite or link to this item:
https://zone.biblio.laurentian.ca/handle/10219/3980
Title: | Fine-tuning a general transformer model on story-lines of IMDB movies database |
Authors: | Ghasemi, Hojat |
Keywords: | Fine-tuning;language model;text summarization;transfer learning;transformer;pre-training;abstractive summarization;IMDB;movie storyline;natural language processing;attention based models;deep learning |
Issue Date: | 13-Jan-2022 |
Abstract: | Recent transformer-based language models pre-trained on huge text corpora have shown great success in performing downstream Natural Language Processing (NLP) tasks such as text summarization when fine-tuned on smaller labeled datasets. However, the impact of fine-tuning on improving the performance of pre-trained language models in summarizing movie storylines have not been explored. Moreover, there is a lack of extensive labelled datasets containing movies storylines to allow pre-trained language models delving deeper in this realm. In this research work we propose a novel labelled dataset containing IMDB movie storylines alongside their summaries for teaching pre-trained language models how to perform text summarization on movie storylines. Furthermore, we showcase the potential of this dataset by fine-tuning a T5-base model with the use of this dataset. Our results show that fine-tuning a T5-base model on this dataset can significantly improve the performance in summarizing movie storylines |
URI: | https://zone.biblio.laurentian.ca/handle/10219/3980 |
Appears in Collections: | Computational Sciences - Master's theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Thesis FINAL_Hojat Ghasemi_ 16-Feb-2022.pdf | 1.83 MB | Adobe PDF | View/Open |
Items in LU|ZONE|UL are protected by copyright, with all rights reserved, unless otherwise indicated.