In this research paper, I investigate the controlled text generation capabilities of ruGPT3Large through fine-tuning, specifically focusing on generating movie reviews based on a designated sentiment attribute. Controlled text generation is an active area of inquiry within the domain of Natural Language Processing, particularly for the Russian language. This study exemplifies a simple approach to controllable text generation by training ruGPT3 on a textual dataset containing sentiment-marked prompts, enabling the model to recognize patterns and generate analogous texts. The research provides a comprehensive analysis of the limitations, shortcomings and merits of fine-tuning a large language model using prompts embedded in a dataset. The generated texts exhibit coherence, logical structure, abundant coreferential links, and narratives and vocabularies characteristic of film reviews. Nevertheless, ruGPT3-generated reviews exhibit certain linguistic errors. I classify the most prevalent errors, such as named entity confusion, grammatical gender inconsistencies and sentiment fluctuations. Given that the primary objective is to evaluate the efficacy of basic fine-tuning with respect to the specified attribute, both automatic sentiment analysis and human evaluation are employed for output assessment. In comparing the outputs of the fine-tuned model and the baseline ruGPT3Large, I observe that positive sentiment generation is the most successful, while neutral and negative sentiments are produced by the models less accurately.
Идентификаторы и классификаторы
- Префикс DOI
- 10.33910/2687-0215-2022-4-1-15-25
- eLIBRARY ID
- 54733034