Pre-trained Language Models are frequently being used to carry out different Natural Language Processing tasks. While these models are extremely successful at these tasks, little is known about how these models learn and comprehend human languages. Current Language Models have complex structures, which prevents us from fully understanding their behaviour. Therefore, we do not fully know to what they owe their success, and why they perform the way they do. This research project aims to find and improve Language Models' weaknesses through analysing their word representations, architectures, and embedding space. More specifically, this research investigates the linguistic phenomena within the model representations, and address the issues these models face in downstream tasks.
I received my M.Sc. in Artificial Intelligence and Robotics from Iran University of Science and Technology in 2022. During my Master's I mostly worked on Language Model Interpretability, and my thesis was about how fine-tuning data size can affect the BERT's linguistic knowledge. My work resulted in a paper that got published in ACL Findings 2022.