COSC 426A/LA: Midterm Project
Overview
For the midterm, you and a group will replicate one of four papers:
-
Benjamin Newman, Kai-Siang Ang, Julia Gong, and John Hewitt. 2021. Refining Targeted Syntactic Evaluation of Language Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3710–3723, Online. Association for Computational Linguistics.
-
Alex Warstadt and Samuel R. Bowman. 2020. Linguistic Analysis of Pretrained Sentence Encoders with Acceptability Judgments. On arXiv.
-
R. Thomas McCoy, Junghyun Min, and Tal Linzen. 2020. BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance. In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 217–227, Online. Association for Computational Linguistics.
-
Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou, Natalia Talmina, and Tal Linzen. 2020. Cross-Linguistic Syntactic Evaluation of Word Prediction Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5523–5539, Online. Association for Computational Linguistics.
Phase One: Understanding the Question and Method
First, you and a group of students will be assigned a paper from the list. You will read the introduction and methods sections, make a few slides, and present the question to the class. The relevant sections for each paper are collated in the following table:
Paper | Sections |
---|---|
Newman et al. (2021) | 1, 2, 3 |
Warstadt and Bowman. (2020) | 1, 3 |
McCoy et al. (2020) | 1, 3 |
Mueller et al. (2020) | 1, 4, 5 |
Phase Two: Replicating the Paper
You and one other person will select one of the 4 papers to replicate. You will then:
- Create stimuli by Friday Sep 26 11PM
- Get results by Friday Oct 4 11PM