Clara Na

Hello! I am a 3rd 4th year PhD student at Carnegie Mellon University’s Language Technologies Institute. I am fortunate to be advised by Emma Strubell and supported by an NSF Graduate Research Fellowship. In 2023, I spent a wonderful summer in Seattle as an intern on the AllenNLP team at AI2, working with Jesse Dodge and Pradeep Dasigi.

Before coming to CMU, I earned a BA in Computer Science and Mathematics at the University of Virginia. I began my research journey at UVA looking for “subtractive” design in patents with Katelyn Stenger and Leidy Klotz. My NLP origin story involves my half-baked bilingualism, a data science internship at the Washington Post, and some generous mentorship from Yangfeng Ji.

I study efficient methods and efficiency evaluation in NLP/ML. I am broadly interested in language, information, impacts and applications of language technologies, and the communities of people who build and use them.


Misc: I was born and raised in northern Virginia (NoVA). My middle name is 선우 (Seon-Woo) – I am a second generation Korean American. I have a younger brother who also went to CMU. In my spare time, I like playing piano (especially with other people), running, climbing, and reading.


news

Sep 2024 Our paper, Scalable Data Ablation Approximations for Language Models through Modular Training and Merging, has been accepted to EMNLP 2024!
Apr 2024 Attending Midwest Speech and Language Days to present preliminary work on scalable data ablation approximations :)
Dec 2023 Excited to be attending EMNLP 2024 in Singapore! I will be giving a talk on our work (co- with Sireesh and Amanda) on Sunday
Oct 2023 Three papers accepted to EMNLP!
Aug 2023 We won a Best Paper award at the LTI Student Research Symposium!

selected publications

  1. EMNLP Main
    Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
    In The 2024 Conference on Empirical Methods in Natural Language Processing 2024
  2. EMNLP Findings
    Energy and Carbon Considerations of Fine-Tuning BERT
    In Findings of the Association for Computational Linguistics: EMNLP 2023 2023
  3. EMNLP Main
    To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing
    In The 2023 Conference on Empirical Methods in Natural Language Processing 2023
  4. EMNLP Main
    The Framework Tax: Disparities Between Inference Efficiency in Research and Deployment
    In The 2023 Conference on Empirical Methods in Natural Language Processing 2023
  5. EMNLP Findings
    Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
    In Findings of the Association for Computational Linguistics: EMNLP 2022 Dec 2022