Clara Na

Hello! I am a 5th year PhD student at Carnegie Mellon University’s Language Technologies Institute. I am fortunate to be advised by Emma Strubell and supported by an NSF Graduate Research Fellowship. I have spent time at AI2 and Microsoft Research in Seattle and New York, respectively; I am grateful to have worked with Jesse Dodge, Pradeep Dasigi, Alessandro Sordoni, Lucas Caccia-Page, Miro Dudik, Jordan Ash, and other wonderful collaborators.

Before coming to CMU, I earned a BA in Computer Science and Mathematics at the University of Virginia. I began my research journey at UVA looking for “subtractive” design in patents with Katelyn Stenger and Leidy Klotz. My NLP origin story involves my half-baked bilingualism, a data science internship at the Washington Post, and some generous mentorship from Yangfeng Ji.

I study efficient methods and efficiency evaluation in NLP/ML. I am broadly interested in language, information, impacts and applications of language technologies, and the communities of people who build and use them. Recently I have been thinking a lot about 1) AI infrastructure and energy*, and 2) modular paradigms in LLM development and deployment.


Misc: I was born and raised in northern Virginia, the *data center capital of the world. My middle name is 선우 (Seon-Woo) – I am a second generation Korean American. I have a younger brother who also went to CMU. In my spare time, I like playing piano (especially with other people), running, climbing, and reading.


news

Feb 2026 SpreadsheetArena was released by meridian.ai – check out our paper here!
Oct 2025 Traveled to Montreal for COLM!
May 2025 Our paper, Energy Considerations of Large Language Model Inference and Efficiency Optimizations, was accepted to ACL 2025!
Apr 2025 Went back to Singapore for ICLR 2025! Also won a Best Proposal Award at the Tackling Climate Change with Machine Learning Workshop for Jared and I’s (ongoing/upcoming) work!
Feb 2025 Our paper, Holistically Evaluating the Environmental Impact of Creating Language Models, got accepted to ICLR 2025 as a Spotlight!

selected publications

  1. ICLR Spotlight
    Holistically Evaluating the Environmental Impact of Creating Language Models
    In The Thirteenth International Conference on Learning Representations 2025
  2. ACL
    Energy Considerations of Large Language Model Inference and Efficiency Optimizations
    In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics 2025
  3. EMNLP Main
    Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
    In The 2024 Conference on Empirical Methods in Natural Language Processing 2024
  4. EMNLP Findings
    Energy and Carbon Considerations of Fine-Tuning BERT
    In Findings of the Association for Computational Linguistics: EMNLP 2023 2023
  5. EMNLP Main
    To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing
    In The 2023 Conference on Empirical Methods in Natural Language Processing 2023
  6. EMNLP Findings
    Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
    In Findings of the Association for Computational Linguistics: EMNLP 2022 Dec 2022