Clara Na

Hello! I am a 4th year PhD student at Carnegie Mellon University’s Language Technologies Institute. I am fortunate to be advised by Emma Strubell and supported by an NSF Graduate Research Fellowship. In 2023, I spent a wonderful summer in Seattle as an intern on the AllenNLP team at AI2, working with Jesse Dodge and Pradeep Dasigi. I’ll be at MSR in New York this summer!

Before coming to CMU, I earned a BA in Computer Science and Mathematics at the University of Virginia. I began my research journey at UVA looking for “subtractive” design in patents with Katelyn Stenger and Leidy Klotz. My NLP origin story involves my half-baked bilingualism, a data science internship at the Washington Post, and some generous mentorship from Yangfeng Ji.

I study efficient methods and efficiency evaluation in NLP/ML. I am broadly interested in language, information, impacts and applications of language technologies, and the communities of people who build and use them. Recently I have been thinking a lot about 1) AI infrastructure and energy, and 2) modular paradigms in LLM development and deployment.

_{Misc: I was born and raised in northern Virginia (NoVA). My middle name is 선우 (Seon-Woo) – I am a second generation Korean American. I have a younger brother who also went to CMU. In my spare time, I like playing piano (especially with other people), running, climbing, and reading.}

news

Apr 2025	Went back to Singapore for ICLR 2025! Also won a Best Proposal Award at the Tackling Climate Change with Machine Learning Workshop for Jared and I’s (ongoing/upcoming) work!
Feb 2025	Our paper, Holistically Evaluating the Environmental Impact of Creating Language Models, got accepted to ICLR 2025 as a Spotlight!
Sep 2024	Our paper, Scalable Data Ablation Approximations for Language Models through Modular Training and Merging, has been accepted to EMNLP 2024! -> presented in Miami in November
Apr 2024	Attending Midwest Speech and Language Days to present preliminary work on scalable data ablation approximations :)
Dec 2023	Excited to be attending EMNLP 2024 in Singapore! I will be giving a talk on our work (co- with Sireesh and Amanda) on Sunday

selected publications

ICLR Spotlight

Holistically Evaluating the Environmental Impact of Creating Language Models

Jacob Morrison, Clara Na, Jared Fernandez, Tim Dettmers, Emma Strubell, and Jesse Dodge

In The Thirteenth International Conference on Learning Representations 2025

arXiv Bib

@inproceedings{morrison2025holistically,
  title = {Holistically Evaluating the Environmental Impact of Creating Language Models},
  author = {Morrison, Jacob and Na, Clara and Fernandez, Jared and Dettmers, Tim and Strubell, Emma and Dodge, Jesse},
  booktitle = {The Thirteenth International Conference on Learning Representations},
  year = {2025},
  url = {https://openreview.net/forum?id=04qx93Viwj},
  equal_contribution = {0},
  arxiv = {2503.05804},
  abbr = {ICLR Spotlight},
  bibtex_show = true,
  eprint = {2503.05804},
  archiveprefix = {arXiv},
  primaryclass = {cs.CY},
  selected = true
}

preprint

Energy Considerations of Large Language Model Inference and Efficiency Optimizations

Jared Fernandez*, Clara Na*, Vashisth Tiwari*, Yonatan Bisk, Sasha Luccioni, and Emma Strubell

2025

arXiv Bib

@misc{fernandez2025energyllminf,
  title = {Energy Considerations of Large Language Model Inference and Efficiency Optimizations},
  author = {Fernandez, Jared and Na, Clara and Tiwari, Vashisth and Bisk, Yonatan and Luccioni, Sasha and Strubell, Emma},
  year = {2025},
  eprint = {2504.17674},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL},
  url = {https://arxiv.org/abs/2504.17674},
  equal_contribution = {3},
  abbr = {preprint},
  bibtex_show = true,
  arxiv = {2504.17674},
  selected = true
}

EMNLP Main

Scalable Data Ablation Approximations for Language Models through Modular Training and Merging

Clara Na, Ian Magnusson, Ananya Harsh Jha, Tom Sherborne, Emma Strubell, Jesse Dodge, and Pradeep Dasigi

In The 2024 Conference on Empirical Methods in Natural Language Processing 2024

arXiv Bib PDF Poster

@inproceedings{na2024scalable,
  title = {Scalable Data Ablation Approximations for Language Models through Modular Training and Merging},
  author = {Na, Clara and Magnusson, Ian and Jha, Ananya Harsh and Sherborne, Tom and Strubell, Emma and Dodge, Jesse and Dasigi, Pradeep},
  year = {2024},
  eprint = {2410.15661},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL},
  selected = true,
  booktitle = {The 2024 Conference on Empirical Methods in Natural Language Processing},
  abbr = {EMNLP Main},
  pdf = {scalable-data-ablations.pdf},
  poster = {scalable-data-ablations-poster.pdf},
  bibtex_show = true,
  arxiv = {2410.15661},
  equal_contribution = {0}
}

EMNLP Findings

Energy and Carbon Considerations of Fine-Tuning BERT

Xiaorong Wang*, Clara Na*, Emma Strubell, Sorelle Friedler, and Sasha Luccioni

In Findings of the Association for Computational Linguistics: EMNLP 2023 2023

arXiv Bib PDF

@inproceedings{wang2023energy,
  title = {Energy and Carbon Considerations of Fine-Tuning BERT},
  author = {Wang, Xiaorong and Na, Clara and Strubell, Emma and Friedler, Sorelle and Luccioni, Sasha},
  booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2023},
  abs = {},
  year = {2023},
  eprint = {2311.10267},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL},
  equal_contribution = {2},
  abbr = {EMNLP Findings},
  pdf = {energy_and_carbon.pdf},
  selected = true,
  arxiv = {2311.10267},
  bibtex_show = true
}

EMNLP Main

To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing

Sireesh Gururaja*, Amanda Bertsch*, Clara Na*, David Gray Widder, and Emma Strubell

In The 2023 Conference on Empirical Methods in Natural Language Processing 2023

arXiv Bib PDF

@inproceedings{gururaja2023build,
  title = {To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing},
  author = {Gururaja, Sireesh and Bertsch, Amanda and Na, Clara and Widder, David Gray and Strubell, Emma},
  year = {2023},
  eprint = {2310.07715},
  archiveprefix = {arXiv},
  primaryclass = {cs.CL},
  selected = true,
  booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing},
  abbr = {EMNLP Main},
  pdf = {to_build_our_future.pdf},
  bibtex_show = true,
  arxiv = {2310.07715},
  equal_contribution = {3}
}

EMNLP Findings

Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models

Clara Na*, Sanket Vaibhav Mehta*, and Emma Strubell

In Findings of the Association for Computational Linguistics: EMNLP 2022 Dec 2022

arXiv Bib HTML PDF Poster

@inproceedings{na-etal-2022-train,
  title = {Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models},
  author = {Na, Clara and Mehta, Sanket Vaibhav and Strubell, Emma},
  editor = {Goldberg, Yoav and Kozareva, Zornitsa and Zhang, Yue},
  booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2022},
  month = dec,
  year = {2022},
  address = {Abu Dhabi, United Arab Emirates},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2022.findings-emnlp.361},
  doi = {10.18653/v1/2022.findings-emnlp.361},
  pages = {4909--4936},
  abbr = {EMNLP Findings},
  pdf = {train-flat-then-compress.pdf},
  poster = {poster-trainflat.pdf},
  selected = {true},
  arxiv = {2205.12694},
  html = {https://aclanthology.org/2022.findings-emnlp.361},
  bibtex_show = true
}