publications

Google Scholar is more likely to be up to date.

2024

  1. Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
    Clara Na, Ian Magnusson, Ananya Harsh Jha, Tom Sherborne, Emma Strubell, Jesse Dodge, and Pradeep Dasigi
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
  2. Gradient Localization Improves Lifelong Pretraining of Language Models
    Jared Fernandez, Yonatan Bisk, and Emma Strubell
    In Findings of the Association for Computational Linguistics: EMNLP, 2024
  3. Light bulbs have energy ratings – so why can’t AI chatbots?
    Sasha Luccioni, Boris Gamazaychikov, Sara Hooker, Régis Pierrard, Emma Strubell, Yacine Jernite, and Carole-Jean Wu
    Nature, 2024
  4. Challenges in End-to-End Policy Extraction from Climate Action Plans
    Nupoor Gandhi, Tom Corringham, and Emma Strubell
    In Proceedings of the 1st Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2024), 2024
  5. Carbon Connect: An Ecosystem for Sustainable Computing
    Benjamin C. Lee, David Brooks, Arthur Benthem, Udit Gupta, Gage Hills, Vincent Liu, Benjamin Pierce, Christopher Stewart, Emma Strubell, Gu-Yeon Wei, Adam Wierman, Yuan Yao, and Minlan Yu
    arXiv preprint, 2024
  6. What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
    Sang Keun Choe, Hwijeen Ahn, Juhan Bae, Kewen Zhao, Minsoo Kang, Youngseog Chung, Adithya Pratapa, Willie Neiswanger, Emma Strubell, Teruko Mitamura, Jeff Schneider, Eduard Hovy, Roger Grosse, and Eric Xing
    arXiv preprint, 2024
  7. Source-Aware Training Enables Knowledge Attribution in Language Models
    Muhammad Khalifa, David Wadden, Emma Strubell, Honglak Lee, Lu Wang, Iz Beltagy, and Hao Peng
    In First Conference on Language Modeling (COLM), 2024
  8. AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters
    Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren Klein, and Jesse Dodge
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
  9. Best Resource Paper
    Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
    Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, and Kyle Lo
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
  10. Best Theme Paper
    OLMo: Accelerating the Science of Language Models
    Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, and Hannaneh Hajishirzi
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
  11. Power Hungry Processing: Watts Driving the Cost of AI Deployment?
    Sasha Luccioni, Yacine Jernite, and Emma Strubell
    In ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT), 2024

2023

  1. Just CHOP: Embarrassingly Simple LLM Compression
    Ananya Harsh Jha, Tom Sherborne, Evan Pete Walsh, Dirk Groeneveld, Emma Strubell, and Iz Beltagy
    arXiv preprint, 2023
  2. To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing
    Sireesh Gururaja, Amanda Bertsch, Clara Na, David Gray Widder, and Emma Strubell
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
  3. The Framework Tax: Disparities Between Inference Efficiency in Research and Deployment
    Jared Fernandez, Jacob Kahn, Clara Na, Yonatan Bisk, and Emma Strubell
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
  4. DSI++: Updating Transformer Memory with New Documents
    Sanket Vaibhav Mehta, Jai Gupta, Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Jinfeng Rao, Marc Najork, Emma Strubell, and Donald Metzler
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
  5. Understanding the Effect of Model Compression on Social Bias in Large Language Models
    Gustavo Gonçalves, and Emma Strubell
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
  6. Pre-training and Fine-tuning BERT: Energy and Carbon Considerations
    Xiaorong Wang, Clara Na, Emma Strubell, Sorelle Friedler, and Sasha Luccioni
    In Findings of the Association for Computational Linguistics: EMNLP, 2023
  7. Data-efficient Active Learning for Structured Prediction with Partial Annotation and Self-Training
    Zhisong Zhang, Emma Strubell, and Eduard Hovy
    In Findings of the Association for Computational Linguistics: EMNLP, 2023
  8. Making Scalable Meta Learning Practical
    Sang Choe, Sanket Vaibhav Mehta, Hwijeen Ahn, Willie Neiswanger, Pengtao Xie, Emma Strubell, and Eric Xing
    In Advances in Neural Information Processing Systems (NeurIPS), 2023
  9. An Empirical Investigation of the Role of Pre-training in Lifelong Learning
    Sanket Vaibhav Mehta, Darshan Patil, Sarath Chandar, and Emma Strubell
    Journal of Machine Learning Research, 2023
  10. Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation
    Hao Peng, Qingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, Iz Beltagy, Evan Pete Walsh, Noah A. Smith, and Hannaneh Hajishirzi
    arXiv preprint, 2023
  11. Dissecting Efficient Architectures for Wake-Word Detection
    Cody Berger, Juncheng B Li, Yiyuan Li, Aaron Berger, Dmitri Berger, Karthik Ganesan, Emma Strubell, and Florian Metze
    In Workshop on Efficient Systems for Foundation Models @ ICML, Jul 2023
  12. Efficient Methods for Natural Language Processing: A Survey
    Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, and Roy Schwartz
    Transactions of the Association for Computational Linguistics, Jul 2023
  13. Queer People are People First: Deconstructing Sexual Identity Stereotypes in Large Language Models
    Harnoor Dhingra, Preetiha Jayashanker, Sayali Moghe, and Emma Strubell
    In Queer in AI Workshop @ ACL, Jul 2023
  14. On the Interactions of Structural Constraints and Data Resources for Structured Prediction
    Zhisong Zhang, Emma Strubell, and Eduard Hovy
    In Proceedings of The Fourth Workshop on Simple and Efficient Natural Language Processing (SustaiNLP), Jul 2023
  15. Annotating Mentions Alone Enables Efficient Domain Adaptation for Coreference Resolution
    Nupoor Gandhi, Anjalie Field, and Emma Strubell
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Jul 2023
  16. To Adapt or to Annotate: Challenges and Interventions for Domain Adaptation in Open-Domain Question Answering
    Dheeru Dua, Emma Strubell, Sameer Singh, and Pat Verga
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Jul 2023

2022

  1. Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
    Clara Na, Sanket Vaibhav Mehta, and Emma Strubell
    In Findings of the Association for Computational Linguistics: EMNLP 2022, Dec 2022
  2. A Survey of Active Learning for Natural Language Processing
    Zhisong Zhang, Emma Strubell, and Eduard Hovy
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Dec 2022
  3. Transfer Learning from Semantic Role Labeling to Event Argument Extraction with Template-based Slot Querying
    Zhisong Zhang, Emma Strubell, and Eduard Hovy
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Dec 2022
  4. Bridging Fairness and Environmental Sustainability in Natural Language Processing
    Marius Hessenthaler, Emma Strubell, Dirk Hovy, and Anne Lauscher
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Dec 2022
  5. Evaluating Gender Bias Transfer from Film Data
    Amanda Bertsch, Ashley Oh, Sanika Natu, Swetha Gangu, Alan W. Black, and Emma Strubell
    In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), Jul 2022
  6. Measuring the Carbon Intensity of AI in Cloud Instances
    Jesse Dodge, Taylor Prewitt, Remi Tachet des Combes, Erika Odmark, Roy Schwartz, Emma Strubell, Alexandra Sasha Luccioni, Noah S. Smith, Nicole Decario, and Will Buchanan
    In ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT), Jun 2022
  7. Improving Compositional Generalization with Self-Training for Data-to-Text Generation
    Sanket Vaibhav Mehta, Jinfeng Rao, Yi Tay, Mihir Kale, Ankur Parikh, and Emma Strubell
    In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, May 2022
  8. Aligning artificial intelligence with climate change mitigation
    Lynn H Kaack, Priya L Donti, Emma Strubell, George Kamiya, Felix Creutzig, and David Rolnick
    Nature Climate Change, May 2022

2021

  1. On the Benefit of Syntactic Supervision for Cross-lingual Transfer in Semantic Role Labeling
    Zhisong Zhang, Emma Strubell, and Eduard Hovy
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Nov 2021
  2. Comparing Span Extraction Methods for Semantic Role Labeling
    Zhisong Zhang, Emma Strubell, and Eduard Hovy
    In Proceedings of the 5th Workshop on Structured Prediction for NLP (SPNLP 2021), Aug 2021
  3. Spotlight talk
    An empirical investigation of the role of pre-training in lifelong learning
    Sanket Vaibhav Mehta, Darshan Patil, Sarath Chandar, and Emma Strubell
    In ICML Theory and Foundation of Continual Learning Workshop, Aug 2021
  4. WiFiMod: Transformer-Based Indoor Human Mobility Modeling Using Passive Sensing
    Amee Trivedi, Kate Silverstein, Emma Strubell, Prashant Shenoy, and Mohit Iyyer
    In ACM SIGCAS Conference on Computing and Sustainable Societies (COMPASS), Aug 2021

2020

  1. Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks
    Edward Kim, Zach Jensen, Alexander Grootel, Kevin Huang, Matthew Staib, Sheshera Mysore, Haw-Shiuan Chang, Emma Strubell, Andrew McCallum, Stefanie Jegelka, and Elsa Olivetti
    Journal of Chemical Information and Modeling, Aug 2020
  2. Artificial Intelligence and Climate Change: Opportunities, considerations, and policy levers to align AI with climate change goals
    Lynn H. Kaack, Priya Donti, Emma Strubell, and David Rolnick
    Invited policy brief, Heinrich Böll Foundation, Dec 2020
  3. Energy and Policy Considerations for Modern Deep Learning Research
    Emma Strubell, Ananya Ganesh, and Andrew McCallum
    Proceedings of the AAAI Conference on Artificial Intelligence, Apr 2020

2019

  1. The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures
    Sheshera Mysore, Zachary Jensen, Edward Kim, Kevin Huang, Haw-Shiuan Chang, Emma Strubell, Jeffrey Flanigan, Andrew McCallum, and Elsa Olivetti
    In Proceedings of the 13th Linguistic Annotation Workshop, Aug 2019
  2. AI Impact Prize
    Energy and Policy Considerations for Deep Learning in NLP
    Emma Strubell, Ananya Ganesh, and Andrew McCallum
    In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Jul 2019
  3. Machine Learning Models for Efficient and Robust Natural Language Processing
    Emma Strubell
    University of Massachusetts Amherst, Sep 2019

2018

  1. Syntax Helps ELMo Understand Semantics: Is Syntax Still Relevant in a Deep Neural Architecture for SRL?
    Emma Strubell, and Andrew McCallum
    In Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP, Melbourne, Australia, Sep 2018
  2. Simultaneously Self-Attending to All Mentions for Full-Abstract Biological Relation Extraction
    Patrick Verga, Emma Strubell, and Andrew McCallum
    In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Jun 2018
  3. Syntax Helps ELMo Understand Semantics: Is Syntax Still Relevant in a Deep Neural Architecture for SRL?
    Emma Strubell, and Andrew McCallum
    In Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP, Melbourne, Australia, Jun 2018
  4. Best Paper
    Linguistically-Informed Self-Attention for Semantic Role Labeling
    Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, and Andrew McCallum
    In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Oct 2018
  5. Multi-Task Learning For Parsing The Alexa Meaning Representation Language
    Vittorio Perera, Tagyoung Chung, Thomas Kollar, and Emma Strubell
    Proceedings of the AAAI Conference on Artificial Intelligence, Apr 2018

2017

  1. Dependency Parsing with Dilated Iterated Graph CNNs
    Emma Strubell, and Andrew McCallum
    In Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing, Sep 2017
  2. Fast and Accurate Entity Recognition with Iterated Dilated Convolutions
    Emma Strubell, Patrick Verga, David Belanger, and Andrew McCallum
    In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Sep 2017
  3. An epidemiological model of internet worms with hierarchical dispersal and spatial clustering of hosts
    David E. Hiebeler, Andrew Audibert, Emma Strubell, and Isaac J. Michaud
    Journal of Theoretical Biology, Sep 2017
  4. Machine-learned and codified synthesis parameters of oxide materials
    Edward Kim, Kevin Huang, Alex Tomala, Sara Matthews, Emma Strubell, Adam Saunders, Andrew McCallum, and Elsa Olivetti
    Nature Scientific Data, Sep 2017
  5. Spotlight talk
    Automatically Extracting Action Graphs From Materials Science Synthesis Procedures
    Sheshera Mysore, Edward Kim, Emma Strubell, Ao Liu, Haw-Shiuan Chang, Srikrishna Kompella, Kevin Huang, Andrew McCallum, and Elsa Olivetti
    In NIPS Workshop on Machine Learning for Molecules and Materials, Dec 2017
  6. Attending to All Mention Pairs for Full Abstract Biological Relation Extraction
    Patrick Verga, Emma Strubell, Ofer Shai, and Andrew McCallum
    In 6th Workshop on Automated Knowledge Base Construction (AKBC), Dec 2017

2016

  1. Multilingual Relation Extraction using Compositional Universal Schema
    Patrick Verga, David Belanger, Emma Strubell, Benjamin Roth, and Andrew McCallum
    In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun 2016
  2. Extracting Multilingual Relations under Limited Resources: TAC 2016 Cold-Start KB construction and Slot-Filling using Compositional Universal Schema
    Haw-Shiuan Chang, Abdurrahman Munir, Ao Liu, Johnny Tian-Zheng Wei, Aaron Traylor, Ajay Nagesh, Nicholas Monath, Patrick Verga, Emma Strubell, and Andrew McCallum
    In Text Analysis Conference (Knowledge Base Population Track) ’16 Workshop (TAC KBP), Nov 2016

2015

  1. Outstanding Paper
    Learning Dynamic Feature Selection for Fast Sequential Prediction
    Emma Strubell, Luke Vilnis, Kate Silverstein, and Andrew McCallum
    In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Jul 2015
  2. Building Knowledge Bases with Universal Schema: Cold Start and Slot-Filling Approaches
    Benjamin Roth, Nicholas Monath, David Belanger, Emma Strubell, Patrick Verga, and Andrew McCallum
    In Text Analysis Conference (Knowledge Base Population Track) ’15 Workshop (TAC KBP), Nov 2015

2014

  1. Training for Fast Sequential Prediction Using Dynamic Feature Selection
    Emma Strubell, Luke Vilnis, and Andrew McCallum
    In NIPS Workshop on Modern Machine Learning and NLP (NIPS WS), Dec 2014
  2. Minimally Supervised Event Argument Extraction using Universal Schema
    Benjamin Roth, Emma Strubell, Katherine Silverstein, and Andrew McCallum
    In 4th Workshop on Automated Knowledge Base Construction (AKBC), Dec 2014
  3. Universal Schema for Slot-Filling, Cold-Start KBP and Event Argument Extraction: UMassIESL at TAC KBP 2014
    Benjamin Roth, Emma Strubell, John Sullivan, Lakshmi Vikraman, Katherine Silverstein, and Andrew McCallum
    In Text Analysis Conference (Knowledge Base Population Track) ’14 Workshop (TAC KBP), Nov 2014

2012

  1. Modeling the Spread of Biologically-Inspired Internet Worms
    Emma Strubell
    University of Maine, May 2012
    Undergraduate honors thesis