Comprehensive Database of Environmental Mitigations Extracted from FERC-Licensed Hydropower Projects using Artificial Intelligence Techniques, 1998-2023

Publication Date:
June 30, 2025
Authors:
Tom A. Ruggles, Hong-Jun Yoon, Arjun Bhattacharya and Debjani Singh
Organization:
HydroSource
Institution:
Oak Ridge National Laboratory

Citation

Tom A. Ruggles, Hong-Jun Yoon, Arjun Bhattacharya and Debjani Singh. 2025. Comprehensive Database of Environmental Mitigations Extracted from FERC-Licensed Hydropower Projects using Artificial Intelligence Techniques, 1998-2023. HydroSource. Oak Ridge National Laboratory. Oak Ridge, Tennessee, USA. DOI: 10.21951/Environmental_MitigationsAI/2570983

Overview

Overview

This dataset provides a comprehensive inventory of environmental mitigation measures required by Federal Energy Regulatory Commission (FERC) licensed hydropower facilities from 461 licenses that were issued from 1998 to 2023. These licenses constitute 446 of the 1015 FERC projects that were active at the end of 2023. 17,612 mentions of environmental mitigations were identified and categorized in 128 unique categories. Mitigations were identified using a Natural Language Processing (NLP) approach, specifically with a Bidirectional Encoder Representations from Transformer (BERT) model. Model-derived results were then reviewed and updated by a subject matter expert as needed. This dataset introduces important enhancements to previous efforts to inventory environmental mitigations, such as including associated license text for each mitigation, tracking the number of instances a mitigation was identified within a license, and providing improved location information. These enhancements significantly expand the dataset’s utility, offering greater analytical capabilities and ensuring reproducibility. The dataset is downloadable as a zip file containing the metadata and dataset files.

Methods

This dataset analyzes 461 FERC licenses issued between 1998 and 2023. Of these, 447 remain active as of the end of 2023, while 15 have become inactive due to retirement, revocation, surrender, or termination. It also incorporates 291 of the 308 licenses previously studied by Bevelhimer et al. (2015), with a few excluded due to the unavailability of digitized text.

To identify environmental mitigation measures, an approach leveraging Natural Language Processing (NLP) approach was developed. The NLP method employed a BERT model, chosen for its effectiveness in understanding the context of words in relation to each other within paragraphs. A subject-matter expert first annotated a training set using licenses from 2014–2016, labeling over 1,800 text segments across 93 mitigation categories. The BERT model was then used to analyze each paragraph of the FERC licenses and assign one or more environmental mitigations along with a corresponding confidence score. The BERT model consisted of two sub-models: a detection model that identified regulatory language and a classification model that assigned mitigation types. The model was retrained with expanded data from 2014–2023 and applied to earlier licenses from 1998–2013. The final dataset compiles mitigation measures and associated license text into a standardized, reproducible format thereby improving upon previous efforts and supporting future research.

Caveats and Limitations:

The authors recognize that there are some caveats surrounding the use of data provided in this dataset. Detailed caveats and limitations are in the Environmental_mitigations_extracted_using_AI_1998_2023_readme.txt.

Applications

This dataset provides a synthesis of environmental mitigation measures required by FERC for licensed hydropower facilities. Users can explore spatial and temporal trends in mitigation requirements. These trends might inform hydropower developers and managers of measures they may be required to employ as a result of the initial licensing or relicensing process. Researchers and policymakers might use this data to explore the relationship between mitigation measures and various environmental indicators to explore their efficacy.

References

Mark S. Bevelhimer, Michael P. Schramm, Christopher R. DeRolph. 2015. US Hydropower Mitigation Database. HydroSource. Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA. Available at: https://hydrosource.ornl.gov/dataset/us-hydropower-mitigation-database

Science Themes

Related Records

Name(Required)
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
This field is hidden when viewing the form
Used to connect entry data with SiteImprove data