Skip to main content
U.S. flag

An official website of the United States government

Return to search results
💡 Advanced Search Tip

Search by organization or tag to find related datasets

Model Estimates of Chlorophyll a and CyanoHABs Occurrence for Select New York State Lakes

Published by U.S. Geological Survey | Department of the Interior | Metadata Last Checked: September 05, 2025 | Last Modified: 20250808
This data release contains modeled estimates of chlorophyll a concentration and cyanoHAB occurrence for a subset of lakes in New York State. Estimates of chlorophyll a concentration were generated using a random forest model. Estimates of cyanoHAB occurrence were generated based on thresholds derived from bootstrapped logistic regression. All analysis was done in R 4.4.1 (R Core Team, 2024) and the full methods are described in Savoy and others (2025). Items in this data release include: random forest inputs and output.txt: Contains observed chlorophyll a concentrations, modeled estimates of chlorophyll a from a random forest model, and all necessary inputs to the model. The file format is tab-delimited with one observation per row. chla_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of chlorophyll a concentration, observed cyanoHAB occurrence, and chlorophyll a concentrations. The file format is tab-delimited with one observation per row. dwl_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of dominant wavelength, observed cyanoHAB occurrence, and dominant wavelength. The file format is tab-delimited with one observation per row. kd_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of the irradiance attenuation coefficient, observed cyanoHAB occurrence, and irradiance attenuation coefficient. The file format is tab-delimited with one observation per row. tn_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of total nitrogen, observed cyanoHAB occurrence, and total nitrogen. The file format is tab-delimited with one observation per row. tp_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of total phosphorus, observed cyanoHAB occurrence, and total phosphorus. The file format is tab-delimited with one observation per row. Associated metadata for each data file follow the same naming convention with a .xml extension. Includes inputs and output from a random forest model to predict chlorophyll a concentration. Included in this dataset are the necessary model inputs, observed chlorophyll a concentrations, and modeled estimates of chlorophyll a for 110 lakes across New York State.

Find Related Datasets

Click any tag below to search for similar datasets

Complete Metadata

data.gov

An official website of the GSA's Technology Transformation Services

Looking for U.S. government information and services?
Visit USA.gov