Return to search results
💡 Advanced Search Tip
Search by organization or tag to find related datasets
Model Estimates of Chlorophyll a and CyanoHABs Occurrence for Select New York State Lakes
This data release contains modeled estimates of chlorophyll a concentration and cyanoHAB occurrence for a subset of lakes in New York State. Estimates of chlorophyll a concentration were generated using a random forest model. Estimates of cyanoHAB occurrence were generated based on thresholds derived from bootstrapped logistic regression. All analysis was done in R 4.4.1 (R Core Team, 2024) and the full methods are described in Savoy and others (2025).
Items in this data release include:
random forest inputs and output.txt: Contains observed chlorophyll a concentrations, modeled estimates of chlorophyll a from a random forest model, and all necessary inputs to the model. The file format is tab-delimited with one observation per row.
chla_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of chlorophyll a concentration, observed cyanoHAB occurrence, and chlorophyll a concentrations. The file format is tab-delimited with one observation per row.
dwl_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of dominant wavelength, observed cyanoHAB occurrence, and dominant wavelength. The file format is tab-delimited with one observation per row.
kd_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of the irradiance attenuation coefficient, observed cyanoHAB occurrence, and irradiance attenuation coefficient. The file format is tab-delimited with one observation per row.
tn_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of total nitrogen, observed cyanoHAB occurrence, and total nitrogen. The file format is tab-delimited with one observation per row.
tp_logistic.txt: Contains modeled estimates of cyanoHABs occurrence based on a threshold of total phosphorus, observed cyanoHAB occurrence, and total phosphorus. The file format is tab-delimited with one observation per row.
Associated metadata for each data file follow the same naming convention with a .xml extension.
Includes inputs and output from a random forest model to predict chlorophyll a concentration. Included in this dataset are the necessary model inputs, observed chlorophyll a concentrations, and modeled estimates of chlorophyll a for 110 lakes across New York State.
Complete Metadata
| @id | http://datainventory.doi.gov/id/dataset/e99c527b83f0113245349fc3ebd73e07 |
|---|---|
| bureauCode |
[ "010:12" ] |
| identifier | USGS:67897fefd34e555111215703 |
| spatial | -79.76,40.5,-71.85,45.01 |
| theme |
[ "geospatial" ] |