Return to search results
💡 Advanced Search Tip
Search by organization or tag to find related datasets
Observed, predicted, and misclassification error data for observations in the training datset for nitrate and arsenic concentrations in basin-fill aquifers in the Southwest Principal Aquifers study.
This product "Observed, predicted, and misclassification error data for observations in the
training dataset for nitrate and arsenic concentrations in basin-fill aquifers in the Southwest
Principal Aquifers study" is a 1:250,000-scale point dataset and was developed as part of a
regional Southwest Principal Aquifers (SWPA) study. The study examined the vulnerability
of basin-fill aquifers in the southwestern United States to nitrate contamination and arsenic
enrichment. Statistical models were developed by using the random forest classifier algorithm
to predict concentrations of nitrate and arsenic across a model grid that represents local- and
basin-scale measures of source, aquifer susceptibility, and geochemical conditions.
Separate classifiers were developed for nitrate and arsenic because each constituent was
expected to be affected by a different set of factors, and each factor could have a different
magnitude or directional influence (increase/decrease) on concentration. For each constituent,
two different classifiers were developed; a prediction classifier and a confirmatory classifier.
The prediction classifiers were developed specifically to predict nitrate and arsenic
concentrations in basin-fill aquifers across the SWPA study area and were based on
explanatory variables representing source and susceptibility conditions. These explanatory
variables were available throughout the entire SWPA study area and, therefore, did not pose
a limitation for using the classifiers to predict concentrations.
The confirmatory classifiers were developed to supplement the prediction classifiers in the
evaluation of the conceptual model. The name, "confirmatory," reflects the classifier's purpose
for evaluation of a-priori hypotheses and contrasts other general types of statistical models,
such as those used for prediction or exploratory purposes. The confirmatory classifiers
included the explanatory variables used in the prediction classifiers, as well as additional
variables representing geochemical conditions and basin groundwater budget components.
The inclusion of the geochemical and basin groundwater budget variables in the confirmatory
classifiers allowed for further evaluation of the conceptual models, which was not possible
with the prediction classifiers alone. The geochemical data, however, were only available at
specific well locations, and consistent water-budget data were not available for every basin
in the study area. The limited availability of the data for these variables constrained the
confirmatory classifiers to observations from 16 case-study basins and precluded use of
the confirmatory classifier for predicting concentrations across the SWPA study area. To
contrast the scope of the two classifiers, the confirmatory classifiers were developed by
using all available explanatory variables but with observations restricted to the 16 case-study
basins, whereas the prediction classifiers were unrestricted with respect to spatial extent
because these were developed by using a subset of the explanatory variables that were
available throughout the study area.
Complete Metadata
| @id | http://datainventory.doi.gov/id/dataset/2b56bce81f58251eb721ee627a188f74 |
|---|---|
| bureauCode |
[ "010:12" ] |
| identifier | USGS:1d589b73-af80-4229-bd58-c62dd4192bc4 |
| spatial | -124.889549,29.300033,-104.566268,44.627454 |
| theme |
[ "geospatial" ] |