Skip to main content
U.S. flag

An official website of the United States government

Return to search results
💡 Advanced Search Tip

Search by organization or tag to find related datasets

Supervised harvesting of expression trees

Published by National Institutes of Health | U.S. Department of Health & Human Services | Metadata Last Checked: September 06, 2025 | Last Modified: 2025-09-06
Background We propose a new method for supervised learning from gene expression data. We call it 'tree harvesting'. This technique starts with a hierarchical clustering of genes, then models the outcome variable as a sum of the average expression profiles of chosen clusters and their products. It can be applied to many different kinds of outcome measures such as censored survival times, or a response falling in two or more classes (for example, cancer classes). The method can discover genes that have strong effects on their own, and genes that interact with other genes. Results We illustrate the method on data from a lymphoma study, and on a dataset containing samples from eight different cancers. It identified some potentially interesting gene clusters. In simulation studies we found that the procedure may require a large number of experimental samples to successfully discover interactions. Conclusions Tree harvesting is a potentially useful tool for exploration of gene expression data and identification of interesting clusters of genes worthy of further investigation.

Find Related Datasets

Click any tag below to search for similar datasets

Complete Metadata

data.gov

An official website of the GSA's Technology Transformation Services

Looking for U.S. government information and services?
Visit USA.gov