Skip to main content
U.S. flag

An official website of the United States government

Return to search results
💡 Advanced Search Tip

Search by organization or tag to find related datasets

Data from: Genotypic characterization of the U.S. peanut core collection

Published by Agricultural Research Service | Department of Agriculture | Metadata Last Checked: June 24, 2025 | Last Modified: 2024-02-15
<p>This collection contains supplementary data for the manuscript "Genotypic characterization of the U.S. Peanut Core Collection", which describes genotyping results for the USDA peanut core collection. Each accession was genotyped with the Arachis_Axiom2 SNP array, yielding 14,430 high-quality, informative SNPs across the collection. Additionally, a subset of the core collection was replicated genotyped in replicate, using between two and five seeds per accession to assess heterogeneity within an accession. Supplementary files include: descriptive information about the genotyped accessions, SNP genotype calls in several formats, a phylogenetic tree calculated from the genotype data, Structure analysis, PCA analysis, and comparisons with the diploid progenitors.</p> <p>This research was co-funded by the National Institute of Food and Agriculture and the National Peanut Board. </p><div><br>Resources in this dataset:</div><br><ul><li><p>Resource Title: Structure membership breakdown.</p> <p>File Name: SF10_K5_membership.pdf</p><p>Resource Description: The proportion of accessions assigned to clusters 1-5 in a Structure analysis (manuscript Figure 3), for K=5 clusters. </p></li><br><li><p>Resource Title: Structure membership assignments for accessions.</p> <p>File Name: SF11_K5_cluster_assignment.xlsx</p><p>Resource Description: The proportional assignments of each cluster to all accessions (relative to the Structure diagram shown in manuscript Figure 3). </p></li><br><li><p>Resource Title: Principal components analysis.</p> <p>File Name: SF12_pca_34.pdf</p><p>Resource Description: Principal Component Analysis of 1120 samples based on 2063 unlinked SNP markers. The X-axis represents PC 3 and the Y-axis represents PC 4. Samples are colored and grouped according to: A. clade membership as defined in the phylogenetic and network analyses, B. botanical varieties, C. market type, D. growth Habit, E. pod shape, and F. collection type </p></li><br><li><p>Resource Title: Pod images for PI 497426.</p> <p>File Name: SF14_PI497426_pods.jpg</p><p>Resource Description: Pods from accession PI 497426 (clade 4), illustrating the distinctive reticulation pattern seen in some accessions in this clade. </p></li><br><li><p>Resource Title: Data dictionary.</p> <p>File Name: data_dictionary_KNWV.txt</p><p>Resource Description: Description of all files in this Dataset. Changes were made to this file on 4/15/202, to update some file names to indicate new versions.</p></li><br><li><p>Resource Title: Main descriptive information about genotyped accessions.</p> <p>File Name: SF01_peanut_core_v14.xlsx</p><p>Resource Description: The main descriptive information about the genotyped accessions, including: information about replicate similarity; phylogenetic clades, geographic origin, and phenotype; and summaries of phenotypic and country information relative to clade assignments. Changes were made to this file on 4/15/2020: Added INDEX worksheet and corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: SNPs as called by the Axiom suite .</p> <p>File Name: SF02_SNPs_whole_Axiom_Arachis2_txt.gz</p><p>Resource Description: The original genotype calls for the Axiom array (for poly-high resolution SNPs). Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: Genotyping calls in VCF format.</p> <p>File Name: SF03_SNPs_whole_Axiom_Arachis2_vcf.gz</p><p>Resource Description: The Axiom array genotype calls, in VCF format. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: DNA variants for all accessions, including from genome assemblies, in TSV format.</p> <p>File Name: SF04_SNPs_w_4_genomes_tsv.gz</p><p>Resource Description: The predominant DNA variants at each SNP location, for all accessions, including variants inferred from four available genome assemblies: A. duranensis and A. ipaensis together, and A. hypogaea accessions Tifrunner, Shitouqi, and Fuhuasheng. The format is in a simple tab-separated table, with 14431 columns (SNP positions). Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: DNA variants for all accessions, including from genome assemblies, in fasta format.</p> <p>File Name: SF05_SNPs_w_4_gnm_mrgd_fas.gz</p><p>Resource Description: The predominant DNA variants at each SNP location, for all accessions, including variants inferred from four available genome assemblies: A. duranensis and A. ipaensis together, and A. hypogaea accessions Tifrunner, Shitouqi, and Fuhuasheng. In fasta format. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: Base-calls for selected accessions, relative to A- and B-genome progenitors.</p> <p>File Name: SF06_chip_and_genome_samples_v05.xlsx</p><p>Resource Description: DNA base-calls for 16 selected, diverse accessions, with comparisons to the variants observed in the A. duranensis and A. ipaensis genomes, and inferences regarding the likely progenitor for the DNA, i.e. A-genome (A. duranensis) or B-genome (A. ipaensis). Changes were made to this file on 4/15/2020: Added INDEX worksheet and corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: Reduced fasta alignments, at 98% identity.</p> <p>File Name: SF07_SNPs_w_4_gnm_mrgd_cen98_fas.gz</p><p>Resource Description: Reduced fasta alignments (relative to the complete alignment file, S5). File S7 has the centroid representatives at 98% identity. This files has 518 sequences. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: Reduced fasta alignments, at 99% identity.</p> <p>File Name: SF08_SNPs_w_4_gnm_mrgd_cen99_fas.gz</p><p>Resource Description: Reduced fasta alignments (relative to the complete alignment file, S5). File S8 has the centroid representatives at 99% identity. This file has 680 sequences. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: Phylogenetic tree of genotype data.</p> <p>File Name: SF09_SNPs_w_4_gnm_mrgd_rt3_nh_txt.gz</p><p>Resource Description: Phylogenetic tree (Newick format) calculated from the alignent in S5, and corresponding with the phylogenetic tree shown in manuscript Figure 1. Changes were made to this file on 4/15/2020: Corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: Subgenome origins of SNPs relative to the A-genome and B-genome progenitors.</p> <p>File Name: SF13_chip_and_genome_GFFs.xlsx</p><p>Resource Description: Inferred subgenome origins of SNPs relative to the A-genome and B-genome progenitors (A. duranensis and A. ipaensis). This data is in GFF format, derived from S6, and used as the basis for the plots in Figure 7 (showing regions of possible subgenome invasions). Changes were made to this file on 4/15/2020: Added INDEX worksheet and corrected three peanut variety identifiers: ROL11 --> TamrunOL-11; NCL06 --> TamnutOL-06; NM309N2 --> NM309-2</p></li><br><li><p>Resource Title: Peruvian Moche-era peanut necklace.</p> <p>File Name: SF15_Sipan_neclkace_Donnan_Einstein.jpg</p><p>Resource Description: Picture of necklace of peanuts, sculpted in gold and silver, from the Moche-era tomb at Sipán (c.AD 250) in coastal Peru. Photograph by Susan Einstein, courtesy of Christopher Donnan. Changes were made to this file on 4/15/2020: Replaced black-and-white derived image with original color image</p></li></ul><p></p>

Complete Metadata

data.gov

An official website of the GSA's Technology Transformation Services

Looking for U.S. government information and services?
Visit USA.gov