Skip to main content
U.S. flag

An official website of the United States government

Return to search results
💡 Advanced Search Tip

Search by organization or tag to find related datasets

Hepatitis C virus whole genome position weight matrix and robust primer design

Published by National Institutes of Health | U.S. Department of Health & Human Services | Metadata Last Checked: September 06, 2025 | Last Modified: 2025-09-06
Background The high degree of sequence heterogeneity found in Hepatitis C virus (HCV) isolates, makes robust nucleic acid-based assays difficult to generate. Polymerase chain reaction based techniques, require efficient and specific sequence recognition. Generation of robust primers capable of recognizing a wide range of isolates is a difficult task. Results A position weight matrix (PWM) and a consensus sequence were built for each region of HCV and subsequently assembled into a whole genome consensus sequence and PWM. For each of the 10 regions, the number of occurrences of each base at a given position was compiled. These counts were converted to frequencies that were used to calculate log odds scores. Using over 100 complete and 14,000 partial HCV genomes from GenBank, a consensus HCV genome sequence was generated along with a PWM reflecting heterogeneity at each position. The PWM was used to identify the most conserved regions for primer design. Conclusions This approach allows rapid identification of conserved regions for robust primer design and is broadly applicable to sets of genomes with all levels of genetic heterogeneity.

Find Related Datasets

Click any tag below to search for similar datasets

Complete Metadata

data.gov

An official website of the GSA's Technology Transformation Services

Looking for U.S. government information and services?
Visit USA.gov