1. Why is a Genome Survey Analysis Essential?
In the absence of a known genome size, a survey analysis is the standard approach. It involves breaking down sequencing reads into K-mers and analyzing their frequency distribution to mathematically estimate the genome's fundamental characteristics—such as size, heterozygosity, and repeat content. Furthermore, a preliminary assembly is performed, and the resulting GC distribution of Contigs is examined to detect potential contamination. This process provides a reliable scientific basis for formulating the final genome assembly strategy.
If prior sequencing data is unavailable, researchers can utilize the following methods to estimate genome size:
Database Search:
Plants: Royal Botanic Gardens, Kew - C-values Database
Animals: Animal Genome Size Database
Flow Cytometry: This is a widely used experimental method for estimating genome size. It measures the fluorescence intensity of DNA-stained nuclei to determine the relative DNA content compared to a standard reference.