.

Ensembl Sequence Statistics

Each species in Ensembl has a number of statistics for sequence length that are displayed on the home page and, where the sequence is assembled into chromosomes, on MapView pages. Since it may not be clear to users how these numbers are calculated, we offer the following definitions:

Base Pairs per chromosome
These are pre-calculated in order to speed up page display, and stored in the seq_region table of the core database. The number is based on the assembled end position of the last seq_region in each chromosome (from the AGP), or if there is a terminal gap it is set to the assembled end location of that terminal gap.
For the haplotype chromosomes (c6_COX etc), although there is only haplotype-specific sequence for a small region of the chromosome, the length of the seq_region is set to the full length of the chromosome including the specific haplotype (eg. c6_COX is 170899992bp long).
Base Pairs (whole assembly)
The total number of base pairs for the entire assembly is the sum of all sequences in the dna table of the core database. This includes redundant regions such as haplotypic sequences and the pseudo-autosomal region (PAR) of the Y chromosome in human, and gaps in Drosophila melanogaster. See the assembly details of each species for more information.
Golden Path
The "golden path" is the length of the reference assembly. It consists of the sum of all top-level sequences in the seq_region table, omitting any redundant regions such as haplotypes and PARs.

For information on gene counts, see the MapView help page.


 

© 2024 Inserm. Hosted by genouest.org. This product includes software developed by Ensembl.

                
GermOnline based on Ensembl release 50 - Jul 2008
HELP