Click on a chromosome for a closer view
This site presents the whole genome shotgun assembly from a female Sumatran orangutan named Susie, housed at the Gladys Porter Zoo (Brownsville, TX). The primary donor-derived reads were assembled using PCAP (Huang, 2006) using stringent parameters; by aligning the orangutan genome against the human genome, it was possible to identify interchromosomal cross-overs and thus eliminate global mis-assemblies larger than 50kb.
Of the 3.09Gb of total sequence, 3.08Gb are ordered and oriented along the chromosomes. Gap sizes between supercontigs were estimated based on their size in human, with a maximum allowed gap size of 30kb.
The Orangutan genome has been released in pre-publication status from the Genome Sequencing Center from Washington University, St Louis. This is provided freely to be used by anyone, but they have requested that the scientific ethics of other groups publishing on this pre-publication data are respected. This is outlined in detail in the Fort Lauradale agreement; in brief, small scale analysis, eg, the analysis of a single locus is an expected use of the data which can be published on without any expectation of coordination. In contrast, large scale, genomewide analysis is expected to be either coordinated with the Orangutan Analysis group in some manner or published after the initial paper. More details on the reasoning for this and details are given in the Fort Lauradale document.
Due to the high sequence similarity to the human genome, the Orangutan genebuild was based on a projection of human gene structures (Ensembl Human build 36i). The projections were made through chained whole genome BLASTz alignments. These projected genes were combined with orangutan-specific proteins, and additional human genes were added using exonerate where the projection was unable to make satisfactory gene models. UTRs were added using orangutan-specific ESTs and cDNAs as well as human cDNAs.
The multiple alignments are being extended with new species and 2X genomes.
Read more...
The Blastz-net alignments have been updated for the following species.
Read more...
Assembly: | PPYG2, Sep 2007 |
Genebuild: | Ensembl, Oct 2007 |
Database version: | 50.1a |
Known protein-coding genes: | 3,789 |
Projected protein-coding genes: | 13,899 |
Novel protein-coding genes: | 2,256 |
Pseudogenes: | 1,008 |
RNA genes: | 4,686 |
Genscan gene predictions: | 53,999 |
Gene exons: | 207,621 |
Gene transcripts: | 24,431 |
SNPs: | 1,384,342 |
Base Pairs: | 3,109,347,532 |
Golden Path Length: | 3,446,771,396 |
Most common InterPro domains: | Top 40 Top 500 |
© 2025 Inserm. Hosted by genouest.org. This product includes software developed by Ensembl.