These tables reflect the Ensembl gene model, which states that a gene contains one or more transcripts, which in turn contain one or more exons. Transcripts may have a translation for protein-coding genes or may not if they are pseudogenes or RNA genes. All of the above objects can have a stable_id that Ensembl tries to map between different genebuilds. This is stored in separate tables because during the gene prediction process these IDs are not known and are uploaded later (as it is easier to upload than update statements).
Some table columns need further explanation:
gene.type currently contains either ensembl or pseudogene and is in need of standardisation (as does the analysis link).
gene.display_xref_id links to the xref entry that gives this gene an external name. This also defines the gene as known.
transcript.display_xref_id same as for the gene.
exon_transcript table builds a many-to-many relationship between transcripts and exons. The same exon can be shared by many transcripts (in the same gene). Rank count begins at 1 and goes from the 5 prime end of transcript.
exon.phase (values 0, 1, 2 or -1 ). The phase of the exon. If the first base in the exon is not coding its phase is -1.
exon.end_phase (values 0, 1, 2 and -1).
© 2024 Inserm. Hosted by genouest.org. This product includes software developed by Ensembl.