The Ensembl FTP site provides biological sequence databases suitable for large-scale local sequence similarity search approaches, as well as MySQL table dumps of all underlying Ensembl databases. These table dumps are suitable for import into relational database management systems and allow installation of complete Ensembl mirror sites.
Please note: Ensembl supports downloading of many correlation tables via the highly customisable BioMart data mining tool. You may find exploring this web-based data mining tool easier than extracting information from our normalised database dumps.
The URL ftp://ftp.ensembl.org/pub/ is the basis of the directory structure outlined below. The structure is also described in the FTP site README. The latest data sets are available via directories prefixed 'current_'. For example 'current_embl' will always point to the latest data release files in EMBL format.
The FTP directory has the following basic structure, although not all information is neccessarily available for each species.
|-- embl Gene predictions annotated on genomic DNA slices of 1 Mb in EMBL format. | | | |-- species | | |-- emf Alignment dumps in EMF format | | | |-- pecan * Pecan whole genome multiple alignments | | with conservation scores for selected sets | |-- ensembl_compara * protein trees and protein multiple alignments | | underlying orthologue/paralogue predictions | |--species_variation * resequencing data | | |-- fasta Gene predictions in FASTA datatabase format | | | |--species | | | |-- cdna * Transcript (cDNA) predictions | |-- dna * Genomic DNA in assembled entities | |-- pep * Translation (peptide) predictions | |-- rna * Non-coding RNA predictions | | |-- genbank Gene predictions annotated on genomic DNA slices of 1 Mb in GenBank format. | | | |-- species | | |-- gtf Gene annotation in GTF format | |-- mysql MySQL database table text dumps | |-- core General genome annotation information | | * Genome sequence assembly | * Ensembl gene predictions | * Ab initio gene predictions | * Marker information | * ... | |-- otherfeatures Additional genome annotation | | * Gene predictions based on EST information | * ... | |-- variation Genetic variation information |-- vega Manually curated gene sets |-- cdna cDNA to genome alignments based on the latest EMBL database | |-- ensembl_compara Cross-species comparative genomics data: | | * Orthologue/paralogue predictions | * Protein families | * Whole genome alignments | * Synteny information | |-- ensembl_go Gene Ontology database | |-- ensembl_web_user_db SQL table defintion for server-side user config database | |-- ensembl_website Ensembl web site database: | * Context-sensitive help articles | * News articles | * Mini-ads | |-- ensembl_mart Cross-species data mining tables | |-- genomic_features_mart Clone data sets | |-- ontology_mart | |-- sequence_mart Genome sequences | |-- snp_mart Genetic variation information | |-- vega_mart Manually curated gene sets
© 2025 Inserm. Hosted by genouest.org. This product includes software developed by Ensembl.