AlphaFold Protein Structure Database
The AlphaFold Protein Structure Database Developed by DeepMind and EMBL-EBI is now available online.
AlphaFold DB provides open access to protein structure predictions for the human proteome and 20 other key organisms to accelerate scientific research.
AlphaFold DB currently provides predicted structures for the organisms listed below and includes human, laboratory species, and key pathogens. All the predictions for all the species can be downloaded from the EBI FTP site ftp://ftp.ebi.ac.uk/pub/databases/alphafold.
Species | Common Name | Reference Proteome | Predicted Structures | Download |
---|---|---|---|---|
Arabidopsis thaliana | Arabidopsis | UP000006548 | 27,434 | Download (3642 MB) |
Caenorhabditis elegans | Nematode worm | UP000001940 | 19,694 | Download (2601 MB) |
Candida albicans | C. albicans | UP000000559 | 5,974 | Download (965 MB) |
Danio rerio | Zebrafish | UP000000437 | 24,664 | Download (4141 MB) |
Dictyostelium discoideum | Dictyostelium | UP000002195 | 12,622 | Download (2150 MB) |
Drosophila melanogaster | Fruit fly | UP000000803 | 13,458 | Download (2174 MB) |
Escherichia coli | E. coli | UP000000625 | 4,363 | Download (448 MB) |
Glycine max | Soybean | UP000008827 | 55,799 | Download (7142 MB) |
Homo sapiens | Human | UP000005640 | 23,391 | Download (4784 MB) |
Leishmania infantum | L. infantum | UP000008153 | 7,924 | Download (1481 MB) |
Methanocaldococcus jannaschii | M. jannaschii | UP000000805 | 1,773 | Download (171 MB) |
Mus musculus | Mouse | UP000000589 | 21,615 | Download (3547 MB) |
Mycobacterium tuberculosis | M. tuberculosis | UP000001584 | 3,988 | Download (421 MB) |
Oryza sativa | Asian rice | UP000059680 | 43,649 | Download (4416 MB) |
Plasmodium falciparum | P. falciparum | UP000001450 | 5,187 | Download (1132 MB) |
Rattus norvegicus | Rat | UP000002494 | 21,272 | Download (3404 MB) |
Saccharomyces cerevisiae | Budding yeast | UP000002311 | 6,040 | Download (960 MB) |
Schizosaccharomyces pombe | Fission yeast | UP000002485 | 5,128 | Download (776 MB) |
Staphylococcus aureus | S. aureus | UP000008816 | 2,888 | Download (268 MB) |
Trypanosoma cruzi | T. cruzi | UP000002296 | 19,036 | Download (2905 MB) |
Zea mays | Maize | UP000007305 | 39,299 | Download (5014 MB) |
The search bar at the top of the query page accepts queries based on protein name, gene name, UniProt identifier, or organism name. At present you can't search using a sequence and look for similar proteins. You would first need to do a BLAST search and use the results from that as queries.
Here I searched for Plasmodium falciparum carbonic anhydrase (Q8IHW5) a potential Malaria target. As you can see there is no crystal structure in the PDB. Whilst the active site is predicted with high confidence there are clearly regions for which there is very low confidence.
You can then download the structure in PDB or mmCIF format.
I made a homology model (in purple below) of this protein a while back and it has little sequence similarity with any proteins in the PDB. Despite not including a Zinc the Alphafold Predicted Structure includes histidines in positions to potentially coordinate to the Zinc. If it is possible to include the Zinc in the structure prediction I'd be interested in finding out.
Overall I'd say this is a very useful starting point.