Downloads

Matt Oates is the corresponding author for the database's website and dump. For specific predictors please contact the corresponding author listed on our about predictors page. For downloading specific genome content and sequences of interest please use the Browse feature to find your sequences then hit the download button at the top of the results table.

All of the dump files below can be loaded into your own MySQL database using this schema.

Web/database code available from GitHub:

Click here for the full SQL database dump

Click here for information on web API access

predictor_compared.tsv.gz All vs all pairwise comparison of each predictor per protein.

consensus_ranges.tsv.gz All of the disordered regions per protein where a percentage consensus across all predictors has been calculated.

coverage_conflict.tsv.gz Per protein coverage for each disorder predictor including its coverage that is contained within SCOP domains.

consensus_conflict_number.tsv.gz Per protein locus: (a) how many disorder predictors suggest disorder, and (b) how many are predicting in a region that has been assigned as containing a SCOP domain by SUPERFAMILY..

Genome Info .tab Get all of the genome meta-data as a tab-delimited file, loadable by MySQL or similar database engine.

genomes.fa.bz2 Get the reference genome sequences in FASTA format. Internal protein ID used as the sequence ID. To map to each genome you will need the genomes.protein file.

genomes.protein.gz Get the genome to protein ID mapping used as the reference genome FASTA file. This is in a tab-delimited format for easy database loading.

strains.fa.bz2 Get the genome sequences of all species strains in FASTA format. Internal protein ID used as the sequence ID. To map to each genome you will need the strains.protein file.

strains.protein.gz Get the genome to protein ID mapping used as the species strains FASTA file. This is in a tab-delimited format for easy database loading.

all.disrange.gz Get all disordered regions predicted by every predictor for all protein sequences. Both .protein files from above will be required to properly map these results to sequence.

vlxt.disrange.gz Just the VL-XT predictions.

vsl2b.disrange.gz Just the VSL2b predictions.

prdos.disrange.gz Just the PrDOS predictions.

pv2.disrange.gz Just the PV2 predictions.

iupred-s.disrange.gz Just the IUPred-S predictions.

iupred-l.disrange.gz Just the IUPred-L predictions.

espritz-x.disrange.gz Just the Espritz-X predictions.

espritz-n.disrange.gz Just the Espritz-N predictions.

espritz-d.disrange.gz Just the Espritz-D predictions.

prdos.score.bz2 Per amino acid disorder scores for all proteins from PrDOS.

iupred-s.score.bz2 Per amino acid disorder scores for all proteins from IUPred-S.

iupred-l.score.bz2 Per amino acid disorder scores for all proteins from IUPred-L.

espritz-x.score.bz2 Per amino acid disorder scores for all proteins from Espritz-X.

espritz-n.score.bz2 Per amino acid disorder scores for all proteins from Espritz-N.

espritz-d.score.bz2 Per amino acid disorder scores for all proteins from Espritz-D.

anchor.disrange.gz ANCHOR results as a part of the IUPred prediction. This file contains the amino acid ranges per protein that are predicted to spontaneously bind with the surface of other proteins, via linear motif interaction.

anchor.score.gz Per residue scores from ANCHOR for every protein in the database collection.

Downloads

Web/database code available from GitHub:

Click here for the full SQL database dump

Click here for information on web API access

Predictor Statistics

Genomes

Proteins

Disorder

Disordered MoRF/Binding Sites