Downloading Sequence Databases
Protein and DNA sequence library files can be downloaded from many different sources, including the NCBI and EMBL-EBI.
Database formats
The FASTA programs work with many different library formats;
you will not need to run file conversion programs or formatting
programs to search sequence libraries with FASTA. However, the
FASTA programs assume that libraries are in FASTA format; to
search libraries in other formats, the format type must be specified
with the file name, e.g.
fasta34 -q mgstm1.aa "/slib/ncbi/refseq_protein 12"
would search the NCBI refseq_protein library in
NCBI/BLAST formatdb format.
Supported popular library formats include:
Format | Description |
0 (or none) | FASTA format |
1 | Genbank flatfile |
3 | EMBL-EBI/Swissprot flatfile |
5 | GCG/PIR flatfile |
6 | GCG compressed binary |
12 | NCBI BLAST formatdb version 2 (current version) |
16 | MySQL SQL query |
17 | PostgresQL SQL query |
|