It turns out that automatically finding NM_ accessions for NP_ accessions is much more difficult than it should be, So I have put up a web script to do the conversion.
Try the URL:
https://fastademo.bioch.virginia.edu/fasta_www2/NP_to_NM.cgi?acc=NP_000552,P09488and you should get the result:
NP_000552 NP_000552 NM_000561 P09488 NP_000552 NM_000561where the first field is the accession you provided, and the second and third are the refseq NP_ and refseq NM_ that match the sequence exactly. Here you can see it done on a larger scale for the a file that includes several NP_ accessions:
for n in `cat gst_np.acc`; do curl https://fastademo.bioch.virginia.edu/fasta_www2/NP_to_NM.cgi?acc=$n doneproduces:
NP_666533 NP_666533 NM_146421 NP_000552 NP_000552 NM_000561 NP_001135840 NP_001135840 NM_001142368 NP_000839 NP_000839 NM_000848 NP_000840 NP_000840 NM_000849 NP_000841 NP_000841 NM_000850 NP_000842 NP_000842 NM_000851 XP_005270842 XP_005270842 XM_005270785
The script should always work with refseq NP_'s and XP_'s. It will sometimes work with SwissProt accessions (P09488), but this is not reliable. For more reliable mapping of Uniprot accessions to RefSeq NP_'s, you need to use the Uniprot mapping service.
This script uses a database of proteins that I download from the NCBI that may not be completely up to date. So not every protein you try to map may be found. But the ones it finds should be correct.
~wrp/biol4230/proj1/NP_to_NM.py $ ~wrp/biol4230/proj1/NP_to_NM.py NP_000552 P09488 NP_000552 NP_000552 NM_000561 P09488 NP_000552 NM_000561or
for n in `cat gst_np.acc`; do ~wrp/biol4230/proj1/NP_to_NM.py $n done
It has the same "issues" as the web site, since it uses the same database.
Last modified: Friday, 30-Mar-2018 08:36:26 EDT