Aug 01
A SEquence Globally Unique IDentifier (SEGUID) Proteome Database
External Databases, Interfaces No Comments »Protein sequences stored in UniProt have a CRC64 checksum. This means that it is possible to take an arbitrary sequence (with or without a known identifier), calculate the CRC64 checksum for it, and then search UniProt for that specific sequence. There are two problems with this:
- UniProt does not provide any web services for querying their database. I’ll check again, but, I don’t see it
- There is a low possibility of collision; that is, two sequences may produce the same CRC64 checksum.
On Nature Precedings I came across A SEquence Globally Unique IDentifier (SEGUID) Proteome Database that solves the two problems above.
Powered by ScribeFire.
Recent Comments