DNA to protein translation tool

Start using this tool

This tool works similarly to other ones available online or programs allowing this feature. Genetic codes used in this service are those ones compiled by Andrzej (Anjay) Elzanowski and Jim Ostell.

DNA sequence may be added as shown in the example sequence or in any other format (number, spaces and line feeds are removed). JavaScript enable browser will be able to perform small tasks as for example tiding up the sequence and getting reverse or complement sequences.

Translation to protein will be performed by using one of the predefined genetic codes, or by using custom genetic code. Minimum size of protein sequence for Open Reading Frames (ORF) is customizable, and they can be trimmed to MET-to-Stop. Showing translation alignment is optional, and aminoacids will be displaied as a 1-letter  aminoacids code.

After translation, in the response page ORFs are shown as arrow. In order to check ORFs represented by those arrows, click on them and a new browser window will be opened showing in red letters the DNA sequence corresponding to that specific ORF and translated protein. This feature requires a JavaScript enable browser.


How to use custom genetic codes

The genetic code used to translate a sequence into protein may be customized.

This service allows introducing the genetic code as a string, where each character corresponds to one aminoacid and asteriscs represents termination codes. In the example bellow is shown the standard genetic code and the corresponding triplets.

Standard genetic code

Aminoacid/Termination FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG

-- Base1 TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
-- Base2 TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
-- Base3 TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG


Explanation

In the first line, the first character ("F") represents Phenylalanine,which is encoded by the triplet TTT (first character of "Base1",
first character of "Base2" and first character of "Base3")

The eleventh character ("*") represents a termination code, which is encoded by the triplet TAA.



The custom genetic code provided must be 64 characters long. Correspondence between characters and aminoacids may follow the system used in this service or may be different, but it will be always case insentitive.

Methionine as a initiation code

When searching "ORFs trimmed to MET-to-Stop", they will be shown the longest ORFs available (from methionine to Stop), so that within the ORF  there may be several methionines, as for example in the aminoacid secuence bellow:

MQVVLITLSDVNSTTWGSRISLGYMAACFRVREVELVKNLMMTGVVLQFTVDFPPSNSEFPHMLGNSNTISPFIPISAT



1-letter aminoacid codes

    A  alanine                         P  proline
B aspartate or asparagine Q glutamine
C cysteine R arginine
D aspartate S serine
E glutamate T threonine
F phenylalanine U selenocysteine
G glycine V valine
H histidine W tryptophan
I isoleucine Y tyrosine
K lysine Z glutamate or glutamine
L leucine X any
M methionine * translation stop
N asparagine - gap of indeterminate length




2003-2015@ University of the Basque Country. All rights reserved.