staff project download information miscellaneous
Vect   DNA to Protein Tutorial
  Installing Perl
Mac
Windows
Unix


Download

Reference Manual
Introduction
Overview
Input Panel
Convert Panel
Output Panel
Perl Program Panel


Tutorials
Numerical Data Extraction
Statistical Data Extraction
Patent Calculation
PDB Data Extraction
GenBank Data Extraction
Tabular Data Analysis
Word Mapping
DNA to Protein Extraction

Change Log


FAQ

Cookbook
 
MangoPicky DownloadLucy2 DownloadTrend DownloadGRAMAUBViz DownloadgeneDBN Download

Final Part: Converting DNA to Protein

Now that the gene sequences are in upper case, we can turn them into protein sequences! :-D Insert a new rule “To Translate data that may also change length. Name it “Quoted Proteins” and change it to the following:

Generic DNA to protein translation from Upper Case Genes.

The reason it is named quoted is because the string returned has * after every protein sequence to mark the end of it.


Click to view larger image

Our final step will be to get rid of the *. Insert a new rule “Extract quoted data from other rules.” Name it “Proteins” and change the rule to look like the following:

Multiple quoted data from Quoted Proteins on the left with nothing and on the right with " * ".

So we finally we got the protein sequences we want! :-)


Click to view larger image


For the ouput, select the rule "Gene Names" and copy it over to the output panel by clicking .
Select the last rule "Protein Names" and copy it over to the output panel. To word wrap, set the Protein Name tag to be <Proteins: 60>.

Your template should look like the following:

><Gene Names>
<Proteins:60>

The :60 means to word wrap every 60 characters. Notice the > symbol before the gene name.

Your output should look like the following:


Here is a summary of what we did:

1)extract gene names from the genbank report
2)extract the entire dna sequence from the genbank report
3)extract from the genbank report the coordinates for protein sequences
4)get the start and end coordinates
5)use the coordinates to extract dna sequences that pertain to protein translation
6)reverse the dna sequences that need to be reversed
7)merge the reversed and non revrsed dna sequences.
8)turn the combined dna sequences into upper case letters
9)translate the dna sequence into protein sequence.


This tutorial written by Jia Zhen Lee, October 2004.
Modified by Ye Lin, December 2005.

Last modified June 13, 2008 . All rights reserved.

Contact Webmaster

lab