staff project download information miscellaneous
Vect   DNA to Protein Tutorial
  Installing Perl
Mac
Windows
Unix


Download

Reference Manual
Introduction
Overview
Input Panel
Convert Panel
Output Panel
Perl Program Panel


Tutorials
Numerical Data Extraction
Statistical Data Extraction
Patent Calculation
PDB Data Extraction
GenBank Data Extraction
Tabular Data Analysis
Word Mapping
DNA to Protein Extraction

Change Log


FAQ

Cookbook
 
MangoPicky DownloadLucy2 DownloadTrend DownloadGRAMAUBViz DownloadgeneDBN Download

Part 1: Extracting Gene Names & DNA (2)

Now that you are in the Convert Data panel, click on Insert button and select the rule “Extract quoted data from other rules” from the list. For the first ???? click on it and select the rule “RawGeneNames.” For the second and last ????, click on them and type in “ and “ respectively. This is telling Vect to extract anything contained within double quotes. This should extract out all the gene names minus the /gene=” “, to look like the following:

 

Return to the Input Data panel. Now we have gene names, we need DNA sequences. We first extract the giant BAC DNA sequence from the Arabidopsis file at the end of the genbank report, as pictured below:

 

Right click on the word ORIGIN and select “New Block Open Condition”. Anything above ORIGIN is now highlighted in pink. Now scroll down to the very end of the genbank report, there should be a // at the end. Right click on the // and select “New Block Close Condition.”

Now, left click one by one on all the columns containing the dna sequences, and also the end // marker, which will be used to indicate the end of concatenation in the next page. They should all be highlighted grey like the following screenshot. Make sure only the dna sequences and the end // marker are selected, if not, it means you goofed up somewhere!

Click on Move and name the rule something meaningful, such as “Raw DNA sequence”.

Last modified June 13, 2008 . All rights reserved.

Contact Webmaster

lab