SlideShare a Scribd company logo
1 of 7
Lab#1.

Data manipulation: molecular online
and server tools & Bioextract Server


   Theme: FXN gene and pancreatic cancer.
                     Etienne Z. Gnimpieba
                         BRIN WS 2012
                    Sioux Falls, May 30 2012
                 Etienne.gnimpieba@usd.edu
Data manipulation                  Molecular online tools and Bioextract server
                                                                            Plan

                • Review
                T1. Genome exploration
                     – Databank: Esemble
                     – Tools: web interface, logic connector

                T2. Sequences manipulation
                     – Databank: EBI, Genbank, NCBI
                     – Tools: queries tools, Blastp, ClustalW2, Jalview, FASTA

                T3. Bioextract server
                     – Data queries
                     – Tools: Blastp, Workflow, ClustalW2, FASTA

                    • Lab’s template


                                                                                    Etienne Z. Gnimpieba
                                                                                        BRIN WS 2012
                                                                                   Sioux Falls, May 31 2012
Data manipulation                                         Molecular online tools and Bioextract server
T1. Genome Exploration                                    Theme: Frataxin (FXN) implication in the pancreatic cancer genesis

 Objective: use Ensembl online tools to localize the FXN on the human genome and identify the genes implicated in pancreatic
 cancer disease. Next, find an appropriate data (sequence) on FASTA and Blast format.

 T1.1. Locate a given gene on human genome                          T1.3. Get the protein information and sequence from EBI
 On the Ensembl web site http://uswest.ensembl.org/index.html
 o Select our species "human“                                             The common protein name for FXN is Frataxin
 o Do a keyword search using the term "FXN“                               o Go to EBI home page http://www.ebi.ac.uk/
 o Follow the link of the gene drop down feature
                                                                          o Type “fxn” in the search and click on “find”
 o How many transcript variations of this gene are in our
   genome (Variations Table)?
                                                                          o Select the Homo sapien Frataxin to get all the information
 o Note the region of FXN gene by clicking location                         about the protein (function, domains, structure, gene
 o Export this gene (left side bar) in html file as a FASTA                 expression..)
   sequence.
 o Do the same process by searching for “pancreatic cancer”.
   When you find the list of genes, use the last link of the page

T1.2. Get a genomic sequence from NCBI
  o Go to NCBI home page http://www.ncbi.nlm.nih.gov/guide/          T1.4. Save the exported data sequences from T1.2.
  o Do keyword search using term FXN                                 in data folder
  o Look at the gene database. How many results are there?
    Choose the gene database and click the corresponding
    Homo-sapiens FXN gene
  o Look for the NCBI Ref Seq to find the mRNA sequence.
    Click on the corresponding accession number of the first
    transcript variant (next to the number 1)
  o Get the same sequence in FASTA format by clicking on
    FASTA
  o Click Send on the top right in blue, complete record,
    file, FASTA, Create File – finished with file for now
  o Repeat the process for pancreatic cancer by searching
    CDKN2A


                                                                                                                                Etienne Z. Gnimpieba
                                                                                                                                    BRIN WS 2012
                                                                                                                               Sioux Falls, May 31 2012
Data manipulation                            Molecular online tools and Bioextract server
T2. Sequences manipulation                   Theme: Frataxin (FXN) implication in the pancreatic cancer genesis

Objective : Find similar sequence using BLAST tools and make alignment on given sequences.
    T2.1. Find similar sequences using BLAST tool
      o Continuing from Task T1.3, select the protein tab and select “view sequence in uniprot” under the
        sequence category. You can get the Fasta format of the protein by clicking on “FASTA”. Go back, now
        check the box next to one of the sequences. Select the “Blast” tool in the drop down menu then click on
        “Go” .
      o The most matched sequences will appear on the first page (green color for the best match). To see
        other sequences you can click on next. Blast parameters can be modified by clicking on “options” and
        then Blast.
   T2.2. Align generated sequences with ClustalW tool
      o Select about 10 different species then click on “Align” at the bottom of the screen. Selected sequences
        will be directly inserted in ClustalW tool and the tool will run automatically.
      o From the right menu, it is possible to select similarities, polar residues, aromatic residues, etc. if
        interested…
      o Through the same page you may add further sequences to the same alignment if needed. You can also
        access the phylogenetic tree. More details about the residues and the distances can be obtained by
        clicking on “Jalview” on the top right in orange. Click “Keep” on the bottom left of the screen, then click
        the download. Check agree to the terms and conditions and “Run”
    T2.3. Visualized result using phylogenic tree on Jalview tool
      oIn Jalview, click “file”, “add sequences”, “from file”, go to downloads folder under userprofile.
       (unless saved in a created folder, go to that specific folder)
      oOpen the first sequence file from the folder contained the previous fasta files (task T1.X)
      oRepeat to add the second saved sequence
      oMake alignment and show the consensus
      oCalculate the tree using the “calculate”, “calculate tree”, by “average distance % identity” buttons

                                                                                                                   Etienne Z. Gnimpieba
                                                                                                                       BRIN WS 2012
                                                                                                                  Sioux Falls, May 31 2012
Data manipulation                                            Molecular online tools and Bioextract Server
T3. Bioextract server                                        Theme: Frataxin (FXN) implication in the pancreatic cancer genesis

 Objective : use server tools to optimized data manipulation processes, apply on Bioextract server.
T3.1. Server Initialization                          http://bioextract.org
 o Register on BioExtract Server to be able to create and save your own workflows.
 o Click on the “workflows tab”, then click “create and import workflows.” Now click “record workflow” then “close.”
 o To obtain the workflow at the end of the lab: From the “workflows” tab click on “create and Import workflows” then click on “save
   records”.
T3.2. Pancreatic cancer & Frataxin (FXN) data
o Select the query tab. Then select the protein sequences and check the box next to NCBI protein database. Select “gene” as Search field and type “FXN”,
  select “Species” and type “Human” by adding a search line. Submit the query.
o Results will appear on the “extract page”. You can get the Genbank view of each sequence by clicking on “View record”. We will need only the Homo sapien
  Frataxin. For that, we will click “select records”, then check the corresponding box of your choosing. Click on “keep only selected records”. The results can
  be saved or extracted in Fasta or txt format (Export the records in FASTA format)
o Click to the "tools" tab. then click on protein tools, and tmap. Select “Use records on extract page formatted in Fasta”.
o Click on “execute” to run the tool. When execution is complete, results can be retrieved by selecting the desired format and clicking on “view results”.
o Repeat the search process with “pancreatic cancer”. Make sure you change the first search field to “all text ”

 T3.3. Mapping, Alignment
o Again go to the query tab and search “FXN”. Search and select a few listings. Export them as done in T3.2
o Go to the tools tab. Now select similarity search tools, then select blastp. Select “use records on extract page formatted as Fasta”. Under "choose
  search set" select the database "swissprot"
o When execution complete, go to the extract page and select 10 different sequences belonging to 10 different species including human, then “keep
  only selected records.” Again export the records.
o Go to the tools tab again, select alignment tools, then clustal w2. Select “use records on extract page formatted as Fasta”. Your 10 protein
  sequences will be automatically incorporated as an input in clustalw2 tool. Verify that the sequence type is “protein” in the general parameters
  setting and then execute the tool. When execution completes verify the alignment through the “.aln” file
T3.4. Workflow save & reused
o Go back to the “workflow” tab and click “create and import workflows”. Write a name and a
  description for your workflow then click on Save. All the previous steps will be saved in this
  workflow.
o Once the workflow saves, you will find it in the bottom of the workflow list. Click on the name of
  the workflow to have a schematic view of it. Run the workflow by clicking on “start”.
o Get and verify all the results by clicking on “provenance”. The general report can be saved for later
  analysis. Results of each tool can be viewed or saved by clicking on “view file”.
o The same workflow can be executed for another query by simply modifying the accession number of
  the protein. (Click save in the “create and import workflows” section to temporarily save the new
  query)
                                                                                                                                                    Etienne Z. Gnimpieba
                                                                                                                                                        BRIN WS 2012
                                                                                                                                                   Sioux Falls, May 31 2012
.                                                                Molecular online tools and server                                                                                                                           16
 Context                                                                                                                                                                                                                          Biological Hypothesis
Statement of problem / Case study:
         The FXN gene provides instructions for making a protein called frataxin. This protein is found in cells throughout the body, with the highest levels in the heart, spinal cord, liver, pancreas, and muscles. The        Reduced expression of frataxin is the
protein is used for voluntary movement (skeletal muscles). Within cells, frataxin is found in energy-producing structures called mitochondria. Although its function is not fully understood, frataxin appears to help assemble   cause of Friedrich's ataxia (FRDA), a
clusters of iron and sulfur molecules that are critical for the function of many proteins, including those needed for energy production. Mutations in the FXN gene cause Friedreich ataxia. Friedreich ataxia is a genetic        lethal neurodegenerative disease, how
condition that affects the nervous system and causes movement problems. Most people with Friedreich ataxia begin to experience the signs and symptoms of the disorder around puberty.                                             about liver cancer?


   0. Specification & aims                                                                              Resolution process
  Aim:                                                                                                  T1. Genome exploration:
  The purpose of this experiment is to initiate online                                                  Objective: used of Ensembl online tools to localize the FXN on the human genome and
  biological exploration tools of the human genome. We                                                  identify the genes implicate in pancreatic cancer disease. After, getting an appropriate
  simulated the application (FXN gene and pancreatic                                                    data (sequence) on FASTA and Blast format.
  cancer). Now we can understand how a researcher can
  come to identify cross biological knowledge available                                                   T1.1. Locate a given gene on human genome
  in data banks.                                                                                          T1.2. Get a genomic sequence from NCBI
 Keywords:                                                                                                T1.3. Get the protein information and sequence from EBI
 Bio: FXN, Frataxin, pancreatic cancer, CDKN4                                                             T1.4. Save the export sequences data in data folder
 Math: HMM,
 Informatics: programing, bioinformatics tools, getting                                                   T2. Sequences manipulation
 and exporting data
                                                                                                             Objective: Find similar sequence using BLAST tools and make an alignment on given
                                                 Frataxin molecule structure (pymol)
  FXN on chromosome 9                                                                                                                            sequences.
                                                                                                              T2.1. Find similar sequences using BLAST tool
                                                                                                              T2.2. Align generated sequences with ClustalW tool
                                                                                                              T1.3. Visualized result using phylogenic tree on Jalview
                                 Biological DB




                                                 ?
                                                                                                          T2. Bioextract server
                                                                                                          Objective: used server tool to optimized data manipulation process, apply on Bioextract server.
                                                      Tools




                                                                                                          T3.1. Server Initialization
                                                                                                          T3.2. Pancreatic cancer & Frataxin (FXN)
                                                                                                          T3.3. Mapping, Alignment
    Pancreas anatomy                                                 Pancreatic cancer                    T3.4. Workflow save & reused
  Acquired skills
  Online and server tools:
  - Query biological DB (fasta, Html, txt, figure formats)                                               Conclusion: ?
  - Sequence tools (protein and gene)
      Mapping (tmap)
      Alignment (clustalw2)
  - Manage data result (select, keep, map, export)
  - Built and reuse workflow
                   16 Korean   Bioinformation Center, 2010                                                                                                                                                                                                                6
END.

More Related Content

Similar to Lab Online Molecular Tools and BioExtract Server

Session i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcSession i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcUSD Bioinformatics
 
Bioinformatics complete manual
Bioinformatics complete manualBioinformatics complete manual
Bioinformatics complete manualFrazAhmadMazari
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcUSD Bioinformatics
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrUSD Bioinformatics
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!adcobb
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBIgeetikaJethra
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxxRowlet
 
Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Sage Base
 
This lab has two parts – please answer all parts.Lab 7 Biotechn.docx
This lab has two parts – please answer all parts.Lab 7 Biotechn.docxThis lab has two parts – please answer all parts.Lab 7 Biotechn.docx
This lab has two parts – please answer all parts.Lab 7 Biotechn.docxglennf2
 
Cytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksCytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksBITS
 
DNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differencesDNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differencesBarbera van Schaik
 
Web Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterWeb Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterMonica Munoz-Torres
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
 
Bioinformatics in Gene Research
Bioinformatics in Gene ResearchBioinformatics in Gene Research
Bioinformatics in Gene ResearchDan Gaston
 

Similar to Lab Online Molecular Tools and BioExtract Server (20)

Session i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmcSession i lab bioinfo dm and app mmc
Session i lab bioinfo dm and app mmc
 
Bioinformatics complete manual
Bioinformatics complete manualBioinformatics complete manual
Bioinformatics complete manual
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmc
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
 
GFP Workshop
GFP WorkshopGFP Workshop
GFP Workshop
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Genome comparision
Genome comparisionGenome comparision
Genome comparision
 
Pathogen Genome Data
Pathogen Genome DataPathogen Genome Data
Pathogen Genome Data
 
Phylogenetic tree
Phylogenetic treePhylogenetic tree
Phylogenetic tree
 
RML NCBI Resources
RML NCBI ResourcesRML NCBI Resources
RML NCBI Resources
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24
 
This lab has two parts – please answer all parts.Lab 7 Biotechn.docx
This lab has two parts – please answer all parts.Lab 7 Biotechn.docxThis lab has two parts – please answer all parts.Lab 7 Biotechn.docx
This lab has two parts – please answer all parts.Lab 7 Biotechn.docx
 
Cytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networksCytoscape: Gene coexppression and PPI networks
Cytoscape: Gene coexppression and PPI networks
 
DNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differencesDNA analysis on your laptop: Spot the differences
DNA analysis on your laptop: Spot the differences
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Web Apollo Workshop University of Exeter
Web Apollo Workshop University of ExeterWeb Apollo Workshop University of Exeter
Web Apollo Workshop University of Exeter
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
Bioinformatics in Gene Research
Bioinformatics in Gene ResearchBioinformatics in Gene Research
Bioinformatics in Gene Research
 

More from USD Bioinformatics

Clinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerClinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerUSD Bioinformatics
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time SequencingUSD Bioinformatics
 
Next Generation Sequencing - the basics
Next Generation Sequencing - the basicsNext Generation Sequencing - the basics
Next Generation Sequencing - the basicsUSD Bioinformatics
 
Session ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcSession ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcUSD Bioinformatics
 
Session ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcSession ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcUSD Bioinformatics
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcUSD Bioinformatics
 
Session ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcSession ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcUSD Bioinformatics
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccUSD Bioinformatics
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcUSD Bioinformatics
 

More from USD Bioinformatics (20)

Visualization Tools
Visualization ToolsVisualization Tools
Visualization Tools
 
Clinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerClinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder Cancer
 
Clinical Application 1.0
Clinical Application 1.0Clinical Application 1.0
Clinical Application 1.0
 
Clinical Application 2.0
Clinical Application 2.0Clinical Application 2.0
Clinical Application 2.0
 
Bridge Amplification Part 2
Bridge Amplification Part 2Bridge Amplification Part 2
Bridge Amplification Part 2
 
Bridge Amplification Part 1
Bridge Amplification Part 1Bridge Amplification Part 1
Bridge Amplification Part 1
 
Basic Steps of the NGS Method
Basic Steps of the NGS MethodBasic Steps of the NGS Method
Basic Steps of the NGS Method
 
True Single Molecule Sequencing
True Single Molecule SequencingTrue Single Molecule Sequencing
True Single Molecule Sequencing
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time Sequencing
 
Sanger Dideoxy Method
Sanger Dideoxy MethodSanger Dideoxy Method
Sanger Dideoxy Method
 
Pyrosequencing 454
Pyrosequencing 454Pyrosequencing 454
Pyrosequencing 454
 
Ion Torrent Sequencing
Ion Torrent SequencingIon Torrent Sequencing
Ion Torrent Sequencing
 
Next Generation Sequencing - the basics
Next Generation Sequencing - the basicsNext Generation Sequencing - the basics
Next Generation Sequencing - the basics
 
Illumina Sequencing
Illumina SequencingIllumina Sequencing
Illumina Sequencing
 
Session ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcSession ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmc
 
Session ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcSession ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmc
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmc
 
Session ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcSession ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmc
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mcc
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmc
 

Recently uploaded

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Recently uploaded (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

Lab Online Molecular Tools and BioExtract Server

  • 1. Lab#1. Data manipulation: molecular online and server tools & Bioextract Server Theme: FXN gene and pancreatic cancer. Etienne Z. Gnimpieba BRIN WS 2012 Sioux Falls, May 30 2012 Etienne.gnimpieba@usd.edu
  • 2. Data manipulation Molecular online tools and Bioextract server Plan • Review T1. Genome exploration – Databank: Esemble – Tools: web interface, logic connector T2. Sequences manipulation – Databank: EBI, Genbank, NCBI – Tools: queries tools, Blastp, ClustalW2, Jalview, FASTA T3. Bioextract server – Data queries – Tools: Blastp, Workflow, ClustalW2, FASTA • Lab’s template Etienne Z. Gnimpieba BRIN WS 2012 Sioux Falls, May 31 2012
  • 3. Data manipulation Molecular online tools and Bioextract server T1. Genome Exploration Theme: Frataxin (FXN) implication in the pancreatic cancer genesis Objective: use Ensembl online tools to localize the FXN on the human genome and identify the genes implicated in pancreatic cancer disease. Next, find an appropriate data (sequence) on FASTA and Blast format. T1.1. Locate a given gene on human genome T1.3. Get the protein information and sequence from EBI On the Ensembl web site http://uswest.ensembl.org/index.html o Select our species "human“ The common protein name for FXN is Frataxin o Do a keyword search using the term "FXN“ o Go to EBI home page http://www.ebi.ac.uk/ o Follow the link of the gene drop down feature o Type “fxn” in the search and click on “find” o How many transcript variations of this gene are in our genome (Variations Table)? o Select the Homo sapien Frataxin to get all the information o Note the region of FXN gene by clicking location about the protein (function, domains, structure, gene o Export this gene (left side bar) in html file as a FASTA expression..) sequence. o Do the same process by searching for “pancreatic cancer”. When you find the list of genes, use the last link of the page T1.2. Get a genomic sequence from NCBI o Go to NCBI home page http://www.ncbi.nlm.nih.gov/guide/ T1.4. Save the exported data sequences from T1.2. o Do keyword search using term FXN in data folder o Look at the gene database. How many results are there? Choose the gene database and click the corresponding Homo-sapiens FXN gene o Look for the NCBI Ref Seq to find the mRNA sequence. Click on the corresponding accession number of the first transcript variant (next to the number 1) o Get the same sequence in FASTA format by clicking on FASTA o Click Send on the top right in blue, complete record, file, FASTA, Create File – finished with file for now o Repeat the process for pancreatic cancer by searching CDKN2A Etienne Z. Gnimpieba BRIN WS 2012 Sioux Falls, May 31 2012
  • 4. Data manipulation Molecular online tools and Bioextract server T2. Sequences manipulation Theme: Frataxin (FXN) implication in the pancreatic cancer genesis Objective : Find similar sequence using BLAST tools and make alignment on given sequences. T2.1. Find similar sequences using BLAST tool o Continuing from Task T1.3, select the protein tab and select “view sequence in uniprot” under the sequence category. You can get the Fasta format of the protein by clicking on “FASTA”. Go back, now check the box next to one of the sequences. Select the “Blast” tool in the drop down menu then click on “Go” . o The most matched sequences will appear on the first page (green color for the best match). To see other sequences you can click on next. Blast parameters can be modified by clicking on “options” and then Blast. T2.2. Align generated sequences with ClustalW tool o Select about 10 different species then click on “Align” at the bottom of the screen. Selected sequences will be directly inserted in ClustalW tool and the tool will run automatically. o From the right menu, it is possible to select similarities, polar residues, aromatic residues, etc. if interested… o Through the same page you may add further sequences to the same alignment if needed. You can also access the phylogenetic tree. More details about the residues and the distances can be obtained by clicking on “Jalview” on the top right in orange. Click “Keep” on the bottom left of the screen, then click the download. Check agree to the terms and conditions and “Run” T2.3. Visualized result using phylogenic tree on Jalview tool oIn Jalview, click “file”, “add sequences”, “from file”, go to downloads folder under userprofile. (unless saved in a created folder, go to that specific folder) oOpen the first sequence file from the folder contained the previous fasta files (task T1.X) oRepeat to add the second saved sequence oMake alignment and show the consensus oCalculate the tree using the “calculate”, “calculate tree”, by “average distance % identity” buttons Etienne Z. Gnimpieba BRIN WS 2012 Sioux Falls, May 31 2012
  • 5. Data manipulation Molecular online tools and Bioextract Server T3. Bioextract server Theme: Frataxin (FXN) implication in the pancreatic cancer genesis Objective : use server tools to optimized data manipulation processes, apply on Bioextract server. T3.1. Server Initialization http://bioextract.org o Register on BioExtract Server to be able to create and save your own workflows. o Click on the “workflows tab”, then click “create and import workflows.” Now click “record workflow” then “close.” o To obtain the workflow at the end of the lab: From the “workflows” tab click on “create and Import workflows” then click on “save records”. T3.2. Pancreatic cancer & Frataxin (FXN) data o Select the query tab. Then select the protein sequences and check the box next to NCBI protein database. Select “gene” as Search field and type “FXN”, select “Species” and type “Human” by adding a search line. Submit the query. o Results will appear on the “extract page”. You can get the Genbank view of each sequence by clicking on “View record”. We will need only the Homo sapien Frataxin. For that, we will click “select records”, then check the corresponding box of your choosing. Click on “keep only selected records”. The results can be saved or extracted in Fasta or txt format (Export the records in FASTA format) o Click to the "tools" tab. then click on protein tools, and tmap. Select “Use records on extract page formatted in Fasta”. o Click on “execute” to run the tool. When execution is complete, results can be retrieved by selecting the desired format and clicking on “view results”. o Repeat the search process with “pancreatic cancer”. Make sure you change the first search field to “all text ” T3.3. Mapping, Alignment o Again go to the query tab and search “FXN”. Search and select a few listings. Export them as done in T3.2 o Go to the tools tab. Now select similarity search tools, then select blastp. Select “use records on extract page formatted as Fasta”. Under "choose search set" select the database "swissprot" o When execution complete, go to the extract page and select 10 different sequences belonging to 10 different species including human, then “keep only selected records.” Again export the records. o Go to the tools tab again, select alignment tools, then clustal w2. Select “use records on extract page formatted as Fasta”. Your 10 protein sequences will be automatically incorporated as an input in clustalw2 tool. Verify that the sequence type is “protein” in the general parameters setting and then execute the tool. When execution completes verify the alignment through the “.aln” file T3.4. Workflow save & reused o Go back to the “workflow” tab and click “create and import workflows”. Write a name and a description for your workflow then click on Save. All the previous steps will be saved in this workflow. o Once the workflow saves, you will find it in the bottom of the workflow list. Click on the name of the workflow to have a schematic view of it. Run the workflow by clicking on “start”. o Get and verify all the results by clicking on “provenance”. The general report can be saved for later analysis. Results of each tool can be viewed or saved by clicking on “view file”. o The same workflow can be executed for another query by simply modifying the accession number of the protein. (Click save in the “create and import workflows” section to temporarily save the new query) Etienne Z. Gnimpieba BRIN WS 2012 Sioux Falls, May 31 2012
  • 6. . Molecular online tools and server 16 Context Biological Hypothesis Statement of problem / Case study: The FXN gene provides instructions for making a protein called frataxin. This protein is found in cells throughout the body, with the highest levels in the heart, spinal cord, liver, pancreas, and muscles. The Reduced expression of frataxin is the protein is used for voluntary movement (skeletal muscles). Within cells, frataxin is found in energy-producing structures called mitochondria. Although its function is not fully understood, frataxin appears to help assemble cause of Friedrich's ataxia (FRDA), a clusters of iron and sulfur molecules that are critical for the function of many proteins, including those needed for energy production. Mutations in the FXN gene cause Friedreich ataxia. Friedreich ataxia is a genetic lethal neurodegenerative disease, how condition that affects the nervous system and causes movement problems. Most people with Friedreich ataxia begin to experience the signs and symptoms of the disorder around puberty. about liver cancer? 0. Specification & aims Resolution process Aim: T1. Genome exploration: The purpose of this experiment is to initiate online Objective: used of Ensembl online tools to localize the FXN on the human genome and biological exploration tools of the human genome. We identify the genes implicate in pancreatic cancer disease. After, getting an appropriate simulated the application (FXN gene and pancreatic data (sequence) on FASTA and Blast format. cancer). Now we can understand how a researcher can come to identify cross biological knowledge available T1.1. Locate a given gene on human genome in data banks. T1.2. Get a genomic sequence from NCBI Keywords: T1.3. Get the protein information and sequence from EBI Bio: FXN, Frataxin, pancreatic cancer, CDKN4 T1.4. Save the export sequences data in data folder Math: HMM, Informatics: programing, bioinformatics tools, getting T2. Sequences manipulation and exporting data Objective: Find similar sequence using BLAST tools and make an alignment on given Frataxin molecule structure (pymol) FXN on chromosome 9 sequences. T2.1. Find similar sequences using BLAST tool T2.2. Align generated sequences with ClustalW tool T1.3. Visualized result using phylogenic tree on Jalview Biological DB ? T2. Bioextract server Objective: used server tool to optimized data manipulation process, apply on Bioextract server. Tools T3.1. Server Initialization T3.2. Pancreatic cancer & Frataxin (FXN) T3.3. Mapping, Alignment Pancreas anatomy Pancreatic cancer T3.4. Workflow save & reused Acquired skills Online and server tools: - Query biological DB (fasta, Html, txt, figure formats) Conclusion: ? - Sequence tools (protein and gene) Mapping (tmap) Alignment (clustalw2) - Manage data result (select, keep, map, export) - Built and reuse workflow 16 Korean Bioinformation Center, 2010 6

Editor's Notes

  1. Welcome to this bioinformatics lab on data manipulation using online and server tools.As the theme, we have chosen to study of the interaction between Frataxin and pancreatic cancer.
  2. During this lab, we have:A brief review Lab’s templateGenome exploration practice…
  3. This is the lab template: The context is a biological context based on a real biological problem. And a given hypothesisI don’t use computer science, strong word.When you read this template, you have a different view than an informatician.You want to understand the process to build the used tools.The architecture of the systemThe algorithm implementationThe quality of the resulting dataAnd so on