Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Stem Cells and Bioinformatics Michelle Previtera Zhenhong Bao
Induced Pluripotent Stem Cell Lines Derived from Human Somatic Cells <ul><li>OCT4 </li></ul><ul><li>SOX2 </li></ul><ul><li...
OCT4 <ul><li>Exercise 1: </li></ul><ul><ul><li>Do a Pubmed search </li></ul></ul><ul><ul><li>Refine your search using the ...
Answer <ul><li>OCT4 is a transcription factor involved to maintain pluripotency in ES cells </li></ul><ul><li>The differen...
SOX2 <ul><li>Using NCBI Gene Entrez </li></ul><ul><ul><li>What are 2 features that make SOX2 unique?  </li></ul></ul><ul><...
Answer <ul><li>intronless gene and lies within an intron of another gene called SOX2 overlapping transcript (SOX2OT)  </li...
NANOG <ul><li>Illustration using EMBOSS Suite </li></ul><ul><li>Exercise: </li></ul><ul><ul><li>Obtain mRNA and genomic se...
NCBI Entrez <ul><li>Obtain sequence files from NCBI </li></ul>
Infoseq <ul><li>Name, Accession, Type, GI, length, GC %, etc. </li></ul><ul><li>Accession No: NM_024865.2  </li></ul><ul><...
dottup <ul><li>Dottup looks for exact matches between sequences </li></ul>Word Size = 10 Word Size = 20
Water <ul><li>local alignment as in water searches for regions of  local similarity  and need not include the entire lengt...
Plotorf/getorf <ul><li>Finds and plots potential open reading frames. (ORF)  </li></ul><ul><li>ORF in plotorf defined as r...
Transeq <ul><li>Transeq translate mRNA sequence to protein sequence. </li></ul><ul><li>ORF obtained from Plotorf/getorf. <...
Pepinfo <ul><li>Pepinfo produces information on amino acid properties (size, polarity, aromaticity, charge etc).  </li></ul>
LIN28 <ul><li>Illustration using Blast </li></ul><ul><li>Exercise </li></ul><ul><ul><li>Obtain human protein sequence of L...
Blast <ul><li>tblastn:  Search  translated nucleotide  database using a  protein  query   </li></ul><ul><li>Search Result:...
Conserved Domain Database (CDD) <ul><li>Result for LIN28  homo sapiens </li></ul><ul><li>CSD, Cold-shock DNA-binding domai...
Taxonomy Report <ul><li>Blast results are categorized in species </li></ul><ul><li>the best match in  mus musculus :  </li...
CD from mouse protein <ul><li>Do the same search with mouse LIN28 protein in CDD </li></ul><ul><li>CSD: 67 aa, 95.5%  AIR1...
Literatures <ul><li>Gene functions related to pluripotency.  </li></ul><ul><li>Oct4 is required to maintain the undifferen...
Upcoming SlideShare
Loading in …5
×

Bioinfomatics Presentation

818 views

Published on

Published in: Business, Technology
  • Be the first to comment

  • Be the first to like this

Bioinfomatics Presentation

  1. 1. Stem Cells and Bioinformatics Michelle Previtera Zhenhong Bao
  2. 2. Induced Pluripotent Stem Cell Lines Derived from Human Somatic Cells <ul><li>OCT4 </li></ul><ul><li>SOX2 </li></ul><ul><li>NANOG </li></ul><ul><li>LIN28 </li></ul>
  3. 3. OCT4 <ul><li>Exercise 1: </li></ul><ul><ul><li>Do a Pubmed search </li></ul></ul><ul><ul><li>Refine your search using the Limit tab to include studies only involving humans. </li></ul></ul><ul><ul><ul><li>What other tools do you see that are useful for refining your search? </li></ul></ul></ul><ul><ul><ul><li>What is OCT4 function in humans according to published literature? </li></ul></ul></ul><ul><ul><ul><li>What is the difference in amount of articles found with/without the limit? </li></ul></ul></ul><ul><ul><li>Do a Google Scholar search for OCT4 </li></ul></ul><ul><ul><ul><li>Do the advance search options have the same options as the Limits tabs on Pubmed? </li></ul></ul></ul><ul><ul><ul><li>What can you do in Google scholar to refine your results? </li></ul></ul></ul>
  4. 4. Answer <ul><li>OCT4 is a transcription factor involved to maintain pluripotency in ES cells </li></ul><ul><li>The difference is: </li></ul><ul><ul><li>Non limit 358 </li></ul></ul><ul><ul><li>Limit 154 </li></ul></ul><ul><li>Google and Pubmed do not have the same refinement tools </li></ul><ul><li>Use: </li></ul><ul><ul><li>&quot;+&quot; operator makes sure your results include common words, letters or numbers that Google's search technology generally ignores </li></ul></ul><ul><ul><li>&quot;-&quot; operator excludes all results that include this search term, </li></ul></ul><ul><ul><li>phrase search only returns results that include this exact phrase </li></ul></ul><ul><ul><li>the &quot;OR&quot; operator returns results that include either of your search terms </li></ul></ul><ul><ul><li>the &quot;intitle:&quot; operator only returns results that include your search term in the document's title. </li></ul></ul><ul><ul><li>From http://scholar.google.com/intl/en/scholar/refinesearch.html </li></ul></ul>
  5. 5. SOX2 <ul><li>Using NCBI Gene Entrez </li></ul><ul><ul><li>What are 2 features that make SOX2 unique? </li></ul></ul><ul><ul><ul><li>HINT: Look in the summary paragraph in Gene Entrex </li></ul></ul></ul><ul><ul><li>What does SOX2 interact with and what is the function of this interaction ? </li></ul></ul><ul><ul><li>Is there a structure for SOX2 on CDD? </li></ul></ul><ul><ul><ul><li>If so what is interesting about the structure when you download it on Cn3d? </li></ul></ul></ul><ul><ul><ul><ul><li>What is the structure aligned to? </li></ul></ul></ul></ul><ul><ul><ul><ul><li>What do the # marks mean on the feature 1 row? </li></ul></ul></ul></ul>
  6. 6. Answer <ul><li>intronless gene and lies within an intron of another gene called SOX2 overlapping transcript (SOX2OT) </li></ul><ul><li>OCT4 to establish first 3 lineages in ES. </li></ul><ul><li>Shows DNA Binding site and those are the residues that are Blue and conserved </li></ul><ul><li>Chain D, Crystal Structure Of A PouHMGDNA TERNARY COMPLEX. </li></ul><ul><li>('#') pinpoint features </li></ul>
  7. 7. NANOG <ul><li>Illustration using EMBOSS Suite </li></ul><ul><li>Exercise: </li></ul><ul><ul><li>Obtain mRNA and genomic sequence of human NANOG in FASTA format from NCBI Entrez . </li></ul></ul><ul><ul><li>According to Infoseq , what’s the length, GC content, accession number of the human NANOG mRNA? </li></ul></ul><ul><ul><li>Align the mRNA sequence to the genomic sequence using dottup , show results. </li></ul></ul><ul><ul><li>Do a local alignment with human mRNA sequence to mouse sequence using Water . </li></ul></ul><ul><ul><li>Use getorf/plotorf to obtain ORF of human NANOG mRNA sequence. </li></ul></ul><ul><ul><li>Translate the mRNA sequence to a protein sequence with Transeq . </li></ul></ul><ul><ul><li>Show the hydropathy plot of the above protein sequence using Pepinfo . </li></ul></ul>
  8. 8. NCBI Entrez <ul><li>Obtain sequence files from NCBI </li></ul>
  9. 9. Infoseq <ul><li>Name, Accession, Type, GI, length, GC %, etc. </li></ul><ul><li>Accession No: NM_024865.2 </li></ul><ul><li>Length: 2098nt </li></ul><ul><li>GC content: 45.28% </li></ul><ul><li>Description: Homo sapiens Nanog homeobox (NANOG), mRNA </li></ul>
  10. 10. dottup <ul><li>Dottup looks for exact matches between sequences </li></ul>Word Size = 10 Word Size = 20
  11. 11. Water <ul><li>local alignment as in water searches for regions of local similarity and need not include the entire length of the sequences </li></ul><ul><li>Results for NANOG mRNA between homo sapiens and Mus Musculus </li></ul>
  12. 12. Plotorf/getorf <ul><li>Finds and plots potential open reading frames. (ORF) </li></ul><ul><li>ORF in plotorf defined as regions between START and STOP codons. </li></ul>
  13. 13. Transeq <ul><li>Transeq translate mRNA sequence to protein sequence. </li></ul><ul><li>ORF obtained from Plotorf/getorf. </li></ul><ul><li>Human NANOG mRNA ORF 217 to 1131 </li></ul><ul><li>Translated Protein Sequence </li></ul>
  14. 14. Pepinfo <ul><li>Pepinfo produces information on amino acid properties (size, polarity, aromaticity, charge etc). </li></ul>
  15. 15. LIN28 <ul><li>Illustration using Blast </li></ul><ul><li>Exercise </li></ul><ul><ul><li>Obtain human protein sequence of LIN28 from NCBI </li></ul></ul><ul><ul><li>Search nucleotide database using a protein query ( tblastn ) </li></ul></ul><ul><ul><li>Search Conserved Domain Database ( CDD ) for conserved domains </li></ul></ul><ul><ul><li>From taxonomy report , find the best match of mouse homolog </li></ul></ul><ul><ul><li>Compare the conserved domains from human and mouse proteins </li></ul></ul>
  16. 16. Blast <ul><li>tblastn: Search translated nucleotide database using a protein query </li></ul><ul><li>Search Result: </li></ul><ul><li>homologs of </li></ul><ul><li>human LIN28 </li></ul>
  17. 17. Conserved Domain Database (CDD) <ul><li>Result for LIN28 homo sapiens </li></ul><ul><li>CSD, Cold-shock DNA-binding domain, 67aa , 95.5% aligned </li></ul><ul><li>AIR1, Arginine methyltransferase-interacting protein, 190aa 34.7% aligned </li></ul>
  18. 18. Taxonomy Report <ul><li>Blast results are categorized in species </li></ul><ul><li>the best match in mus musculus : </li></ul><ul><li>protein accession number: NP_665832 , E value 2e-103 </li></ul>
  19. 19. CD from mouse protein <ul><li>Do the same search with mouse LIN28 protein in CDD </li></ul><ul><li>CSD: 67 aa, 95.5% AIR1: 190aa, 26.84% </li></ul><ul><li>Compared to human LIN28 </li></ul><ul><li>CSD:67aa, 95.5% AIR1: 190aa, 34.7% </li></ul>
  20. 20. Literatures <ul><li>Gene functions related to pluripotency. </li></ul><ul><li>Oct4 is required to maintain the undifferentiated stem cell state, and differentiation to trophectoderm occurs in its absence. </li></ul><ul><li>NANOG plays a crucial role in maintaining the pluripotent state of primate embryonic stem cells. </li></ul><ul><li>… </li></ul>

×