オープンコミュニティ「要求開発アライアンス」(http://www.openthology.org)の2008年10月定例会発表資料です。
Open Community "Requirement Development Alliance" 2008 October regular meeting of the presentation materials.
1. The document analyzes differences between two regions, region B and region C, of the Makorin1-p1 gene in different mouse species.
2. A chi-square test finds significant differences in three pairs of species for the number of differences between region B and region C.
3. The Makorin1-p1 gene is likely a processed pseudogene containing a CpG island, and its region B may have evolved faster in the M. caroli lineage.
1) The document discusses using the Biostrings package in R to analyze DNA sequences of the Makorin1-p1 pseudogene from different Mus musculus subspecies, calculating the p-distance between sequences in regions B and C.
2) A phylogenetic tree shows the evolutionary relationships between different M. musculus subspecies and related species. Makorin1-p1 is found to have an ortholog in rats.
3) The methodology describes plans to use the Biostrings package to extract region B and region C sequences from a Makorin1-p1.fasta file for different M. musculus subspecies, and calculate p
The document describes analyzing nucleotide sequences of the rhodopsin gene from human, chimpanzee and macaque. Key steps include:
1) Obtaining rhodopsin coding sequences from NCBI and writing them to a FASTA file
2) Performing a multiple sequence alignment using ClustalW
3) Calculating the transition/transversion ratio and genetic distance between species based on the alignment
This document describes steps in a bioinformatics analysis pipeline that uses BLAST and CLUSTALW. The pipeline takes a protein sequence from KEGG identified by its ID (hsa:6469), runs BLAST to find similar sequences, parses the BLAST output, runs CLUSTALW for multiple sequence alignment, and outputs the results. Ruby scripts are used to implement each step via calling REST and SOAP services for BLAST and CLUSTALW.
オープンコミュニティ「要求開発アライアンス」(http://www.openthology.org)の2008年10月定例会発表資料です。
Open Community "Requirement Development Alliance" 2008 October regular meeting of the presentation materials.
1. The document analyzes differences between two regions, region B and region C, of the Makorin1-p1 gene in different mouse species.
2. A chi-square test finds significant differences in three pairs of species for the number of differences between region B and region C.
3. The Makorin1-p1 gene is likely a processed pseudogene containing a CpG island, and its region B may have evolved faster in the M. caroli lineage.
1) The document discusses using the Biostrings package in R to analyze DNA sequences of the Makorin1-p1 pseudogene from different Mus musculus subspecies, calculating the p-distance between sequences in regions B and C.
2) A phylogenetic tree shows the evolutionary relationships between different M. musculus subspecies and related species. Makorin1-p1 is found to have an ortholog in rats.
3) The methodology describes plans to use the Biostrings package to extract region B and region C sequences from a Makorin1-p1.fasta file for different M. musculus subspecies, and calculate p
The document describes analyzing nucleotide sequences of the rhodopsin gene from human, chimpanzee and macaque. Key steps include:
1) Obtaining rhodopsin coding sequences from NCBI and writing them to a FASTA file
2) Performing a multiple sequence alignment using ClustalW
3) Calculating the transition/transversion ratio and genetic distance between species based on the alignment
This document describes steps in a bioinformatics analysis pipeline that uses BLAST and CLUSTALW. The pipeline takes a protein sequence from KEGG identified by its ID (hsa:6469), runs BLAST to find similar sequences, parses the BLAST output, runs CLUSTALW for multiple sequence alignment, and outputs the results. Ruby scripts are used to implement each step via calling REST and SOAP services for BLAST and CLUSTALW.
1) The document discusses various bioinformatics databases and tools that can be accessed through REST and SOAP web services, including KEGG, Ensembl, and UniProt.
2) It provides examples of querying these resources to retrieve gene and protein sequence data, perform BLAST searches, and generate multiple sequence alignments using ClustalW.
3) The document also outlines examples of scripts that can be written using Ruby to programmatically access these web services to analyze gene sequences like Sonic Hedgehog.
1) The document discusses various web APIs including Google Chart API and TogoWS. Google Chart API can be used to generate charts and graphs from data. TogoWS is a RESTful web service that allows retrieval of biological data from databases like NCBI, Ensembl through HTTP requests.
2) Methods for using Google Chart API and TogoWS are described. Google Chart API uses URL parameters to specify chart properties, data, and format. TogoWS allows retrieval of database entries and searching through RESTful HTTP requests to its endpoints.
3) Examples show how to generate line charts and scatter plots using Google Chart API and retrieve sequence and annotation data for genes using TogoWS REST requests. Dotplot
1) The document discusses using Ensembl and BLAT to design PCR primers for the rat olfactory receptor gene Olr1082.
2) Ensembl provides the coding sequence and annotation for Olr1082, which has one exon and a 954bp coding sequence.
3) BLAT is used to align the Olr1082 coding sequence against the rat genome, confirming it is located on a single exon on the reverse DNA strand.
The document discusses various bioinformatics tools for sequence analysis, including BLAST, EMBOSS, and Dotplot. It provides instructions for downloading and installing BLAST and EMBOSS on Mac OS X, and describes how to use tools like BLASTN, BLASTP, and Dotmatcher in EMBOSS to compare DNA and protein sequences and generate dotplot alignments. Examples are given of using these tools to analyze makorin gene sequences from different species.
(1) The document provides information about NCBI and Ensembl databases and how to use their viewers to search for genes and sequences. It explains how to search by keywords, accession numbers, and organism names.
(2) It describes the main features of the viewers including accessing gene summaries, alignments, orthologs, paralogs, synteny, and gene trees. Screenshots are included to show example outputs.
(3) Details are given about the file formats used for sequences (FASTA) and alignments (MSF, NEXUS) as well as how to export and configure the data displayed in the viewers. Orthologs are distinguished from paralogs based on divergence time
This document discusses the history and development of DNA sequencing technologies from 1953 to the present. Early methods like Sanger sequencing allowed sequencing of around 6,000 base pairs per day in 1985, increasing to 600,000 bp per day by 2000. The goal is now 100 gigabases per hour by 2010. DNA sequence data is stored in databases like GenBank, EMBL, and DDBJ. Emerging fields like genomics, transcriptomics, proteomics, and metabolomics are using high-throughput sequencing and associated '-omics' approaches to study biological systems.
1) The document describes steps in a bioinformatics analysis pipeline that uses BLAST and CLUSTALW to perform sequence alignment and clustering on a query protein sequence from Homo sapiens (hsa:6469).
2) Each step is contained in a separate Ruby script (step10.rb through step60.rb) that retrieves data through REST/SOAP calls and passes output through text files.
3) The final step performs a multiple sequence alignment of BLAST hits passing the E-value threshold using CLUSTALW through a SOAP call and outputs the result.
This document discusses using Web APIs, specifically the Google Chart API, to generate charts and graphs. It provides:
1) An overview of what a Web API is and how it can be used to access functionality from another application over HTTP.
2) A description of the Google Chart API and how it can be used to generate charts as PNG images by specifying parameters in a URL.
3) Details on common parameters for the Google Chart API including how to specify chart size, data, type, and colors.
8. 全ゲノム配列解読前の研究
1991年 Linda B. Buck and Richard Axel (The Nobel Prize in Physiology or Medicine for 2004)
により、ラットの嗅覚受容体遺伝子がクローニングされた。
Gタンパク質共役型であることは予想されていたので、その配列の特性を用いてprimerを
設計し、PCRをして(a)、得られた産物の塩基配列を決定(b)。
また、嗅覚を司る器官で発現されていることを確認(c)。 (18 loci)
(c)
(a) (b)
Figure 2 Figure 4 Figure 3
PCR result Amino acid alignment Northern Blot Analysis
(from Buck & Axel Cell 1991 65:175-87.)
1992年から1998年にかけては、このラットの嗅覚受容体遺伝子の情報をもとに、マウス、
ヒト、ナマズ、メダカなどの生物で嗅覚受容体遺伝子が相次いでクローニングされた。
8