Homo Sapien
A Marvel
A machine:
that can reason about itself,
find ways to augment itself,
challenge nature – evolution,
change and shape its own destiny.
My Excitement stems from this
episode
The other day I wanted to know how many
nucleotides (ACGT) are there in genome?
Back in the day, one would run to
Libraries and scan through books.
Even before that we would seek counsel from a
learned one. They used to vanish taking their
knowledge with them. So we came up with written form
and books. Then we came up with libraries collection
of books that any one can tap into.
We were not satisfied with that Library we created, we
invented the world wide web.
Disclaimer
My interest is in analyzing data
Bioinformatics is an interdisciplinary field that intersects with
biology, computer science, mathematics and statistics. It
concerns itself with the development and use of methods and
software tools for collecting and analyzing biological data.
– https://github.com/topics/bioinformatics
Broad Disclaimer
I am NOT the original author of anything you find in this
presentation. Figures, ideas all come from various sources and I
have tried to provide the link. I have no claim or pretense about
the ownership of this material. This presentation is for the benefit
of students who are considering Bioinformatics to give them a
broad overview of the field, its promise and understand how
technolgy can help find cure to many ailments.
So I binged it – My augmented
memory
Time to seek more knowledge
It just does not make sense!!!
Why is it divided into 24 linear molecules?
What is a chromosome and why are there 24 chromosomes?
22 duplicate pairs, X and X in females, X and Y in males
And how do chromosomes relate to those 24 linear molecules?
What is or how is a chromosome related to DNA?
Why are there pairs of chromosomes?
What is a cell? What is a cell nuclei?
What is an individual mitochondria?
Protein-coding DNA genes and non-encoding DNA ?
(why do they call it DNAGenes?)
Time to seek more knowledge
What is a genome? Why is it a human genome?Can there be pig genome?
How is a human nuclear genome different from human genome?
What is nucleotide? Why is there a sequence?
What is mitochondria and its genome?
Why do scientists call them genome?
Wiki Summary Table
There is nosuch thing as enough: especially
when it comes to knowledge
https://www.genome.gov/25520925/2007-national-dna-day-online-chatroom-transcript/
Choosing to have or not to have genetic testing is a personal one. For example, a person who has a family history of early onset breast cancer
may wish to have genetic testing to learn whether he or she has inherited a BRCA1 or BRCA2 gene that greatly increases their risk for
developing breast and other cancers so that they can take preventive actions.
More
April 25th – DNA DAY
Genomics
https://www.pbs.org/wnet/religionandethics/2013/08/09/january-25-2013-1000-dollar-genome/14569/
When researchers first mapped the human genome, it took almost 10 years and cost $3
billion. Today the process takes three weeks, and the price tag is rapidly approaching $1,000.
What is genomics?
The branch of molecular biology concerned with the structure, function,
evolution, and mapping of genomes. All the genetic material in a person is
referred to as their genome.
https://www.cbsnews.com/news/could-gene-therapy-cure-sickle-cell-anemia-60-minutes/
https://www.cbsnews.com/video/the-chairman-aclu-genetic-revolution/ Sickle cell Anemia
https://www.cbsnews.com/video/woman-feels-no-pain-due-to-rare-gene-mutation-researchers-say
https://www.amazon.com/The-Evolution-of-Us/dp/B0788TJT52
http://bioinformaticsalgorithms.com/videos.htm
Why, What and How Learning cycle
From the Sickle Cell Episode on CBS we learn
there are 7000 genetic diseases
From the HairLoss gene presented we learn the
process involves lot of data collection, storage
and association/statistical analysis
So we seek to learn what is the science of genes
…
This is not a substitute for a proper course in the
Bio departments...
Body-cell-chromosome-DNA
Cells
https://rarediseases.info.nih.gov/files/glossary/english/cell_lg.jpg
A cell is the basic building block of living things.
All cells can be sorted into one of two groups: eukaryotes and prokaryotes.
A eukaryote has a nucleus and membrane-bound organelles, while a prokaryote does not.
Plants and animals are made of numerous eukaryotic cells, while many microbes, such as
bacteria, consist of single cells.
An adult human body is estimated to contain between 10 and 100 trillion cells.
Chromosome
https://rarediseases.info.nih.gov/files/glossary/english/chromosome_lg.jp
g
We have heard about the telomere in the Evolution of US II
Chromosomes
Chromosomes are the structures in our cells that are made up of DNA. Our
chromosomes are contained in the nucleus of the millions of cells that make up our
body. We have 23 pairs of chromosomes (46 total). For every pair of chromosomes,
one chromosome comes from our mother and one from our father. One pair of
chromosomes is the sex chromosomes called the X and Y chromosomes. These two
chromosomes determine if we will be male or female. Females have two X
chromosomes and males have an X and a Y. Except for genes carried on the sex
chromosomes, we inherit two copies of every gene, one from each parent.
The diagram below shows how our
bodies are made of millions of cells that
contain DNA packaged in chromosomes.
A chromosome is an organized package of
DNA found in the nucleus of the cell.
Different organisms have different numbers
of chromosomes.
Humans have 23 pairs of chromosomes--
22 pairs of numbered chromosomes,
called autosomes,
and one pair of sex chromosomes, X and Y.
Each parent contributes one chromosome to each pair so that offspring get half of
their chromosomes from their mother and half from their father.
DNA structure and function
 Undisturbed, DNA forms a double helix of 2
intertwined chains with complementary
sequences of nucleotides (A<->T, G<->C)
 To a sugar-phosphate backbone are attached
base pairs = 2 nucleotides on opposite
strands connected by hydrogen bonds (A only
with T, and G only with C)
 Major function of DNA = encode sequence of
amino acids in proteins
BIO for CSCI
3/31/19 17
Genomes and chromosomes
 Genome of an organism is its complete DNA sequence of a full set of
chromosomes for an organism
 Looks like a very long word on a 4-letter alphabet
 Chromosome = single, very long, continuous piece of DNA and proteins
 Assembly of transcription units that are precisely duplicated in each cell
generation
 Species specific in number, size, and shape
 How many chromosomes organism has is not a measure of the
complexity of the organism (amphibians have more than people, and
tulips have 10 times as much DNA in their cells)
 How DNA wind?
3/31/19 18
http://www.youtube.com/watch?v=gefKBQ81ngE
DNA
What is DNA?
DNA (deoxyribonucleic acid) is
the basic material of heredity. It
is a chemical made of four kinds
of building blocks called bases.
The four types of bases are:
adenine (A), cytosine (C),
guanine (G), and thymidine (T).
These four bases are strung
together in different combinations
or sequences. DNA sequences
are mostly the same in everyone.
In fact, about 99% of the DNA
sequence is identical in all
humans. The 1% of the DNA that
differs between people is what
makes us each unique. The
differences in the DNA between
people are called variants.
Human Genome Project
~ 3 billion base pairs
~30,000 genes
Genes
Genes are segments of DNA that we inherit from our parents and pass on to our
children. Most genes tell our body how to make products called proteins. Proteins
are components of our body that perform a specific function so that our bodies work
properly. We have about 23,000 genes that instruct our body on everything from how
our heart will form and beat, to what color our eyes will be.
You can think of a gene as a book that contains instructions for how to create a
specific protein. The order of the bases in DNA is like the letters that make up a
sentence in a book. If there is a “typo” or a change in a letter in a word, it can alter
the meaning of the sentence. For example, changing the letter “T” in the word tag to
the letter “G” would result in the word gag, changing our understanding of the
sentence. Similarly, a variant or mutation in a gene can alter the instructions a gene
provides to the body, causing the body not to make a working protein. Such variants
may lead to problems with a person’s growth and development, or cause disease.
These variants are called mutations.
Gene and Codons
 Gene= hereditary unit of DNA
 Other DNA segments have structural purposes, or are involved in regulating the
expression of genetic information
 non-coding sequences (portions for which no function has yet been identified,
about 97% in people)
 Each 3-letter nucleotide sequence forms a codon
 Since there 4 letters, there are 43 = 64 possible codons
 3 stop codons, and 1 start codon
 The other triples (redundantly) represent 20 amino acids
 Example: Proline = CCC, CCT, CCA, and CCG
BIO for CSCI
3/31/19 21
Hair Loss Gene
Hairloss is no laughing matter...
Billion dollar industry
And nothing of significance can be
achieved without passion or
dedicated and dogged effort.
https://magazine.columbia.edu/article/radical-solutions-
Hair Loss Voyage
https://magazine.columbia.edu/article/radical-solutions-
baldness
20 years of research – a lifetime of dedication
The tale of AA – alopecia areata
Overcoming rejections/objections
Story of persistence and resolute effort.
Treasure trove of knowledge
Let us pause and take stock
The human body has an estimated thirty seven trillion cells.
Each cell has our entire set of genes, about 20 thousand, in the nuclei.
Genes are the basic unit of heredity in all life forms.
Genes are discrete segments of DNA. Entire chain of Genes in a DNA
in a cell is called the genome –the complete set of chemical instructions
for the organism.
DNAs are molecules shaped like a twisted ladder, coiled in a thread like
bundles called chromosomes.
Genes encode messages for specific proteins to build and run an organism.
Genes determine everything about the organism.
The Human Genome Project — the sequencing of the three billion subunits,
or base pairs, of the DNA molecule — gave geneticists a powerful new tool
for identifying genetic factors in complex, multi-gene diseases.
Colloborating with two Pakistani genetic researchers and experiments on mouse
Christiana found the gene that controls the growth of human hair.
Gene Registry
Christiano got a substantial National Institutes of Health grant to create a DNA registry
in the US for alopecia areata — the first step toward identifying the genes that cause
that disease. That meant finding people from extended families in which multiple
members had severe hair loss, as well as finding thousands of individual patients
whose relatives were not affected. “All were valuable to the gene-hunting efforts,”
Christiano says. The registry was compiled by Christiano at Columbia and four other
institutions over a period of years.
How did Christiana do it?
ULBP3:But one thing leads to another...
The team zeroed in on chromosome six, where they detected what Christiano calls
“the smoking gun for AA” — a gene called ULBP3. “This was the gene that landed us
in the hair follicle,” Christiano says.
Genome/Exome
All the genetic material in a person is referred to as their genome. Whole genome
sequencing is a type of genetic test that looks at the sequence of all of the DNA in a
person’s body. The portion of the DNA that is needed for a person’s body to make
proteins is referred to as that person’s exome. The vast majority of known genetic
diseases are caused by mutations in the exome. Whole exome sequencing only looks at
the portion of DNA that is used to make proteins. Whole genome sequencing and whole
exome sequencing may also be referred to as just genome or exome sequencing. Using
exome and genome sequencing without the term “whole” may be more appropriate, since
these tests do not usually look at a person’s whole sequence.
Genome or exome sequencing is usually done to try to figure out the cause of a condition
that seems to be genetic. It is often performed when there are many possible genes that
could be causing a patient’s condition. It might also be performed when standard genetic
testing has not identified the cause or a diagnosis. Samples from a person’s family
members, especially parents, may also be requested when a person has exome or
genome testing. This allows the laboratory to compare the DNA sequences of the patient
and family members in order to better understand which genetic variants are inherited.
Because genetic variants are often inherited from an individual’s parents, sequencing may
identify information that is relevant to family members. For example it could identify a
chance that they or their child may have or develop a certain medical condition.
Classsifying the Mutation
1. “Negative for a disease-associated mutation” means that no mutation was found in any
gene known to be involved or suspected to be involved with the condition. This result does
not exclude a possible genetic cause because not all genetic changes are found by
current testing.
2. “Positive for a disease-associated mutation” means that a mutation was found in a gene
that is known to be associated OR suspected to be associated with the condition.
Disease-Associated Mutation (Known Disease Gene): A disease-causing mutation was
found in a gene known to be associated with the condition. Such a result may lead to
changes in recommendations for medical management and treatment. Testing other family
members as well as prenatal testing may be available to determine if other individuals are
affected.
Suspected Disease-Associated Mutation (Uncertain Disease Gene): A mutation was
identified in a gene that does NOT have a well known association with the condition. The
gene may be suspected to be associated with the condition based on the gene’s function
in the body or how it interacts with other genes. In such cases, it is not certain if the
mutation found is the cause of the person’s condition. Additional information may become
available in the future to clarify what this result means.
https://pediseq.research.chop.edu/discover/for-
parents/results.html#Primary
WGS (whole genome sequencing)
uncovering the genetic and molecular basis of human
language through analysis of whole genome sequencing
(WGS) data
the effect of genetic variation on the development and
function of the brain, particularly in the context of
neurodevelopmental conditions such as autism and
language impairment.
whole genome sequencing, RNA-seq, ChIP-seq, etc.
https://pediseq.research.chop.edu/discover/for-parents/introduction-to-genetic-sequencing.html
Findings:Primary vs Incidental
Incidental Findings
Incidental findings are genetic variants or results unrelated to the
condition that prompted sequencing. Incidental findings could tell
us about other symptoms or diseases that a person may have or
might develop, how a person might respond to certain medications,
or the potential risks of having a child with a specific genetic
disease. One example of an incidental finding would be finding a
variant related to diabetes in someone who had exome sequencing
to identify the cause of her hearing loss.
Primary Findings
Primary Findings
There are three main types of results related to the condition
that prompted sequencing (“primary findings”): Negative,
Positive and Variants of Uncertain Significance.
https://pediseq.research.chop.edu/discover/for-
parents/results.html#Primary
https://rarediseases.org/for-patients-and-families/information-resources/rare-disease-information/
Mutations are natural
Looking for mutations that may be the cause ….
April 25th – DNA DAY
April 25th – DNA DAY
Mutations
But for mutations, there will not be any evolution or variation
within a species
– chimpanzees and apes will not evolve into homo-sapiens
– on the other hand some mutations are NOT compatible with
Life and or healthy living – disease chronic and terminal
Types of Mutations
Deletion
Duplication
Invertion
Insertion
Translocation
Types of Mutations
Deletion
Duplication
Invertion
Insertion
Translocation
How do mutations occur?
Spontaneous
Induced
Opportunities to be a real life Hunter!
As of 2007 – we had not found a way to correct a mutant gene....
By 2019 – some 7000 diseases have been found to be generitc
And scientists have found a way to cure sickle cell disease....
http://www.youtube.com/watch?v=D3fOXt4M
rOM
http://www.youtube.com/watch?v=4jtmOZaIv
S0
References
 https://rarediseases.info.nih.gov/diseases
 https://magazine.columbia.edu/article/radical-solutions-baldness
 https://pediseq.research.chop.edu/discover/for-parents/results.html#Primary
 https://rarediseases.org/for-patients-and-families/information-resources/rare-
disease-information
 https://www.edinformatics.com/math_science/human-genome.html
 https://www.edx.org/course/case-studies-functional-genomics-harvardx-ph525-7x-
0
 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3656720/#R1
 http://www.edinformatics.com/math_science/gluten-and-celiac-disease/genetics-
of-celiac-disease.html
 (inheritance pattern is unknown)
 https://www.genome.gov/25520925/2007-national-dna-day-online-chatroom-
transcript
 https://links.bioinformatics.ca/
More URLs
https://drive.google.com/file/d/1r8vVlVxYzkaLV9G3rSM3z9YfPJKWbG0-/view
https://drive.google.com/file/d/1XZghbyTCmyc5xmo7twj_LjvDP4BMhgxA/view
http://icgc.org/icgc/media International Cancer Genome Consortium
https://cancercollaboratory.org/
https://dcc.icgc.org/ Cancer datasets
Genome-Wide Association Study
 A study of all or most of the genes of different individuals
to see how much the genes vary from individual to
individual. Different variations are then associated with
different traits, and diseases.
 Single Nucleotide Polymorphism (SNP) in 3 billion base
pairs in human genome
Haemoglobin Glu->Val
mutation causes
clumping
Sickle cell anemia
Cancer genome
Comparative Genomics
• Besides human, more than 500
genomes sequences, ~1,000
genomes are being sequenced.
• Addressing important problems
systematically
- unique genes in human?
- suitable anti-infectious drug
targets in pathogens?
• Human and Chimp share 99.9%
genes, and 98.5% of genome.
Encyclopedia DNA Element (ENCODE)
3/31/19 49
 97% are non-
coding DNA.
Are they junks?
 90% disease
mutations are
not in the coding
region.
 At least 20%
non-coding
DNAs are
functional.
Genetic Circuit:
Social Network of Gene Products
ACTGTCGAACTGGACTTCAGGTTGACTCGGAACGT
gene A gene B gene C gene D gene E
DNA
Gene
Protein
Circuit
ComponentStructureFunction
A B C D E
DataIntegration
Biology and Medicine
in the Context of Genetic Circuit
Classification
Genetic circuit as
a basic functional entity
Bacteria evolution
52
Isn’t This a CSCI Course?
 It is a grand challenge to
handle biological data
 Highly complex – multi-
scale, heterogeneous
 Huge amount – millions
data points/day from a
single lab
 Impact of big data everywhere
 Career opportunities
 …….
BIO for CSCI
3/31/19
53
Fundamental underlying questions
 How are different kinds of information related in the world?
 How can we model that information usefully?
 How can we provide that information efficiently and accurately?
CSCI for BIO
3/31/19

A voyage-inward-02

  • 1.
    Homo Sapien A Marvel Amachine: that can reason about itself, find ways to augment itself, challenge nature – evolution, change and shape its own destiny.
  • 2.
    My Excitement stemsfrom this episode The other day I wanted to know how many nucleotides (ACGT) are there in genome? Back in the day, one would run to Libraries and scan through books. Even before that we would seek counsel from a learned one. They used to vanish taking their knowledge with them. So we came up with written form and books. Then we came up with libraries collection of books that any one can tap into. We were not satisfied with that Library we created, we invented the world wide web.
  • 3.
    Disclaimer My interest isin analyzing data Bioinformatics is an interdisciplinary field that intersects with biology, computer science, mathematics and statistics. It concerns itself with the development and use of methods and software tools for collecting and analyzing biological data. – https://github.com/topics/bioinformatics Broad Disclaimer I am NOT the original author of anything you find in this presentation. Figures, ideas all come from various sources and I have tried to provide the link. I have no claim or pretense about the ownership of this material. This presentation is for the benefit of students who are considering Bioinformatics to give them a broad overview of the field, its promise and understand how technolgy can help find cure to many ailments.
  • 4.
    So I bingedit – My augmented memory
  • 5.
    Time to seekmore knowledge It just does not make sense!!! Why is it divided into 24 linear molecules? What is a chromosome and why are there 24 chromosomes? 22 duplicate pairs, X and X in females, X and Y in males And how do chromosomes relate to those 24 linear molecules? What is or how is a chromosome related to DNA? Why are there pairs of chromosomes? What is a cell? What is a cell nuclei? What is an individual mitochondria? Protein-coding DNA genes and non-encoding DNA ? (why do they call it DNAGenes?)
  • 6.
    Time to seekmore knowledge What is a genome? Why is it a human genome?Can there be pig genome? How is a human nuclear genome different from human genome? What is nucleotide? Why is there a sequence? What is mitochondria and its genome? Why do scientists call them genome?
  • 7.
  • 8.
    There is nosuchthing as enough: especially when it comes to knowledge https://www.genome.gov/25520925/2007-national-dna-day-online-chatroom-transcript/ Choosing to have or not to have genetic testing is a personal one. For example, a person who has a family history of early onset breast cancer may wish to have genetic testing to learn whether he or she has inherited a BRCA1 or BRCA2 gene that greatly increases their risk for developing breast and other cancers so that they can take preventive actions.
  • 9.
  • 10.
  • 11.
    Genomics https://www.pbs.org/wnet/religionandethics/2013/08/09/january-25-2013-1000-dollar-genome/14569/ When researchers firstmapped the human genome, it took almost 10 years and cost $3 billion. Today the process takes three weeks, and the price tag is rapidly approaching $1,000. What is genomics? The branch of molecular biology concerned with the structure, function, evolution, and mapping of genomes. All the genetic material in a person is referred to as their genome. https://www.cbsnews.com/news/could-gene-therapy-cure-sickle-cell-anemia-60-minutes/ https://www.cbsnews.com/video/the-chairman-aclu-genetic-revolution/ Sickle cell Anemia https://www.cbsnews.com/video/woman-feels-no-pain-due-to-rare-gene-mutation-researchers-say https://www.amazon.com/The-Evolution-of-Us/dp/B0788TJT52 http://bioinformaticsalgorithms.com/videos.htm
  • 12.
    Why, What andHow Learning cycle From the Sickle Cell Episode on CBS we learn there are 7000 genetic diseases From the HairLoss gene presented we learn the process involves lot of data collection, storage and association/statistical analysis So we seek to learn what is the science of genes … This is not a substitute for a proper course in the Bio departments...
  • 13.
  • 14.
    Cells https://rarediseases.info.nih.gov/files/glossary/english/cell_lg.jpg A cell isthe basic building block of living things. All cells can be sorted into one of two groups: eukaryotes and prokaryotes. A eukaryote has a nucleus and membrane-bound organelles, while a prokaryote does not. Plants and animals are made of numerous eukaryotic cells, while many microbes, such as bacteria, consist of single cells. An adult human body is estimated to contain between 10 and 100 trillion cells.
  • 15.
  • 16.
    Chromosomes Chromosomes are thestructures in our cells that are made up of DNA. Our chromosomes are contained in the nucleus of the millions of cells that make up our body. We have 23 pairs of chromosomes (46 total). For every pair of chromosomes, one chromosome comes from our mother and one from our father. One pair of chromosomes is the sex chromosomes called the X and Y chromosomes. These two chromosomes determine if we will be male or female. Females have two X chromosomes and males have an X and a Y. Except for genes carried on the sex chromosomes, we inherit two copies of every gene, one from each parent. The diagram below shows how our bodies are made of millions of cells that contain DNA packaged in chromosomes. A chromosome is an organized package of DNA found in the nucleus of the cell. Different organisms have different numbers of chromosomes. Humans have 23 pairs of chromosomes-- 22 pairs of numbered chromosomes, called autosomes, and one pair of sex chromosomes, X and Y. Each parent contributes one chromosome to each pair so that offspring get half of their chromosomes from their mother and half from their father.
  • 17.
    DNA structure andfunction  Undisturbed, DNA forms a double helix of 2 intertwined chains with complementary sequences of nucleotides (A<->T, G<->C)  To a sugar-phosphate backbone are attached base pairs = 2 nucleotides on opposite strands connected by hydrogen bonds (A only with T, and G only with C)  Major function of DNA = encode sequence of amino acids in proteins BIO for CSCI 3/31/19 17
  • 18.
    Genomes and chromosomes Genome of an organism is its complete DNA sequence of a full set of chromosomes for an organism  Looks like a very long word on a 4-letter alphabet  Chromosome = single, very long, continuous piece of DNA and proteins  Assembly of transcription units that are precisely duplicated in each cell generation  Species specific in number, size, and shape  How many chromosomes organism has is not a measure of the complexity of the organism (amphibians have more than people, and tulips have 10 times as much DNA in their cells)  How DNA wind? 3/31/19 18 http://www.youtube.com/watch?v=gefKBQ81ngE
  • 19.
    DNA What is DNA? DNA(deoxyribonucleic acid) is the basic material of heredity. It is a chemical made of four kinds of building blocks called bases. The four types of bases are: adenine (A), cytosine (C), guanine (G), and thymidine (T). These four bases are strung together in different combinations or sequences. DNA sequences are mostly the same in everyone. In fact, about 99% of the DNA sequence is identical in all humans. The 1% of the DNA that differs between people is what makes us each unique. The differences in the DNA between people are called variants. Human Genome Project ~ 3 billion base pairs ~30,000 genes
  • 20.
    Genes Genes are segmentsof DNA that we inherit from our parents and pass on to our children. Most genes tell our body how to make products called proteins. Proteins are components of our body that perform a specific function so that our bodies work properly. We have about 23,000 genes that instruct our body on everything from how our heart will form and beat, to what color our eyes will be. You can think of a gene as a book that contains instructions for how to create a specific protein. The order of the bases in DNA is like the letters that make up a sentence in a book. If there is a “typo” or a change in a letter in a word, it can alter the meaning of the sentence. For example, changing the letter “T” in the word tag to the letter “G” would result in the word gag, changing our understanding of the sentence. Similarly, a variant or mutation in a gene can alter the instructions a gene provides to the body, causing the body not to make a working protein. Such variants may lead to problems with a person’s growth and development, or cause disease. These variants are called mutations.
  • 21.
    Gene and Codons Gene= hereditary unit of DNA  Other DNA segments have structural purposes, or are involved in regulating the expression of genetic information  non-coding sequences (portions for which no function has yet been identified, about 97% in people)  Each 3-letter nucleotide sequence forms a codon  Since there 4 letters, there are 43 = 64 possible codons  3 stop codons, and 1 start codon  The other triples (redundantly) represent 20 amino acids  Example: Proline = CCC, CCT, CCA, and CCG BIO for CSCI 3/31/19 21
  • 22.
    Hair Loss Gene Hairlossis no laughing matter... Billion dollar industry And nothing of significance can be achieved without passion or dedicated and dogged effort. https://magazine.columbia.edu/article/radical-solutions-
  • 23.
  • 24.
    The tale ofAA – alopecia areata
  • 25.
    Overcoming rejections/objections Story ofpersistence and resolute effort.
  • 26.
  • 27.
    Let us pauseand take stock The human body has an estimated thirty seven trillion cells. Each cell has our entire set of genes, about 20 thousand, in the nuclei. Genes are the basic unit of heredity in all life forms. Genes are discrete segments of DNA. Entire chain of Genes in a DNA in a cell is called the genome –the complete set of chemical instructions for the organism. DNAs are molecules shaped like a twisted ladder, coiled in a thread like bundles called chromosomes. Genes encode messages for specific proteins to build and run an organism. Genes determine everything about the organism. The Human Genome Project — the sequencing of the three billion subunits, or base pairs, of the DNA molecule — gave geneticists a powerful new tool for identifying genetic factors in complex, multi-gene diseases. Colloborating with two Pakistani genetic researchers and experiments on mouse Christiana found the gene that controls the growth of human hair.
  • 28.
    Gene Registry Christiano gota substantial National Institutes of Health grant to create a DNA registry in the US for alopecia areata — the first step toward identifying the genes that cause that disease. That meant finding people from extended families in which multiple members had severe hair loss, as well as finding thousands of individual patients whose relatives were not affected. “All were valuable to the gene-hunting efforts,” Christiano says. The registry was compiled by Christiano at Columbia and four other institutions over a period of years.
  • 29.
  • 30.
    ULBP3:But one thingleads to another... The team zeroed in on chromosome six, where they detected what Christiano calls “the smoking gun for AA” — a gene called ULBP3. “This was the gene that landed us in the hair follicle,” Christiano says.
  • 31.
    Genome/Exome All the geneticmaterial in a person is referred to as their genome. Whole genome sequencing is a type of genetic test that looks at the sequence of all of the DNA in a person’s body. The portion of the DNA that is needed for a person’s body to make proteins is referred to as that person’s exome. The vast majority of known genetic diseases are caused by mutations in the exome. Whole exome sequencing only looks at the portion of DNA that is used to make proteins. Whole genome sequencing and whole exome sequencing may also be referred to as just genome or exome sequencing. Using exome and genome sequencing without the term “whole” may be more appropriate, since these tests do not usually look at a person’s whole sequence. Genome or exome sequencing is usually done to try to figure out the cause of a condition that seems to be genetic. It is often performed when there are many possible genes that could be causing a patient’s condition. It might also be performed when standard genetic testing has not identified the cause or a diagnosis. Samples from a person’s family members, especially parents, may also be requested when a person has exome or genome testing. This allows the laboratory to compare the DNA sequences of the patient and family members in order to better understand which genetic variants are inherited. Because genetic variants are often inherited from an individual’s parents, sequencing may identify information that is relevant to family members. For example it could identify a chance that they or their child may have or develop a certain medical condition.
  • 32.
    Classsifying the Mutation 1.“Negative for a disease-associated mutation” means that no mutation was found in any gene known to be involved or suspected to be involved with the condition. This result does not exclude a possible genetic cause because not all genetic changes are found by current testing. 2. “Positive for a disease-associated mutation” means that a mutation was found in a gene that is known to be associated OR suspected to be associated with the condition. Disease-Associated Mutation (Known Disease Gene): A disease-causing mutation was found in a gene known to be associated with the condition. Such a result may lead to changes in recommendations for medical management and treatment. Testing other family members as well as prenatal testing may be available to determine if other individuals are affected. Suspected Disease-Associated Mutation (Uncertain Disease Gene): A mutation was identified in a gene that does NOT have a well known association with the condition. The gene may be suspected to be associated with the condition based on the gene’s function in the body or how it interacts with other genes. In such cases, it is not certain if the mutation found is the cause of the person’s condition. Additional information may become available in the future to clarify what this result means. https://pediseq.research.chop.edu/discover/for- parents/results.html#Primary
  • 33.
    WGS (whole genomesequencing) uncovering the genetic and molecular basis of human language through analysis of whole genome sequencing (WGS) data the effect of genetic variation on the development and function of the brain, particularly in the context of neurodevelopmental conditions such as autism and language impairment. whole genome sequencing, RNA-seq, ChIP-seq, etc. https://pediseq.research.chop.edu/discover/for-parents/introduction-to-genetic-sequencing.html
  • 34.
    Findings:Primary vs Incidental IncidentalFindings Incidental findings are genetic variants or results unrelated to the condition that prompted sequencing. Incidental findings could tell us about other symptoms or diseases that a person may have or might develop, how a person might respond to certain medications, or the potential risks of having a child with a specific genetic disease. One example of an incidental finding would be finding a variant related to diabetes in someone who had exome sequencing to identify the cause of her hearing loss.
  • 35.
    Primary Findings Primary Findings Thereare three main types of results related to the condition that prompted sequencing (“primary findings”): Negative, Positive and Variants of Uncertain Significance. https://pediseq.research.chop.edu/discover/for- parents/results.html#Primary https://rarediseases.org/for-patients-and-families/information-resources/rare-disease-information/
  • 36.
    Mutations are natural Lookingfor mutations that may be the cause ….
  • 37.
  • 38.
  • 39.
    Mutations But for mutations,there will not be any evolution or variation within a species – chimpanzees and apes will not evolve into homo-sapiens – on the other hand some mutations are NOT compatible with Life and or healthy living – disease chronic and terminal
  • 40.
  • 41.
  • 42.
    How do mutationsoccur? Spontaneous Induced
  • 43.
    Opportunities to bea real life Hunter! As of 2007 – we had not found a way to correct a mutant gene.... By 2019 – some 7000 diseases have been found to be generitc And scientists have found a way to cure sickle cell disease....
  • 44.
  • 45.
    References  https://rarediseases.info.nih.gov/diseases  https://magazine.columbia.edu/article/radical-solutions-baldness https://pediseq.research.chop.edu/discover/for-parents/results.html#Primary  https://rarediseases.org/for-patients-and-families/information-resources/rare- disease-information  https://www.edinformatics.com/math_science/human-genome.html  https://www.edx.org/course/case-studies-functional-genomics-harvardx-ph525-7x- 0  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3656720/#R1  http://www.edinformatics.com/math_science/gluten-and-celiac-disease/genetics- of-celiac-disease.html  (inheritance pattern is unknown)  https://www.genome.gov/25520925/2007-national-dna-day-online-chatroom- transcript  https://links.bioinformatics.ca/
  • 46.
  • 47.
    Genome-Wide Association Study A study of all or most of the genes of different individuals to see how much the genes vary from individual to individual. Different variations are then associated with different traits, and diseases.  Single Nucleotide Polymorphism (SNP) in 3 billion base pairs in human genome Haemoglobin Glu->Val mutation causes clumping Sickle cell anemia Cancer genome
  • 48.
    Comparative Genomics • Besideshuman, more than 500 genomes sequences, ~1,000 genomes are being sequenced. • Addressing important problems systematically - unique genes in human? - suitable anti-infectious drug targets in pathogens? • Human and Chimp share 99.9% genes, and 98.5% of genome.
  • 49.
    Encyclopedia DNA Element(ENCODE) 3/31/19 49  97% are non- coding DNA. Are they junks?  90% disease mutations are not in the coding region.  At least 20% non-coding DNAs are functional.
  • 50.
    Genetic Circuit: Social Networkof Gene Products ACTGTCGAACTGGACTTCAGGTTGACTCGGAACGT gene A gene B gene C gene D gene E DNA Gene Protein Circuit ComponentStructureFunction A B C D E DataIntegration
  • 51.
    Biology and Medicine inthe Context of Genetic Circuit Classification Genetic circuit as a basic functional entity Bacteria evolution
  • 52.
    52 Isn’t This aCSCI Course?  It is a grand challenge to handle biological data  Highly complex – multi- scale, heterogeneous  Huge amount – millions data points/day from a single lab  Impact of big data everywhere  Career opportunities  ……. BIO for CSCI 3/31/19
  • 53.
    53 Fundamental underlying questions How are different kinds of information related in the world?  How can we model that information usefully?  How can we provide that information efficiently and accurately? CSCI for BIO 3/31/19