2. What is Data?
• Data is a collection of facts, such as
values or measurements.
• It can be numbers, words, measurements,
observations or even just descriptions of
things.
3. General
• DATA - Numbers or other identifiers
derived from observation, experiment or
calculation
• INFORMATION - a collection of data and
associated explanations, interpretations,
and other material concerning a particular
object, event or process
5. Primary Data
• Data that has been collected from first-
hand-experience is known as primary
data.
• Primary data has not been published yet
and is more reliable, authentic and
objective. Primary data has not been
changed or altered by human beings,
therefore its validity is greater than
secondary data.
6. Sources of Primary Data
• Experiments
• Survey
– Questionnaire
– Interview
– Observations
7. Secondary Data
• Data collected from a source that has
already been published in any form is
called as secondary data.
• The review of literature in nay research is
based on secondary data. Mostly from
books, journals and periodicals.
8. Sources of Secondary Data
• Published Printed Sources:
• Books:
• Journals/periodicals
• Magazines/Newspapers
• e-journals
• General websites
9. Sources of data
• Sequencing programs
• Molecular studies
• Elucidation of metabolic pathways / Cellular mechanisms
• Cytological studies
• Clinical studies
• Physiological studies
• Mutational experiments
• Data from simulation
10. Qualitative vs Quantitative
• Data can be qualitative or quantitative.
• Qualitative data is descriptive information (it
describes something)
• Quantitative data, is numerical information
(numbers).
•
11. • And Quantitative data can also be
Discrete or Continuous:
• Discrete data can only take certain values
(like whole numbers)
• Continuous data can take any value
(within a range)
12. Qualitative:
He is brown and black
He has long hair
He has lots of energy
Quantitative:
Discrete:
He has 4 legs
He has 2 brothers
Continuous:
He weighs 25.5 kg
He is 565 mm tall
Discrete data is counted, Continuous data is measured
14. General characteristics
• Diversity – are intrinsically complex and are organized
in loose hierarchies that reflect our understanding of the
complex living systems, ranging from gene and proteins,
to protein-protein interactions, biochemical pathways and
regulatory networks, to cells and tissues, organisms and
populations
• Variability - Different individuals and species vary
tremendously, so naturally biological data does also
15. • Non reproducibility – Representations of
the same data by different biologists will
most likely be different (even when using
the same system)
• Redundancy – Lack of uniqueness
16. The importance of biological data
• Data has become an essential
commodity for biological research.
• Ten years ago, if a medical researcher
needed to find a gene involved in a certain
disease, he or she might have needed to
invest 3 years of laboratory work.
• Today, thanks to genomic information
stored in large public databases, the same
task may take less than 30 minutes.