R programming for Data Science
Part-1
w
www.dataspoof.info
Steps to do any Data
Science project
Identify the problem
(question)
Collect & Prepare the
data
Explore the data
Communicatethe
results
www.dataspoof.info
Data Collection
Data collection is the process of gathering information from a
specific source, which can be used to answer relevant questions
and evaluate outcomes.
Data can help us in:
• learning more about customers, items, products, ..etc.
• discovering trends in the current system, organization,
..etc.
• segmenting elements into different groups based on
their individual needs.
• decision making process to improve the quality of the
system.
• improving the quality of the product or service based
on the feedback obtained. www.dataspoof.info
Data Sources
www.dataspoof.info
Data Format
www.dataspoof.info
Define Data
Data is a set of facts such as numbers, words, measurements, observations or descriptions
of things.
There are two types of data are there
• Qualitative data: descriptive information (describes something).
• Quantitative data: numerical information (numbers).
www.dataspoof.info
Qualitative vs
Quantitative
Types of Data Values
►Numeric:
•Discrete - integer values. Example: number of car in the park.
•Continuous - any value in a pre-defined range (float, double). Example: average mark (e.g., 63.4)
►Categorical: values are selected from a predefined number of categories.
•Ordinal - categories could be meaningfully ordered. Example: grades (A, B, C, D, E, F).
•Nominal - don’t have any order. Example: eye colours (blue, black, honey, etc.)
•Binary - the special case of nominal, with only 2 possible
categories. Example: binary value (1, 0)
www.dataspoof.info
Types of Data Values
►Date: datetime, timestamp. Example: 11.10.2018.
►Text: Multidimensional data
►Time series: Data points indexed in the time order
Types of Data Category
There are two main categories
• Experimental data: Data collected from strictly controlled/designed experiments with efforts
made to ensure statistical validity.
Examples- Medical clinical trials, Election polls
• Observational data: Data collected from ’real-world’ settings without control over the captured
underlying phenomena. It is easier to collect and obtain, but results and conclusions from such
data may be biased or inconclusive.
Examples- Almost all data used in data mining, bushiness analytic and data science are
observational data.
Various Data Types are
• Numbers
• String
• Relational data
• Factors or categorical variables
• Dates and times
• Description

R programming for data science

  • 1.
    R programming forData Science Part-1 w www.dataspoof.info
  • 2.
    Steps to doany Data Science project Identify the problem (question) Collect & Prepare the data Explore the data Communicatethe results www.dataspoof.info
  • 3.
    Data Collection Data collectionis the process of gathering information from a specific source, which can be used to answer relevant questions and evaluate outcomes. Data can help us in: • learning more about customers, items, products, ..etc. • discovering trends in the current system, organization, ..etc. • segmenting elements into different groups based on their individual needs. • decision making process to improve the quality of the system. • improving the quality of the product or service based on the feedback obtained. www.dataspoof.info
  • 4.
  • 5.
  • 6.
    Define Data Data isa set of facts such as numbers, words, measurements, observations or descriptions of things. There are two types of data are there • Qualitative data: descriptive information (describes something). • Quantitative data: numerical information (numbers). www.dataspoof.info
  • 7.
  • 8.
    Types of DataValues ►Numeric: •Discrete - integer values. Example: number of car in the park. •Continuous - any value in a pre-defined range (float, double). Example: average mark (e.g., 63.4) ►Categorical: values are selected from a predefined number of categories. •Ordinal - categories could be meaningfully ordered. Example: grades (A, B, C, D, E, F). •Nominal - don’t have any order. Example: eye colours (blue, black, honey, etc.) •Binary - the special case of nominal, with only 2 possible categories. Example: binary value (1, 0) www.dataspoof.info
  • 9.
    Types of DataValues ►Date: datetime, timestamp. Example: 11.10.2018. ►Text: Multidimensional data ►Time series: Data points indexed in the time order
  • 10.
    Types of DataCategory There are two main categories • Experimental data: Data collected from strictly controlled/designed experiments with efforts made to ensure statistical validity. Examples- Medical clinical trials, Election polls • Observational data: Data collected from ’real-world’ settings without control over the captured underlying phenomena. It is easier to collect and obtain, but results and conclusions from such data may be biased or inconclusive. Examples- Almost all data used in data mining, bushiness analytic and data science are observational data.
  • 11.
    Various Data Typesare • Numbers • String • Relational data • Factors or categorical variables • Dates and times • Description