Data Wrangling
Dr. Ferdin Joe John Joseph
Faculty of Information Technology
Thai – Nichi Institute of Technology, Bangkok
Today’s Lesson
• Data Wrangling – Course in Glance
• Data Wrangling or Cleaning
• Introduction to Python
• Jupyter Notebook
• Python Libraries
• Install Packages using pip
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Today’s Lesson
• Data Wrangling – Course in Glance
• Data Wrangling or Cleaning
• Introduction to Python
• Jupyter Notebook
• Python Libraries
• Install Packages using pip
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
DSA 201 – A Road Map
Attendance
(10%)
Mid Exam
(30%)
Assignments
and
Presentations
(20%)
Final Exam
(40%)
Faculty of Information Technology, Thai - Nichi Institute of
Technology
4
Today’s Lesson
• Data Wrangling – Course in Glance
• Data Wrangling or Cleaning
• Introduction to Python
• Jupyter Notebook
• Python Libraries
• Install Packages using pip
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Why Data Wrangling?
Data Quality
Issues
Bad Data
Incorrect
Analysis
Invalid
Insights
Wrong
Decisions
Poor
Outcomes
Loss in
revenue
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Data Wrangling - Definition
• Data wrangling is the process of converting raw data into data that
can be analyzed to generate valid actionable insights.
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Analogy
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Analogy
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Analogy
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Data Wrangling – Other names
• Data Preprocessing
• Data Preparation
• Data Cleansing
• Data Scrubbing
• Data Munging
• Data Transformation
• Data Fold, Spindle, Mutilate
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Steps in Data Wrangling
Merge data
sets and
Rebuild
missing data
Standardize
and
Normalize
Deduplicate,
Verify and
Enrich
Export Data
and Insights
Import Data
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
How are we gonna wrangle data?
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Today’s Lesson
• Data Wrangling – Course in Glance
• Data Wrangling or Cleaning
• Introduction to Python
• Jupyter Notebook
• Python Libraries
• Install Packages using pip
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Introduction to Python
• Open Source
• Object Oriented
• Preferred language for data science
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Install Miniconda
• Installation of Miniconda needed for Laptops
• https://docs.conda.io/en/latest
• For offline practice
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Today’s Lesson
• Data Wrangling – Course in Glance
• Data Wrangling or Cleaning
• Introduction to Python
• Jupyter Notebook
• Python Libraries
• Install Packages using pip
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Preferred tool for practice
• Google Colab Notebooks
• Internet is needed to use
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Today’s Lesson
• Data Wrangling – Course in Glance
• Data Wrangling or Cleaning
• Introduction to Python
• Jupyter Notebook
• Python Libraries
• Install Packages using pip
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Most Common Libraries
• Pandas
• Numpy
• Matplotlib
• Scikit Learn
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Today’s Lesson
• Data Wrangling – Course in Glance
• Data Wrangling or Cleaning
• Introduction to Python
• Jupyter Notebook
• Python Libraries
• Install Packages using pip
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Install packages using pip
• Numerous packages are available
• Not all packages are installed by default
• Packages can be installed using pip
• Syntax: pip install <library name>
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Today’s Lesson
• Data Wrangling – Course in Glance
• Data Wrangling or Cleaning
• Introduction to Python
• Jupyter Notebook
• Python Libraries
• Install Packages using pip
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Demonstration
• Get familiar with Colab
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok
Lesson for Next Week
• Basics of Python
• Data Types
• Variables
• Operators and Operands
• Lists
• Demonstration
Faculty of Information Technology, Thai - Nichi Institute of
Technology, Bangkok

Data wrangling week1

  • 1.
    Data Wrangling Dr. FerdinJoe John Joseph Faculty of Information Technology Thai – Nichi Institute of Technology, Bangkok
  • 2.
    Today’s Lesson • DataWrangling – Course in Glance • Data Wrangling or Cleaning • Introduction to Python • Jupyter Notebook • Python Libraries • Install Packages using pip • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 3.
    Today’s Lesson • DataWrangling – Course in Glance • Data Wrangling or Cleaning • Introduction to Python • Jupyter Notebook • Python Libraries • Install Packages using pip • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 4.
    DSA 201 –A Road Map Attendance (10%) Mid Exam (30%) Assignments and Presentations (20%) Final Exam (40%) Faculty of Information Technology, Thai - Nichi Institute of Technology 4
  • 5.
    Today’s Lesson • DataWrangling – Course in Glance • Data Wrangling or Cleaning • Introduction to Python • Jupyter Notebook • Python Libraries • Install Packages using pip • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 6.
    Why Data Wrangling? DataQuality Issues Bad Data Incorrect Analysis Invalid Insights Wrong Decisions Poor Outcomes Loss in revenue Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 7.
    Data Wrangling -Definition • Data wrangling is the process of converting raw data into data that can be analyzed to generate valid actionable insights. Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 8.
    Analogy Faculty of InformationTechnology, Thai - Nichi Institute of Technology, Bangkok
  • 9.
    Analogy Faculty of InformationTechnology, Thai - Nichi Institute of Technology, Bangkok
  • 10.
    Analogy Faculty of InformationTechnology, Thai - Nichi Institute of Technology, Bangkok
  • 11.
    Data Wrangling –Other names • Data Preprocessing • Data Preparation • Data Cleansing • Data Scrubbing • Data Munging • Data Transformation • Data Fold, Spindle, Mutilate Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 12.
    Steps in DataWrangling Merge data sets and Rebuild missing data Standardize and Normalize Deduplicate, Verify and Enrich Export Data and Insights Import Data Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 13.
    How are wegonna wrangle data? Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 14.
    Today’s Lesson • DataWrangling – Course in Glance • Data Wrangling or Cleaning • Introduction to Python • Jupyter Notebook • Python Libraries • Install Packages using pip • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 15.
    Introduction to Python •Open Source • Object Oriented • Preferred language for data science Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 16.
    Install Miniconda • Installationof Miniconda needed for Laptops • https://docs.conda.io/en/latest • For offline practice Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 17.
    Today’s Lesson • DataWrangling – Course in Glance • Data Wrangling or Cleaning • Introduction to Python • Jupyter Notebook • Python Libraries • Install Packages using pip • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 18.
    Preferred tool forpractice • Google Colab Notebooks • Internet is needed to use Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 19.
    Today’s Lesson • DataWrangling – Course in Glance • Data Wrangling or Cleaning • Introduction to Python • Jupyter Notebook • Python Libraries • Install Packages using pip • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 20.
    Most Common Libraries •Pandas • Numpy • Matplotlib • Scikit Learn Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 21.
    Today’s Lesson • DataWrangling – Course in Glance • Data Wrangling or Cleaning • Introduction to Python • Jupyter Notebook • Python Libraries • Install Packages using pip • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 22.
    Install packages usingpip • Numerous packages are available • Not all packages are installed by default • Packages can be installed using pip • Syntax: pip install <library name> Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 23.
    Today’s Lesson • DataWrangling – Course in Glance • Data Wrangling or Cleaning • Introduction to Python • Jupyter Notebook • Python Libraries • Install Packages using pip • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 24.
    Demonstration • Get familiarwith Colab Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok
  • 25.
    Faculty of InformationTechnology, Thai - Nichi Institute of Technology, Bangkok
  • 26.
    Faculty of InformationTechnology, Thai - Nichi Institute of Technology, Bangkok
  • 27.
    Faculty of InformationTechnology, Thai - Nichi Institute of Technology, Bangkok
  • 28.
    Faculty of InformationTechnology, Thai - Nichi Institute of Technology, Bangkok
  • 29.
    Faculty of InformationTechnology, Thai - Nichi Institute of Technology, Bangkok
  • 30.
    Lesson for NextWeek • Basics of Python • Data Types • Variables • Operators and Operands • Lists • Demonstration Faculty of Information Technology, Thai - Nichi Institute of Technology, Bangkok