Introduction to Data Science

MISSION
CHRIST is a nurturing ground for an individual’s
holistic development to make effective contribution to
the society in a dynamic environment
VISION
Excellence and Service
CORE VALUES
Faith in God | Moral Uprightness
Love of Fellow Beings
Social Responsibility | Pursuit of Excellence
MTH341C - PRINCIPLES OF DATA SCIENCE
Week1: 18 to 23 July 2022
Department of Data Science and Statistics, CHRIST (DEEMED TO BE
UNIVERSITY)
BANGALORE, KARNATAKA, INDIA
Introduction to Data Science
Dr. UMME SALMA M
Assistant Professor
Ummesalma.m@christuniversity.in

CHRIST
Deemed to be University
Class Details
● Programme
○ MSC Mathematics
● Course
○ MTH341C
○ PRINCIPLES OF DATA SCIENCE
● Unit 1
○ Introduction To Data Science and Big Data
● Topic 1
○ Data Science Market
● Material
○ Online resources
2

CHRIST
Outline
3
➔KYS
➔Data Science Family
➔Skills and Jobs
➔Resources

CHRIST
KYS :)
Gender; Region; +2/UG Groups; Interest;
Project; Goal
4

CHRIST
Data Science
5

CHRIST
6

CHRIST
Data Science Virtual Machine
7

CHRIST
Data Science Family
8
● Data Science is a field of science than mere data
● Data Mining is mainly about
finding useful information in a dataset and utilizing that information to
uncover hidden patterns.
● Data Analytics involves tools and techniques
○ [information resulting from the systematic analysis of data or
statistics]
& Data Mining

CHRIST
9

CHRIST
10

CHRIST
11

CHRIST
12

CHRIST
Data Science Steps
13
Step 1: The first step of this process is setting a research goal. The main purpose here is making sure all the
stakeholders understand the what, how, and why of the project.
Step 2: The second phase is data retrieval. You want to have data available for analysis, so this step
includes finding suitable data and getting access to the data from the data owner. The result is data in its
raw form, which probably needs polishing and transformation beforeit becomes usable.
Step 3: Data transformation converts a raw form into directly usable form. To achieve this, you’ll detectand
correctdifferentkinds of errors in the data, combine data from differentdata sources,and transform it. If you
have successfullycompletedthis step, you can progress to data visualization and modeling.
Step 4: Data Exploration helps to gain a deep understanding of the data. You’ll look for patterns, correlations,
and deviations based on visual and descriptive techniques.The insights you gain from this phase will enable
you to start modeling.
Step 5: Data modelling is the phase to attempt to gain the insights or make the predictions stated in your
projectcharter. Now is the time to bring out the heavy guns, but rememberresearchhas taught us that often
(but not always) a combinationof simple models tends to outperform one complicatedmodel.
Step 6:Presentation and automation is all about presenting your results and automating the analysis, if needed.

CHRIST
Data Science Steps Outcome
14
Step1 Outcome:Clear Understanding of the goals of research and its context.
A projectcharter requires teamwork, and your input covers at least the following:
■ A clear researchgoal
■ The projectmissionand context
■ How you’re going to perform your analysis
■ What resources you expectto use
■ Proof that it’s an achievable project,or proof of concepts
■ Deliverables and a measure of success
■ A timeline
Step 2 Outcome:Sometimesyou need to go into the field and designa data collectionprocess
yourself,but most of the time you won’t be involved in this step.
Step 3 Outcome:Getting access to data is another difficulttask. Organizations understand the
value and sensitivity of data and oftenhave policies in place so everyone has access to what
they need and nothing more. Don’t be afraid to shop around.

CHRIST
15
Step 4 Outcome:Cleansing data
Data cleansing is a subprocess of the data science processthat focuses on
removing rrors in your data so your data becomesa true and consistent
representationof the processes itoriginates from.
Combiningdata from differentdata sources
Step 5 Outcome:Working Model based upon the
requirement
Step 6: Deployed Model

CHRIST
16

CHRIST
17

CHRIST
18

CHRIST
19

CHRIST
20
Source:https://wheebox.com/assets/pdf/ISR_Report_2020.pdf

CHRIST
21
Source:https://wheebox.com/assets/pdf/ISR_Report_2021.pdf

CHRIST
22

CHRIST
23
Source: https://analyticsindiamag.com/why-you-may-not-be-getting-a-call-back-for-that-data-science-job/

CHRIST
Skills
and
Jobs
24
Source:https://blog.udacity.com/2014/11/data-science-job-skills.html

CHRIST
25
Souce: https://www.gartner.com/smarterwithgartner/gartner-top-10-data-and-analytics-trends-for-2021/

CHRIST
Data Repositories
26
•Google DatasetSearch.
•Kaggle.
•Data.Gov.
•Datahub.io.
•UCI Machine Learning Repository.
•Earth Data.
•CERN Open Data Portal.
•Global Health ObservatoryData Repository.
•NCBI
•CERT
•NCRB
•Indiastat

CHRIST
Resources
27
● https://www.kdnuggets.com
● https://www.kaggle.com/
● https://www.analyticsvidhya.com/
● https://towardsdatascience.com
● https://machinelearningmastery.com/
● https://pydata.org/
● https://www.meetup.com/topics/data-science/
arXiv ; GitHub; MOOCS

CHRIST
THANKYOU
Next Topic: Unit 1: Chapter 1
Data Science in a Big Data World
Next session: Monday 12.00 PM
28

Introduction to Data Science

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction to Data Science

Similar to Introduction to Data Science (20)

More from UmmeSalmaM1

More from UmmeSalmaM1 (9)

Recently uploaded

Recently uploaded (20)

Introduction to Data Science