DATASCIENCE
IntroductiontoDataScience
Presented by Prof.Priyanka Jadhav
Jens Martensson
1.1WhatisDataScience,importanceofdatascience,
1.2BigdataanddataScience,thecurrentScenario,
1.3IndustryPerspectiveTypesofData:Structuredvs.
UnstructuredData,
1.4Quantitativevs.CategoricalData,
1.5BigDatavs.LittleData,Datascienceprocess
1.6RoleofDataScientist
2
Jens Martensson
What is Data Science
The study of data to extract meaningful insights
for business.
Data Science is a multidisciplinary field that
uses scientific methods, processes, algorithms,
and systems to extract knowledge and insights
from structured and unstructured data. It
combines aspects of mathematics, statistics,
computer science, and domain knowledge to
interpret data for decision-making and problem-
solving.
3
Jens Martensson
Importance of Data Science
Data science is important because it combines tools, methods, and
technology to generate meaning from data
1 2
3 4
5 6
4
Jens Martensson
Comparison
5
1.Structured data –
Structured data is data whose elements are addressable for effective analysis. It has been organized into a
formatted repository that is typically a database. It concerns all data which can be stored in database SQL in
a table with rows and columns. They have relational keys and can easily be mapped into pre-designed
fields. Today, those data are most processed in the development and simplest way to manage
information. Example: Relational data.
2.Semi-Structured data –
Semi-structured data is information that does not reside in a relational database but that has some
organizational properties that make it easier to analyze. With some processes, you can store them in the
relation database (it could be very hard for some kind of semi-structured data), but Semi-structured exist to
ease space. Example: XML data.
3.Unstructured data –
Unstructured data is a data which is not organized in a predefined manner or does not have a predefined
data model, thus it is not a good fit for a mainstream relational database. So for Unstructured data, there are
alternative platforms for storing and managing, it is increasingly prevalent in IT systems and is used by
organizations in a variety of business intelligence and analytics applications. Example: Word, PDF, Text,
Media logs.
Jens Martensson 6
Jens Martensson 7
Big Data :
The definition of big data is data that contains greater variety, arriving in increasing
volumes and with more velocity.
Big data is a term that describes large, hard-to-manage volumes of data – both
structured and unstructured – that inundate businesses on a day-to-day basis. But it's
not just the type or amount of data that's important, it's what organisations do with the
data that matters.
Big Data Examples to Know
Transportation: assist in GPS navigation, traffic and weather alerts
Big data examples:
Tracking consumer behavior and shopping habits to deliver hyper-personalized retail
product recommendations tailored to individual customers. Monitoring payment
patterns and analyzing them against historical customer activity to detect fraud in real
time.
Jens Martensson 8
Jens Martensson 9
Jens Martensson 10
Why is big data needed in current scenario?
Large data sets are meant to be comprehensive and encompass as much
information as the organization needs to make better decisions. Big data
insights let business leaders quickly make data-driven decisions that
impact their organizations. Better customer and market insights.
Quantitativevariablesareanyvariableswherethedatarepresent
amounts(e.g.height,weight,orage).
Categoricalvariablesareanyvariableswherethedatarepresent
groups.Thisincludesrankings(e.g.finishingplacesinarace),
classifications(e.g.brandsofcereal),andbinaryoutcomes(e.g.coin
flips).
Youneedtoknowwhattypeofvariablesyouareworkingwithto
choosetherightstatisticaltestforyourdataandinterpret
yourresults.
Jens Martensson 12
Jens Martensson 13
Jens Martensson 14
Jens Martensson 15
Role of Data Scientist :
A data scientist is a tech professional
that collects, analyzes, and interprets vast
amounts of data using analytical, statistical,
and programming skills. They are responsible
for mining valuable information from various
sources and transforming it into actionable
insights that can drive business growth.
Thank
You

Unit 1 Introduction to DATA SCIENCE .pptx