SlideShare a Scribd company logo
1 of 33
DATA SCIENCE: TOOLS,
TECHNIQUES and APPLICATIONS
Dr. Meenakshi Srivastava
Dr. Ranjana Rajnish
Assistant Professor
Amity University
msrivastava@lko.amity.edu
What and Why ???
• WHAT is Data Science?
• WHY Data Science is Important?
• WHY Data Scientist are High in Demands?
• WHY Data Science : In Academia ?
Application of Data Science :
Some Examples
I. HEALTHCARE
• Survival analysis
– Analyze survival statistics for different patient
attributes (age, blood type, gender, etc) and
treatments.
• Medication (dosage) effectiveness
– Analyze effects of admitting different types and
dosage of medication for a disease.
• Re-admission risk
– Predict risk of re-admittance based on patient
attributes, medical history, diagnose & treatment.
II. MARKETING
• Predicting Lifetime Value (LTV)
What for: if you can predict the
characteristics of high LTV customers, this
supports customer segmentation, identifies
up sell opportunities and supports other
marketing initiatives.
• Demand Forecasting
III. LOGISTICS
• How many of what thing Customer needs and
where will they need them?
(Enables lean inventory and prevents out of
stock situations.)
MOST IMPORTANT QUESTION
HOW DATA SCIENCE DO ALL THIS
What is Data Science ?
Data Science, is Broad Umbrella Term
whereby the Scientific Methods, Math,
Statistics etc are applied to Data sets in
order to extract KNOWLEDGE and
INSIGHT.
DATA SCIENCE : A MESH UP OF
DISCIPLINES
• Another View
THE DATA SCIENCE UNICORN
• In medieval times, a Unicorn was a
rare and mythical creature with
great powers.
• In today’s world, a similar mythical
creature is a Data Science Unicorn,
who knows equally well the
Technology, Data Science, and
Business.
• Such professional is a most
valuable resource of any data
science team.
• Many data professionals are
experts in the first two areas –
technology and data science, but
lack business/domain skills.
You All Are OUR FUTURE UNICORN
How To Become A Data Science
UNICORN ?
Data Science UNICORN: Do Whatever Is
Necessary To Extract Value from the Data
• Statistics: Take a sample (data), answer questions about the process that
produced this sample Is it a normal distribution? Estimate it’s mean.
• Machine Learning: Take a sample(data), build a model to answer
questions about future samples.
– Given a sample of named faces, design a model for naming a new unseen
face.
• Data Mining: mine huge data store for interesting patterns or
relationships.
– Given DB of transactions, apply tools and algorithms to find frequent product
bundles
Machine Learning
Machine
Learning refers to a
computer’s ability to
learn from a dataset and
adapt accordingly
without having been
explicitly programmed
to do so.
Examples : Regression,
Decision Tree, Neural
Network etc.
Data Mining
• To most of people data mining
goes something like this: Tons of
data is collected, then quant
wizards work their arcane magic,
and then they know all of this
amazing stuff.
• BUT WHAT THEY DO ?
• They can tell us that "one of
these things is not like the other“,
or it can show us categories and
then sort things into pre-
determined categories/ class.
HOW TO DO ALL THIS ??
COMPUTATIONAL TOOLS
• With the help of existing computational tools
you all can very easily analyze your data.
• No Programming Skills Required.
• No in depth knowledge of Statistics, Machine
Learning, Data Mining etc is required.
Common Computational Tool
• Rapid Miner (Open Source and Free):
This is very popular since it is a readymade, open
source, no-coding required software, which gives
advanced analytics. Written in Java, it incorporates
multifaceted data mining functions such as data
preprocessing, visualization, predictive analysis, and
can be easily integrated with WEKA and R-tool to
directly give models from scripts written in the
former two.
• WEKA (Open Source & Free):
This is a JAVA based customization tool, which is
free to use. It includes visualization and
predictive analysis and modeling techniques,
clustering, association, regression and
classification.
• R-Programming Tool (Open Source and Free) :
This is written in C and FORTRAN, and allows the
data miners to write scripts just like a programming
language/platform. Hence, it is used to make
statistical and analytical software for data mining. It
supports graphical analysis, both linear and
nonlinear modeling, classification, clustering and
time-based data analysis.
• Python based Orange and NTLK:
Python is very popular due to ease of use and its
powerful features. Orange is an open source
tool that is written in Python with useful data
analytics, text analysis, and machine-learning
features embedded in a visual programming
interface. NTLK, also composed in Python, is a
powerful language processing data mining tool,
which consists of data mining, machine learning,
and data scraping features that can easily be
built up for customized needs.
• Rattle (Open source and FREE)
A rattle is a GUI tool that uses R
Stats programming language. Rattle exposes the
statistical power of R by providing considerable
data mining functionality. Although Rattle has
an extensive and well-developed UI. Also, it has
an inbuilt log code tab that generates duplicate
code for any activity happening at GUI.
• DataMelt (Availability: Open source and Free)
DataMelt, also known as DMelt is a computation
and visualization environment. Also, provides an
interactive framework to do data analysis and
visualization. It is designed mainly for engineers,
scientists & students.
How Computational Tools Work
• Have methods developed using Statistics,
Machine Learning and Data Mining are used.
• These pre-developed methods can be easily
applied on your data set.
• They provide you in build support for data
visualization.
What ALL I CAN DO WITH MY DATA ?
• Regression:
In statistics, regression is a classic technique to
identify the scalar relationship between two
or more variables by fitting the state line on
the variable values.
Cont…
• Classification:
This is a machine-learning technique used for
labeling the set of observations provided for
training examples. With this, we can classify the
observations into one or more labels. The
likelihood of sales, online fraud detection, and
cancer classification (for medical science) are
common applications of classification problems.
Google Mail uses this technique to classify e-
mails as spam or not.
• Clustering:
This technique is all about organizing similar items
into groups from the given collection of items.
User segmentation and image compression are
the most common applications of clustering.
Market segmentation, social network analysis,
organizing the computer clustering, and
astronomical data analysis are applications of
clustering.
• Google News
Uses these techniques to group similar news items
into the same category.
Cont…
• Recommendation:
The recommendation algorithms are used in
recommender systems where these systems are
the most immediately recognizable machine
learning techniques in use today. Web content
recommendations may include similar websites,
blogs, videos, or related content. Also,
recommendation of online items can be helpful
for cross-selling and up-selling.
• Association Rules:
This data mining technique helps to find the
association between two or more Items. It
discovers a hidden pattern in the data set.
• Outlier Detection:
This type of data mining technique refers to
observation of data items in the dataset which
do not match an expected pattern or expected
behavior. This technique can be used in a variety
of domains, such as intrusion, detection, fraud
or fault detection, etc. Outer detection is also
called Outlier Analysis or Outlier mining.
• Prediction:
Prediction has used a combination of the other
data mining techniques like trends, sequential
patterns, clustering, classification, etc. It
analyzes past events or instances in a right
sequence for predicting a future event.
ADVANTAGES
Use Computational Tools to predict the
behavior of your compound.
Use Computational Tools to analyze the same
data with a different vision.
Cos Cutting.
Time Saving
Very Clean perfect vision for your Research
QUESTIONS ?
THANKS

More Related Content

Similar to Data Science.pptx NEW COURICUUMN IN DATA

Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadVamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabadsaitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placementSaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science trainingDIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabadVamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in HyderabadKumarNaik21
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)SayyedYusufali
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)SayyedYusufali
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)SayyedYusufali
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in MalaysiaAhmed Elmalla
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 

Similar to Data Science.pptx NEW COURICUUMN IN DATA (20)

Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)Data science training in hyd ppt converted (1)
Data science training in hyd ppt converted (1)
 
Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)Data science training in hyd pdf converted (1)
Data science training in hyd pdf converted (1)
 
Data science training in hydpdf converted (1)
Data science training in hydpdf  converted (1)Data science training in hydpdf  converted (1)
Data science training in hydpdf converted (1)
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
 
semana1.pptx
semana1.pptxsemana1.pptx
semana1.pptx
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
D sppt
D spptD sppt
D sppt
 

More from javed75

Unit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit threUnit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit threjaved75
 
javed_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart diseasejaved_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart diseasejaved75
 
presentationfinal-090714235255-phpapp01 (1) (2).pptx
presentationfinal-090714235255-phpapp01 (1) (2).pptxpresentationfinal-090714235255-phpapp01 (1) (2).pptx
presentationfinal-090714235255-phpapp01 (1) (2).pptxjaved75
 
algocomplexity cost effective tradeoff in
algocomplexity cost effective tradeoff inalgocomplexity cost effective tradeoff in
algocomplexity cost effective tradeoff injaved75
 
Section 7.5 version 2 AM new ppt for every
Section 7.5 version 2 AM new ppt for everySection 7.5 version 2 AM new ppt for every
Section 7.5 version 2 AM new ppt for everyjaved75
 
Cyber_Security_Awareness_Presentation (1).pptx
Cyber_Security_Awareness_Presentation (1).pptxCyber_Security_Awareness_Presentation (1).pptx
Cyber_Security_Awareness_Presentation (1).pptxjaved75
 
anand ethics ppt for phd scholar integral
anand ethics ppt for phd scholar integralanand ethics ppt for phd scholar integral
anand ethics ppt for phd scholar integraljaved75
 
1 Basic E-Commerce Concepts for it 2ndt year
1 Basic E-Commerce Concepts for it 2ndt year1 Basic E-Commerce Concepts for it 2ndt year
1 Basic E-Commerce Concepts for it 2ndt yearjaved75
 
UNIT-IV WT web technology for 1st year cs
UNIT-IV WT web technology for 1st year csUNIT-IV WT web technology for 1st year cs
UNIT-IV WT web technology for 1st year csjaved75
 
training about android installation and usa
training about android installation and usatraining about android installation and usa
training about android installation and usajaved75
 
Phd2023-2024cIntegralUniversitynida.pptx
Phd2023-2024cIntegralUniversitynida.pptxPhd2023-2024cIntegralUniversitynida.pptx
Phd2023-2024cIntegralUniversitynida.pptxjaved75
 

More from javed75 (11)

Unit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit threUnit-1.pptx final unit new mtech unit thre
Unit-1.pptx final unit new mtech unit thre
 
javed_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart diseasejaved_prethesis2608 on predcition of heart disease
javed_prethesis2608 on predcition of heart disease
 
presentationfinal-090714235255-phpapp01 (1) (2).pptx
presentationfinal-090714235255-phpapp01 (1) (2).pptxpresentationfinal-090714235255-phpapp01 (1) (2).pptx
presentationfinal-090714235255-phpapp01 (1) (2).pptx
 
algocomplexity cost effective tradeoff in
algocomplexity cost effective tradeoff inalgocomplexity cost effective tradeoff in
algocomplexity cost effective tradeoff in
 
Section 7.5 version 2 AM new ppt for every
Section 7.5 version 2 AM new ppt for everySection 7.5 version 2 AM new ppt for every
Section 7.5 version 2 AM new ppt for every
 
Cyber_Security_Awareness_Presentation (1).pptx
Cyber_Security_Awareness_Presentation (1).pptxCyber_Security_Awareness_Presentation (1).pptx
Cyber_Security_Awareness_Presentation (1).pptx
 
anand ethics ppt for phd scholar integral
anand ethics ppt for phd scholar integralanand ethics ppt for phd scholar integral
anand ethics ppt for phd scholar integral
 
1 Basic E-Commerce Concepts for it 2ndt year
1 Basic E-Commerce Concepts for it 2ndt year1 Basic E-Commerce Concepts for it 2ndt year
1 Basic E-Commerce Concepts for it 2ndt year
 
UNIT-IV WT web technology for 1st year cs
UNIT-IV WT web technology for 1st year csUNIT-IV WT web technology for 1st year cs
UNIT-IV WT web technology for 1st year cs
 
training about android installation and usa
training about android installation and usatraining about android installation and usa
training about android installation and usa
 
Phd2023-2024cIntegralUniversitynida.pptx
Phd2023-2024cIntegralUniversitynida.pptxPhd2023-2024cIntegralUniversitynida.pptx
Phd2023-2024cIntegralUniversitynida.pptx
 

Recently uploaded

Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 

Recently uploaded (20)

OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 

Data Science.pptx NEW COURICUUMN IN DATA

  • 1. DATA SCIENCE: TOOLS, TECHNIQUES and APPLICATIONS Dr. Meenakshi Srivastava Dr. Ranjana Rajnish Assistant Professor Amity University msrivastava@lko.amity.edu
  • 2. What and Why ??? • WHAT is Data Science? • WHY Data Science is Important? • WHY Data Scientist are High in Demands? • WHY Data Science : In Academia ?
  • 3. Application of Data Science : Some Examples I. HEALTHCARE • Survival analysis – Analyze survival statistics for different patient attributes (age, blood type, gender, etc) and treatments. • Medication (dosage) effectiveness – Analyze effects of admitting different types and dosage of medication for a disease. • Re-admission risk – Predict risk of re-admittance based on patient attributes, medical history, diagnose & treatment.
  • 4. II. MARKETING • Predicting Lifetime Value (LTV) What for: if you can predict the characteristics of high LTV customers, this supports customer segmentation, identifies up sell opportunities and supports other marketing initiatives. • Demand Forecasting
  • 5. III. LOGISTICS • How many of what thing Customer needs and where will they need them? (Enables lean inventory and prevents out of stock situations.)
  • 6. MOST IMPORTANT QUESTION HOW DATA SCIENCE DO ALL THIS
  • 7. What is Data Science ? Data Science, is Broad Umbrella Term whereby the Scientific Methods, Math, Statistics etc are applied to Data sets in order to extract KNOWLEDGE and INSIGHT.
  • 8. DATA SCIENCE : A MESH UP OF DISCIPLINES • Another View
  • 9. THE DATA SCIENCE UNICORN • In medieval times, a Unicorn was a rare and mythical creature with great powers. • In today’s world, a similar mythical creature is a Data Science Unicorn, who knows equally well the Technology, Data Science, and Business. • Such professional is a most valuable resource of any data science team. • Many data professionals are experts in the first two areas – technology and data science, but lack business/domain skills.
  • 10. You All Are OUR FUTURE UNICORN
  • 11. How To Become A Data Science UNICORN ? Data Science UNICORN: Do Whatever Is Necessary To Extract Value from the Data • Statistics: Take a sample (data), answer questions about the process that produced this sample Is it a normal distribution? Estimate it’s mean. • Machine Learning: Take a sample(data), build a model to answer questions about future samples. – Given a sample of named faces, design a model for naming a new unseen face. • Data Mining: mine huge data store for interesting patterns or relationships. – Given DB of transactions, apply tools and algorithms to find frequent product bundles
  • 12. Machine Learning Machine Learning refers to a computer’s ability to learn from a dataset and adapt accordingly without having been explicitly programmed to do so. Examples : Regression, Decision Tree, Neural Network etc.
  • 13. Data Mining • To most of people data mining goes something like this: Tons of data is collected, then quant wizards work their arcane magic, and then they know all of this amazing stuff. • BUT WHAT THEY DO ? • They can tell us that "one of these things is not like the other“, or it can show us categories and then sort things into pre- determined categories/ class.
  • 14. HOW TO DO ALL THIS ??
  • 15. COMPUTATIONAL TOOLS • With the help of existing computational tools you all can very easily analyze your data. • No Programming Skills Required. • No in depth knowledge of Statistics, Machine Learning, Data Mining etc is required.
  • 16. Common Computational Tool • Rapid Miner (Open Source and Free): This is very popular since it is a readymade, open source, no-coding required software, which gives advanced analytics. Written in Java, it incorporates multifaceted data mining functions such as data preprocessing, visualization, predictive analysis, and can be easily integrated with WEKA and R-tool to directly give models from scripts written in the former two.
  • 17. • WEKA (Open Source & Free): This is a JAVA based customization tool, which is free to use. It includes visualization and predictive analysis and modeling techniques, clustering, association, regression and classification.
  • 18. • R-Programming Tool (Open Source and Free) : This is written in C and FORTRAN, and allows the data miners to write scripts just like a programming language/platform. Hence, it is used to make statistical and analytical software for data mining. It supports graphical analysis, both linear and nonlinear modeling, classification, clustering and time-based data analysis.
  • 19. • Python based Orange and NTLK: Python is very popular due to ease of use and its powerful features. Orange is an open source tool that is written in Python with useful data analytics, text analysis, and machine-learning features embedded in a visual programming interface. NTLK, also composed in Python, is a powerful language processing data mining tool, which consists of data mining, machine learning, and data scraping features that can easily be built up for customized needs.
  • 20. • Rattle (Open source and FREE) A rattle is a GUI tool that uses R Stats programming language. Rattle exposes the statistical power of R by providing considerable data mining functionality. Although Rattle has an extensive and well-developed UI. Also, it has an inbuilt log code tab that generates duplicate code for any activity happening at GUI.
  • 21. • DataMelt (Availability: Open source and Free) DataMelt, also known as DMelt is a computation and visualization environment. Also, provides an interactive framework to do data analysis and visualization. It is designed mainly for engineers, scientists & students.
  • 22. How Computational Tools Work • Have methods developed using Statistics, Machine Learning and Data Mining are used. • These pre-developed methods can be easily applied on your data set. • They provide you in build support for data visualization.
  • 23.
  • 24. What ALL I CAN DO WITH MY DATA ? • Regression: In statistics, regression is a classic technique to identify the scalar relationship between two or more variables by fitting the state line on the variable values.
  • 25. Cont… • Classification: This is a machine-learning technique used for labeling the set of observations provided for training examples. With this, we can classify the observations into one or more labels. The likelihood of sales, online fraud detection, and cancer classification (for medical science) are common applications of classification problems. Google Mail uses this technique to classify e- mails as spam or not.
  • 26. • Clustering: This technique is all about organizing similar items into groups from the given collection of items. User segmentation and image compression are the most common applications of clustering. Market segmentation, social network analysis, organizing the computer clustering, and astronomical data analysis are applications of clustering. • Google News Uses these techniques to group similar news items into the same category.
  • 27. Cont… • Recommendation: The recommendation algorithms are used in recommender systems where these systems are the most immediately recognizable machine learning techniques in use today. Web content recommendations may include similar websites, blogs, videos, or related content. Also, recommendation of online items can be helpful for cross-selling and up-selling.
  • 28. • Association Rules: This data mining technique helps to find the association between two or more Items. It discovers a hidden pattern in the data set.
  • 29. • Outlier Detection: This type of data mining technique refers to observation of data items in the dataset which do not match an expected pattern or expected behavior. This technique can be used in a variety of domains, such as intrusion, detection, fraud or fault detection, etc. Outer detection is also called Outlier Analysis or Outlier mining.
  • 30. • Prediction: Prediction has used a combination of the other data mining techniques like trends, sequential patterns, clustering, classification, etc. It analyzes past events or instances in a right sequence for predicting a future event.
  • 31. ADVANTAGES Use Computational Tools to predict the behavior of your compound. Use Computational Tools to analyze the same data with a different vision. Cos Cutting. Time Saving Very Clean perfect vision for your Research