SlideShare a Scribd company logo
1 of 42
Present by: Abdul Ahad Abro
1
Data Science & Predictive Analytics
Computer Engineering Department, Ege University, Turkey
Prof. Dr. Aybars UĞUR
Presentations 1
Veri Bilimi ve Tahmin Edici Analizler
April 04 -2017
 Veri Bilimi ( Data Science )
 Veri bilimi kavramı (Concept of Data Science )
 Daha İyi Veriler Daha İyi Veri Bilimi ( Better Data Make Better Data Science )
 Neden Veri Bilimi öğrenmeye ihtiyacınız var? (Why you need to learn Data Science?)
 Gerçek Hayat Bilgisi Bilim Örneği Real Life Data Science Example
 Veri Bilimi Teknikleri ( Data Science Techniques )
 Veri madenciliği ( Data Mining )
 Veri Madenciliği Yazılımı ( Data Mining Software ) Orange & Weka
 Büyük very ( Big Data )
 Tahmin Edici Analizler ( Predictive Analytics )
 Predictive Analytics Process, Application & Software
Contents
2
Or
Data Science is an umbrella that contain many other fields like Machine learning, Data
Mining, big Data, statistics, Data visualization and data analytics.
What is Data Science ? Veri Bilimi nedir
Data science also known as data-driven science, is
an interdisciplinary field about scientific methods,
processes and systems to extract knowledge from
data in various forms, either structured or
unstructured, similar to Knowledge Discovery in
Databases (KDD) [4].
3
Data science is a concept to unify statistics, data analysis and their related methods in order
to understand and analyze actual phenomena with data. It employs techniques and theories
drawn from many fields within the broad areas of mathematics, statistics, information
science, and computer science, in particular from the subdomains of machine learning,
classification, cluster analysis, data mining, databases, and visualization [4].
Concept of Data Science Veri bilimi kavramı
4
Data Science using numbers and names which are also called categories of label to predict
answer the question.
It might surprise you whether really only 05 question data science answer.
• Is this A or B? (Classification Algorithm)
• Is this weird? (Anomaly Detection Algorithm)
• How much or how many? (Regression Algorithms)
• How is this organized? (Clustering Algorithms)
• What should I do next? (Reinforcement Learning Algorithms)
Each one is question answered by separate family machine learning method called
algorithm [13] .
05 - Question Data Science can Answer
5
Data Science Work:
Algorithm = Recipe
Your Data = Ingredients
Computer= Blender
Your Answer = Smoothie
How Does Data Science Work?
6
Better Data Make Better Data Science Daha İyi Veriler Daha İyi Veri Bilimi Oluşturun
Is your Data:
Relevant Alakalı ?
Connected Bağlı ?
Accurate Doğru ?
Enough to Work With Çalışmak için yeterli ?
7
8
[3]
Data Science isn't just for data Scientists
Data science creates new opportunities and helps individuals make better
decision in almost every field. This means business people, who don't want
to do technical work, should also learn data science.
9
10
[3]
11
[3]
12
[3]
Data Science Techniques Veri Bilimi Teknikleri
 Linear Regression
 Logistic Regression
 Jackknife Regression *
 Density Estimation
 Confidence Interval
 Test of Hypotheses
 Pattern Recognition
 Clustering - (aka Unsupervised Learning)
 Supervised Learning
 Time Series
 Decision Trees
 Random Numbers
 Monte-Carlo Simulation
 Bayesian Statistics
 Naive Bayes
 Principal Component Analysis - (PCA)
 Ensembles
 Neural Networks
 Support Vector Machine - (SVM)
 Nearest Neighbors - (k-NN)
13
Data Science Techniques
 Feature Selection - (aka Variable
Reduction)
 Indexation / Cataloguing *
 (Geo-) Spatial Modeling
 Recommendation Engine *
 Search Engine*
 Attribution Modeling *
 Collaborative Filtering *
 Rule System
 Linkage Analysis
 Association Rules
 Scoring Engine
 Segmentation
 Predictive Modeling
 Graphs
 Deep Learning
 Game Theory
 Imputation
 Survival Analysis
 Arbitrage
 Lift Modeling
 Yield Optimization
 Cross-Validation
 Model Fitting
 Relevancy Algorithm *
 Experimental Design
14
Search Engine -- Data Science Technique
A web search engine is a software system that is designed to search for information on the
World Wide Web. The search results are generally presented in a line of results often
referred to as search engine results pages (SERPs). The information may be a mix of web
pages, images, and other types of files. Some search engines also mine data available in
databases or open directories. [1]
A search engine maintains the following processes in near real time:
Web crawling
Indexing
Searching
Web search engines get their information by web crawling from site to site. The "spider"
checks for the standard filename robots.txt, addressed to it, before sending certain
information back to be indexed depending on many factors, such as the titles, page content,
JavaScript, Cascading Style Sheets (CSS), headings, as evidenced by the standard HTML
markup of the informational content, or its metadata in HTML meta tags. [1]
15
16
[12]
Typically when a user enters a query into a search engine
it is a few keywords. The index already has the names of
the sites containing the keywords, and these are instantly
obtained from the index. The real processing load is in
generating the web pages that are the search results list:
Every page in the entire list must be weighted according
to information in the indexes. Then the top search result
item requires the lookup, reconstruction, and markup of
the snippets showing the context of the keywords
matched. [1]
Search Engine -- Data Science Technique Arama Motoru - Veri Bilim Tekniği
17
[12]
Applications / Uses of Data Science Veri Biliminin Uygulamaları / Kullanımı
Most common applications of data science that we
use in our daily lives.
Internet Search İnternet araması
When we speak of search, we think ‘Google’. Right?
But there are many other search engines like Yahoo,
Bing, Ask, AOL, Duckduckgo etc. All these search
engines (including Google) make use of data
science algorithms to deliver the best result for our
searched query in fraction of seconds. Considering
the fact that, Google processes more than 20
petabytes of data everyday. Had there been no data
science, Google wouldn’t have been the ‘Google’
we know today [1].
18
[12]
Applications / Uses of Data Science
Recommender Systems
Tavsiyeci Sistemleri
Who can forget the suggestions about
similar products on Amazon? They not
only help you find relevant products from
billions of products available with them,
but also adds a lot to the user experience
19
[12]
Applications / Uses of Data Science
 Image Recognition
 Speech Recognition
 Gaming
 Price Comparison Websites
 Self Driving Cars
 Delivery logistics
 Fraud and Risk Detection
 Airline Route Planning
20
Data Mining Veri madenciliği
Data are any facts, numbers, or text that can be processed by a computer.
The overall goal of the data mining process is to extract information from a
data set and transform it into an understandable structure for further use.
Data mining : Refers to the science of collecting all the past data and then
searching for patterns in this data. You look for consistent patterns and / or
relationships between variables. Once you find these insights, you validate the
findings by applying the detected patterns to new subsets of data. The
ultimate goal of data mining is prediction [14].
Data mining is the analysis step of the "knowledge discovery in databases"
process, or KDD.
21
Data mining consists of five major elements:
Extract, transform, and load transaction data onto the data warehouse system.
Store and manage the data in a multidimensional database system.
Provide data access to business analysts and information technology professionals.
Analyze the data by application software.
Present the data in a useful format, such as a graph or table.
.
22
Neden Veri Madenciliği?Why Data Mining?
Data, Data everywhere Yet …. Veri, Veriler her yerde
I can't find the data I need .
I can't get the data I need.
I can't understand the data I found.
I can't use the data I found.
23
This picture is worth thousand words
24
Data Mining Software
 Orange
Orange is an open-source datavisualization,machine
learningand data miningtoolkit. It features a visual programming
front-end for explorative data analysis and interactive data
visualization, and can also be used as a Python library.
Weka is a workbench that contains a collection of visualization tools
and algorithms for data analysis and predictive modeling,together
 Weka withgraphical user interfacesfor easy access to thesefunctions.
25
Orange (Interface)
26
WEKA (Interface)
Thinking machines is what WEKA does best. WEKA is a collection of machine learning algorithm.
Machine learning is a form of artificial intelligence.
27
28
[8]
Big Data Büyük veri
Big data is a term for data sets that are so large or complex that traditional data processing application
software's are inadequate to deal with them. Challenges include capture, storage, analysis, data curation,
search, sharing, transfer, visualization, querying, updating and information privacy.
29
[12]
Big Data Büyük veri
Big Data is not about the size of the data, it's about the value
within the data.
30
Big Data
With the datafication comes big data, which is often described using
the four Vs
Volume: Refers to vast amounts of data generated every
second.
Velocity: Refers to speed at which new data is generated and
the speed at which data moves around.
Variety: Refers to the different types of data we can now use.
Veracity: Refers to the messiness or trustworthiness of the
data.
31
32
Predictive Analytics Tahmin Edici Analizler
Predictive analytics is the branch of the advanced analytics which is
used to make predictions about unknown future events. Predictive
analytics uses many techniques from data mining, statistics,
modeling, machine learning, and artificial intelligence to analyze
current data to make predictions about future. It uses a number of
data mining, and analytical techniques to bring together the
management, information technology, and modeling business
process to make predictions about future. The patterns found in
historical and transactional data can be used to identify risks and
opportunities for future.
33
Predictive Analytics Tahmin Edici Analizler
The data mining and text analytics along with statistics
allows the business users to create predictive intelligence
by uncovering patterns and relationships in both the
structured and unstructured data.
34
Predictive Analytics Process
35
1.Define Project: Define the project outcomes, deliverables, scoping of the effort, business
objectives, identify the data sets which are going to be used.
2.Data Collection: Data mining for predictive analytics prepares data from multiple sources for
analysis. This provides a complete view of the customer interactions.
3. Data Analysis: Data Analysis is the process of inspecting, cleaning, transforming, and
modeling data with the objective of discovering useful information, arriving
at conclusions.
4.Statistics: Statistical Analysis enables to validate the assumptions, hypotheses and
test them with using standard statistical models.
Predictive Analytics Process
36
5. Modeling: Predictive Modeling provides the ability to automatically create accurate
predictive models about future. There are also options to choose the best
solution with multi model evaluation.
6.Deployment: Predictive Model Deployment provides the option to deploy the analytical results
in to the every day decision making process to get results, reports and
output by automating the decisions based on the modeling.
7.Model Monitoring: Models are managed and monitored to review the model performance to ensure that it is providing
the results expected.
37
Applications of Predictive Analytics
38
1. Customer relationship management (CRM)
2. Health Care
3. Collection Analytics
4. Fraud detection
5. Risk management
6.Direct Marketing
7.Underwriting
Predictive Analytics Software
39
Microsoft Azure Machine Learning
Anaconda
MATLAB
SAP Predictive Analytics
SAS Predictive Analytics
IBM Predictive Analytics
Microsoft R
……..
References Section
[1] https://en.wikipedia.org/wiki/Web_search_engine
[2] https://www.analyticsvidhya.com/blog/2015/09/applications-data-science/
[3] http://www.cherhan.net/why-you-need-learn-data-science/
[4] https://en.wikipedia.org/wiki/Data_science
[4] https://www.youtube.com/watch?v=m7kpIBGEdkI
[5] https://www.coursera.org/learn/big-data-introduction/lecture/Fonq2/steps-in-the-data-science-process
[6] https://www.codementor.io/data-science/tutorial
[7] http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm
[8] https://www.slideshare.net/dwellman/what-is-big-data-24401517
[9] http://whatis.techtarget.com/definition/machine-learning
[10] http://businessintelligence.com/bi-insights/7-ways-big-data-affects-everyday-life/
[11] https://www.quora.com/What-are-some-of-the-real-life-examples-where-usage-of-machine-learning-
algorithms-had-huge-impact
[12] http://giphy.com/
[13] https://www.youtube.com/watch?v=vgmL808eSw4
[14] https://en.wikipedia.org/wiki/Data_mining
40
41
42

More Related Content

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Data Science & Predictive Analytics

  • 1. Present by: Abdul Ahad Abro 1 Data Science & Predictive Analytics Computer Engineering Department, Ege University, Turkey Prof. Dr. Aybars UĞUR Presentations 1 Veri Bilimi ve Tahmin Edici Analizler April 04 -2017
  • 2.  Veri Bilimi ( Data Science )  Veri bilimi kavramı (Concept of Data Science )  Daha İyi Veriler Daha İyi Veri Bilimi ( Better Data Make Better Data Science )  Neden Veri Bilimi öğrenmeye ihtiyacınız var? (Why you need to learn Data Science?)  Gerçek Hayat Bilgisi Bilim Örneği Real Life Data Science Example  Veri Bilimi Teknikleri ( Data Science Techniques )  Veri madenciliği ( Data Mining )  Veri Madenciliği Yazılımı ( Data Mining Software ) Orange & Weka  Büyük very ( Big Data )  Tahmin Edici Analizler ( Predictive Analytics )  Predictive Analytics Process, Application & Software Contents 2
  • 3. Or Data Science is an umbrella that contain many other fields like Machine learning, Data Mining, big Data, statistics, Data visualization and data analytics. What is Data Science ? Veri Bilimi nedir Data science also known as data-driven science, is an interdisciplinary field about scientific methods, processes and systems to extract knowledge from data in various forms, either structured or unstructured, similar to Knowledge Discovery in Databases (KDD) [4]. 3
  • 4. Data science is a concept to unify statistics, data analysis and their related methods in order to understand and analyze actual phenomena with data. It employs techniques and theories drawn from many fields within the broad areas of mathematics, statistics, information science, and computer science, in particular from the subdomains of machine learning, classification, cluster analysis, data mining, databases, and visualization [4]. Concept of Data Science Veri bilimi kavramı 4
  • 5. Data Science using numbers and names which are also called categories of label to predict answer the question. It might surprise you whether really only 05 question data science answer. • Is this A or B? (Classification Algorithm) • Is this weird? (Anomaly Detection Algorithm) • How much or how many? (Regression Algorithms) • How is this organized? (Clustering Algorithms) • What should I do next? (Reinforcement Learning Algorithms) Each one is question answered by separate family machine learning method called algorithm [13] . 05 - Question Data Science can Answer 5
  • 6. Data Science Work: Algorithm = Recipe Your Data = Ingredients Computer= Blender Your Answer = Smoothie How Does Data Science Work? 6
  • 7. Better Data Make Better Data Science Daha İyi Veriler Daha İyi Veri Bilimi Oluşturun Is your Data: Relevant Alakalı ? Connected Bağlı ? Accurate Doğru ? Enough to Work With Çalışmak için yeterli ? 7
  • 9. Data Science isn't just for data Scientists Data science creates new opportunities and helps individuals make better decision in almost every field. This means business people, who don't want to do technical work, should also learn data science. 9
  • 13. Data Science Techniques Veri Bilimi Teknikleri  Linear Regression  Logistic Regression  Jackknife Regression *  Density Estimation  Confidence Interval  Test of Hypotheses  Pattern Recognition  Clustering - (aka Unsupervised Learning)  Supervised Learning  Time Series  Decision Trees  Random Numbers  Monte-Carlo Simulation  Bayesian Statistics  Naive Bayes  Principal Component Analysis - (PCA)  Ensembles  Neural Networks  Support Vector Machine - (SVM)  Nearest Neighbors - (k-NN) 13
  • 14. Data Science Techniques  Feature Selection - (aka Variable Reduction)  Indexation / Cataloguing *  (Geo-) Spatial Modeling  Recommendation Engine *  Search Engine*  Attribution Modeling *  Collaborative Filtering *  Rule System  Linkage Analysis  Association Rules  Scoring Engine  Segmentation  Predictive Modeling  Graphs  Deep Learning  Game Theory  Imputation  Survival Analysis  Arbitrage  Lift Modeling  Yield Optimization  Cross-Validation  Model Fitting  Relevancy Algorithm *  Experimental Design 14
  • 15. Search Engine -- Data Science Technique A web search engine is a software system that is designed to search for information on the World Wide Web. The search results are generally presented in a line of results often referred to as search engine results pages (SERPs). The information may be a mix of web pages, images, and other types of files. Some search engines also mine data available in databases or open directories. [1] A search engine maintains the following processes in near real time: Web crawling Indexing Searching Web search engines get their information by web crawling from site to site. The "spider" checks for the standard filename robots.txt, addressed to it, before sending certain information back to be indexed depending on many factors, such as the titles, page content, JavaScript, Cascading Style Sheets (CSS), headings, as evidenced by the standard HTML markup of the informational content, or its metadata in HTML meta tags. [1] 15
  • 17. Typically when a user enters a query into a search engine it is a few keywords. The index already has the names of the sites containing the keywords, and these are instantly obtained from the index. The real processing load is in generating the web pages that are the search results list: Every page in the entire list must be weighted according to information in the indexes. Then the top search result item requires the lookup, reconstruction, and markup of the snippets showing the context of the keywords matched. [1] Search Engine -- Data Science Technique Arama Motoru - Veri Bilim Tekniği 17 [12]
  • 18. Applications / Uses of Data Science Veri Biliminin Uygulamaları / Kullanımı Most common applications of data science that we use in our daily lives. Internet Search İnternet araması When we speak of search, we think ‘Google’. Right? But there are many other search engines like Yahoo, Bing, Ask, AOL, Duckduckgo etc. All these search engines (including Google) make use of data science algorithms to deliver the best result for our searched query in fraction of seconds. Considering the fact that, Google processes more than 20 petabytes of data everyday. Had there been no data science, Google wouldn’t have been the ‘Google’ we know today [1]. 18 [12]
  • 19. Applications / Uses of Data Science Recommender Systems Tavsiyeci Sistemleri Who can forget the suggestions about similar products on Amazon? They not only help you find relevant products from billions of products available with them, but also adds a lot to the user experience 19 [12]
  • 20. Applications / Uses of Data Science  Image Recognition  Speech Recognition  Gaming  Price Comparison Websites  Self Driving Cars  Delivery logistics  Fraud and Risk Detection  Airline Route Planning 20
  • 21. Data Mining Veri madenciliği Data are any facts, numbers, or text that can be processed by a computer. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Data mining : Refers to the science of collecting all the past data and then searching for patterns in this data. You look for consistent patterns and / or relationships between variables. Once you find these insights, you validate the findings by applying the detected patterns to new subsets of data. The ultimate goal of data mining is prediction [14]. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. 21
  • 22. Data mining consists of five major elements: Extract, transform, and load transaction data onto the data warehouse system. Store and manage the data in a multidimensional database system. Provide data access to business analysts and information technology professionals. Analyze the data by application software. Present the data in a useful format, such as a graph or table. . 22
  • 23. Neden Veri Madenciliği?Why Data Mining? Data, Data everywhere Yet …. Veri, Veriler her yerde I can't find the data I need . I can't get the data I need. I can't understand the data I found. I can't use the data I found. 23
  • 24. This picture is worth thousand words 24
  • 25. Data Mining Software  Orange Orange is an open-source datavisualization,machine learningand data miningtoolkit. It features a visual programming front-end for explorative data analysis and interactive data visualization, and can also be used as a Python library. Weka is a workbench that contains a collection of visualization tools and algorithms for data analysis and predictive modeling,together  Weka withgraphical user interfacesfor easy access to thesefunctions. 25
  • 27. WEKA (Interface) Thinking machines is what WEKA does best. WEKA is a collection of machine learning algorithm. Machine learning is a form of artificial intelligence. 27
  • 29. Big Data Büyük veri Big data is a term for data sets that are so large or complex that traditional data processing application software's are inadequate to deal with them. Challenges include capture, storage, analysis, data curation, search, sharing, transfer, visualization, querying, updating and information privacy. 29 [12]
  • 30. Big Data Büyük veri Big Data is not about the size of the data, it's about the value within the data. 30
  • 31. Big Data With the datafication comes big data, which is often described using the four Vs Volume: Refers to vast amounts of data generated every second. Velocity: Refers to speed at which new data is generated and the speed at which data moves around. Variety: Refers to the different types of data we can now use. Veracity: Refers to the messiness or trustworthiness of the data. 31
  • 32. 32
  • 33. Predictive Analytics Tahmin Edici Analizler Predictive analytics is the branch of the advanced analytics which is used to make predictions about unknown future events. Predictive analytics uses many techniques from data mining, statistics, modeling, machine learning, and artificial intelligence to analyze current data to make predictions about future. It uses a number of data mining, and analytical techniques to bring together the management, information technology, and modeling business process to make predictions about future. The patterns found in historical and transactional data can be used to identify risks and opportunities for future. 33
  • 34. Predictive Analytics Tahmin Edici Analizler The data mining and text analytics along with statistics allows the business users to create predictive intelligence by uncovering patterns and relationships in both the structured and unstructured data. 34
  • 35. Predictive Analytics Process 35 1.Define Project: Define the project outcomes, deliverables, scoping of the effort, business objectives, identify the data sets which are going to be used. 2.Data Collection: Data mining for predictive analytics prepares data from multiple sources for analysis. This provides a complete view of the customer interactions. 3. Data Analysis: Data Analysis is the process of inspecting, cleaning, transforming, and modeling data with the objective of discovering useful information, arriving at conclusions. 4.Statistics: Statistical Analysis enables to validate the assumptions, hypotheses and test them with using standard statistical models.
  • 36. Predictive Analytics Process 36 5. Modeling: Predictive Modeling provides the ability to automatically create accurate predictive models about future. There are also options to choose the best solution with multi model evaluation. 6.Deployment: Predictive Model Deployment provides the option to deploy the analytical results in to the every day decision making process to get results, reports and output by automating the decisions based on the modeling. 7.Model Monitoring: Models are managed and monitored to review the model performance to ensure that it is providing the results expected.
  • 37. 37
  • 38. Applications of Predictive Analytics 38 1. Customer relationship management (CRM) 2. Health Care 3. Collection Analytics 4. Fraud detection 5. Risk management 6.Direct Marketing 7.Underwriting
  • 39. Predictive Analytics Software 39 Microsoft Azure Machine Learning Anaconda MATLAB SAP Predictive Analytics SAS Predictive Analytics IBM Predictive Analytics Microsoft R ……..
  • 40. References Section [1] https://en.wikipedia.org/wiki/Web_search_engine [2] https://www.analyticsvidhya.com/blog/2015/09/applications-data-science/ [3] http://www.cherhan.net/why-you-need-learn-data-science/ [4] https://en.wikipedia.org/wiki/Data_science [4] https://www.youtube.com/watch?v=m7kpIBGEdkI [5] https://www.coursera.org/learn/big-data-introduction/lecture/Fonq2/steps-in-the-data-science-process [6] https://www.codementor.io/data-science/tutorial [7] http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm [8] https://www.slideshare.net/dwellman/what-is-big-data-24401517 [9] http://whatis.techtarget.com/definition/machine-learning [10] http://businessintelligence.com/bi-insights/7-ways-big-data-affects-everyday-life/ [11] https://www.quora.com/What-are-some-of-the-real-life-examples-where-usage-of-machine-learning- algorithms-had-huge-impact [12] http://giphy.com/ [13] https://www.youtube.com/watch?v=vgmL808eSw4 [14] https://en.wikipedia.org/wiki/Data_mining 40
  • 41. 41
  • 42. 42