Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Data Science & Predictive Analytics
1. Present by: Abdul Ahad Abro
1
Data Science & Predictive Analytics
Computer Engineering Department, Ege University, Turkey
Prof. Dr. Aybars UĞUR
Presentations 1
Veri Bilimi ve Tahmin Edici Analizler
April 04 -2017
2. Veri Bilimi ( Data Science )
Veri bilimi kavramı (Concept of Data Science )
Daha İyi Veriler Daha İyi Veri Bilimi ( Better Data Make Better Data Science )
Neden Veri Bilimi öğrenmeye ihtiyacınız var? (Why you need to learn Data Science?)
Gerçek Hayat Bilgisi Bilim Örneği Real Life Data Science Example
Veri Bilimi Teknikleri ( Data Science Techniques )
Veri madenciliği ( Data Mining )
Veri Madenciliği Yazılımı ( Data Mining Software ) Orange & Weka
Büyük very ( Big Data )
Tahmin Edici Analizler ( Predictive Analytics )
Predictive Analytics Process, Application & Software
Contents
2
3. Or
Data Science is an umbrella that contain many other fields like Machine learning, Data
Mining, big Data, statistics, Data visualization and data analytics.
What is Data Science ? Veri Bilimi nedir
Data science also known as data-driven science, is
an interdisciplinary field about scientific methods,
processes and systems to extract knowledge from
data in various forms, either structured or
unstructured, similar to Knowledge Discovery in
Databases (KDD) [4].
3
4. Data science is a concept to unify statistics, data analysis and their related methods in order
to understand and analyze actual phenomena with data. It employs techniques and theories
drawn from many fields within the broad areas of mathematics, statistics, information
science, and computer science, in particular from the subdomains of machine learning,
classification, cluster analysis, data mining, databases, and visualization [4].
Concept of Data Science Veri bilimi kavramı
4
5. Data Science using numbers and names which are also called categories of label to predict
answer the question.
It might surprise you whether really only 05 question data science answer.
• Is this A or B? (Classification Algorithm)
• Is this weird? (Anomaly Detection Algorithm)
• How much or how many? (Regression Algorithms)
• How is this organized? (Clustering Algorithms)
• What should I do next? (Reinforcement Learning Algorithms)
Each one is question answered by separate family machine learning method called
algorithm [13] .
05 - Question Data Science can Answer
5
6. Data Science Work:
Algorithm = Recipe
Your Data = Ingredients
Computer= Blender
Your Answer = Smoothie
How Does Data Science Work?
6
7. Better Data Make Better Data Science Daha İyi Veriler Daha İyi Veri Bilimi Oluşturun
Is your Data:
Relevant Alakalı ?
Connected Bağlı ?
Accurate Doğru ?
Enough to Work With Çalışmak için yeterli ?
7
9. Data Science isn't just for data Scientists
Data science creates new opportunities and helps individuals make better
decision in almost every field. This means business people, who don't want
to do technical work, should also learn data science.
9
15. Search Engine -- Data Science Technique
A web search engine is a software system that is designed to search for information on the
World Wide Web. The search results are generally presented in a line of results often
referred to as search engine results pages (SERPs). The information may be a mix of web
pages, images, and other types of files. Some search engines also mine data available in
databases or open directories. [1]
A search engine maintains the following processes in near real time:
Web crawling
Indexing
Searching
Web search engines get their information by web crawling from site to site. The "spider"
checks for the standard filename robots.txt, addressed to it, before sending certain
information back to be indexed depending on many factors, such as the titles, page content,
JavaScript, Cascading Style Sheets (CSS), headings, as evidenced by the standard HTML
markup of the informational content, or its metadata in HTML meta tags. [1]
15
17. Typically when a user enters a query into a search engine
it is a few keywords. The index already has the names of
the sites containing the keywords, and these are instantly
obtained from the index. The real processing load is in
generating the web pages that are the search results list:
Every page in the entire list must be weighted according
to information in the indexes. Then the top search result
item requires the lookup, reconstruction, and markup of
the snippets showing the context of the keywords
matched. [1]
Search Engine -- Data Science Technique Arama Motoru - Veri Bilim Tekniği
17
[12]
18. Applications / Uses of Data Science Veri Biliminin Uygulamaları / Kullanımı
Most common applications of data science that we
use in our daily lives.
Internet Search İnternet araması
When we speak of search, we think ‘Google’. Right?
But there are many other search engines like Yahoo,
Bing, Ask, AOL, Duckduckgo etc. All these search
engines (including Google) make use of data
science algorithms to deliver the best result for our
searched query in fraction of seconds. Considering
the fact that, Google processes more than 20
petabytes of data everyday. Had there been no data
science, Google wouldn’t have been the ‘Google’
we know today [1].
18
[12]
19. Applications / Uses of Data Science
Recommender Systems
Tavsiyeci Sistemleri
Who can forget the suggestions about
similar products on Amazon? They not
only help you find relevant products from
billions of products available with them,
but also adds a lot to the user experience
19
[12]
21. Data Mining Veri madenciliği
Data are any facts, numbers, or text that can be processed by a computer.
The overall goal of the data mining process is to extract information from a
data set and transform it into an understandable structure for further use.
Data mining : Refers to the science of collecting all the past data and then
searching for patterns in this data. You look for consistent patterns and / or
relationships between variables. Once you find these insights, you validate the
findings by applying the detected patterns to new subsets of data. The
ultimate goal of data mining is prediction [14].
Data mining is the analysis step of the "knowledge discovery in databases"
process, or KDD.
21
22. Data mining consists of five major elements:
Extract, transform, and load transaction data onto the data warehouse system.
Store and manage the data in a multidimensional database system.
Provide data access to business analysts and information technology professionals.
Analyze the data by application software.
Present the data in a useful format, such as a graph or table.
.
22
23. Neden Veri Madenciliği?Why Data Mining?
Data, Data everywhere Yet …. Veri, Veriler her yerde
I can't find the data I need .
I can't get the data I need.
I can't understand the data I found.
I can't use the data I found.
23
25. Data Mining Software
Orange
Orange is an open-source datavisualization,machine
learningand data miningtoolkit. It features a visual programming
front-end for explorative data analysis and interactive data
visualization, and can also be used as a Python library.
Weka is a workbench that contains a collection of visualization tools
and algorithms for data analysis and predictive modeling,together
Weka withgraphical user interfacesfor easy access to thesefunctions.
25
27. WEKA (Interface)
Thinking machines is what WEKA does best. WEKA is a collection of machine learning algorithm.
Machine learning is a form of artificial intelligence.
27
29. Big Data Büyük veri
Big data is a term for data sets that are so large or complex that traditional data processing application
software's are inadequate to deal with them. Challenges include capture, storage, analysis, data curation,
search, sharing, transfer, visualization, querying, updating and information privacy.
29
[12]
30. Big Data Büyük veri
Big Data is not about the size of the data, it's about the value
within the data.
30
31. Big Data
With the datafication comes big data, which is often described using
the four Vs
Volume: Refers to vast amounts of data generated every
second.
Velocity: Refers to speed at which new data is generated and
the speed at which data moves around.
Variety: Refers to the different types of data we can now use.
Veracity: Refers to the messiness or trustworthiness of the
data.
31
33. Predictive Analytics Tahmin Edici Analizler
Predictive analytics is the branch of the advanced analytics which is
used to make predictions about unknown future events. Predictive
analytics uses many techniques from data mining, statistics,
modeling, machine learning, and artificial intelligence to analyze
current data to make predictions about future. It uses a number of
data mining, and analytical techniques to bring together the
management, information technology, and modeling business
process to make predictions about future. The patterns found in
historical and transactional data can be used to identify risks and
opportunities for future.
33
34. Predictive Analytics Tahmin Edici Analizler
The data mining and text analytics along with statistics
allows the business users to create predictive intelligence
by uncovering patterns and relationships in both the
structured and unstructured data.
34
35. Predictive Analytics Process
35
1.Define Project: Define the project outcomes, deliverables, scoping of the effort, business
objectives, identify the data sets which are going to be used.
2.Data Collection: Data mining for predictive analytics prepares data from multiple sources for
analysis. This provides a complete view of the customer interactions.
3. Data Analysis: Data Analysis is the process of inspecting, cleaning, transforming, and
modeling data with the objective of discovering useful information, arriving
at conclusions.
4.Statistics: Statistical Analysis enables to validate the assumptions, hypotheses and
test them with using standard statistical models.
36. Predictive Analytics Process
36
5. Modeling: Predictive Modeling provides the ability to automatically create accurate
predictive models about future. There are also options to choose the best
solution with multi model evaluation.
6.Deployment: Predictive Model Deployment provides the option to deploy the analytical results
in to the every day decision making process to get results, reports and
output by automating the decisions based on the modeling.
7.Model Monitoring: Models are managed and monitored to review the model performance to ensure that it is providing
the results expected.
38. Applications of Predictive Analytics
38
1. Customer relationship management (CRM)
2. Health Care
3. Collection Analytics
4. Fraud detection
5. Risk management
6.Direct Marketing
7.Underwriting
39. Predictive Analytics Software
39
Microsoft Azure Machine Learning
Anaconda
MATLAB
SAP Predictive Analytics
SAS Predictive Analytics
IBM Predictive Analytics
Microsoft R
……..