SlideShare a Scribd company logo
Big Data & Analytics
Workshop
Yaman Hajja, Ph.D.
yamanhajja@gmail.com
March 24, 2017
1
Introduction
What is data?
Data
is a set of values of qualitative or quantitative variables.
Yaman Hajja | Big Data & Analytics
1
Introduction
What is data?
Data
is a set of values of qualitative or quantitative variables.
is any sequence of one or more symbols given meaning by
specific act(s) of interpretation. [In Computing].
Yaman Hajja | Big Data & Analytics
1
Introduction
What is data?
Data
is a set of values of qualitative or quantitative variables.
is any sequence of one or more symbols given meaning by
specific act(s) of interpretation. [In Computing].
Data Information
Data requires interpretation to become information.
Yaman Hajja | Big Data & Analytics
2
Data is the new oil of the digital economy
Data in the 21st century is like oil in the 18th century.
Data is the new oil of
the digital economy.
Yaman Hajja | Big Data & Analytics
2
Data is the new oil of the digital economy
Data in the 21st century is like oil in the 18th century.
Data is the new oil of
the digital economy.
Data infrastructure should become a profit center.
Yaman Hajja | Big Data & Analytics
3
Types of data
Types of data. Translation of document hosted by João Netoat.
Yaman Hajja | Big Data & Analytics
4
Open Data
Open Data
is the idea that some data should be freely available to everyone
to use and republish as they wish, without restrictions from
copyright, patents or other mechanisms of control.
Yaman Hajja | Big Data & Analytics
4
Open Data
Open Data
is the idea that some data should be freely available to everyone
to use and republish as they wish, without restrictions from
copyright, patents or other mechanisms of control.
Example:
Linked Datasets as of August 2014. Tungsten Tide.
Yaman Hajja | Big Data & Analytics
5
Datasets for data science projects
Example:
analyticsvidhya
Example:
kaggle
Example:
drivendata
opendatasoft
opendatainception
Yaman Hajja | Big Data & Analytics
6
What is data analysis?
Data analysis: also known as data analytics, is a process of
inspecting, cleansing, transforming, and modeling data with the
goal of discovering useful information, suggesting conclusions,
and supporting decision-making.
Data analysis has multiple facets and approaches,
encompassing diverse techniques under a variety of names, in
different business, science, and social science domains.
Yaman Hajja | Big Data & Analytics
7
What is data analysis?
Statistical data
Statistical data?
Statistical analysis:
is a component of data analytics. In the context of business
intelligence (BI), statistical analysis involves collecting and
scrutinizing every data sample in a set of items from which
samples can be drawn.
A sample,
in statistics, is a representative selection drawn from a total
population.
Yaman Hajja | Big Data & Analytics
8
Data analysis process
Yaman Hajja | Big Data & Analytics
9
Data Information Intelligence
Yaman Hajja | Big Data & Analytics
10
Understanding Big Data
Yaman Hajja | Big Data & Analytics
11
Understanding Big Data
Big Data
is a term for data sets that are so
large or complex that traditional data
processing application softwares are
inadequate to deal with them.
Challenges include capture, storage,
analysis, data curationa
, search,
sharing, transfer, visualization, querying,
updating and information privacy.
aorganization and integration of data collected from
various sources
Yaman Hajja | Big Data & Analytics
12
Big Data Characteristics
3 Vs
1. Volume: big data doesn’t sample; it just observes and tracks
what happens
2. Velocity: big data is often available in real-time
3. Variety: big data draws from text, images, audio, video; plus it
completes missing pieces through data fusion
Yaman Hajja | Big Data & Analytics
13
Who can deal with Big Data?
Yaman Hajja | Big Data & Analytics
14
Who can deal with Big Data?
Yaman Hajja | Big Data & Analytics
15
Multidisciplinary!!!
Yaman Hajja | Big Data & Analytics
16
Big Data tools
Yaman Hajja | Big Data & Analytics
17
Big Data tools
Yaman Hajja | Big Data & Analytics
18
some Big Data facts
Big Data and Business Analytics Revenues Forecast to Reach
$150.8 Billion This Year, Led by Banking and Manufacturing
Investments, According to from International Data Corporation
(IDC), an increase of 12.4% over 2016.
Twenty-five years ago, data was growing at a rate of 100GB a
day. Now, data grows at a rate of almost 50,000GB a second.
The world today is awash in data. In 2015, mankind produced as
much information as was created in all previous years of human
civilization. Every time we send a message, make a call, or
complete a transaction, we leave digital traces.
Yaman Hajja | Big Data & Analytics
19
Data scientists vs data analysts
Yaman Hajja | Big Data & Analytics
20
Data Visualization
Data visualization is a general term that describes any effort to
help people understand the significance of data by placing it in a
visual context. Patterns, trends and correlations that might go
undetected in text-based data can be exposed and recognized
easier with data visualization software.
Yaman Hajja | Big Data & Analytics
21
Example: Data Visualized
Charter value
NPLs
Exchange rate
M1
15
15.5
16
16.5
17
17.5
18
Chartervalue%
2
4
6
8
10
12
14
16
18
20
22 2002m1
2002m7
2003m1
2003m7
2004m1
2004m7
2005m1
2005m7
2006m1
2006m7
2007m1
2007m7
2008m1
2008m7
2009m1
2009m7
2010m1
2010m7
2011m1
2011m7
2012m1
2012m7
2013m1
2013m7
2014m1
2014m7
2015m1
2015m7
Time (2002 M1 - 2015 M8)
NPLs % Money supply M1 % pa
Exchange rate Charter value %
NPls of Malaysia banking system over M1, exchange rate, and charter value (2002 M1 - 2015 M8)
Yaman Hajja | Big Data & Analytics
22
Example#2: Data Visualized
Capital
GDP
NPLs
-12
-10
-8
-6
-4
-2
0
2
4
6
8
10
12
14
16 1998m1
1998m7
1999m1
1999m7
2000m1
2000m7
2001m1
2001m7
2002m1
2002m7
2003m1
2003m7
2004m1
2004m7
2005m1
2005m7
2006m1
2006m7
2007m1
2007m7
2008m1
2008m7
2009m1
2009m7
2010m1
2010m7
2011m1
2011m7
2012m1
2012m7
2013m1
2013m7
2014m1
2014m7
2015m1
NPLs % GDP growth % Capital ratio %
NPLs of Malaysia banking system over business cycle (GDP) (1998 M1 - 2015 M3) with capital ratio
Yaman Hajja | Big Data & Analytics
23
Example#3: Data Visualized
NPLs
Lending rate
Unemploment
Inflation
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
1998q1
1998q3
1999q1
1999q3
2000q1
2000q3
2001q1
2001q3
2002q1
2002q3
2003q1
2003q3
2004q1
2004q3
2005q1
2005q3
2006q1
2006q3
2007q1
2007q3
2008q1
2008q3
2009q1
2009q3
2010q1
2010q3
2011q1
2011q3
2012q1
2012q3
2013q1
2013q3
2014q1
2014q3
2015q1
Time (1998 Q1 - 2015 Q1)
NPLs % Lending interest rate %
Inflation (CP) % Unemloyment %
NPLs of Malaysia banking system over the business cycle (1998 Q1 - 2015 Q1)
Yaman Hajja | Big Data & Analytics
24
Visualization Types
Yaman Hajja | Big Data & Analytics
25
Social Network Analysis
Social network analysis (SNA) is the process of investigating
social structures through the use of network and graph
theories.
It characterizes networked structures in terms of nodes
(individual actors, people, or things within the network) and the
ties, edges, or links (relationships or interactions) that connect
them. Examples of social structures commonly visualized
through social network analysis include social media networks.
Yaman Hajja | Big Data & Analytics
26
Example of Social Network Analysis
Data visualization of Facebook relationships
Yaman Hajja | Big Data & Analytics
27
Network Theory Tools
Yaman Hajja | Big Data & Analytics
28
What exactly is the meaning of an API?
Application Programming Interface (API)
Application Programming
Interface (API)
API is a particular set of rules (’code’)
and specifications that software
programs can follow to communicate
with each other.
It serves as an interface between
different software programs and
facilitates their interaction, similar to the
way the user interface facilitates
interaction between humans and
computers.
Yaman Hajja | Big Data & Analytics
29
What exactly is the meaning of an API?
Application Programming Interface (API)
API is a set of subroutine definitions, protocols, and tools for building
application software.
It is a set of clearly defined methods of communication between
various software components. A good API makes it easier to develop
a computer program by providing all the building blocks, which are
then put together by the programmer.
An API may be for a web-based system, operating system, database
system, computer hardware or software library. An API specification
can take many forms, but often includes specifications for routines,
data structures, object classes, variables or remote calls.
Microsoft Windows API, the C++ Standard Template Library and Java
APIs are examples of different forms of APIs.
Yaman Hajja | Big Data & Analytics
30
API
Yaman Hajja | Big Data & Analytics
31
Example of web API
Shiny Weather Data
A web API is an application programming interface (API) for
either a web server or a web browser.
Shiny Weather Data is a web service making different sources of
European gridded climate data available in hourly time series
formats used by common building performance modeling tools.
This web service has been around for a while and has a steadily
growing user group of professional building modelers as well as
students and researchers.
satellite-based time series of solar irradiation for the actual
weather conditions as well as for clear-sky conditions
Portfolio Visualizer
Yaman Hajja | Big Data & Analytics
32
Predictive Analytics
Predictive analytics is the branch of
the advanced analytics which is used to
make predictions about unknown future
events.
Predictive analytics uses many
techniques from data mining, statistics,
modeling, machine learning, and
artificial intelligence to analyze current
data to make predictions about future.
Yaman Hajja | Big Data & Analytics
33
Predictive Analytics
Yaman Hajja | Big Data & Analytics
34
Probability and Statistics
Probability is the measure of the likelihood that an event will
occur. Probability is quantified as a number between 0 and 1
(where 0 indicates impossibility and 1 indicates certainty). The
higher the probability of an event, the more certain that the event
will occur.
A simple example is the tossing of a coin. Since the coin is
unbiased, the two outcomes ("head" and "tail") are both equally
probable; the probability of "head" equals the probability of
"tail". Since no other outcomes are possible, the probability is
1/2 (or 50%), of either "head" or "tail".
Yaman Hajja | Big Data & Analytics
35
Probability Theory
Probability Theory is the branch of mathematics concerned
with probability, the analysis of random phenomena.
The central objects of probability theory are random variables,
stochastic processes, and events: mathematical abstractions of
non-deterministic events or measured quantities that may either
be single occurrences or evolve over time in an apparently
random fashion.
Example
Yaman Hajja | Big Data & Analytics
36
Statistics
Statistics as "a branch of mathematics dealing with the
collection, analysis, interpretation, and presentation of masses of
numerical data". Merriam-Webster dictionary.
In applying statistics to, e.g., a scientific, industrial, or social
problem, it is conventional to begin with a statistical population or
a statistical model process to be studied.
Populations can be diverse topics such as "all people living in a
country" or "every atom composing a crystal".
Statistics deals with all aspects of data including the planning of
data collection in terms of the design of surveys and
experiments.
Yaman Hajja | Big Data & Analytics
37
Normal Distribution
Normal (or Gaussian) distribution is a very common continuous
probability distribution. Normal distributions are important in
statistics and are often used in the natural and social sciences to
represent real-valued random variables whose distributions are
not known.
LINK (Normal Distribution).
Yaman Hajja | Big Data & Analytics
38
Normal Distribution
Probability density function
Figure: The red curve is the standard normal distribution
Yaman Hajja | Big Data & Analytics
39
Other Distributions
Yaman Hajja | Big Data & Analytics
40
p-value
The P value, or calculated probability, is the probability of finding the
observed, or more extreme, results when the null hypothesis (H0) of a
study question is true – the definition of ’extreme’ depends on how
the hypothesis is being tested.
- LINK.
- Seeing Theory website.
Yaman Hajja | Big Data & Analytics
41
what is Regression Analysis?
Regression analysis is a form of predictive modelling technique
which investigates the relationship between a dependent (target)
and independent variable (s) (predictor).
This technique is used for forecasting, time series modelling and
finding the causal effect relationship between the variables. For
example, relationship between rash driving and number of road
accidents by a driver is best studied through regression.
Regression analysis is an important tool for modelling and
analyzing data.
There are multiple benefits of using regression analysis.
They are as follows:
*** It indicates the significant relationships between dependent
variable and independent variable.
*** It indicates the strength of impact of multiple independent
variables on a dependent variable.
Yaman Hajja | Big Data & Analytics
42
Linear Regression
It is one of the most widely known
modeling technique. Linear
regression is usually among the
first few topics which people pick
while learning predictive
modeling.
Linear Regression establishes a
relationship between dependent
variable (Y) and one or more
independent variables (X) using
a best fit straight line (also
known as regression line).
Yaman Hajja | Big Data & Analytics
43
Linear Regression. Cont.
It is represented by an equation
Y = α + βX + e, where a is
intercept, β is slope of the line
and e is error term. This equation
can be used to predict the value
of target variable based on given
predictor variable(s).
Yaman Hajja | Big Data & Analytics
44
Data Modeling then Forecasting (Simulation
of the model) Example.
0
1
2
3
-1
-.5
0
.5
-.5
0
.5
1
-1
-.5
0
.5
1
-1
-.5
0
.5
-.05
0
.05
.1
0
.1
.2
.3
-.1
-.05
0
.05
-.15
-.1
-.05
0
.05
-.02
0
.02
.04
.06
-2
-1
0
1
-3
-2
-1
0
1
-1
0
1
2
-5
0
5
10
-2
-1
0
1
-.1
-.05
0
.05
-.1
-.05
0
.05
-.05
0
.05
.1
-.1
0
.1
.2
.3
-.1
-.05
0
.05
-.2
-.1
0
.1
0
.1
.2
.3
.4
-.2
-.1
0
.1
-.2
0
.2
.4
0
.1
.2
.3
0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30
M1 : M1
LENR : M1
CA^2 : M1
CA : M1
NPLs : M1
M1 : LENR
LENR : LENR
CA^2 : LENR
CA : LENR
NPLs : LENR
M1 : CA^2
LENR : CA^2
CA^2 : CA^2
CA : CA^2
NPLs : CA^2
M1 : CA
LENR : CA
CA^2 : CA
CA : CA
NPLs : CA
M1 : NPLs
LENR : NPLs
CA^2 : NPLs
CA : NPLs
NPLs : NPLs
95% CI Orthogonalized IRF
Step (1-month)
impulse : response. Generated by Monte-Carlo with 1000 reps. Based on VAR model.
Impulse-Response Functions
Yaman Hajja | Big Data & Analytics
45
Back to R Programming
How to fetch stock data?
Example: How to fetch stock data?
Financial time series forecasting – an easy approach
Yahoo Finance
Yaman Hajja | Big Data & Analytics
46
Back to R Programming
R - Linear Regression
Example
Linear Regression in R.
Yaman Hajja | Big Data & Analytics
47
Back to R Programming
R - Linear Regression
Example
Advanced R
Yaman Hajja | Big Data & Analytics
48
Artificial intelligence (AI)
Definition
AI is intelligence exhibited by machines. In computer science,
the field of AI research defines itself as the study of "intelligent
agents": any device that perceives its environment and takes
actions that maximize its chance of success at some goal.
The term "artificial intelligence" is applied when a machine
mimics "cognitive" functions that humans associate with other
human minds, such as "learning" and "problem solving" (known
as Machine Learning).
In August 2001, robots beat humans in a simulated financial
trading competition.
Yaman Hajja | Big Data & Analytics
49
Artificial intelligence (AI)
List of programming languages for artificial intelligence
Definition
Python is widely used for Artificial Intelligence. They have a lot of
different AIs with corresponding packages: General AI, Machine
Learning, Natural Language Processing and Neural Networks.
Companies like Narrative Science use Python to create an
artificial intelligence for Narrative Language Processing.
MATLAB.
C++
.
Yaman Hajja | Big Data & Analytics
50
Machine learning
Definition
Machine learning is the subfield of computer science that gives
computers the ability to learn without being explicitly
programmed. Evolved from the study of pattern recognition
and computational learning theory in artificial intelligence,
machine learning explores the study and construction of
algorithms that can learn from and make predictions on
data—such algorithms overcome following strictly static program
instructions by making data driven predictions or decisions,
through building a model from sample inputs.
Machine learning is employed in a range of computing tasks
where designing and programming explicit algorithms with good
performance is difficult or infeasible; example applications
include spam filtering, optical character recognition (OCR),
search engines and computer vision.
Yaman Hajja | Big Data & Analytics
51
Machine learning
Definition +
Machine learning is a branch in computer science that studies
the design of algorithms that can learn. Typical machine learning
tasks are concept learning, function learning or “predictive
modeling”, clustering and finding predictive patterns.
These tasks are learned through available data that were
observed through experiences or instructions, for example.
Machine learning hopes that including the experience into its
tasks will eventually improve the learning. The ultimate goal is to
improve the learning in such a way that it becomes automatic, so
that humans like ourselves don’t need to interfere any more.
Yaman Hajja | Big Data & Analytics
52
Machine learning
Figure: The machine learning process starts with raw data and ends up with
a model derived from that data.
Yaman Hajja | Big Data & Analytics
53
Common Machine Learning Algorithms
Naïve Bayes Classifier Algorithm
K Means Clustering Algorithm
Support Vector Machine Algorithm
Apriori Algorithm
Linear Regression
Logistic Regression
Artificial Neural Networks
Random Forests
Decision Trees
Nearest Neighbours (k-nearest neighbours "KNN" )
Yaman Hajja | Big Data & Analytics
54
The Role of [R] in machine learning
Much of the work done by a data scientist involves statistics. For
example, machine learning algorithms commonly apply some
kind of statistical technique to prepared data.
But doing this kind of work can sometimes require programming.
What programming language is best for statistical computing?
The answer is clear: It’s the open-source language called R.
Created in New Zealand more than 20 years ago, R has
become the lingua franca for writing code in this area. In
fact, it’s hard to find a data scientist who doesn’t know R.
Example: Machine Learning in R using (k-nearest neighbours)
algorithm.
Yaman Hajja | Big Data & Analytics
55
Machine learning
Yaman Hajja | Big Data & Analytics
56
Data mining
Definition
Data mining is the computational process of discovering
patterns in large data sets involving methods at the intersection
of artificial intelligence, machine learning, statistics, and
database systems.
It is an interdisciplinary subfield of computer science
Yaman Hajja | Big Data & Analytics
57
Data mining
Definition 2
Data in digital form are available everywhere. It can be used to
predict the future. Usually the statistical approach is used. Data
mining is an extension of traditional data analysis and statistical
approaches in that it incorporates analytical techniques drawn
from a range of disciplines.
Data mining covers the entire process of data analysis,
including data cleaning and preparation and visualization of the
results, and how to produce predictions in real-time so that
specific goals are met.
Source
Yaman Hajja | Big Data & Analytics
58
Data mining process and concept
Figure: Data mining is actually a part of the knowledge discovery process (KDD: knowledge
discovery from data). Data mining can be considered as a step in an iterative knowledge
discovery process which is shown in the above figure (Fayyad & Patetsky-Shapiro & Smith, 1996)
Yaman Hajja | Big Data & Analytics
59
Data mining in "Risk Management"
Data mining creates models through data analysis and
prediction to help solve problems involving both project feasibility
and risk management.
Data mining has been used to analyze a database containing
information on a person’s history, achievements, and expertise.
The goal was to develop a profile of the maturity of a certain
project involving the resource capacity, especially human capital.
Yaman Hajja | Big Data & Analytics
60
Data mining tools
Yaman Hajja | Big Data & Analytics
61
Data mining Cont.
Why Data Mining?
It helps to discover reasons for success and failure.
It helps to understand your customers, products etc.
It improves your organization by mining large sized databases.
SQL Data Mining Algorithms
Set of clusters illustrating how to relate the cases in dataset.
Decision Tree forecasts about the outcome and its after-effects.
Set of Rules explain how to group the products in a transaction.
Yaman Hajja | Big Data & Analytics
62
World wide data
Move On To Big Data!!!
Yaman Hajja | Big Data & Analytics
Thank you!

More Related Content

What's hot

Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
hktripathy
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
Xavier Rafael Palou
 
Big Data
Big DataBig Data
Big Data
Vinayak Kamath
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Simplilearn
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data PresentationMatthew Urdan
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
Jayant Mukherjee
 
Big data
Big dataBig data
Big data
Nimish Kochhar
 
Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides
SlideTeam
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
Md. Salman Ahmed
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
Nazir Ahmed
 
Big Data use cases in telcos
Big Data use cases in telcosBig Data use cases in telcos
Big Data use cases in telcos
Mohamed Zuber Khatib
 
Big Data
Big DataBig Data
Big Data
Rohit Jain
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Vipin Batra
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
Vikram Nandini
 
Big data
Big dataBig data
Big data
Ami Redwan Haq
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
Srinimf-Slides
 
Big data
Big dataBig data
Big data
Pooja Shah
 

What's hot (20)

Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Big Data
Big DataBig Data
Big Data
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data Presentation
 
Big data
Big dataBig data
Big data
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Big data
Big dataBig data
Big data
 
Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Big Data use cases in telcos
Big Data use cases in telcosBig Data use cases in telcos
Big Data use cases in telcos
 
Big Data
Big DataBig Data
Big Data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data
Big dataBig data
Big data
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Big data
Big dataBig data
Big data
 

Viewers also liked

What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
Bernard Marr
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Impact of big data on analytics
Impact of big data on analyticsImpact of big data on analytics
Impact of big data on analytics
Capgemini
 
Big-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunitiesBig-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunities
台灣資料科學年會
 
IQ Crash Course - Big Data Analytics
IQ Crash Course - Big Data AnalyticsIQ Crash Course - Big Data Analytics
IQ Crash Course - Big Data Analytics
InterQuest Group
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
Leslie Samuel
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
SlideShare
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
SlideShare
 
Decision support system
Decision  support  systemDecision  support  system
Decision support systemNoriha Nori
 
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Impetus Technologies
 
Big Data Report - 16 JULY 2012
Big Data Report - 16 JULY 2012Big Data Report - 16 JULY 2012
Big Data Report - 16 JULY 2012
Lora Cecere
 
DevOps - Motivadores e Benefícios
DevOps - Motivadores e BenefíciosDevOps - Motivadores e Benefícios
DevOps - Motivadores e Benefícios
Flávio Secchieri Mariotti
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
Richard Vidgen
 
The What, Why and How of Big Data
The What, Why and How of Big DataThe What, Why and How of Big Data
The What, Why and How of Big Data
Luca Naso
 
Benefícios e desafios que Big Data & Analytics traz para as empresas na jorna...
Benefícios e desafios que Big Data & Analytics traz para as empresas na jorna...Benefícios e desafios que Big Data & Analytics traz para as empresas na jorna...
Benefícios e desafios que Big Data & Analytics traz para as empresas na jorna...
Flávio Secchieri Mariotti
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Karan Desai
 
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
Ayyappan Paramesh
 
Big data Introduction by Mohan
Big data Introduction by MohanBig data Introduction by Mohan
Big data Introduction by Mohan
Venkata Reddy Konasani
 

Viewers also liked (20)

What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Impact of big data on analytics
Impact of big data on analyticsImpact of big data on analytics
Impact of big data on analytics
 
Big-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunitiesBig-data analytics: challenges and opportunities
Big-data analytics: challenges and opportunities
 
IQ Crash Course - Big Data Analytics
IQ Crash Course - Big Data AnalyticsIQ Crash Course - Big Data Analytics
IQ Crash Course - Big Data Analytics
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 
How to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & TricksHow to Make Awesome SlideShares: Tips & Tricks
How to Make Awesome SlideShares: Tips & Tricks
 
Big data experiments
Big data experimentsBig data experiments
Big data experiments
 
Getting Started With SlideShare
Getting Started With SlideShareGetting Started With SlideShare
Getting Started With SlideShare
 
Decision support system
Decision  support  systemDecision  support  system
Decision support system
 
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
 
Big Data Report - 16 JULY 2012
Big Data Report - 16 JULY 2012Big Data Report - 16 JULY 2012
Big Data Report - 16 JULY 2012
 
DevOps - Motivadores e Benefícios
DevOps - Motivadores e BenefíciosDevOps - Motivadores e Benefícios
DevOps - Motivadores e Benefícios
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
The What, Why and How of Big Data
The What, Why and How of Big DataThe What, Why and How of Big Data
The What, Why and How of Big Data
 
Benefícios e desafios que Big Data & Analytics traz para as empresas na jorna...
Benefícios e desafios que Big Data & Analytics traz para as empresas na jorna...Benefícios e desafios que Big Data & Analytics traz para as empresas na jorna...
Benefícios e desafios que Big Data & Analytics traz para as empresas na jorna...
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
 
Big data Introduction by Mohan
Big data Introduction by MohanBig data Introduction by Mohan
Big data Introduction by Mohan
 

Similar to Big Data & Analytics (Conceptual and Practical Introduction)

Building the Cognitive Era : Big Data Strategies
Building the Cognitive Era : Big Data StrategiesBuilding the Cognitive Era : Big Data Strategies
Building the Cognitive Era : Big Data Strategies
Kevin Sigliano
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdf
Anil
 
Power Of Data.pdf
Power Of Data.pdfPower Of Data.pdf
Power Of Data.pdf
Rahul Ranjan
 
NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...
NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...
NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...
Denodo
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
Sunil Ranka
 
The Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentThe Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate Environment
IRJET Journal
 
From IoT to IoTA
From IoT to IoTAFrom IoT to IoTA
From IoT to IoTA
Striim
 
Big Data Update - MTI Future Tense 2014
Big Data Update - MTI Future Tense 2014Big Data Update - MTI Future Tense 2014
Big Data Update - MTI Future Tense 2014
Hawyee Auyong
 
Big Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in BankingBig Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in Banking
Gianpaolo Zampol
 
Bigdata Landscape and Competitive Intelligence
Bigdata Landscape and Competitive IntelligenceBigdata Landscape and Competitive Intelligence
Bigdata Landscape and Competitive IntelligenceJithin S L
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big Data
IRJET Journal
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET Journal
 
Big data
Big dataBig data
Big data
ISME College
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thing
Bharath Rao
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
Sonovate
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Oomph! Recruitment
 
The Path to Manageable Data - Going Beyond the Three V’s of Big Data
The Path to Manageable Data - Going Beyond the Three V’s of Big DataThe Path to Manageable Data - Going Beyond the Three V’s of Big Data
The Path to Manageable Data - Going Beyond the Three V’s of Big Data
Connexica
 
Digital transformation review no 5 dtr - capgemini consulting - digitaltran...
Digital transformation review no 5   dtr - capgemini consulting - digitaltran...Digital transformation review no 5   dtr - capgemini consulting - digitaltran...
Digital transformation review no 5 dtr - capgemini consulting - digitaltran...
Rick Bouter
 
Big Data: Smart Technologies Provide Big Opportunities
Big Data: Smart Technologies Provide Big OpportunitiesBig Data: Smart Technologies Provide Big Opportunities
Big Data: Smart Technologies Provide Big Opportunities
NAED_Org
 

Similar to Big Data & Analytics (Conceptual and Practical Introduction) (20)

Building the Cognitive Era : Big Data Strategies
Building the Cognitive Era : Big Data StrategiesBuilding the Cognitive Era : Big Data Strategies
Building the Cognitive Era : Big Data Strategies
 
149.pdf
149.pdf149.pdf
149.pdf
 
Know The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdfKnow The What, Why, and How of Big Data_.pdf
Know The What, Why, and How of Big Data_.pdf
 
Power Of Data.pdf
Power Of Data.pdfPower Of Data.pdf
Power Of Data.pdf
 
NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...
NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...
NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
The Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate EnvironmentThe Comparison of Big Data Strategies in Corporate Environment
The Comparison of Big Data Strategies in Corporate Environment
 
From IoT to IoTA
From IoT to IoTAFrom IoT to IoTA
From IoT to IoTA
 
Big Data Update - MTI Future Tense 2014
Big Data Update - MTI Future Tense 2014Big Data Update - MTI Future Tense 2014
Big Data Update - MTI Future Tense 2014
 
Big Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in BankingBig Data & Analytics perspectives in Banking
Big Data & Analytics perspectives in Banking
 
Bigdata Landscape and Competitive Intelligence
Bigdata Landscape and Competitive IntelligenceBigdata Landscape and Competitive Intelligence
Bigdata Landscape and Competitive Intelligence
 
Analysis of Big Data
Analysis of Big DataAnalysis of Big Data
Analysis of Big Data
 
IRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial DomainIRJET- Scope of Big Data Analytics in Industrial Domain
IRJET- Scope of Big Data Analytics in Industrial Domain
 
Big data
Big dataBig data
Big data
 
Big data - The next best thing
Big data - The next best thingBig data - The next best thing
Big data - The next best thing
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
The Path to Manageable Data - Going Beyond the Three V’s of Big Data
The Path to Manageable Data - Going Beyond the Three V’s of Big DataThe Path to Manageable Data - Going Beyond the Three V’s of Big Data
The Path to Manageable Data - Going Beyond the Three V’s of Big Data
 
Digital transformation review no 5 dtr - capgemini consulting - digitaltran...
Digital transformation review no 5   dtr - capgemini consulting - digitaltran...Digital transformation review no 5   dtr - capgemini consulting - digitaltran...
Digital transformation review no 5 dtr - capgemini consulting - digitaltran...
 
Big Data: Smart Technologies Provide Big Opportunities
Big Data: Smart Technologies Provide Big OpportunitiesBig Data: Smart Technologies Provide Big Opportunities
Big Data: Smart Technologies Provide Big Opportunities
 

Recently uploaded

SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 

Recently uploaded (20)

SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 

Big Data & Analytics (Conceptual and Practical Introduction)

  • 1. Big Data & Analytics Workshop Yaman Hajja, Ph.D. yamanhajja@gmail.com March 24, 2017
  • 2. 1 Introduction What is data? Data is a set of values of qualitative or quantitative variables. Yaman Hajja | Big Data & Analytics
  • 3. 1 Introduction What is data? Data is a set of values of qualitative or quantitative variables. is any sequence of one or more symbols given meaning by specific act(s) of interpretation. [In Computing]. Yaman Hajja | Big Data & Analytics
  • 4. 1 Introduction What is data? Data is a set of values of qualitative or quantitative variables. is any sequence of one or more symbols given meaning by specific act(s) of interpretation. [In Computing]. Data Information Data requires interpretation to become information. Yaman Hajja | Big Data & Analytics
  • 5. 2 Data is the new oil of the digital economy Data in the 21st century is like oil in the 18th century. Data is the new oil of the digital economy. Yaman Hajja | Big Data & Analytics
  • 6. 2 Data is the new oil of the digital economy Data in the 21st century is like oil in the 18th century. Data is the new oil of the digital economy. Data infrastructure should become a profit center. Yaman Hajja | Big Data & Analytics
  • 7. 3 Types of data Types of data. Translation of document hosted by João Netoat. Yaman Hajja | Big Data & Analytics
  • 8. 4 Open Data Open Data is the idea that some data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. Yaman Hajja | Big Data & Analytics
  • 9. 4 Open Data Open Data is the idea that some data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. Example: Linked Datasets as of August 2014. Tungsten Tide. Yaman Hajja | Big Data & Analytics
  • 10. 5 Datasets for data science projects Example: analyticsvidhya Example: kaggle Example: drivendata opendatasoft opendatainception Yaman Hajja | Big Data & Analytics
  • 11. 6 What is data analysis? Data analysis: also known as data analytics, is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains. Yaman Hajja | Big Data & Analytics
  • 12. 7 What is data analysis? Statistical data Statistical data? Statistical analysis: is a component of data analytics. In the context of business intelligence (BI), statistical analysis involves collecting and scrutinizing every data sample in a set of items from which samples can be drawn. A sample, in statistics, is a representative selection drawn from a total population. Yaman Hajja | Big Data & Analytics
  • 13. 8 Data analysis process Yaman Hajja | Big Data & Analytics
  • 14. 9 Data Information Intelligence Yaman Hajja | Big Data & Analytics
  • 15. 10 Understanding Big Data Yaman Hajja | Big Data & Analytics
  • 16. 11 Understanding Big Data Big Data is a term for data sets that are so large or complex that traditional data processing application softwares are inadequate to deal with them. Challenges include capture, storage, analysis, data curationa , search, sharing, transfer, visualization, querying, updating and information privacy. aorganization and integration of data collected from various sources Yaman Hajja | Big Data & Analytics
  • 17. 12 Big Data Characteristics 3 Vs 1. Volume: big data doesn’t sample; it just observes and tracks what happens 2. Velocity: big data is often available in real-time 3. Variety: big data draws from text, images, audio, video; plus it completes missing pieces through data fusion Yaman Hajja | Big Data & Analytics
  • 18. 13 Who can deal with Big Data? Yaman Hajja | Big Data & Analytics
  • 19. 14 Who can deal with Big Data? Yaman Hajja | Big Data & Analytics
  • 21. 16 Big Data tools Yaman Hajja | Big Data & Analytics
  • 22. 17 Big Data tools Yaman Hajja | Big Data & Analytics
  • 23. 18 some Big Data facts Big Data and Business Analytics Revenues Forecast to Reach $150.8 Billion This Year, Led by Banking and Manufacturing Investments, According to from International Data Corporation (IDC), an increase of 12.4% over 2016. Twenty-five years ago, data was growing at a rate of 100GB a day. Now, data grows at a rate of almost 50,000GB a second. The world today is awash in data. In 2015, mankind produced as much information as was created in all previous years of human civilization. Every time we send a message, make a call, or complete a transaction, we leave digital traces. Yaman Hajja | Big Data & Analytics
  • 24. 19 Data scientists vs data analysts Yaman Hajja | Big Data & Analytics
  • 25. 20 Data Visualization Data visualization is a general term that describes any effort to help people understand the significance of data by placing it in a visual context. Patterns, trends and correlations that might go undetected in text-based data can be exposed and recognized easier with data visualization software. Yaman Hajja | Big Data & Analytics
  • 26. 21 Example: Data Visualized Charter value NPLs Exchange rate M1 15 15.5 16 16.5 17 17.5 18 Chartervalue% 2 4 6 8 10 12 14 16 18 20 22 2002m1 2002m7 2003m1 2003m7 2004m1 2004m7 2005m1 2005m7 2006m1 2006m7 2007m1 2007m7 2008m1 2008m7 2009m1 2009m7 2010m1 2010m7 2011m1 2011m7 2012m1 2012m7 2013m1 2013m7 2014m1 2014m7 2015m1 2015m7 Time (2002 M1 - 2015 M8) NPLs % Money supply M1 % pa Exchange rate Charter value % NPls of Malaysia banking system over M1, exchange rate, and charter value (2002 M1 - 2015 M8) Yaman Hajja | Big Data & Analytics
  • 27. 22 Example#2: Data Visualized Capital GDP NPLs -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 1998m1 1998m7 1999m1 1999m7 2000m1 2000m7 2001m1 2001m7 2002m1 2002m7 2003m1 2003m7 2004m1 2004m7 2005m1 2005m7 2006m1 2006m7 2007m1 2007m7 2008m1 2008m7 2009m1 2009m7 2010m1 2010m7 2011m1 2011m7 2012m1 2012m7 2013m1 2013m7 2014m1 2014m7 2015m1 NPLs % GDP growth % Capital ratio % NPLs of Malaysia banking system over business cycle (GDP) (1998 M1 - 2015 M3) with capital ratio Yaman Hajja | Big Data & Analytics
  • 28. 23 Example#3: Data Visualized NPLs Lending rate Unemploment Inflation -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1998q1 1998q3 1999q1 1999q3 2000q1 2000q3 2001q1 2001q3 2002q1 2002q3 2003q1 2003q3 2004q1 2004q3 2005q1 2005q3 2006q1 2006q3 2007q1 2007q3 2008q1 2008q3 2009q1 2009q3 2010q1 2010q3 2011q1 2011q3 2012q1 2012q3 2013q1 2013q3 2014q1 2014q3 2015q1 Time (1998 Q1 - 2015 Q1) NPLs % Lending interest rate % Inflation (CP) % Unemloyment % NPLs of Malaysia banking system over the business cycle (1998 Q1 - 2015 Q1) Yaman Hajja | Big Data & Analytics
  • 29. 24 Visualization Types Yaman Hajja | Big Data & Analytics
  • 30. 25 Social Network Analysis Social network analysis (SNA) is the process of investigating social structures through the use of network and graph theories. It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links (relationships or interactions) that connect them. Examples of social structures commonly visualized through social network analysis include social media networks. Yaman Hajja | Big Data & Analytics
  • 31. 26 Example of Social Network Analysis Data visualization of Facebook relationships Yaman Hajja | Big Data & Analytics
  • 32. 27 Network Theory Tools Yaman Hajja | Big Data & Analytics
  • 33. 28 What exactly is the meaning of an API? Application Programming Interface (API) Application Programming Interface (API) API is a particular set of rules (’code’) and specifications that software programs can follow to communicate with each other. It serves as an interface between different software programs and facilitates their interaction, similar to the way the user interface facilitates interaction between humans and computers. Yaman Hajja | Big Data & Analytics
  • 34. 29 What exactly is the meaning of an API? Application Programming Interface (API) API is a set of subroutine definitions, protocols, and tools for building application software. It is a set of clearly defined methods of communication between various software components. A good API makes it easier to develop a computer program by providing all the building blocks, which are then put together by the programmer. An API may be for a web-based system, operating system, database system, computer hardware or software library. An API specification can take many forms, but often includes specifications for routines, data structures, object classes, variables or remote calls. Microsoft Windows API, the C++ Standard Template Library and Java APIs are examples of different forms of APIs. Yaman Hajja | Big Data & Analytics
  • 35. 30 API Yaman Hajja | Big Data & Analytics
  • 36. 31 Example of web API Shiny Weather Data A web API is an application programming interface (API) for either a web server or a web browser. Shiny Weather Data is a web service making different sources of European gridded climate data available in hourly time series formats used by common building performance modeling tools. This web service has been around for a while and has a steadily growing user group of professional building modelers as well as students and researchers. satellite-based time series of solar irradiation for the actual weather conditions as well as for clear-sky conditions Portfolio Visualizer Yaman Hajja | Big Data & Analytics
  • 37. 32 Predictive Analytics Predictive analytics is the branch of the advanced analytics which is used to make predictions about unknown future events. Predictive analytics uses many techniques from data mining, statistics, modeling, machine learning, and artificial intelligence to analyze current data to make predictions about future. Yaman Hajja | Big Data & Analytics
  • 38. 33 Predictive Analytics Yaman Hajja | Big Data & Analytics
  • 39. 34 Probability and Statistics Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1 (where 0 indicates impossibility and 1 indicates certainty). The higher the probability of an event, the more certain that the event will occur. A simple example is the tossing of a coin. Since the coin is unbiased, the two outcomes ("head" and "tail") are both equally probable; the probability of "head" equals the probability of "tail". Since no other outcomes are possible, the probability is 1/2 (or 50%), of either "head" or "tail". Yaman Hajja | Big Data & Analytics
  • 40. 35 Probability Theory Probability Theory is the branch of mathematics concerned with probability, the analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single occurrences or evolve over time in an apparently random fashion. Example Yaman Hajja | Big Data & Analytics
  • 41. 36 Statistics Statistics as "a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data". Merriam-Webster dictionary. In applying statistics to, e.g., a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model process to be studied. Populations can be diverse topics such as "all people living in a country" or "every atom composing a crystal". Statistics deals with all aspects of data including the planning of data collection in terms of the design of surveys and experiments. Yaman Hajja | Big Data & Analytics
  • 42. 37 Normal Distribution Normal (or Gaussian) distribution is a very common continuous probability distribution. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known. LINK (Normal Distribution). Yaman Hajja | Big Data & Analytics
  • 43. 38 Normal Distribution Probability density function Figure: The red curve is the standard normal distribution Yaman Hajja | Big Data & Analytics
  • 44. 39 Other Distributions Yaman Hajja | Big Data & Analytics
  • 45. 40 p-value The P value, or calculated probability, is the probability of finding the observed, or more extreme, results when the null hypothesis (H0) of a study question is true – the definition of ’extreme’ depends on how the hypothesis is being tested. - LINK. - Seeing Theory website. Yaman Hajja | Big Data & Analytics
  • 46. 41 what is Regression Analysis? Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable (s) (predictor). This technique is used for forecasting, time series modelling and finding the causal effect relationship between the variables. For example, relationship between rash driving and number of road accidents by a driver is best studied through regression. Regression analysis is an important tool for modelling and analyzing data. There are multiple benefits of using regression analysis. They are as follows: *** It indicates the significant relationships between dependent variable and independent variable. *** It indicates the strength of impact of multiple independent variables on a dependent variable. Yaman Hajja | Big Data & Analytics
  • 47. 42 Linear Regression It is one of the most widely known modeling technique. Linear regression is usually among the first few topics which people pick while learning predictive modeling. Linear Regression establishes a relationship between dependent variable (Y) and one or more independent variables (X) using a best fit straight line (also known as regression line). Yaman Hajja | Big Data & Analytics
  • 48. 43 Linear Regression. Cont. It is represented by an equation Y = α + βX + e, where a is intercept, β is slope of the line and e is error term. This equation can be used to predict the value of target variable based on given predictor variable(s). Yaman Hajja | Big Data & Analytics
  • 49. 44 Data Modeling then Forecasting (Simulation of the model) Example. 0 1 2 3 -1 -.5 0 .5 -.5 0 .5 1 -1 -.5 0 .5 1 -1 -.5 0 .5 -.05 0 .05 .1 0 .1 .2 .3 -.1 -.05 0 .05 -.15 -.1 -.05 0 .05 -.02 0 .02 .04 .06 -2 -1 0 1 -3 -2 -1 0 1 -1 0 1 2 -5 0 5 10 -2 -1 0 1 -.1 -.05 0 .05 -.1 -.05 0 .05 -.05 0 .05 .1 -.1 0 .1 .2 .3 -.1 -.05 0 .05 -.2 -.1 0 .1 0 .1 .2 .3 .4 -.2 -.1 0 .1 -.2 0 .2 .4 0 .1 .2 .3 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 M1 : M1 LENR : M1 CA^2 : M1 CA : M1 NPLs : M1 M1 : LENR LENR : LENR CA^2 : LENR CA : LENR NPLs : LENR M1 : CA^2 LENR : CA^2 CA^2 : CA^2 CA : CA^2 NPLs : CA^2 M1 : CA LENR : CA CA^2 : CA CA : CA NPLs : CA M1 : NPLs LENR : NPLs CA^2 : NPLs CA : NPLs NPLs : NPLs 95% CI Orthogonalized IRF Step (1-month) impulse : response. Generated by Monte-Carlo with 1000 reps. Based on VAR model. Impulse-Response Functions Yaman Hajja | Big Data & Analytics
  • 50. 45 Back to R Programming How to fetch stock data? Example: How to fetch stock data? Financial time series forecasting – an easy approach Yahoo Finance Yaman Hajja | Big Data & Analytics
  • 51. 46 Back to R Programming R - Linear Regression Example Linear Regression in R. Yaman Hajja | Big Data & Analytics
  • 52. 47 Back to R Programming R - Linear Regression Example Advanced R Yaman Hajja | Big Data & Analytics
  • 53. 48 Artificial intelligence (AI) Definition AI is intelligence exhibited by machines. In computer science, the field of AI research defines itself as the study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of success at some goal. The term "artificial intelligence" is applied when a machine mimics "cognitive" functions that humans associate with other human minds, such as "learning" and "problem solving" (known as Machine Learning). In August 2001, robots beat humans in a simulated financial trading competition. Yaman Hajja | Big Data & Analytics
  • 54. 49 Artificial intelligence (AI) List of programming languages for artificial intelligence Definition Python is widely used for Artificial Intelligence. They have a lot of different AIs with corresponding packages: General AI, Machine Learning, Natural Language Processing and Neural Networks. Companies like Narrative Science use Python to create an artificial intelligence for Narrative Language Processing. MATLAB. C++ . Yaman Hajja | Big Data & Analytics
  • 55. 50 Machine learning Definition Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed. Evolved from the study of pattern recognition and computational learning theory in artificial intelligence, machine learning explores the study and construction of algorithms that can learn from and make predictions on data—such algorithms overcome following strictly static program instructions by making data driven predictions or decisions, through building a model from sample inputs. Machine learning is employed in a range of computing tasks where designing and programming explicit algorithms with good performance is difficult or infeasible; example applications include spam filtering, optical character recognition (OCR), search engines and computer vision. Yaman Hajja | Big Data & Analytics
  • 56. 51 Machine learning Definition + Machine learning is a branch in computer science that studies the design of algorithms that can learn. Typical machine learning tasks are concept learning, function learning or “predictive modeling”, clustering and finding predictive patterns. These tasks are learned through available data that were observed through experiences or instructions, for example. Machine learning hopes that including the experience into its tasks will eventually improve the learning. The ultimate goal is to improve the learning in such a way that it becomes automatic, so that humans like ourselves don’t need to interfere any more. Yaman Hajja | Big Data & Analytics
  • 57. 52 Machine learning Figure: The machine learning process starts with raw data and ends up with a model derived from that data. Yaman Hajja | Big Data & Analytics
  • 58. 53 Common Machine Learning Algorithms Naïve Bayes Classifier Algorithm K Means Clustering Algorithm Support Vector Machine Algorithm Apriori Algorithm Linear Regression Logistic Regression Artificial Neural Networks Random Forests Decision Trees Nearest Neighbours (k-nearest neighbours "KNN" ) Yaman Hajja | Big Data & Analytics
  • 59. 54 The Role of [R] in machine learning Much of the work done by a data scientist involves statistics. For example, machine learning algorithms commonly apply some kind of statistical technique to prepared data. But doing this kind of work can sometimes require programming. What programming language is best for statistical computing? The answer is clear: It’s the open-source language called R. Created in New Zealand more than 20 years ago, R has become the lingua franca for writing code in this area. In fact, it’s hard to find a data scientist who doesn’t know R. Example: Machine Learning in R using (k-nearest neighbours) algorithm. Yaman Hajja | Big Data & Analytics
  • 60. 55 Machine learning Yaman Hajja | Big Data & Analytics
  • 61. 56 Data mining Definition Data mining is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. It is an interdisciplinary subfield of computer science Yaman Hajja | Big Data & Analytics
  • 62. 57 Data mining Definition 2 Data in digital form are available everywhere. It can be used to predict the future. Usually the statistical approach is used. Data mining is an extension of traditional data analysis and statistical approaches in that it incorporates analytical techniques drawn from a range of disciplines. Data mining covers the entire process of data analysis, including data cleaning and preparation and visualization of the results, and how to produce predictions in real-time so that specific goals are met. Source Yaman Hajja | Big Data & Analytics
  • 63. 58 Data mining process and concept Figure: Data mining is actually a part of the knowledge discovery process (KDD: knowledge discovery from data). Data mining can be considered as a step in an iterative knowledge discovery process which is shown in the above figure (Fayyad & Patetsky-Shapiro & Smith, 1996) Yaman Hajja | Big Data & Analytics
  • 64. 59 Data mining in "Risk Management" Data mining creates models through data analysis and prediction to help solve problems involving both project feasibility and risk management. Data mining has been used to analyze a database containing information on a person’s history, achievements, and expertise. The goal was to develop a profile of the maturity of a certain project involving the resource capacity, especially human capital. Yaman Hajja | Big Data & Analytics
  • 65. 60 Data mining tools Yaman Hajja | Big Data & Analytics
  • 66. 61 Data mining Cont. Why Data Mining? It helps to discover reasons for success and failure. It helps to understand your customers, products etc. It improves your organization by mining large sized databases. SQL Data Mining Algorithms Set of clusters illustrating how to relate the cases in dataset. Decision Tree forecasts about the outcome and its after-effects. Set of Rules explain how to group the products in a transaction. Yaman Hajja | Big Data & Analytics
  • 67. 62 World wide data Move On To Big Data!!! Yaman Hajja | Big Data & Analytics