SlideShare a Scribd company logo
1 of 151
BBIIGG DDAATTAA 
LLeessssoonn 33 
Study : Jean-Antoine Moreau (Engineer - Lecturer) 
© Jean-Antoine Moreau 
copying and reproduction prohibited 
Managing my copyright ADAGP.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
« In questions of science, the authority of a thousand 
is not worth the humble reasoning of a single 
individual. » 
Galileo Galilei 
Contact http://www.jean-antoine-moreau.fr.nf JAM 2
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• HadHoop Open Source Framework; 
• Support database; 
• HadHoop is composed of a cell (stack) 
application; 
• for analytical apllications a distributed file 
system records; 
• HDFS (Hadoop Distributed File System). 
Contact http://www.jean-antoine-moreau.fr.nf JAM 3
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• HDFS (Hadoop Distributed File System) 
developed in JAVA language support Base 
Hbase; 
• HBase: Database distributed across the 
nodes of a cluster server. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 4
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
MapReduce 
Architecture analytics 
Contact http://www.jean-antoine-moreau.fr.nf JAM 5
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
MapReduce allows parallel calculations and 
distributed. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 6
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Data is distributed across a cluster of server 
nodes that make up the architecture. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 7
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Hadoop uses the block mode. 
• A data file is divided into block of the same 
size, to be distributed over the nodes in the 
cluster, for processing. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 8
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• The system maintains the mapping HDFS 
distributed data. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 9
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Cutting block data processing speed. What brings 
real time. 
• Unstructured data can undergo an analytical 
treatment. 
• The distribution of data among the nodes of a 
cluster allows paralization treatments. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 10
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• MapReduce driver: 
– nodes in the cluster; 
– assign duties 
– manages the restart disable nodes; 
– manages the restart of tasks; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 11
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
and 
Cloud Computing 
Contact http://www.jean-antoine-moreau.fr.nf JAM 12
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
uses 
virtualization 
virtual abstraction 
heterogeneous infrastructure 
Contact http://www.jean-antoine-moreau.fr.nf JAM 13
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
uses the data stored in the cloud 
Contact http://www.jean-antoine-moreau.fr.nf JAM 14
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Virtual Machines (VMs) are used for the 
deployment of Big Data architectures. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 15
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Uses 
Cluster 
distributed processing 
the computing nodes in the cluster 
Contact http://www.jean-antoine-moreau.fr.nf JAM 16
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Treatment using virtual machines 
Scalability of Hadoop nodes 
ability to vary the number of nodes as required 
Contact http://www.jean-antoine-moreau.fr.nf JAM 17
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Security 
MapReduce 
Fault tolerance 
Contact http://www.jean-antoine-moreau.fr.nf JAM 18
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
use the virtual abstraction layer as a resource. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 19
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Big Data 
– thanks to the cloud; 
– thanks to virtual machines; 
– thanks to the management nodes; 
– 
• Big Data uses less bandwidth; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 20
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
lambda architecture 
is a generic term to describe the Big Data 
architectures, which store and process the 
real-time data. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 21
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
lambda architecture 
These architectures know how to manage at the same time 
as a stock and as a flow. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 22
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Lambda architectures use the tools: 
– Hadoop; 
– Strom Hadoop; 
– Spark Streaming; 
– Summing Bird; 
– Hydra; 
– … 
Contact http://www.jean-antoine-moreau.fr.nf JAM 23
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Lambda architectures use the stores No SQL: 
– CouchBase; 
– MongoDb; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 24
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Lambda architectures 
Events Process State Services 
Contact http://www.jean-antoine-moreau.fr.nf JAM 25
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Lambda architectures 
Real Time 
Process 
Real Time 
Result 
Events Federation 
Batch 
Process 
Batch 
View 
Contact http://www.jean-antoine-moreau.fr.nf JAM 26
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Architecture 
Layers 
• the batch layer: 
– Stock the dataset; 
– logical calculation on the dataset views. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 27
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Architecture 
Layers 
• The Real Time layer (speed layer): 
– this layer only deals with recent data; 
– this layer compensates the high latency of the batch 
layer by calculating views "real time"; 
– real-time views are calculated incrementally based on 
stream processing systems and databases in random 
read / write. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 28
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Architecture 
Layers 
Service layer - Layer serving 
Contact http://www.jean-antoine-moreau.fr.nf JAM 29
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Architecture must : 
• Securing a single processing an event; 
• Allows scalability without compromising the architecture, 
or tools used; 
• Allow changes in the data. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 30
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools and Framework 
Contact http://www.jean-antoine-moreau.fr.nf JAM 31 
Message 
Queue 
Real-Time 
Processing 
Real-Time 
State 
Real-Time 
Views 
Service 
Federated 
View 
Batch 
Pump 
Batch 
State 
Batch 
Processing 
Batch 
Views 
Services
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools and Framework 
Contact http://www.jean-antoine-moreau.fr.nf JAM 32 
Message 
Queue 
Real-Time 
Processing 
Real-Time 
State 
Real-Time 
Views 
Service 
Federated 
View 
Batch 
Pump 
Batch 
State 
Batch 
Processing 
Batch 
Views 
Services 
RABBITMQ 
STORM 
MEMCACHE 
MONGODB 
WEBAPP 
FLUME HDFS MAPRED HBase
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools 
• Message queue: 
– Active MQ; 
– Hornet MQ; 
– Rabbit MQ; 
– Kestrel; 
– Kafka; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 33
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools 
Storm 
Storm is a distributed real-time computation system for 
processing fast, large streams of data, adding real-time 
data processing to Apache Hadoop. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 34
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Contact http://www.jean-antoine-moreau.fr.nf JAM 35 
• Concepts 
– Storm has two basic units of processing: the Spouts and the Bolts; 
– The Spouts are the elements that generate the data to be processed, 
they may get that data from external sources or generate it 
themselves but their mission is to introduce it to the cluster; 
– Bolts are processing units: they receive data from the Spouts and 
perform work on it, optionally generating more data to be 
processed by other Bolts.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• The data that flows internally on the cluster is organized in Streams, 
those are homogeneous lists of named tuples which are known in 
advance; when a Spout is created it declares the kind of Tuple it will 
emit, likewise when a Bolt is meant to generate further data to be 
processed it declares the kind of Tuple it will emit for the next Bolt on 
the chain. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 36
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• In streambase, events are represented by 
data type called a tuple; 
• Individual tuples are instance of a subtype 
of tuple determined associated with a 
specific schema; 
• A tuple contains a single value for each 
field in its schema. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 37
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Contact http://www.jean-antoine-moreau.fr.nf JAM 38 
Spout 
Spout 
Bolt 
Bolt 
Bolt 
Bolt 
Tuple 
Tuple 
Tuple 
Tuple 
Tuple 
Tuple 
Topology
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Topology 
Contact http://www.jean-antoine-moreau.fr.nf JAM 39
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools 
Trident - Storm 
• Trident is a new high-level abstraction for 
doing realtime computing on top of Storm. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 40
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools 
Trident - Storm 
• The concepts seem to Pig (programming tools) and Cascading tools. 
• They allow : 
Contact http://www.jean-antoine-moreau.fr.nf JAM 41 
 Joins; 
 Aggregations; 
 Groupings; 
 Functions; 
 Filters.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools 
Trident - Storm 
• The concepts seem to Pig (programming tools) and Cascading tools. 
• They allow : 
Contact http://www.jean-antoine-moreau.fr.nf JAM 42 
 Joins; 
 Aggregations; 
 Groupings; 
 Functions; 
 Filters.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools 
Trident – Storm 
• The possibilities of Trident + Storm, combined with fast 
scalable data stores; 
• Everything from real-time, filtering, complex event 
processing, machine learning, … 
• Storm is an asynchronous distributed framework but with a 
simple distributed RPC server (Remote Procedure Call) it can 
easily be used in synchronous. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 43
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools 
Trident – Storm 
• Tuples are processed by Batch processing; 
• Each batch of tuples assigned a unique identifier, 
the transaction number; 
• Knew a lot is replayed, it will receive the same 
number; 
• The Batch processing are processed in the order. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 44
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tools 
• Summingbird is a large-scale data 
processing system enabling developers to 
uniformly execute code in either batch-mode 
(Hadoop/MapReduce-based) or 
stream-mode (Storm-based) or a 
combination thereof, called hybrid mode. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 45
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tool 
Spark 
• Open-source data analytics cluster computing framework 
originally developed at University of California: 
Berkeley. 
• It uses the concept of Distributed Dataset. 
• Spark can handle real-time data with SparkStreaming. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 46
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tool 
Spark 
• Open-source data analytics cluster computing framework 
originally developed at University of California: 
Berkeley. 
• It uses the concept of Distributed Dataset. 
• Spark can handle real-time data with SparkStreaming. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 47
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Data Base NoSQL 
Mongo DB, Redis, HBase, … 
The NoSQL databases can also be used as tools of implementation 
• technical bricks; 
• technical components; 
• technical blocks; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 48
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Using Mongo 
Contact http://www.jean-antoine-moreau.fr.nf JAM 49 
Message 
Queue 
Real-Time 
Processing 
Real-Time 
State 
Real-Time 
Views 
Service 
Federated 
View 
Batch 
Pump 
Batch 
State 
Batch 
Processing 
Batch 
Views 
Services
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Using Mongo 
Contact http://www.jean-antoine-moreau.fr.nf JAM 50 
Message 
Queue 
Real-Time 
Processing 
Real-Time 
State 
Real-Time 
Views 
Service 
Federated 
View 
Batch 
Pump 
Batch 
State 
Batch 
Processing 
Batch 
Views 
Services 
Insert in Mongo 
Mongo 
Mongo 
Aggregation 
Insert Mongo 
Mongo 
MapReduce 
Mongo 
Collection
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tool 
SploutSQL 
for indexing and partition data from a Hadoop 
cluster and expose them in SQL. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 51
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Tool 
SploutSQL 
Hadoop 
Data generate 
Deploy 
SploutSQL 
Splout SQL 
updating 
Splout SQL 
updating 
Contact http://www.jean-antoine-moreau.fr.nf JAM 52 
Source 1 
Source 2
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
ORACLE appliance 
– ORACLE Big Data connectors; 
– ORACLE Data Base; 
– ORACLE Exadat; 
– ORACLE Exalytics; 
– Infiniband; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 53
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Big Data Analytical 
Processes and Models 
Contact http://www.jean-antoine-moreau.fr.nf JAM 54 
Including : 
Statistics; 
Spacial; 
Semantics; 
Interactive; 
Discovery; 
Visualization;
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Align with the Cloud Operating Model 
• Data across the all dataset : 
Contact http://www.jean-antoine-moreau.fr.nf JAM 55 
Transactions; 
Master data; 
Reference; 
Summarized; 
• Pre-Processing 
Integration; 
In-database summarization; 
• Post-Processing; 
• Analytical modeling; 
• A well planned private and public security strategy play, with integral rules.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
and 
IT Governance 
Contact http://www.jean-antoine-moreau.fr.nf JAM 56
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Expand your IT governance to include the Big 
Data : 
– To ensure business alignment; 
– Grow you skills; 
– Manage Open source; 
– Manage tools and Technologies evolution; 
– Share the knowledge; 
– Establish standards; 
– Manage the best practice. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 57
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Futur State 
Current State 
IT Architecture 
IT Infrastructure 
Governance IT 
Business 
Architecture 
Contact http://www.jean-antoine-moreau.fr.nf JAM 58
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Projects Bid Data should take into account 
existing systems, such as: 
– ERP (Enterprise Resssource Planning); 
• Databases; 
• SaaS Applications; 
• data warehouse; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 59 
– and also: 
• social networks; 
• Websites; 
• Internet data; 
• sensors;
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Change 
Adapt 
the Governance of Information System 
Contact http://www.jean-antoine-moreau.fr.nf JAM 60
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Governance of the Information System 
Quality of Service Security 
reliability of the data 
Contact http://www.jean-antoine-moreau.fr.nf JAM 61
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Hadoop natively uses Kerberos for security; 
• But integration engines use their own 
methods sécurité. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 62
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
New skills 
Few computer programmers control MapReduce 
Contact http://www.jean-antoine-moreau.fr.nf JAM 63
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
The stakes 
We need to train engineers to new technologies. 
The quality of data that make up the information 
The profitability will be long-term. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 64
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
issues 
cost control management 
control of development costs 
Contact http://www.jean-antoine-moreau.fr.nf JAM 65
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Control technologies; 
Effective management of the Project; 
Planning; 
Project Plan; 
Quality Assurance Plan; 
Service Plan (purpose and level of service); 
Strategy Plan Information; 
Data Governance Plan. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 66
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
The necessary adjustment of 
the information system department 
Contact http://www.jean-antoine-moreau.fr.nf JAM 67
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• for effective use of: 
– Big Data sources; 
– Data massaging and store layer; 
– Analysis layer; 
– Consumption layer. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 68
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
the department information system 
must have the ability to do and manage : 
…. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 69
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Big Data source 
The data available for Analysis 
Coming from all channels 
Contact http://www.jean-antoine-moreau.fr.nf JAM 70
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Velocity and volume 
The speed that data arrives and the rate 
at which it’s delivered 
various data source. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 71
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Manage the collection point 
Where the data is collected 
Identify the data to which you have limited acces 
Contact http://www.jean-antoine-moreau.fr.nf JAM 72
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Management of : 
– The data massaging and store layer; 
– The Hadoop Distibuted File System; 
– The Relational Database Management System 
(RDBMS). 
Contact http://www.jean-antoine-moreau.fr.nf JAM 73
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Manage the Analysis Layer 
• Produce the desired analytics; 
• Derive insight from the data; 
• Find the entities required; 
• Locate the data sources; 
• Understand the algorithms. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 74
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Manage 
• The consumption layer and components. 
• The output produced by the analysis layer. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 75
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Organization 
Schema 
Contact http://www.jean-antoine-moreau.fr.nf JAM 76
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Layers 
• Consumption layer; 
• Analysis layer; 
• Data massaging and Storage layer; 
• Data sources layer. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 77
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Contact http://www.jean-antoine-moreau.fr.nf JAM 78 
Consumption layer 
Business Process 
Management 
Visualization 
Discovery 
Real time 
Business 
Alert 
Data 
Navigation 
Decision 
Management 
Reporting 
engine 
Selfservice 
Query 
Customs 
DataBoard 
Transaction 
Interception
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Analysis Layer Entity 
Identification 
Model Real time 
Scored 
Result 
Decision 
Management 
Scoring 
of evens 
Model Management 
Predictive 
Models 
Statistical 
Models 
Verification 
Models 
Contact http://www.jean-antoine-moreau.fr.nf JAM 79
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Data Massaging and Storage Layer 
Data Acquisition Data Digest 
Contact http://www.jean-antoine-moreau.fr.nf JAM 80
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Contact http://www.jean-antoine-moreau.fr.nf JAM 81 
Big Data Sources 
texts images audio spacial temporal documents 
Relational Data 
Domaines entities 
Aggregates data 
provider 
Sensor vendor 
application
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Enterprise system : 
– Customer relationship management system; 
– Billing operations; 
– Mainframe applications; 
– Enterprise resources planning; 
– Web application. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 82
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Data Management System (DMS) 
– It stores legal data, processes, policies, and 
various other kinds of documents; 
– These documents can be converted into data 
structured, that can be used for analytic. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 83
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Data stores include : 
– The enterprise data warehouse; 
– The operational database; 
– Transactional database; 
– Such data may be stored in the distributed file 
system. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 84
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Smart devices are capable of : 
– Capturing 
– Processing 
– Communicating 
Informations 
Contact http://www.jean-antoine-moreau.fr.nf JAM 85
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Aggregated data providers; 
• These provider owns or acquires the data 
and exposes sophisticated formats, at 
required frequencies, and through specific 
filter. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 86
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Additional data sources 
 Geographical information 
• Maps; 
• Regional details; 
• Location détails. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 87
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Additional data sources 
 Human –generated content 
• Social Media; 
• E-mail; 
• Blogs; 
• Online Information; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 88
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Additional data sources 
 Sensor Data 
• Environment; 
• Electricity; 
• Navigate instrument; 
• Ionizing Radiation; 
• Proximity; 
• Position; 
• Acoustic; 
• Automotive, transportation; 
• Thermal; 
• Optical; 
• Chemical; 
• Pressure; 
• Flow; 
• Fluid; 
• Force; 
• Density. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 89
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Data massaging and store layer 
– Data acquisition; 
– Data digest; 
– Distributed data store. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 90
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Analysis layer 
– Analysis layer entity identification; 
– Analysis engine; 
– Model Management. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 91
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Consumption layer : 
– The outcome of the analysis in consumed by various 
users within the organization and by entities external to 
the organization; 
– The business insight gained from analysis; 
– A company can use customer preference data and 
location awareness to deliver personalized offers to 
customers as they walk down. The aisle or pass by the 
store. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 92
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Transaction interceptor : 
– This component intercepts high-values transactions in 
real time and convert them into a suitable. 
– Format that can be readily understand by the analysis 
layer to do real-time analysis on the incoming data. 
– The transaction interceptor should have the ability to 
integrate with and handle data from various sources 
such as sensors, smart meters, microphones, camera, 
GPS devices, ATMs, Images scanners. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 93
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Buisiness process management process: 
– The insight from the analysis layer can be 
consumed by Business Process Execution 
Langage (BPEL) process, APIs or other 
business value by automating the functions. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 94
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Real Time monitoring 
Contact http://www.jean-antoine-moreau.fr.nf JAM 95
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Real time alerts can be generated using the 
date coming out of the analysis layer. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 96
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• The alert can be sent to interested consumers 
and devices, such as smartphones and tablets. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 97
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Key performance indicators can be defined for 
operational effectiveness using the data insight 
generated from the analytics components. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 98
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Data in real time can be made available to 
business users from varied sources in the 
form of dashboard to monitor, or to measure 
the effectiveness of a campaign. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 99
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Reporting engine 
– The ability to produce reports similar to 
traditional business intelligence report is 
critical. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 100
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Reporting engine 
Ad-hoc report 
Scheduled reports 
Self query 
Analysis 
 can be created by users 
based on the insight coming 
out of the analysis layer. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 101
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Visualisation and discovery : 
– Data can be navigated across various federated 
data sources within and out side the entreprise. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 102
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Visualisation and discovery : 
– The data can vary in content and format, and all 
of the data : 
Structured 
Semi-structured 
Unstructured 
Can be combined for Virtualization 
and provide the user (Visualisation) 
Contact http://www.jean-antoine-moreau.fr.nf JAM 103
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• This ability enables at the organization to combine : 
– The traditional entreprise content 
• Management system; 
• Data warehouses; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 104 
– With 
• New social content.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Vertical layers : 
– Components of the logical layers; 
– Information integration; 
– Big Data Governance; 
– Quality of Service. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 105
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Governance 
Contact http://www.jean-antoine-moreau.fr.nf JAM 106
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Data governance is about defining guidelines, 
that help entreprise make the right decisions 
about the data. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 107
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Bid Data governance helps in dealing with 
the complexities, volumes, and variety of 
data that is within the enterprise or is 
coming from external sources. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 108
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Strong guideline and processes are required to 
monitor, structure, store, and secure the data from 
the time they enter in the enterprise. 
• gets processed stored; 
• Analyzed; 
• Parged; 
• archived. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 109
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Managing high volume in data variety; 
• of formats. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 110
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Continuous training and managing the 
statistical models required to pre-process 
unstructured data and analytics. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 111
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Setting policy and compliance regulation 
for external data regarding their retention 
and usage. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 112
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Define the data archiving and purging policies. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 113
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Creating the policy for : 
how data can be replicated across various systems. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 114
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Setting data encryption policies. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 115
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Quality of the service layer. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 116
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Contact http://www.jean-antoine-moreau.fr.nf JAM 117 
• Data Quality 
– Completeness in identify all the data elements required; 
– Timeliness for providing data at an acceptable level of freshness; 
– Accuracy in verifying that the data respects data accuracy rules; 
– Adhérence to a common langage: 
– Data elements fulfil the requirements langage; 
– Consistency in verifying that the data from multiple systems 
respects the data consistency rules; 
– Technical conformance in meeting the data spécification and 
information architecture;
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Policies around 
Privacy and Security 
Contact http://www.jean-antoine-moreau.fr.nf JAM 118
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Policies are required to protect sensitive 
data; 
• Decision must be made about data masking 
and the storage of such data. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 119
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Consider the following data access policies: 
– Data availability; 
– Data criticality; 
– Data authenticity; 
– Data sharing and publishing; 
– Data storage and retention. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 120
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Constraints of data providers: 
Contact http://www.jean-antoine-moreau.fr.nf JAM 121 
– Political; 
– Technical; 
– Regional; 
• Social media terms of use; 
• Data frequency; 
• Size of fetch; 
• Filters;
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
System Management 
Contact http://www.jean-antoine-moreau.fr.nf JAM 122
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Systems management is critical for big data, 
it involves many systems across clusters. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 123
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Monitoring the health of the overall big data 
ecosystem includes: 
Contact http://www.jean-antoine-moreau.fr.nf JAM 124
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Managing the logs of systems; 
– Virtual machines; 
– Applications; 
• and other devices; 
Contact http://www.jean-antoine-moreau.fr.nf JAM 125
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Correlating the various logs and helping 
investigate and monitoring the situation. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 126
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Monitoring real-time alerts and notifications. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 127
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Using a real time dashboard showing various 
parameters. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 128
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Referring to reports and detailed analysis 
about the system. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 129
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Setting and abiding by service-level agreement. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 130
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Managing storage and capacity. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 131
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Archiving and managing archive retrieval. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 132
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• Policies Management. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 133
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Use Case (UML) for architectural features 
Unified Modeling Language 
collect 
transform 
analyze 
return 
store 
Contact http://www.jean-antoine-moreau.fr.nf JAM 134
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
The new functions 
• Data builder; 
• Data scientist; 
• Data architect; 
• DB design architect; 
• Data Administrator; 
• Business Analysis. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 135
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
& 
Cloud Computing 
Contact http://www.jean-antoine-moreau.fr.nf JAM 136
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Big Data Processing 
Contact http://www.jean-antoine-moreau.fr.nf JAM 137
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
The code 
(Perl, Python, SQL, C++,JAVA, …) 
Contact http://www.jean-antoine-moreau.fr.nf JAM 138 
• is : 
– Bug free; 
– Easy to read; 
– Easy to maintain; 
– Optimized; 
– Robust; 
– Re-usable; 
– Documented.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Big Data Processing 
in Cloud environment 
Contact http://www.jean-antoine-moreau.fr.nf JAM 139
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Cloud Computing 
Aggregation of ressources 
Aggregation of data 
into Data Center 
on Internet 
Contact http://www.jean-antoine-moreau.fr.nf JAM 140
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
 Cloud Services 
Iaas 
Paas 
Saas 
Wokflow data processing 
Processing stream (flow) of data 
Contact http://www.jean-antoine-moreau.fr.nf JAM 141
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
• One example of using: 
Contact http://www.jean-antoine-moreau.fr.nf JAM 142 
– The GPS 
• Global Positioning System.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Analysis 
system 
Internet Web Services 
Real Time 
Sytem 
Rule 
judgement 
Detection 
judgement Processing 
Contact http://www.jean-antoine-moreau.fr.nf JAM 143 
Handset 
Real Word 
Result data 
Rule 
update 
Complex event 
processing 
aggregation 
Statistic 
optimization 
Distributed parallel data 
Data 
accumulation
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
copy stored 
in other server 
Contact http://www.jean-antoine-moreau.fr.nf JAM 144
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Parallelization of 
Complexe Event Progessing 
Contact http://www.jean-antoine-moreau.fr.nf JAM 145
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Cloud environment 
Parallel CEP system 
CEP processing 
Scream set (1) 
Parallel distribution 
CEP processing 
Scream set (n) 
Perfomamnce 
Resources 
Load monitor 
function 
Contact http://www.jean-antoine-moreau.fr.nf JAM 146 
Noti 
fi 
c 
ati 
o 
n 
Data stream (1) 
Data stream (2)
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Basic Structure of CEP 
Contact http://www.jean-antoine-moreau.fr.nf JAM 147 
Data stream 
Server 
CEP engine 
rules 
Data 
Operation 
results 
Processing 
state 
Application 
Message queue 
Message queue
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
CEP dynamic load balancing method 
(transition from current system to extra system) 
Contact http://www.jean-antoine-moreau.fr.nf JAM 148 
Server 1 
rules 
Processing 
state 
rules 
Processing 
state 
data 
rules Processing 
state 
Server 1 
Server 2 
Server 3 
rules Processing 
state 
Server (n-m) 
Server n 
rules 
data 
Processing 
state
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
person using 
the information 
Server 
Balancing 
Contact http://www.jean-antoine-moreau.fr.nf JAM 149
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
BBiigg DDaattaa 
Contact http://www.jean-antoine-moreau.fr.nf JAM 150 
Person 
// 
Information 
Query (1) Query (1) Query (n) Result 
Issue 
Store in 
area extract 
Person 
matched 
with 
store product 
Person’s 
Location 
Préférence 
Information 
extracted 
Goal : Reduce the CPU usage.
© Jean-Antoine 
Moreau 
copying and 
reproduction 
prohibited 
Managing my 
copyright ADAGP. 
At the end of the part of the course published on the Internet. 
Contact http://www.jean-antoine-moreau.fr.nf JAM 151

More Related Content

Similar to Big Data Lesson 3 Jean-Antoine Moreau

Cassandra summit-2013
Cassandra summit-2013Cassandra summit-2013
Cassandra summit-2013dfilppi
 
DATA SCIENCE Lesson 5 Data Science Predictive Modeling and Modelling Methodol...
DATA SCIENCE Lesson 5 Data Science Predictive Modeling and Modelling Methodol...DATA SCIENCE Lesson 5 Data Science Predictive Modeling and Modelling Methodol...
DATA SCIENCE Lesson 5 Data Science Predictive Modeling and Modelling Methodol...Jean-Antoine Moreau
 
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur BittorrentOsis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur BittorrentPôle Systematic Paris-Region
 
SnapVault SE presentation
SnapVault SE presentationSnapVault SE presentation
SnapVault SE presentationRobbie Rikard
 
Detecting and mitigating DDoS ZenDesk by Vicente De Luca
Detecting and mitigating DDoS ZenDesk by Vicente De LucaDetecting and mitigating DDoS ZenDesk by Vicente De Luca
Detecting and mitigating DDoS ZenDesk by Vicente De LucaPavel Odintsov
 
he-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdf
he-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdfhe-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdf
he-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdfThienMinh30
 
Data stream with cruise control
Data stream with cruise controlData stream with cruise control
Data stream with cruise controlBill Liu
 
thwackCamp 2013: Building a Large-Scale SolarWinds Installation
thwackCamp 2013: Building a Large-Scale SolarWinds InstallationthwackCamp 2013: Building a Large-Scale SolarWinds Installation
thwackCamp 2013: Building a Large-Scale SolarWinds InstallationSolarWinds
 
Filtering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingFiltering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingCloud Elements
 
DDoS Attacks - Scenery, Evolution and Mitigation
DDoS Attacks - Scenery, Evolution and MitigationDDoS Attacks - Scenery, Evolution and Mitigation
DDoS Attacks - Scenery, Evolution and MitigationWilson Rogerio Lopes
 
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntop
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntopIT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntop
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntopInfluxData
 
Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0Vinay Kumar Chella
 
Building Your Data Streams for all the IoT
Building Your Data Streams for all the IoTBuilding Your Data Streams for all the IoT
Building Your Data Streams for all the IoTDevOps.com
 
Building ContinuousIntegration with Virtuozzo DevOps
Building ContinuousIntegration with Virtuozzo DevOpsBuilding ContinuousIntegration with Virtuozzo DevOps
Building ContinuousIntegration with Virtuozzo DevOpsVirtuozzo
 
Storm users group real time hadoop
Storm users group real time hadoopStorm users group real time hadoop
Storm users group real time hadoopTed Dunning
 
Storm Users Group Real Time Hadoop
Storm Users Group Real Time HadoopStorm Users Group Real Time Hadoop
Storm Users Group Real Time HadoopMapR Technologies
 
Microservice monitoring
Microservice monitoringMicroservice monitoring
Microservice monitoringMarek Koniew
 
InfluxDB Community Office Hours September 2020
InfluxDB Community Office Hours September 2020 InfluxDB Community Office Hours September 2020
InfluxDB Community Office Hours September 2020 InfluxData
 

Similar to Big Data Lesson 3 Jean-Antoine Moreau (20)

Cassandra summit-2013
Cassandra summit-2013Cassandra summit-2013
Cassandra summit-2013
 
DATA SCIENCE Lesson 5 Data Science Predictive Modeling and Modelling Methodol...
DATA SCIENCE Lesson 5 Data Science Predictive Modeling and Modelling Methodol...DATA SCIENCE Lesson 5 Data Science Predictive Modeling and Modelling Methodol...
DATA SCIENCE Lesson 5 Data Science Predictive Modeling and Modelling Methodol...
 
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur BittorrentOsis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
Osis18_Cloud : DeepTorrent Stockage distribué perenne basé sur Bittorrent
 
SnapVault SE presentation
SnapVault SE presentationSnapVault SE presentation
SnapVault SE presentation
 
Detecting and mitigating DDoS ZenDesk by Vicente De Luca
Detecting and mitigating DDoS ZenDesk by Vicente De LucaDetecting and mitigating DDoS ZenDesk by Vicente De Luca
Detecting and mitigating DDoS ZenDesk by Vicente De Luca
 
he-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdf
he-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdfhe-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdf
he-dieu-hanh_david-mazieres_l18-virtual-machines - [cuuduongthancong.com].pdf
 
Data stream with cruise control
Data stream with cruise controlData stream with cruise control
Data stream with cruise control
 
thwackCamp 2013: Building a Large-Scale SolarWinds Installation
thwackCamp 2013: Building a Large-Scale SolarWinds InstallationthwackCamp 2013: Building a Large-Scale SolarWinds Installation
thwackCamp 2013: Building a Large-Scale SolarWinds Installation
 
Filtering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingFiltering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media Streaming
 
DDoS Attacks - Scenery, Evolution and Mitigation
DDoS Attacks - Scenery, Evolution and MitigationDDoS Attacks - Scenery, Evolution and Mitigation
DDoS Attacks - Scenery, Evolution and Mitigation
 
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntop
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntopIT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntop
IT Monitoring in the Era of Containers | Luca Deri Founder & Project Lead | ntop
 
Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0
 
Building Your Data Streams for all the IoT
Building Your Data Streams for all the IoTBuilding Your Data Streams for all the IoT
Building Your Data Streams for all the IoT
 
Building ContinuousIntegration with Virtuozzo DevOps
Building ContinuousIntegration with Virtuozzo DevOpsBuilding ContinuousIntegration with Virtuozzo DevOps
Building ContinuousIntegration with Virtuozzo DevOps
 
Workshop slides
Workshop slidesWorkshop slides
Workshop slides
 
Storm users group real time hadoop
Storm users group real time hadoopStorm users group real time hadoop
Storm users group real time hadoop
 
Storm Users Group Real Time Hadoop
Storm Users Group Real Time HadoopStorm Users Group Real Time Hadoop
Storm Users Group Real Time Hadoop
 
Microservice monitoring
Microservice monitoringMicroservice monitoring
Microservice monitoring
 
Hyperledger Fabric Hands-On
Hyperledger Fabric Hands-OnHyperledger Fabric Hands-On
Hyperledger Fabric Hands-On
 
InfluxDB Community Office Hours September 2020
InfluxDB Community Office Hours September 2020 InfluxDB Community Office Hours September 2020
InfluxDB Community Office Hours September 2020
 

More from Jean-Antoine Moreau

l'Intelligence Artificielle Jean-Antoine Moreau
l'Intelligence Artificielle Jean-Antoine Moreaul'Intelligence Artificielle Jean-Antoine Moreau
l'Intelligence Artificielle Jean-Antoine MoreauJean-Antoine Moreau
 
Management of the Performance Jean-Antoine Moreau
Management of the Performance Jean-Antoine MoreauManagement of the Performance Jean-Antoine Moreau
Management of the Performance Jean-Antoine MoreauJean-Antoine Moreau
 
Management de la Performance Jean-Antoine Moreau
Management de la Performance Jean-Antoine MoreauManagement de la Performance Jean-Antoine Moreau
Management de la Performance Jean-Antoine MoreauJean-Antoine Moreau
 
Stratégie Économique Jean-Antoine Moreau
Stratégie Économique Jean-Antoine MoreauStratégie Économique Jean-Antoine Moreau
Stratégie Économique Jean-Antoine MoreauJean-Antoine Moreau
 
Economic Strategy Jean-Antoine Moreau
Economic Strategy Jean-Antoine MoreauEconomic Strategy Jean-Antoine Moreau
Economic Strategy Jean-Antoine MoreauJean-Antoine Moreau
 
Stratégie Industrielle Jean-Antoine Moreau
Stratégie Industrielle Jean-Antoine MoreauStratégie Industrielle Jean-Antoine Moreau
Stratégie Industrielle Jean-Antoine MoreauJean-Antoine Moreau
 
Industrial Strategy Jean-Antoine Moreau
Industrial Strategy Jean-Antoine MoreauIndustrial Strategy Jean-Antoine Moreau
Industrial Strategy Jean-Antoine MoreauJean-Antoine Moreau
 
Regional Economic Development Jean-Antoine Moreau
Regional Economic Development Jean-Antoine MoreauRegional Economic Development Jean-Antoine Moreau
Regional Economic Development Jean-Antoine MoreauJean-Antoine Moreau
 
MARKETING STRATEGY Jean-Antoine Moreau
MARKETING STRATEGY Jean-Antoine MoreauMARKETING STRATEGY Jean-Antoine Moreau
MARKETING STRATEGY Jean-Antoine MoreauJean-Antoine Moreau
 
Politique Industrielle Seconde Partie
Politique Industrielle Seconde PartiePolitique Industrielle Seconde Partie
Politique Industrielle Seconde PartieJean-Antoine Moreau
 
Politique industrielle Jean-Antoine Moreau
Politique industrielle Jean-Antoine MoreauPolitique industrielle Jean-Antoine Moreau
Politique industrielle Jean-Antoine MoreauJean-Antoine Moreau
 
Réindustrialisation,Politique Industrielle,Plan Industriel Jean-Antoine Moreau
Réindustrialisation,Politique Industrielle,Plan Industriel Jean-Antoine MoreauRéindustrialisation,Politique Industrielle,Plan Industriel Jean-Antoine Moreau
Réindustrialisation,Politique Industrielle,Plan Industriel Jean-Antoine MoreauJean-Antoine Moreau
 
Le Chômage en France Etude, Impacts sociétaux et économiques
Le Chômage en France Etude, Impacts sociétaux et économiquesLe Chômage en France Etude, Impacts sociétaux et économiques
Le Chômage en France Etude, Impacts sociétaux et économiquesJean-Antoine Moreau
 
Diagnostic Projet Jean-Antoine Moreau
Diagnostic Projet  Jean-Antoine MoreauDiagnostic Projet  Jean-Antoine Moreau
Diagnostic Projet Jean-Antoine MoreauJean-Antoine Moreau
 
Élaborer et Mesurer Une Stratégie d’Entreprise Modèle Méthode simple d’évalua...
Élaborer et Mesurer Une Stratégie d’Entreprise Modèle Méthode simple d’évalua...Élaborer et Mesurer Une Stratégie d’Entreprise Modèle Méthode simple d’évalua...
Élaborer et Mesurer Une Stratégie d’Entreprise Modèle Méthode simple d’évalua...Jean-Antoine Moreau
 
Systemic approach to commercial programming and commercial choices Jean-Antoi...
Systemic approach to commercial programming and commercial choices Jean-Antoi...Systemic approach to commercial programming and commercial choices Jean-Antoi...
Systemic approach to commercial programming and commercial choices Jean-Antoi...Jean-Antoine Moreau
 

More from Jean-Antoine Moreau (20)

Histoire de la Drogue en France
Histoire de la Drogue en FranceHistoire de la Drogue en France
Histoire de la Drogue en France
 
l'Intelligence Artificielle Jean-Antoine Moreau
l'Intelligence Artificielle Jean-Antoine Moreaul'Intelligence Artificielle Jean-Antoine Moreau
l'Intelligence Artificielle Jean-Antoine Moreau
 
Blockchain Jean-Antoine Moreau
Blockchain   Jean-Antoine MoreauBlockchain   Jean-Antoine Moreau
Blockchain Jean-Antoine Moreau
 
Management of the Performance Jean-Antoine Moreau
Management of the Performance Jean-Antoine MoreauManagement of the Performance Jean-Antoine Moreau
Management of the Performance Jean-Antoine Moreau
 
Management de la Performance Jean-Antoine Moreau
Management de la Performance Jean-Antoine MoreauManagement de la Performance Jean-Antoine Moreau
Management de la Performance Jean-Antoine Moreau
 
Le Budget Jean-Antoine Moreau
Le Budget Jean-Antoine MoreauLe Budget Jean-Antoine Moreau
Le Budget Jean-Antoine Moreau
 
Stratégie Économique Jean-Antoine Moreau
Stratégie Économique Jean-Antoine MoreauStratégie Économique Jean-Antoine Moreau
Stratégie Économique Jean-Antoine Moreau
 
Economic Strategy Jean-Antoine Moreau
Economic Strategy Jean-Antoine MoreauEconomic Strategy Jean-Antoine Moreau
Economic Strategy Jean-Antoine Moreau
 
Stratégie Industrielle Jean-Antoine Moreau
Stratégie Industrielle Jean-Antoine MoreauStratégie Industrielle Jean-Antoine Moreau
Stratégie Industrielle Jean-Antoine Moreau
 
Industrial Strategy Jean-Antoine Moreau
Industrial Strategy Jean-Antoine MoreauIndustrial Strategy Jean-Antoine Moreau
Industrial Strategy Jean-Antoine Moreau
 
Regional Economic Development Jean-Antoine Moreau
Regional Economic Development Jean-Antoine MoreauRegional Economic Development Jean-Antoine Moreau
Regional Economic Development Jean-Antoine Moreau
 
MARKETING STRATEGY Jean-Antoine Moreau
MARKETING STRATEGY Jean-Antoine MoreauMARKETING STRATEGY Jean-Antoine Moreau
MARKETING STRATEGY Jean-Antoine Moreau
 
Politique Industrielle Seconde Partie
Politique Industrielle Seconde PartiePolitique Industrielle Seconde Partie
Politique Industrielle Seconde Partie
 
Politique industrielle Jean-Antoine Moreau
Politique industrielle Jean-Antoine MoreauPolitique industrielle Jean-Antoine Moreau
Politique industrielle Jean-Antoine Moreau
 
Réindustrialisation,Politique Industrielle,Plan Industriel Jean-Antoine Moreau
Réindustrialisation,Politique Industrielle,Plan Industriel Jean-Antoine MoreauRéindustrialisation,Politique Industrielle,Plan Industriel Jean-Antoine Moreau
Réindustrialisation,Politique Industrielle,Plan Industriel Jean-Antoine Moreau
 
Le Chômage en France Etude, Impacts sociétaux et économiques
Le Chômage en France Etude, Impacts sociétaux et économiquesLe Chômage en France Etude, Impacts sociétaux et économiques
Le Chômage en France Etude, Impacts sociétaux et économiques
 
Diagnostic Projet Jean-Antoine Moreau
Diagnostic Projet  Jean-Antoine MoreauDiagnostic Projet  Jean-Antoine Moreau
Diagnostic Projet Jean-Antoine Moreau
 
Élaborer et Mesurer Une Stratégie d’Entreprise Modèle Méthode simple d’évalua...
Élaborer et Mesurer Une Stratégie d’Entreprise Modèle Méthode simple d’évalua...Élaborer et Mesurer Une Stratégie d’Entreprise Modèle Méthode simple d’évalua...
Élaborer et Mesurer Une Stratégie d’Entreprise Modèle Méthode simple d’évalua...
 
Systemic approach to commercial programming and commercial choices Jean-Antoi...
Systemic approach to commercial programming and commercial choices Jean-Antoi...Systemic approach to commercial programming and commercial choices Jean-Antoi...
Systemic approach to commercial programming and commercial choices Jean-Antoi...
 
SQL Jean-Antoine Moreau
SQL  Jean-Antoine MoreauSQL  Jean-Antoine Moreau
SQL Jean-Antoine Moreau
 

Recently uploaded

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Recently uploaded (20)

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Big Data Lesson 3 Jean-Antoine Moreau

  • 1. BBIIGG DDAATTAA LLeessssoonn 33 Study : Jean-Antoine Moreau (Engineer - Lecturer) © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP.
  • 2. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. « In questions of science, the authority of a thousand is not worth the humble reasoning of a single individual. » Galileo Galilei Contact http://www.jean-antoine-moreau.fr.nf JAM 2
  • 3. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • HadHoop Open Source Framework; • Support database; • HadHoop is composed of a cell (stack) application; • for analytical apllications a distributed file system records; • HDFS (Hadoop Distributed File System). Contact http://www.jean-antoine-moreau.fr.nf JAM 3
  • 4. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • HDFS (Hadoop Distributed File System) developed in JAVA language support Base Hbase; • HBase: Database distributed across the nodes of a cluster server. Contact http://www.jean-antoine-moreau.fr.nf JAM 4
  • 5. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa MapReduce Architecture analytics Contact http://www.jean-antoine-moreau.fr.nf JAM 5
  • 6. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa MapReduce allows parallel calculations and distributed. Contact http://www.jean-antoine-moreau.fr.nf JAM 6
  • 7. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Data is distributed across a cluster of server nodes that make up the architecture. Contact http://www.jean-antoine-moreau.fr.nf JAM 7
  • 8. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Hadoop uses the block mode. • A data file is divided into block of the same size, to be distributed over the nodes in the cluster, for processing. Contact http://www.jean-antoine-moreau.fr.nf JAM 8
  • 9. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • The system maintains the mapping HDFS distributed data. Contact http://www.jean-antoine-moreau.fr.nf JAM 9
  • 10. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Cutting block data processing speed. What brings real time. • Unstructured data can undergo an analytical treatment. • The distribution of data among the nodes of a cluster allows paralization treatments. Contact http://www.jean-antoine-moreau.fr.nf JAM 10
  • 11. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • MapReduce driver: – nodes in the cluster; – assign duties – manages the restart disable nodes; – manages the restart of tasks; Contact http://www.jean-antoine-moreau.fr.nf JAM 11
  • 12. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa and Cloud Computing Contact http://www.jean-antoine-moreau.fr.nf JAM 12
  • 13. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa uses virtualization virtual abstraction heterogeneous infrastructure Contact http://www.jean-antoine-moreau.fr.nf JAM 13
  • 14. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa uses the data stored in the cloud Contact http://www.jean-antoine-moreau.fr.nf JAM 14
  • 15. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Virtual Machines (VMs) are used for the deployment of Big Data architectures. Contact http://www.jean-antoine-moreau.fr.nf JAM 15
  • 16. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Uses Cluster distributed processing the computing nodes in the cluster Contact http://www.jean-antoine-moreau.fr.nf JAM 16
  • 17. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Treatment using virtual machines Scalability of Hadoop nodes ability to vary the number of nodes as required Contact http://www.jean-antoine-moreau.fr.nf JAM 17
  • 18. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Security MapReduce Fault tolerance Contact http://www.jean-antoine-moreau.fr.nf JAM 18
  • 19. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa use the virtual abstraction layer as a resource. Contact http://www.jean-antoine-moreau.fr.nf JAM 19
  • 20. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Big Data – thanks to the cloud; – thanks to virtual machines; – thanks to the management nodes; – • Big Data uses less bandwidth; Contact http://www.jean-antoine-moreau.fr.nf JAM 20
  • 21. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa lambda architecture is a generic term to describe the Big Data architectures, which store and process the real-time data. Contact http://www.jean-antoine-moreau.fr.nf JAM 21
  • 22. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa lambda architecture These architectures know how to manage at the same time as a stock and as a flow. Contact http://www.jean-antoine-moreau.fr.nf JAM 22
  • 23. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Lambda architectures use the tools: – Hadoop; – Strom Hadoop; – Spark Streaming; – Summing Bird; – Hydra; – … Contact http://www.jean-antoine-moreau.fr.nf JAM 23
  • 24. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Lambda architectures use the stores No SQL: – CouchBase; – MongoDb; Contact http://www.jean-antoine-moreau.fr.nf JAM 24
  • 25. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Lambda architectures Events Process State Services Contact http://www.jean-antoine-moreau.fr.nf JAM 25
  • 26. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Lambda architectures Real Time Process Real Time Result Events Federation Batch Process Batch View Contact http://www.jean-antoine-moreau.fr.nf JAM 26
  • 27. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Architecture Layers • the batch layer: – Stock the dataset; – logical calculation on the dataset views. Contact http://www.jean-antoine-moreau.fr.nf JAM 27
  • 28. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Architecture Layers • The Real Time layer (speed layer): – this layer only deals with recent data; – this layer compensates the high latency of the batch layer by calculating views "real time"; – real-time views are calculated incrementally based on stream processing systems and databases in random read / write. Contact http://www.jean-antoine-moreau.fr.nf JAM 28
  • 29. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Architecture Layers Service layer - Layer serving Contact http://www.jean-antoine-moreau.fr.nf JAM 29
  • 30. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Architecture must : • Securing a single processing an event; • Allows scalability without compromising the architecture, or tools used; • Allow changes in the data. Contact http://www.jean-antoine-moreau.fr.nf JAM 30
  • 31. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools and Framework Contact http://www.jean-antoine-moreau.fr.nf JAM 31 Message Queue Real-Time Processing Real-Time State Real-Time Views Service Federated View Batch Pump Batch State Batch Processing Batch Views Services
  • 32. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools and Framework Contact http://www.jean-antoine-moreau.fr.nf JAM 32 Message Queue Real-Time Processing Real-Time State Real-Time Views Service Federated View Batch Pump Batch State Batch Processing Batch Views Services RABBITMQ STORM MEMCACHE MONGODB WEBAPP FLUME HDFS MAPRED HBase
  • 33. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools • Message queue: – Active MQ; – Hornet MQ; – Rabbit MQ; – Kestrel; – Kafka; Contact http://www.jean-antoine-moreau.fr.nf JAM 33
  • 34. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools Storm Storm is a distributed real-time computation system for processing fast, large streams of data, adding real-time data processing to Apache Hadoop. Contact http://www.jean-antoine-moreau.fr.nf JAM 34
  • 35. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Contact http://www.jean-antoine-moreau.fr.nf JAM 35 • Concepts – Storm has two basic units of processing: the Spouts and the Bolts; – The Spouts are the elements that generate the data to be processed, they may get that data from external sources or generate it themselves but their mission is to introduce it to the cluster; – Bolts are processing units: they receive data from the Spouts and perform work on it, optionally generating more data to be processed by other Bolts.
  • 36. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • The data that flows internally on the cluster is organized in Streams, those are homogeneous lists of named tuples which are known in advance; when a Spout is created it declares the kind of Tuple it will emit, likewise when a Bolt is meant to generate further data to be processed it declares the kind of Tuple it will emit for the next Bolt on the chain. Contact http://www.jean-antoine-moreau.fr.nf JAM 36
  • 37. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • In streambase, events are represented by data type called a tuple; • Individual tuples are instance of a subtype of tuple determined associated with a specific schema; • A tuple contains a single value for each field in its schema. Contact http://www.jean-antoine-moreau.fr.nf JAM 37
  • 38. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Contact http://www.jean-antoine-moreau.fr.nf JAM 38 Spout Spout Bolt Bolt Bolt Bolt Tuple Tuple Tuple Tuple Tuple Tuple Topology
  • 39. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Topology Contact http://www.jean-antoine-moreau.fr.nf JAM 39
  • 40. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools Trident - Storm • Trident is a new high-level abstraction for doing realtime computing on top of Storm. Contact http://www.jean-antoine-moreau.fr.nf JAM 40
  • 41. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools Trident - Storm • The concepts seem to Pig (programming tools) and Cascading tools. • They allow : Contact http://www.jean-antoine-moreau.fr.nf JAM 41  Joins;  Aggregations;  Groupings;  Functions;  Filters.
  • 42. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools Trident - Storm • The concepts seem to Pig (programming tools) and Cascading tools. • They allow : Contact http://www.jean-antoine-moreau.fr.nf JAM 42  Joins;  Aggregations;  Groupings;  Functions;  Filters.
  • 43. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools Trident – Storm • The possibilities of Trident + Storm, combined with fast scalable data stores; • Everything from real-time, filtering, complex event processing, machine learning, … • Storm is an asynchronous distributed framework but with a simple distributed RPC server (Remote Procedure Call) it can easily be used in synchronous. Contact http://www.jean-antoine-moreau.fr.nf JAM 43
  • 44. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools Trident – Storm • Tuples are processed by Batch processing; • Each batch of tuples assigned a unique identifier, the transaction number; • Knew a lot is replayed, it will receive the same number; • The Batch processing are processed in the order. Contact http://www.jean-antoine-moreau.fr.nf JAM 44
  • 45. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tools • Summingbird is a large-scale data processing system enabling developers to uniformly execute code in either batch-mode (Hadoop/MapReduce-based) or stream-mode (Storm-based) or a combination thereof, called hybrid mode. Contact http://www.jean-antoine-moreau.fr.nf JAM 45
  • 46. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tool Spark • Open-source data analytics cluster computing framework originally developed at University of California: Berkeley. • It uses the concept of Distributed Dataset. • Spark can handle real-time data with SparkStreaming. Contact http://www.jean-antoine-moreau.fr.nf JAM 46
  • 47. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tool Spark • Open-source data analytics cluster computing framework originally developed at University of California: Berkeley. • It uses the concept of Distributed Dataset. • Spark can handle real-time data with SparkStreaming. Contact http://www.jean-antoine-moreau.fr.nf JAM 47
  • 48. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Data Base NoSQL Mongo DB, Redis, HBase, … The NoSQL databases can also be used as tools of implementation • technical bricks; • technical components; • technical blocks; Contact http://www.jean-antoine-moreau.fr.nf JAM 48
  • 49. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Using Mongo Contact http://www.jean-antoine-moreau.fr.nf JAM 49 Message Queue Real-Time Processing Real-Time State Real-Time Views Service Federated View Batch Pump Batch State Batch Processing Batch Views Services
  • 50. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Using Mongo Contact http://www.jean-antoine-moreau.fr.nf JAM 50 Message Queue Real-Time Processing Real-Time State Real-Time Views Service Federated View Batch Pump Batch State Batch Processing Batch Views Services Insert in Mongo Mongo Mongo Aggregation Insert Mongo Mongo MapReduce Mongo Collection
  • 51. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tool SploutSQL for indexing and partition data from a Hadoop cluster and expose them in SQL. Contact http://www.jean-antoine-moreau.fr.nf JAM 51
  • 52. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Tool SploutSQL Hadoop Data generate Deploy SploutSQL Splout SQL updating Splout SQL updating Contact http://www.jean-antoine-moreau.fr.nf JAM 52 Source 1 Source 2
  • 53. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa ORACLE appliance – ORACLE Big Data connectors; – ORACLE Data Base; – ORACLE Exadat; – ORACLE Exalytics; – Infiniband; Contact http://www.jean-antoine-moreau.fr.nf JAM 53
  • 54. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Big Data Analytical Processes and Models Contact http://www.jean-antoine-moreau.fr.nf JAM 54 Including : Statistics; Spacial; Semantics; Interactive; Discovery; Visualization;
  • 55. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Align with the Cloud Operating Model • Data across the all dataset : Contact http://www.jean-antoine-moreau.fr.nf JAM 55 Transactions; Master data; Reference; Summarized; • Pre-Processing Integration; In-database summarization; • Post-Processing; • Analytical modeling; • A well planned private and public security strategy play, with integral rules.
  • 56. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa and IT Governance Contact http://www.jean-antoine-moreau.fr.nf JAM 56
  • 57. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Expand your IT governance to include the Big Data : – To ensure business alignment; – Grow you skills; – Manage Open source; – Manage tools and Technologies evolution; – Share the knowledge; – Establish standards; – Manage the best practice. Contact http://www.jean-antoine-moreau.fr.nf JAM 57
  • 58. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Futur State Current State IT Architecture IT Infrastructure Governance IT Business Architecture Contact http://www.jean-antoine-moreau.fr.nf JAM 58
  • 59. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Projects Bid Data should take into account existing systems, such as: – ERP (Enterprise Resssource Planning); • Databases; • SaaS Applications; • data warehouse; Contact http://www.jean-antoine-moreau.fr.nf JAM 59 – and also: • social networks; • Websites; • Internet data; • sensors;
  • 60. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Change Adapt the Governance of Information System Contact http://www.jean-antoine-moreau.fr.nf JAM 60
  • 61. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Governance of the Information System Quality of Service Security reliability of the data Contact http://www.jean-antoine-moreau.fr.nf JAM 61
  • 62. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Hadoop natively uses Kerberos for security; • But integration engines use their own methods sécurité. Contact http://www.jean-antoine-moreau.fr.nf JAM 62
  • 63. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa New skills Few computer programmers control MapReduce Contact http://www.jean-antoine-moreau.fr.nf JAM 63
  • 64. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa The stakes We need to train engineers to new technologies. The quality of data that make up the information The profitability will be long-term. Contact http://www.jean-antoine-moreau.fr.nf JAM 64
  • 65. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa issues cost control management control of development costs Contact http://www.jean-antoine-moreau.fr.nf JAM 65
  • 66. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Control technologies; Effective management of the Project; Planning; Project Plan; Quality Assurance Plan; Service Plan (purpose and level of service); Strategy Plan Information; Data Governance Plan. Contact http://www.jean-antoine-moreau.fr.nf JAM 66
  • 67. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa The necessary adjustment of the information system department Contact http://www.jean-antoine-moreau.fr.nf JAM 67
  • 68. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • for effective use of: – Big Data sources; – Data massaging and store layer; – Analysis layer; – Consumption layer. Contact http://www.jean-antoine-moreau.fr.nf JAM 68
  • 69. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa the department information system must have the ability to do and manage : …. Contact http://www.jean-antoine-moreau.fr.nf JAM 69
  • 70. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Big Data source The data available for Analysis Coming from all channels Contact http://www.jean-antoine-moreau.fr.nf JAM 70
  • 71. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Velocity and volume The speed that data arrives and the rate at which it’s delivered various data source. Contact http://www.jean-antoine-moreau.fr.nf JAM 71
  • 72. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Manage the collection point Where the data is collected Identify the data to which you have limited acces Contact http://www.jean-antoine-moreau.fr.nf JAM 72
  • 73. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Management of : – The data massaging and store layer; – The Hadoop Distibuted File System; – The Relational Database Management System (RDBMS). Contact http://www.jean-antoine-moreau.fr.nf JAM 73
  • 74. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Manage the Analysis Layer • Produce the desired analytics; • Derive insight from the data; • Find the entities required; • Locate the data sources; • Understand the algorithms. Contact http://www.jean-antoine-moreau.fr.nf JAM 74
  • 75. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Manage • The consumption layer and components. • The output produced by the analysis layer. Contact http://www.jean-antoine-moreau.fr.nf JAM 75
  • 76. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Organization Schema Contact http://www.jean-antoine-moreau.fr.nf JAM 76
  • 77. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Layers • Consumption layer; • Analysis layer; • Data massaging and Storage layer; • Data sources layer. Contact http://www.jean-antoine-moreau.fr.nf JAM 77
  • 78. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Contact http://www.jean-antoine-moreau.fr.nf JAM 78 Consumption layer Business Process Management Visualization Discovery Real time Business Alert Data Navigation Decision Management Reporting engine Selfservice Query Customs DataBoard Transaction Interception
  • 79. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Analysis Layer Entity Identification Model Real time Scored Result Decision Management Scoring of evens Model Management Predictive Models Statistical Models Verification Models Contact http://www.jean-antoine-moreau.fr.nf JAM 79
  • 80. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Data Massaging and Storage Layer Data Acquisition Data Digest Contact http://www.jean-antoine-moreau.fr.nf JAM 80
  • 81. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Contact http://www.jean-antoine-moreau.fr.nf JAM 81 Big Data Sources texts images audio spacial temporal documents Relational Data Domaines entities Aggregates data provider Sensor vendor application
  • 82. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Enterprise system : – Customer relationship management system; – Billing operations; – Mainframe applications; – Enterprise resources planning; – Web application. Contact http://www.jean-antoine-moreau.fr.nf JAM 82
  • 83. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Data Management System (DMS) – It stores legal data, processes, policies, and various other kinds of documents; – These documents can be converted into data structured, that can be used for analytic. Contact http://www.jean-antoine-moreau.fr.nf JAM 83
  • 84. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Data stores include : – The enterprise data warehouse; – The operational database; – Transactional database; – Such data may be stored in the distributed file system. Contact http://www.jean-antoine-moreau.fr.nf JAM 84
  • 85. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Smart devices are capable of : – Capturing – Processing – Communicating Informations Contact http://www.jean-antoine-moreau.fr.nf JAM 85
  • 86. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Aggregated data providers; • These provider owns or acquires the data and exposes sophisticated formats, at required frequencies, and through specific filter. Contact http://www.jean-antoine-moreau.fr.nf JAM 86
  • 87. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Additional data sources  Geographical information • Maps; • Regional details; • Location détails. Contact http://www.jean-antoine-moreau.fr.nf JAM 87
  • 88. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Additional data sources  Human –generated content • Social Media; • E-mail; • Blogs; • Online Information; Contact http://www.jean-antoine-moreau.fr.nf JAM 88
  • 89. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Additional data sources  Sensor Data • Environment; • Electricity; • Navigate instrument; • Ionizing Radiation; • Proximity; • Position; • Acoustic; • Automotive, transportation; • Thermal; • Optical; • Chemical; • Pressure; • Flow; • Fluid; • Force; • Density. Contact http://www.jean-antoine-moreau.fr.nf JAM 89
  • 90. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Data massaging and store layer – Data acquisition; – Data digest; – Distributed data store. Contact http://www.jean-antoine-moreau.fr.nf JAM 90
  • 91. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Analysis layer – Analysis layer entity identification; – Analysis engine; – Model Management. Contact http://www.jean-antoine-moreau.fr.nf JAM 91
  • 92. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Consumption layer : – The outcome of the analysis in consumed by various users within the organization and by entities external to the organization; – The business insight gained from analysis; – A company can use customer preference data and location awareness to deliver personalized offers to customers as they walk down. The aisle or pass by the store. Contact http://www.jean-antoine-moreau.fr.nf JAM 92
  • 93. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Transaction interceptor : – This component intercepts high-values transactions in real time and convert them into a suitable. – Format that can be readily understand by the analysis layer to do real-time analysis on the incoming data. – The transaction interceptor should have the ability to integrate with and handle data from various sources such as sensors, smart meters, microphones, camera, GPS devices, ATMs, Images scanners. Contact http://www.jean-antoine-moreau.fr.nf JAM 93
  • 94. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Buisiness process management process: – The insight from the analysis layer can be consumed by Business Process Execution Langage (BPEL) process, APIs or other business value by automating the functions. Contact http://www.jean-antoine-moreau.fr.nf JAM 94
  • 95. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Real Time monitoring Contact http://www.jean-antoine-moreau.fr.nf JAM 95
  • 96. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Real time alerts can be generated using the date coming out of the analysis layer. Contact http://www.jean-antoine-moreau.fr.nf JAM 96
  • 97. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • The alert can be sent to interested consumers and devices, such as smartphones and tablets. Contact http://www.jean-antoine-moreau.fr.nf JAM 97
  • 98. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Key performance indicators can be defined for operational effectiveness using the data insight generated from the analytics components. Contact http://www.jean-antoine-moreau.fr.nf JAM 98
  • 99. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Data in real time can be made available to business users from varied sources in the form of dashboard to monitor, or to measure the effectiveness of a campaign. Contact http://www.jean-antoine-moreau.fr.nf JAM 99
  • 100. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Reporting engine – The ability to produce reports similar to traditional business intelligence report is critical. Contact http://www.jean-antoine-moreau.fr.nf JAM 100
  • 101. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Reporting engine Ad-hoc report Scheduled reports Self query Analysis  can be created by users based on the insight coming out of the analysis layer. Contact http://www.jean-antoine-moreau.fr.nf JAM 101
  • 102. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Visualisation and discovery : – Data can be navigated across various federated data sources within and out side the entreprise. Contact http://www.jean-antoine-moreau.fr.nf JAM 102
  • 103. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Visualisation and discovery : – The data can vary in content and format, and all of the data : Structured Semi-structured Unstructured Can be combined for Virtualization and provide the user (Visualisation) Contact http://www.jean-antoine-moreau.fr.nf JAM 103
  • 104. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • This ability enables at the organization to combine : – The traditional entreprise content • Management system; • Data warehouses; Contact http://www.jean-antoine-moreau.fr.nf JAM 104 – With • New social content.
  • 105. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Vertical layers : – Components of the logical layers; – Information integration; – Big Data Governance; – Quality of Service. Contact http://www.jean-antoine-moreau.fr.nf JAM 105
  • 106. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Governance Contact http://www.jean-antoine-moreau.fr.nf JAM 106
  • 107. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Data governance is about defining guidelines, that help entreprise make the right decisions about the data. Contact http://www.jean-antoine-moreau.fr.nf JAM 107
  • 108. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Bid Data governance helps in dealing with the complexities, volumes, and variety of data that is within the enterprise or is coming from external sources. Contact http://www.jean-antoine-moreau.fr.nf JAM 108
  • 109. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Strong guideline and processes are required to monitor, structure, store, and secure the data from the time they enter in the enterprise. • gets processed stored; • Analyzed; • Parged; • archived. Contact http://www.jean-antoine-moreau.fr.nf JAM 109
  • 110. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Managing high volume in data variety; • of formats. Contact http://www.jean-antoine-moreau.fr.nf JAM 110
  • 111. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Continuous training and managing the statistical models required to pre-process unstructured data and analytics. Contact http://www.jean-antoine-moreau.fr.nf JAM 111
  • 112. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Setting policy and compliance regulation for external data regarding their retention and usage. Contact http://www.jean-antoine-moreau.fr.nf JAM 112
  • 113. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Define the data archiving and purging policies. Contact http://www.jean-antoine-moreau.fr.nf JAM 113
  • 114. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Creating the policy for : how data can be replicated across various systems. Contact http://www.jean-antoine-moreau.fr.nf JAM 114
  • 115. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Setting data encryption policies. Contact http://www.jean-antoine-moreau.fr.nf JAM 115
  • 116. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Quality of the service layer. Contact http://www.jean-antoine-moreau.fr.nf JAM 116
  • 117. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Contact http://www.jean-antoine-moreau.fr.nf JAM 117 • Data Quality – Completeness in identify all the data elements required; – Timeliness for providing data at an acceptable level of freshness; – Accuracy in verifying that the data respects data accuracy rules; – Adhérence to a common langage: – Data elements fulfil the requirements langage; – Consistency in verifying that the data from multiple systems respects the data consistency rules; – Technical conformance in meeting the data spécification and information architecture;
  • 118. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Policies around Privacy and Security Contact http://www.jean-antoine-moreau.fr.nf JAM 118
  • 119. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Policies are required to protect sensitive data; • Decision must be made about data masking and the storage of such data. Contact http://www.jean-antoine-moreau.fr.nf JAM 119
  • 120. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Consider the following data access policies: – Data availability; – Data criticality; – Data authenticity; – Data sharing and publishing; – Data storage and retention. Contact http://www.jean-antoine-moreau.fr.nf JAM 120
  • 121. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Constraints of data providers: Contact http://www.jean-antoine-moreau.fr.nf JAM 121 – Political; – Technical; – Regional; • Social media terms of use; • Data frequency; • Size of fetch; • Filters;
  • 122. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa System Management Contact http://www.jean-antoine-moreau.fr.nf JAM 122
  • 123. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Systems management is critical for big data, it involves many systems across clusters. Contact http://www.jean-antoine-moreau.fr.nf JAM 123
  • 124. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Monitoring the health of the overall big data ecosystem includes: Contact http://www.jean-antoine-moreau.fr.nf JAM 124
  • 125. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Managing the logs of systems; – Virtual machines; – Applications; • and other devices; Contact http://www.jean-antoine-moreau.fr.nf JAM 125
  • 126. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Correlating the various logs and helping investigate and monitoring the situation. Contact http://www.jean-antoine-moreau.fr.nf JAM 126
  • 127. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Monitoring real-time alerts and notifications. Contact http://www.jean-antoine-moreau.fr.nf JAM 127
  • 128. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Using a real time dashboard showing various parameters. Contact http://www.jean-antoine-moreau.fr.nf JAM 128
  • 129. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Referring to reports and detailed analysis about the system. Contact http://www.jean-antoine-moreau.fr.nf JAM 129
  • 130. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Setting and abiding by service-level agreement. Contact http://www.jean-antoine-moreau.fr.nf JAM 130
  • 131. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Managing storage and capacity. Contact http://www.jean-antoine-moreau.fr.nf JAM 131
  • 132. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Archiving and managing archive retrieval. Contact http://www.jean-antoine-moreau.fr.nf JAM 132
  • 133. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • Policies Management. Contact http://www.jean-antoine-moreau.fr.nf JAM 133
  • 134. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Use Case (UML) for architectural features Unified Modeling Language collect transform analyze return store Contact http://www.jean-antoine-moreau.fr.nf JAM 134
  • 135. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa The new functions • Data builder; • Data scientist; • Data architect; • DB design architect; • Data Administrator; • Business Analysis. Contact http://www.jean-antoine-moreau.fr.nf JAM 135
  • 136. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa & Cloud Computing Contact http://www.jean-antoine-moreau.fr.nf JAM 136
  • 137. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Big Data Processing Contact http://www.jean-antoine-moreau.fr.nf JAM 137
  • 138. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa The code (Perl, Python, SQL, C++,JAVA, …) Contact http://www.jean-antoine-moreau.fr.nf JAM 138 • is : – Bug free; – Easy to read; – Easy to maintain; – Optimized; – Robust; – Re-usable; – Documented.
  • 139. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Big Data Processing in Cloud environment Contact http://www.jean-antoine-moreau.fr.nf JAM 139
  • 140. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Cloud Computing Aggregation of ressources Aggregation of data into Data Center on Internet Contact http://www.jean-antoine-moreau.fr.nf JAM 140
  • 141. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa  Cloud Services Iaas Paas Saas Wokflow data processing Processing stream (flow) of data Contact http://www.jean-antoine-moreau.fr.nf JAM 141
  • 142. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa • One example of using: Contact http://www.jean-antoine-moreau.fr.nf JAM 142 – The GPS • Global Positioning System.
  • 143. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Analysis system Internet Web Services Real Time Sytem Rule judgement Detection judgement Processing Contact http://www.jean-antoine-moreau.fr.nf JAM 143 Handset Real Word Result data Rule update Complex event processing aggregation Statistic optimization Distributed parallel data Data accumulation
  • 144. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa copy stored in other server Contact http://www.jean-antoine-moreau.fr.nf JAM 144
  • 145. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Parallelization of Complexe Event Progessing Contact http://www.jean-antoine-moreau.fr.nf JAM 145
  • 146. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Cloud environment Parallel CEP system CEP processing Scream set (1) Parallel distribution CEP processing Scream set (n) Perfomamnce Resources Load monitor function Contact http://www.jean-antoine-moreau.fr.nf JAM 146 Noti fi c ati o n Data stream (1) Data stream (2)
  • 147. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Basic Structure of CEP Contact http://www.jean-antoine-moreau.fr.nf JAM 147 Data stream Server CEP engine rules Data Operation results Processing state Application Message queue Message queue
  • 148. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa CEP dynamic load balancing method (transition from current system to extra system) Contact http://www.jean-antoine-moreau.fr.nf JAM 148 Server 1 rules Processing state rules Processing state data rules Processing state Server 1 Server 2 Server 3 rules Processing state Server (n-m) Server n rules data Processing state
  • 149. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa person using the information Server Balancing Contact http://www.jean-antoine-moreau.fr.nf JAM 149
  • 150. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. BBiigg DDaattaa Contact http://www.jean-antoine-moreau.fr.nf JAM 150 Person // Information Query (1) Query (1) Query (n) Result Issue Store in area extract Person matched with store product Person’s Location Préférence Information extracted Goal : Reduce the CPU usage.
  • 151. © Jean-Antoine Moreau copying and reproduction prohibited Managing my copyright ADAGP. At the end of the part of the course published on the Internet. Contact http://www.jean-antoine-moreau.fr.nf JAM 151