More Related Content Similar to Cambriano's Data Governor reduces the Big Data footprint and saves the planet (20) More from Martyn Richard Jones (6) Cambriano's Data Governor reduces the Big Data footprint and saves the planet1. Click to edit Master title style
Big Data Governor
Martyn Jones
Cambriano Energy
www.cambriano.es
© 2014 Martyn Richard Jones All rights reserved.
2. Click to edit Master title style
• Simply stated, the best application of Big Data is in systems and
methods that will significantly reduce the data footprint.
To begin at the beginning
© 2014 Martyn Richard Jones All rights reserved.
3. Click to edit Master title style
I. Do not generate data that is not needed.
II. Do not store data that doesn't need to be stored.
III. Do not index data that doesn't need to be indexed.
IV. Do not replicate data that doesn't need to be replicated.
V. Do not transmit or move data that doesn't need to be transmitted or moved.
VI. Do not integrate data that doesn't need to be integrated.
VII. Do not enrich data that doesn't need to be enriched.
VIII. Do not process data that doesn't need to be processed.
IX. Do not provide access to data that does not need to be accessed.
X. Do not archive or backup data that doesn't need to be archived or backed up.
10 Big Data Commandments
© 2014 Martyn Richard Jones All rights reserved.
4. Click to edit Master title style
• Years of knowledge and experience in information management
strongly suggests that more data does not necessarily lead to better
data.
• The more data there is to generate, move and manage, the greater
the development and administrative overheads.
• The more data we generate, store, replicate, move and transform, the
bigger the data, energy and carbon footprints will become.
Why would we want to reduce the data footprint?
© 2014 Martyn Richard Jones All rights reserved.
5. Click to edit Master title style
• We can use it in profiling, in order to identify the data that could be
useful.
• We can use it to identify immaterial, surplus and redundant data.
• By using it to catalogue, categorise and classify certain high-volume
data sources.
How can Big Data reduce Big Data?
© 2014 Martyn Richard Jones All rights reserved.
6. Click to edit Master title style
• We can use it to audit, analyse and review the generation, storage
and transmission of data.
• We can use the data to parameterise data generators and filters, and
• To be used to generate 'Big-Data-by-exception' discrimination rules
and as the basis for data discrimination based on directed machine-
learning approaches.
What can we do with the Big Data profile data?
© 2014 Martyn Richard Jones All rights reserved.
7. Click to edit Master title style
• We hear that Big Data represents a significant challenge.
• The best way of dealing with significant challenges is to manufacture
an appropriate, coherent and realisable response - a strategy.
• By addressing the data problems up-stream we can then attempt to
turn the Big Data problem into a more manageable data problem, or
alternatively, we can choose to remove the problem.
So why would we do all of this?
© 2014 Martyn Richard Jones All rights reserved.
8. Click to edit Master title style
• We can reduce the amount of data that we actually generate by removing
unnecessary generation, storage and transmission of superfluous data. We can
change logging, monitoring and signal data generators (applications and devices)
so that they produce only concise and usable data. This requires modifications to
parts of existing applications and application servers.
• We can introduce data governors as intelligent data filters and actively exclude or
include data in data flows. This is particularly relevant where we are dealing with
really high-volume data throughput and bandwidth where release of data into
the data streams is subject to rules of exception. For example, we may decide to
exclude any market signal data that simply repeats the same price stated in
previous data.
• We can also filter data dimensionally; by association and abstraction of discrete
phases, events, facets and values; and, by time, affinity and proximity.
How does this work in practice?
© 2014 Martyn Richard Jones All rights reserved.
9. Click to edit Master title style
• Making data smaller reduces the data footprint – lower cost, less
operational complexity and greater focus.
• The earlier you filter data the smaller the data footprint is – lower
costs, less operational complexity and greater focus.
• A smaller data footprint accelerates the processing of the data that
does have potential business value – lower cost, higher value, less
complexity and best focus.
What are the benefits?
© 2014 Martyn Richard Jones All rights reserved.
10. Click to edit Master title style
• We should only generate data that is required, that has value, and
that has a business purpose – whether management oriented,
business oriented or technical in nature.
• We should filter Big Data, early and often.
• We should store, transmit and analyse Big Data only when there is a
real business imperative that prompts us to do so.
In order to tame Big Data?
© 2014 Martyn Richard Jones All rights reserved.
11. Click to edit Master title style
• Taming Big Data is a business, management and technical imperative.
• The best approach to taming the data avalanche is to ensure there is
no data avalanche – this is referred to as moving the problem
upstream.
• The use of smart 'data governors' will provide a practical way to
control the flow of high volumes of data.
Conclusions?
© 2014 Martyn Richard Jones All rights reserved.
12. Click to edit Master title style
The Big Data Governor
A brief architectural and functional overview
Martyn Jones, Creative Director, Cambriano Energy
© 2014 Martyn Richard Jones All rights reserved.
13. Click to edit Master title style
• The Big Data Governor is an architectural concept, set of methods and
a technology which has been developed in Spain (EU) by Martyn
Jones and associates at Cambriano Energy.
• The Big Data Governor’s role is to help in the purposeful and
meaningful reduction of the ever expanding data footprint, especially
as it relates to data volumes and velocity (see Gartner 3Vs).
• The reduction techniques are based on exclusion, inclusion and
exception.
• It’s implementation is made through a development environment
that can target hardware, firmware, middleware and software forms
of hosting and continuously monitored execution.
The Big Data Governor
© 2014 Martyn Richard Jones All rights reserved.
14. Click to edit Master title styleReduce the data footprint and maintain fidelity
Data Application
Data forward, store
and analyse
All data All data All data
Business As Usual: All generated data is stored and forwarded
© 2014 Martyn Richard Jones All rights reserved.
15. Click to edit Master title styleReduce the data footprint and maintain fidelity
Inline Data Application
Data forward, store
and analyse
All data All data Significant data
CE Data Governor
Temporal data store
Business To Be: Model 1: Only significant data is stored and forwarded
© 2014 Martyn Richard Jones All rights reserved.
16. Click to edit Master title styleCE Data Governor – Exhibit I – IC Fab Testing
Application Inline Data
Data forward, store
and analyse
All data All data Significant data
CE Data Governor
Temporal data store
Integrated Circuit Wafer Production Testing / Probing Storage of Test / Probe Results Analysis of Test / Probe Results
© 2014 Martyn Richard Jones All rights reserved.
17. Click to edit Master title styleCE Data Governor – Exhibit I – IC Fab Testing
This exhibit shows where the Data Governor is
placed in the Integration Circuit fabrication and
testing/probing chain.
In large plants, the IC probing process generates
very large volumes of data at high velocity rates.
Based on exception rules the Data Governor
reduces the flow of data to the centralised data
store.
It also speeds up velocity and time to analysis.
Greater speed and less volumes mean that
production show-stoppers are spotted earlier,
thereby potentially leading to significant
production and recuperation cost savings.
© 2014 Martyn Richard Jones All rights reserved.
18. Click to edit Master title styleCE Data Governor – Exhibit II – Internet of Things
Application Inline Data
Data forward, store
and analyse
All data All data Significant data
CE Data Governor
Temporal data store
Internet of Things Internet ‘Thing’ Storage of IoT Data Analysis of IoT Data
© 2014 Martyn Richard Jones All rights reserved.
19. Click to edit Master title styleCE Data Governor – Exhibit II – Internet of Things
This exhibit shows where the Data Governor
is placed in the Internet of Things data flow.
The Data Governor is embedded into an IoT
device, and functions as a data exception
engine.
Based on exception rules and triggers the
Data Governor reduces the flow of data to the
centralised / regionalised data store.
It also speeds up velocity and time to analysis.
Greater speed and less volumes mean that
important signals are spotted earlier, thereby
possibly leading to more effective analysis and
quicker time to action.© 2014 Martyn Richard Jones All rights reserved.
20. Click to edit Master title styleCE Data Governor – Exhibit III – Net Activity
Application Inline Data
Data forward, store
and analyse
All data All data Significant data
CE Data Governor
Temporal data store
Online internet interaction Activity and event logging Happy Data Analysis of Net Activity
© 2014 Martyn Richard Jones All rights reserved.
21. Click to edit Master title styleCE Data Governor – Exhibit III – Net Activity
This exhibit shows where the Data Governor is
placed in the capture and logging of interactive
internet activity.
The Data Governor acts as a virtual device written
to by standard and customised log writers, and
functions as a data exception engine.
Based on exception rules and triggers the Data
Governor reduces the flow of data generated by
internet-browser-activity logging.
It also speeds up velocity and time to analysis.
Greater speed and significantly reduced data
volumes may lead to more effective and focused
analysis and quicker time to action.
© 2014 Martyn Richard Jones All rights reserved.
22. Click to edit Master title styleCE Data Governor – Exhibit IV – Signal Data
Application Inline Data
Data forward, store
and analyse
All data All data Significant data
CE Data Governor
Temporal data store
Signal generation and transmission Near Zero Latency transmission
Immediate Analysis of
Data
Smaller Big Data
© 2014 Martyn Richard Jones All rights reserved.
23. Click to edit Master title styleCE Data Governor – Exhibit IV – Signal Data
This exhibit shows where the Data Governor is
placed in the stream of continuous signal data.
The Data Governor acts as an inline data exception
engine.
Based on exception rules and triggers the Data
Governor reduces the flow of signal data.
It also speeds up velocity and time to analysis.
Greater speed and significantly reduced data
volumes may lead to more effective and focused
analysis and quicker time to action.
© 2014 Martyn Richard Jones All rights reserved.
24. Click to edit Master title styleCE Data Governor – Exhibit V – Machine Data
Application Inline Data
Data forward, store
and analyse
All data All data Significant data
CE Data Governor
Filter, concentrate /
summarise
Sensor data Sensor data (internal storage)’ Sensor data Analysis of IoT Data
© 2014 Martyn Richard Jones All rights reserved.
25. Click to edit Master title styleCE Data Governor – Exhibit V – Machine Data
This exhibit shows where the Data Governor is
placed in the stream of continuous machine
generated data.
The Data Governor acts as an inline data analysis
and exception engine.
Exception data is stored locally and periodically
transferred to an analysis centre.
Analysis of the totality of the same class and
origins of data can be used to drive ANN* and
statistical analysis which can be used to support
(for example) the automatic and semi-automatic
generation of preventive maintenance rules.
Greater speed and significantly reduced data
volumes may lead to more effective and focused
analysis and quicker time to proactivity.
*Adaptive Neural Network
© 2014 Martyn Richard Jones All rights reserved.
26. Click to edit Master title styleCE Data Governor – Exhibit VI – Other Applications
Application Inline Data
Data forward, store
and analyse
All data All data Significant data
CE Data Governor
Temporal data store
Trading Plant monitoring Sport Climate Change
© 2014 Martyn Richard Jones All rights reserved.
27. Click to edit Master title style
• Taking our example of the IC Fab test/probe chain, a Data Governor
should be able to handle a hierarchy or matrix of designation and
exception.
• For example, a top level Data Governor actor could be the Production
Run actor.
• The Production Run actor could designate and assign exception rules
to a Batch Analysis actor.
• In turn, the Batch Analysis actor could designate and assign exception
rules to a Wafer Instance Analysis actor.
Designation and Exception Rules – IC Fab
© 2014 Martyn Richard Jones All rights reserved.
28. Click to edit Master title styleDesignation, Exception Rules, Feedback - IC Fab
Production Run
Actor
Batch Analysis
Actor
Wafer Instance
Analysis Actor
Batch Analysis
Actor
Wafer Instance
Analysis Actor
Wafer Instance
Analysis Actor
Wafer Instance
Analysis Actor
Wafer Instance
Analysis Actor
Designation and Exception
Exception Feedback
© 2014 Martyn Richard Jones All rights reserved.
29. Click to edit Master title styleException Rules – IC Fab
If the status of production.run is red
and the signal.aggregate from batch.actor is green
and the signal from wafer.actor is green
and the sensibility.status of production.run is red
Then the action of data.governor is forward.data
and the action of data.governor is immediate
© 2014 Martyn Richard Jones
30. Click to edit Master title styleTesting and triggering of exception rules – IC Fab
Production ID in focus: X0635387N
Test ID 1 2 3 4 5 6 7 8 9 10 11 12
A OK
B FAIL
C OK
D OK
E OK
F OK
G
H OK
I OK
J FAIL
K FAIL
Timeline (artificial)
If status.text of wafer.test(“B”) is FAIL
and the status.text of wafer.test(“J”) is FAIL
and the status.text of wafer.test(“K”) is FAIL
Then the action of data.governor is forward.data
and the action of data.governor is immediate
© 2014 Martyn Richard Jones
31. Click to edit Master title styleCambriano Data Governor -
Data
Governor
Temporal
Data Store
Big Data
Target Data
Store
Blackboard Paradigm
Classes, objects and instances
Exclusion, inclusion, aggregation and
exception rules – modus ponens
Non-brittle and brittle constraints and
triggers
Scripting, pluggable components and
user exits
Quantitative analysis
Qualitative analysis
Data persistence, aggregation and
generalisation
© 2014 Martyn Richard Jones All rights reserved.
32. Click to edit Master title styleDW 3.0 Information Supply Framework with Data Governors
External
digital data
Data
logistics
Operational
Data Store
Data
Warehouse
Analytics
Data Store
Data Marts
Statistical
Analysis
Business
Intelligence
Scenarios
Data
logistics
Primary data flow
Secondary data flow
Operational
applications
Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es
OLTP
Applications
‘What if’
analysis
MIS /
Reporting
Visualisation
Publication
º
All digital
data
Data Governor
Data Governor Collector
Data
Governor
Manager
33. Click to edit Master title styleSummary
Application /
Intelligent device
Inline Data
Data forward, store
and analyse
All data generators All data generated Significant data
CE Data Governor
Temporal data store
Rules and
constraints
1. Data is generated, captured, created or invented.
2. It is stored to a real device or virtual device.
3. The Data Governor (in all its configurations) acts as a data discrimination and
data exception manager and ensures that significant data is passed on.
4. Significant data is used for ‘business purposes’ and to potentially refine the
rules of the CE Data Governor.
© 2014 Martyn Richard Jones All rights reserved.
34. Click to edit Master title style
The CE Big Data Governor
If you want to know more about the CE Big Data Governor architecture
or wish to discuss your particular needs then please contact Martyn
Jones at martyn.jones@cambriano.es
Direct line: +34 618 471 465
35. Click to edit Master title style
Big Data Governor
martyn.jones@cambriano.es and http://www.cambriano.es
Professional web site: http://www.martynjones.eu
Strategy blog: http://www.goodstrat.com
Direct line: +34 618 471 465
© 2014 Martyn Richard Jones All rights reserved.