SlideShare a Scribd company logo
A Data Lake and a Data Lab to Optimize
Operations and Safety Within a Nuclear Fleet
Hadoop Summit 2016, San José, June 30th
Marie-Luce PICARD, EDF R&D – marie-luce.picard@edf.fr
Jean-Marc RANGOD, EDF-DPNT
Christophe SALPERWYCK, EDF R&D
Special thanks to Raphaël QUERCIA EDF-DTG, Carole MAI and Amandine PIERROT EDF R&D
2
Outline
1. A FEW WORDS ABOUT EDF
2. CONTEXT AND OBJECTIVES
3. A DATA LAKE FOR A NUCLEAR FLEET
4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS
5. A DATA LAB IN PROGRESS
6. AS A CONCLUSION
Brice Richard - Flickr
KC Tan Phoyography - Flickr
3
Outline
1. A FEW WORDS ABOUT EDF
2. CONTEXT AND OBJECTIVES
3. A DATA LAKE FOR A NUCLEAR FLEET
4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS
5. A DATA LAB IN PROGRESS
6. AS A CONCLUSION
Brice Richard - Flickr
KC Tan Phoyography - Flickr
4
ELECTRICITY GENERATION
623.5 TWH
All electricity-related activities
Generation
Transmission & Distribution
Trading and Sales & Marketing
Energy services
Key figures*
€72.9 billion in sales
38.5 million customers
158,161 employees worldwide
84.7% of generation does not emit CO2
2014 INVESTMENTS
€4.5 BILLION
EDF: A GLOBAL LEADER IN ELECTRICITY
*as of 2015
EDF :
AN EFFICIENT,
RESPONSIBLE
ELECTRICITY COMPANY
AND THE CHAMPION
OF LOW-CARBON
GROWTH
WORLD’S LEADING OPERATOR, EXCELLENT
PERFORMANCE IN FRANCE
72.9 GW installed capacity, 54% of the Group’s net generation
capacity
477.7 TWh generated, 77% of the Group’s output
58 reactors operated in France,
15 in the UK
3 EPR under construction:
— 1 in Flamanville (France)
— 2 in Taishan (China)
2 EPR in project phase
 OSART safety audit
17 best practices identified by IAEA
 France
Best generation performance for six years
 UK
World record for safety in the workplace
 China
Strengthened cooperation agreement with CNNC
NUCLEAR
EDF 2015 I P.5
R&D KEY FIGURES
Scientific
partnerships with
actors of Paris-
Saclay
research departments
8
exceptional buildings
4 outstanding hall test
1 Unique equipment,
innovative
communication
tools
Diverse areas of
expertise
1500
work stations
Plenty of
collaborative
spaces
EDF LAB PARIS-SACLAY
9
Main Big Data related challenges for EDF
Power Generation
 Process monitoring and condition-based maintenance
from sensors
 Power generation forecasting for renewables
Energy management
 Load forecasting
 Balancing and optimizing generation and consumption
(using smart metering information, including
renewables)
 Electrical networks
 Smart Grid operations (local)
 Condition-based maintenance
Customers and sales
 New services to customers using smart-metering data
 Smart Homes, Smart Building, Smart Cities management
related to energy
10
Outline
1. A FEW WORDS ABOUT EDF
2. CONTEXT AND OBJECTIVES
3. A DATA LAKE FOR A NUCLEAR FLEET
4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS
5. A DATA LAB IN PROGRESS
6. AS A CONCLUSION
Brice Richard - Flickr
KC Tan Phoyography - Flickr
11
Operations and maintenance of the nuclear fleet
 The maintenance policy of EDF generation fleet is optimized to ensure reliability and safety of
equipment and systems while strengthening our competitiveness:
 Have better diagnosis, improved performance and availability
 Make a better use of data and documents, so far stored into Data silos
 More globally, the IT teams and projects aim at:
 Strengthen performance of operations and maintenance through a global fleet approach
 Simplify the Industrial Information System architecture
 Improve and develop the way we use our data
 Accumulate and archive data through time
… while reducing costs
12
Voluminous and heterogeneous data …. stored in data silos
Source : Wikipedia
One DB by nuclear site, gathering data from
sensors. Use of Data Historians.
 Focus on data:
 High volume:
 data is stored up to 40-60 years (lifetime of the plant)
 SCADA data can be sampled every 20 to 40 ms (but mainly a few
seconds)
 Around 10.000 sensors per plant
 Variety:
 Data is heterogeneous
 Time series, images, documents
 Various data sources
 The actual systems (historians) don’t allow
too many concurrent access, and their SLA are
quite bad
13
A Data Lake for the nuclear fleet
ESPADON : the Data Lake for the nuclear
fleet
One DB by nuclear site, gathering data from
sensors. Use of Data Historians.
Source : Wikipedia
© M. Caraveo, Hadoop cluster NOE data center
14
Outline
1. A FEW WORDS ABOUT EDF
2. CONTEXT AND OBJECTIVES
3. A DATA LAKE FOR A NUCLEAR FLEET
4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS
5. A DATA LAB IN PROGRESS
6. AS A CONCLUSION
Brice Richard - Flickr
KC Tan Phoyography - Flickr
| 15
A data lake for the nuclear fleet: big picture
….
Files
(chemical
information)
Historian -
SCADA
Files
(dosimetry)
E-monitoring
application
Viz
Interactive
queries and
reporting
Web Service
Hadoop cluster – ESPADON
Data Lake
Reports
© M. Caraveo, Hadoop cluster NOE data
center
16
Zoom on data
 4 generations of plants, but high level of normalization of data and sensors (for
example, use of trigrams for identification of elementary systems)
 Two main types of sensors : ANA (for analogic) and TOR (for state events)
 Time series
 Volume
 For the POC, 10 plants, 2 years: about 20 billions of points
 Target (59 plants) : 15 To of data (all plants, whole lifecycle)
Metric, global Date Value Quality
BU2ABP177MT- 2015-04-30T22:05:00.000Z 156.6 Good/M
BU2ABP177MT- 2015-04-30T22:06:00.000Z 156.4 Good/M
BU2ABP177MT- 2015-04-30T22:07:00.000Z 156.2 Good/M
BU2ABP177MT- 2015-04-30T22:08:00.000Z 156.0 Good
BU2ABP177MT- 2015-04-30T22:09:00.000Z 156.2 Good/M
BU2ABP177MT- 2015-04-30T22:10:00.000Z 156.4 Good/M
BU2ABP177MT- 2015-04-30T22:12:00.000Z 156.7 Good/M
BU2ABP177MT- 2015-04-30T22:14:00.000Z 157.1 Good
BU2ABP177MT- 2015-04-30T22:15:00.000Z 157.3 Good
BU2ABP177MT- 2015-04-30T22:16:00.000Z 157.5 Good
BU2ABP177MT- 2015-04-30T22:19:00.000Z 157.3 Good/M
BU2ABP177MT- 2015-04-30T22:20:00.000Z 157.1 Good/M
BU2ABP177MT- 2015-04-30T22:21:00.000Z 157.3 Good/M
BU2ABP177MT- 2015-04-30T22:22:00.000Z 157.1 Good/M
BU2ABP177MT- 2015-04-30T22:24:00.000Z 156.9 Good/M
BU2ABP177MT- 2015-04-30T22:27:00.000Z 157.1 Good/M
BU2ABP177MT- 2015-04-30T22:28:00.000Z 157.3 Good/M
BU2ABP177MT- 2015-04-30T22:29:00.000Z 157.5 Good/M
BU2ABP177MT- 2015-04-30T22:30:00.000Z 157.7 Good/M
17
Data model
 Use of HBASE and PHOENIX
 Distributed key/values store
 Allows models update (normalization requirements evolution, new indicators… new plants)
 Phoenix for SQL compliance + BI tools
 Tables
 3 tables : DDT, ANA, TOR
 Rowkey : <sensorid, timestamp> (queries mainly consider one or several sensors for a period of time)
 Sequential storage ; split into Hfiles and Hregion according to the plant unit
Clé ColumnFamily Colonne Valeur Phoenix type
m
(concat(metriquei
d, timestamp))
0 v H_ValeurANA Float
q H_QualitéANA Char(10)
n H_NiveauxANA varchar(10)
Clé ColumnFamily Colonne Valeur Phoenix type
m
(concat(metriquei
d, timestamp))
0 v H_ValeurTOR Varchar(10)
q H_QualiteTOR Char(10)
n H_NiveauxTOR Varchar(10)
18
Validation and performances evaluation
 POC validation
 Upload of historical data; queries / analyses
 Existing functions: viz, reports, services
 Data injection: SCADA for the whole fleet,
integration of other sources of data
 Results
 6 weeks (estimated) needed to upload historical data
from 59 plants
 Queries for validating the model :
 Use of Jmeter for simulating load
 With or without insertion workload
 ~ < 1 second for drawing a curve for a selected month
 Integration of an existing GUI for viz (realized within a
few days)
 Validation of specific calculation within reports
 ODBC link for specific e-monitoring application
 Integration of various sources of (structured) data into
the data lake
 ‘Real-time’ insertion of data (micro-batch):
 Up to 2M points / s
 Very low latency between insertion and availability (< 10s)
SELECT
MIN(v), MAX(v),
FIRST_VALUE(v) WITHIN GROUP (ORDER BY ts ASC),
LAST_VALUE(v) WITHIN GROUP (ORDER BY ts ASC),
TO_CHAR(ts, 'dd') as day,
TO_CHAR(ts, 'HH') as hour,
TO_CHAR(ts, 'mm') as minute,
count(*) as cnt
FROM
ORLI_ANA
WHERE
m = ? AND
ts > current_time()-1 AND //last 24h
ts < current_time()
GROUP BY
day, hour, minute
Phoenix query (ANA)
19
Outline
1. A FEW WORDS ABOUT EDF
2. CONTEXT AND OBJECTIVES
3. A DATA LAKE FOR A NUCLEAR FLEET
4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS
5. A DATA LAB IN PROGRESS
6. AS A CONCLUSION
Brice Richard - Flickr
KC Tan Phoyography - Flickr
20
Added value of data science algorithms on heterogeneous data:
Operations and maintenance can be better optimized through data analytics run on
data coming from the whole fleet
 Active and reactive power are indicators of constraints on alternators: effect on
their wears
• ~ 50 plants
• 20 years of data
• 10 min interval data
• Phoenix queries allow to select plants and periods of time
• Compute and show reactive power per day or per hour of the
day
• More detailed analysis
• Fleet level analysis
• Interactive queries
21
Added value of data science algorithms on heterogeneous data:
Operations and maintenance can be better optimized through data analytics run on
data coming from the whole fleet
Monitoring and control of contractual agreements when network frequency
varies (plants have to contribute to the global balance)
• Pattern matching
• Response time for different plants
• Different levels of analysis : by plant, by
generation, global
• Generic approach implemented for any
kind of patterns
22
Added value of data science algorithms on heterogeneous data
Prediction of plants cooling according to the quality of incoming water in the
plants
• Correlations?
• According to the plants
• Use of GAM models
• Integration of two internal sources +
external data
• Better understanding
• // Work in progress //
23
Integration of data science and visualization: architecture
Hadoop Cluster Web Service REST
(VM)
Browser
24
Integration of data science: a global approach
Pre-processing
Data quality
Sampling
Synchronization
…
Selection and queries
Threshold
Pattern matching
Period of time
…
Analysis and data science
Reporting
Exploratory analysis
(distribution …)
Modelling
…
25
Outline
1. A FEW WORDS ABOUT EDF
2. CONTEXT AND OBJECTIVES
3. A DATA LAKE FOR A NUCLEAR FLEET
4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS
5. A DATA LAB IN PROGRESS
6. AS A CONCLUSION
Brice Richard - Flickr
KC Tan Phoyography - Flickr
26
A Data Lab in progress: a team, an approach …
… and some questions
Objectives:
Bring value from data analytics
Issues:
 Skills and organization (between entities)
 Architecture :
 Operational Hadoop cluster and loads (use of a multitenant
enterprise cluster)
 Other loads (data science)
 Data prep within Hadoop + edge machine for data science (Spark, R,
Python)
 How to quantify value
 Developments costs and maintenance
 How to industrialize
Source: Xebia
27
Outline
1. A FEW WORDS ABOUT EDF
2. CONTEXT AND OBJECTIVES
3. A DATA LAKE FOR A NUCLEAR FLEET
4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS
5. A DATA LAB IN PROGRESS
6. AS A CONCLUSION
Brice Richard - Flickr
KC Tan Phoyography - Flickr
28
Takeaways
 A Data Lake for our nuclear fleet
 In progress : industrialization and decommissioning of Historian applications
 Great reduction of licensing costs
 A Data Lab under construction
 POCs showing the added value of data science algorithms
 predictive maintenance
 In the context of fleet renovation for plant life extension (major overhaul program): operations & maintenance, generation
costs optimization
 Issues remaining : skills, organization, technical architecture, quantify value
 Perspectives and technical issues:
 Data lakes and labs for other fleets (thermal plants, hydro, renewables)
 Scalable time-series analytics (synchronization, missing data …)
 Handling heterogeneous data (textual, images, graphs …)
 IoT platform
References
A proof of concept with Hadoop: storage and analytics of electrical time-series.
Marie-Luce Picard, Bruno Jacquin, Hadoop Summit 2012, Californie, USA, June 2012: http://www.slideshare.net/Hadoop_Summit/proof-of-
concent-with-hadoop
Massive Smart Meter Data Storage and Processing on top of Hadoop.
Leeley D. P. dos Santos, Alzennyr G. da Silva, Bruno Jacquin, Marie-Luce Picard, David Worms,Charles Bernard. Workshop Big Data 2012,
Conférence VLDB (Very Large Data Bases), Istanbul, Turquie, 2012: http://www.cse.buffalo.edu/faculty/tkosar/bigdata2012/program.php
Searching time-series with Hadoop in an electric power company.
Alice Bérard, Georges Hébrail, BigMine Workshop, KDD2013, Chicago, August 2013: http://bigdata-mining.org/
Real-time energy data-analytics with Storm.
Rémy Saissy, Marie-Luce Picard, Charles Bernard, Bruno Jacquin, Simon Maby, Benoît Grossin, Hadoop Summit 2014, Californie, USA, June
2014: http://fr.slideshare.net/Hadoop_Summit/t-525p212picard
Computing Data Quality Indicators on Big Data Stream Using a CEP
Wenlu Yang, Alzennyr Gomes Da Silva, Marie-Luce Picard, IEEE Xplore - IWCIM 2015, Prague, Novembre 2015.
Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Network
Guillaume Germaine, Thomas Vial, Hadoop Summit Europe 2016, Dublin
http://www.slideshare.net/HadoopSummit/exploring-titan-and-spark-graphx-for-analyzing-timevarying-electrical-networks

More Related Content

What's hot

Association Rule Mining Using WEKA
Association Rule Mining Using WEKAAssociation Rule Mining Using WEKA
Association Rule Mining Using WEKA
Prothoma Diteeya
 
BIGDATA ANALYTICS LAB MANUAL final.pdf
BIGDATA  ANALYTICS LAB MANUAL final.pdfBIGDATA  ANALYTICS LAB MANUAL final.pdf
BIGDATA ANALYTICS LAB MANUAL final.pdf
ANJALAI AMMAL MAHALINGAM ENGINEERING COLLEGE
 
School ERP System
School ERP SystemSchool ERP System
Cloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard UniversityCloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard University
Barton George
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
Mr. Fmhyudin
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Difference
jeetendra mandal
 
Data ingestion
Data ingestionData ingestion
Data ingestion
nitheeshe2
 
Testing Strategies for Data Lake Hosted on Hadoop
Testing Strategies for Data Lake Hosted on HadoopTesting Strategies for Data Lake Hosted on Hadoop
Testing Strategies for Data Lake Hosted on Hadoop
CitiusTech
 
Getting Started with Amazon QuickSight
Getting Started with Amazon QuickSightGetting Started with Amazon QuickSight
Getting Started with Amazon QuickSight
Amazon Web Services
 
Alta disponibilidad en clusteres asterisk
Alta disponibilidad en clusteres asterisk Alta disponibilidad en clusteres asterisk
Alta disponibilidad en clusteres asterisk
Javier Boquin Rivera
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Amazon Web Services
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
 
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureHadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
DataWorks Summit
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
DataWorks Summit/Hadoop Summit
 
Data Visualization Project Presentation
Data Visualization Project PresentationData Visualization Project Presentation
Data Visualization Project Presentation
Shubham Shrivastava
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
Amazon Web Services
 
Big Data Testing Strategies
Big Data Testing StrategiesBig Data Testing Strategies
Big Data Testing Strategies
Knoldus Inc.
 
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Amazon Web Services
 
define_xml_tutorial .ppt
define_xml_tutorial .pptdefine_xml_tutorial .ppt
define_xml_tutorial .ppt
ssuser660bb1
 

What's hot (20)

Association Rule Mining Using WEKA
Association Rule Mining Using WEKAAssociation Rule Mining Using WEKA
Association Rule Mining Using WEKA
 
BIGDATA ANALYTICS LAB MANUAL final.pdf
BIGDATA  ANALYTICS LAB MANUAL final.pdfBIGDATA  ANALYTICS LAB MANUAL final.pdf
BIGDATA ANALYTICS LAB MANUAL final.pdf
 
School ERP System
School ERP SystemSchool ERP System
School ERP System
 
Cloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard UniversityCloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard University
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Batch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing DifferenceBatch Processing vs Stream Processing Difference
Batch Processing vs Stream Processing Difference
 
Data ingestion
Data ingestionData ingestion
Data ingestion
 
Testing Strategies for Data Lake Hosted on Hadoop
Testing Strategies for Data Lake Hosted on HadoopTesting Strategies for Data Lake Hosted on Hadoop
Testing Strategies for Data Lake Hosted on Hadoop
 
Getting Started with Amazon QuickSight
Getting Started with Amazon QuickSightGetting Started with Amazon QuickSight
Getting Started with Amazon QuickSight
 
Alta disponibilidad en clusteres asterisk
Alta disponibilidad en clusteres asterisk Alta disponibilidad en clusteres asterisk
Alta disponibilidad en clusteres asterisk
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureHadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
 
Data Visualization Project Presentation
Data Visualization Project PresentationData Visualization Project Presentation
Data Visualization Project Presentation
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Big Data Testing Strategies
Big Data Testing StrategiesBig Data Testing Strategies
Big Data Testing Strategies
 
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
Migrating Your Data Warehouse to Amazon Redshift (DAT337) - AWS re:Invent 2018
 
define_xml_tutorial .ppt
define_xml_tutorial .pptdefine_xml_tutorial .ppt
define_xml_tutorial .ppt
 

Viewers also liked

Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
mark madsen
 
Building a Data Analytics PaaS for Smart Cities
Building a Data Analytics PaaS for Smart CitiesBuilding a Data Analytics PaaS for Smart Cities
Building a Data Analytics PaaS for Smart Cities
DataWorks Summit/Hadoop Summit
 
Processing and retrieval of geotagged unmanned aerial system telemetry
Processing and retrieval of geotagged unmanned aerial system telemetry Processing and retrieval of geotagged unmanned aerial system telemetry
Processing and retrieval of geotagged unmanned aerial system telemetry
DataWorks Summit/Hadoop Summit
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awareness
DataWorks Summit/Hadoop Summit
 
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
DataWorks Summit/Hadoop Summit
 
Enterprise Data Classification and Provenance
Enterprise Data Classification and ProvenanceEnterprise Data Classification and Provenance
Enterprise Data Classification and Provenance
DataWorks Summit/Hadoop Summit
 
Modernise your EDW - Data Lake
Modernise your EDW - Data LakeModernise your EDW - Data Lake
Modernise your EDW - Data Lake
DataWorks Summit/Hadoop Summit
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
VMware Tanzu
 
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
DataWorks Summit/Hadoop Summit
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
DataWorks Summit/Hadoop Summit
 
Filling the Data Lake
Filling the Data LakeFilling the Data Lake
Filling the Data Lake
DataWorks Summit/Hadoop Summit
 
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasSecurity and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache Atlas
DataWorks Summit/Hadoop Summit
 
Logistique : Le transport dans le commerce
Logistique : Le transport dans le commerceLogistique : Le transport dans le commerce
Logistique : Le transport dans le commerce
Thomas Malice
 
The real world use of Big Data to change business
The real world use of Big Data to change businessThe real world use of Big Data to change business
The real world use of Big Data to change business
DataWorks Summit/Hadoop Summit
 
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
DataWorks Summit/Hadoop Summit
 
Scalable OCR with NiFi and Tesseract
Scalable OCR with NiFi and TesseractScalable OCR with NiFi and Tesseract
Scalable OCR with NiFi and Tesseract
DataWorks Summit/Hadoop Summit
 
Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Capgemini
 
Creative Capital, Information & Communication Technologies, & Economic Growth...
Creative Capital, Information & Communication Technologies, & Economic Growth...Creative Capital, Information & Communication Technologies, & Economic Growth...
Creative Capital, Information & Communication Technologies, & Economic Growth...
Regional Science Academy
 
Apache Hive ACID Project
Apache Hive ACID ProjectApache Hive ACID Project
Apache Hive ACID Project
DataWorks Summit/Hadoop Summit
 

Viewers also liked (20)

Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
 
Building a Data Analytics PaaS for Smart Cities
Building a Data Analytics PaaS for Smart CitiesBuilding a Data Analytics PaaS for Smart Cities
Building a Data Analytics PaaS for Smart Cities
 
Processing and retrieval of geotagged unmanned aerial system telemetry
Processing and retrieval of geotagged unmanned aerial system telemetry Processing and retrieval of geotagged unmanned aerial system telemetry
Processing and retrieval of geotagged unmanned aerial system telemetry
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awareness
 
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
 
Enterprise Data Classification and Provenance
Enterprise Data Classification and ProvenanceEnterprise Data Classification and Provenance
Enterprise Data Classification and Provenance
 
Modernise your EDW - Data Lake
Modernise your EDW - Data LakeModernise your EDW - Data Lake
Modernise your EDW - Data Lake
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
A3RT - the details and actual use cases of "Analytics & Artificial intelligen...
 
Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
 
Filling the Data Lake
Filling the Data LakeFilling the Data Lake
Filling the Data Lake
 
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasSecurity and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache Atlas
 
Logistique : Le transport dans le commerce
Logistique : Le transport dans le commerceLogistique : Le transport dans le commerce
Logistique : Le transport dans le commerce
 
The real world use of Big Data to change business
The real world use of Big Data to change businessThe real world use of Big Data to change business
The real world use of Big Data to change business
 
What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS What's new in Hadoop Common and HDFS
What's new in Hadoop Common and HDFS
 
Scalable OCR with NiFi and Tesseract
Scalable OCR with NiFi and TesseractScalable OCR with NiFi and Tesseract
Scalable OCR with NiFi and Tesseract
 
Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry Big Data Analytics in light of Financial Industry
Big Data Analytics in light of Financial Industry
 
Creative Capital, Information & Communication Technologies, & Economic Growth...
Creative Capital, Information & Communication Technologies, & Economic Growth...Creative Capital, Information & Communication Technologies, & Economic Growth...
Creative Capital, Information & Communication Technologies, & Economic Growth...
 
Apache Hive ACID Project
Apache Hive ACID ProjectApache Hive ACID Project
Apache Hive ACID Project
 

Similar to A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear fleet

How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
inside-BigData.com
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
Larry Smarr
 
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
confluent
 
PosterPresentation
PosterPresentationPosterPresentation
PosterPresentationRaj Shekhar
 
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
Larry Smarr
 
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry ImpactAccelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
inside-BigData.com
 
Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx Systems
Dataconomy Media
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …
Anubhav Jain
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
Flink Forward
 
big_data_casestudies_2.ppt
big_data_casestudies_2.pptbig_data_casestudies_2.ppt
big_data_casestudies_2.ppt
vishal choudhary
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Databricks
 
Weather Station Data Publication at Irstea: an implementation Report.
Weather Station Data Publication at Irstea: an implementation Report.  Weather Station Data Publication at Irstea: an implementation Report.
Weather Station Data Publication at Irstea: an implementation Report.
catherine roussey
 
Linked Sensor Data cube
Linked Sensor Data cubeLinked Sensor Data cube
Linked Sensor Data cube
Laurent Lefort
 
TeraGrid Communication and Computation
TeraGrid Communication and ComputationTeraGrid Communication and Computation
TeraGrid Communication and Computation
Tal Lavian Ph.D.
 
SplunkLive! Customer Presentation – Harris
SplunkLive! Customer Presentation – HarrisSplunkLive! Customer Presentation – Harris
SplunkLive! Customer Presentation – Harris
Splunk
 
Blue Waters and Resource Management - Now and in the Future
 Blue Waters and Resource Management - Now and in the Future Blue Waters and Resource Management - Now and in the Future
Blue Waters and Resource Management - Now and in the Future
inside-BigData.com
 
Private Cloud Delivers Big Data in Oil & Gas v4
Private Cloud Delivers Big Data in Oil & Gas v4Private Cloud Delivers Big Data in Oil & Gas v4
Private Cloud Delivers Big Data in Oil & Gas v4
Andy Moore
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
CloudLightning
 
Nfcis2009
Nfcis2009Nfcis2009
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
inside-BigData.com
 

Similar to A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear fleet (20)

How HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental scienceHow HPC and large-scale data analytics are transforming experimental science
How HPC and large-scale data analytics are transforming experimental science
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
 
PosterPresentation
PosterPresentationPosterPresentation
PosterPresentation
 
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political...
 
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry ImpactAccelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
 
Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx Systems
 
DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …DuraMat CO1 Central Data Resource: How it started, how it’s going …
DuraMat CO1 Central Data Resource: How it started, how it’s going …
 
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
 
big_data_casestudies_2.ppt
big_data_casestudies_2.pptbig_data_casestudies_2.ppt
big_data_casestudies_2.ppt
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
 
Weather Station Data Publication at Irstea: an implementation Report.
Weather Station Data Publication at Irstea: an implementation Report.  Weather Station Data Publication at Irstea: an implementation Report.
Weather Station Data Publication at Irstea: an implementation Report.
 
Linked Sensor Data cube
Linked Sensor Data cubeLinked Sensor Data cube
Linked Sensor Data cube
 
TeraGrid Communication and Computation
TeraGrid Communication and ComputationTeraGrid Communication and Computation
TeraGrid Communication and Computation
 
SplunkLive! Customer Presentation – Harris
SplunkLive! Customer Presentation – HarrisSplunkLive! Customer Presentation – Harris
SplunkLive! Customer Presentation – Harris
 
Blue Waters and Resource Management - Now and in the Future
 Blue Waters and Resource Management - Now and in the Future Blue Waters and Resource Management - Now and in the Future
Blue Waters and Resource Management - Now and in the Future
 
Private Cloud Delivers Big Data in Oil & Gas v4
Private Cloud Delivers Big Data in Oil & Gas v4Private Cloud Delivers Big Data in Oil & Gas v4
Private Cloud Delivers Big Data in Oil & Gas v4
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
 
Nfcis2009
Nfcis2009Nfcis2009
Nfcis2009
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 

More from DataWorks Summit/Hadoop Summit

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 

Recently uploaded (20)

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 

A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear fleet

  • 1. A Data Lake and a Data Lab to Optimize Operations and Safety Within a Nuclear Fleet Hadoop Summit 2016, San José, June 30th Marie-Luce PICARD, EDF R&D – marie-luce.picard@edf.fr Jean-Marc RANGOD, EDF-DPNT Christophe SALPERWYCK, EDF R&D Special thanks to Raphaël QUERCIA EDF-DTG, Carole MAI and Amandine PIERROT EDF R&D
  • 2. 2 Outline 1. A FEW WORDS ABOUT EDF 2. CONTEXT AND OBJECTIVES 3. A DATA LAKE FOR A NUCLEAR FLEET 4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS 5. A DATA LAB IN PROGRESS 6. AS A CONCLUSION Brice Richard - Flickr KC Tan Phoyography - Flickr
  • 3. 3 Outline 1. A FEW WORDS ABOUT EDF 2. CONTEXT AND OBJECTIVES 3. A DATA LAKE FOR A NUCLEAR FLEET 4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS 5. A DATA LAB IN PROGRESS 6. AS A CONCLUSION Brice Richard - Flickr KC Tan Phoyography - Flickr
  • 4. 4 ELECTRICITY GENERATION 623.5 TWH All electricity-related activities Generation Transmission & Distribution Trading and Sales & Marketing Energy services Key figures* €72.9 billion in sales 38.5 million customers 158,161 employees worldwide 84.7% of generation does not emit CO2 2014 INVESTMENTS €4.5 BILLION EDF: A GLOBAL LEADER IN ELECTRICITY *as of 2015 EDF : AN EFFICIENT, RESPONSIBLE ELECTRICITY COMPANY AND THE CHAMPION OF LOW-CARBON GROWTH
  • 5. WORLD’S LEADING OPERATOR, EXCELLENT PERFORMANCE IN FRANCE 72.9 GW installed capacity, 54% of the Group’s net generation capacity 477.7 TWh generated, 77% of the Group’s output 58 reactors operated in France, 15 in the UK 3 EPR under construction: — 1 in Flamanville (France) — 2 in Taishan (China) 2 EPR in project phase  OSART safety audit 17 best practices identified by IAEA  France Best generation performance for six years  UK World record for safety in the workplace  China Strengthened cooperation agreement with CNNC NUCLEAR EDF 2015 I P.5
  • 6.
  • 8. Scientific partnerships with actors of Paris- Saclay research departments 8 exceptional buildings 4 outstanding hall test 1 Unique equipment, innovative communication tools Diverse areas of expertise 1500 work stations Plenty of collaborative spaces EDF LAB PARIS-SACLAY
  • 9. 9 Main Big Data related challenges for EDF Power Generation  Process monitoring and condition-based maintenance from sensors  Power generation forecasting for renewables Energy management  Load forecasting  Balancing and optimizing generation and consumption (using smart metering information, including renewables)  Electrical networks  Smart Grid operations (local)  Condition-based maintenance Customers and sales  New services to customers using smart-metering data  Smart Homes, Smart Building, Smart Cities management related to energy
  • 10. 10 Outline 1. A FEW WORDS ABOUT EDF 2. CONTEXT AND OBJECTIVES 3. A DATA LAKE FOR A NUCLEAR FLEET 4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS 5. A DATA LAB IN PROGRESS 6. AS A CONCLUSION Brice Richard - Flickr KC Tan Phoyography - Flickr
  • 11. 11 Operations and maintenance of the nuclear fleet  The maintenance policy of EDF generation fleet is optimized to ensure reliability and safety of equipment and systems while strengthening our competitiveness:  Have better diagnosis, improved performance and availability  Make a better use of data and documents, so far stored into Data silos  More globally, the IT teams and projects aim at:  Strengthen performance of operations and maintenance through a global fleet approach  Simplify the Industrial Information System architecture  Improve and develop the way we use our data  Accumulate and archive data through time … while reducing costs
  • 12. 12 Voluminous and heterogeneous data …. stored in data silos Source : Wikipedia One DB by nuclear site, gathering data from sensors. Use of Data Historians.  Focus on data:  High volume:  data is stored up to 40-60 years (lifetime of the plant)  SCADA data can be sampled every 20 to 40 ms (but mainly a few seconds)  Around 10.000 sensors per plant  Variety:  Data is heterogeneous  Time series, images, documents  Various data sources  The actual systems (historians) don’t allow too many concurrent access, and their SLA are quite bad
  • 13. 13 A Data Lake for the nuclear fleet ESPADON : the Data Lake for the nuclear fleet One DB by nuclear site, gathering data from sensors. Use of Data Historians. Source : Wikipedia © M. Caraveo, Hadoop cluster NOE data center
  • 14. 14 Outline 1. A FEW WORDS ABOUT EDF 2. CONTEXT AND OBJECTIVES 3. A DATA LAKE FOR A NUCLEAR FLEET 4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS 5. A DATA LAB IN PROGRESS 6. AS A CONCLUSION Brice Richard - Flickr KC Tan Phoyography - Flickr
  • 15. | 15 A data lake for the nuclear fleet: big picture …. Files (chemical information) Historian - SCADA Files (dosimetry) E-monitoring application Viz Interactive queries and reporting Web Service Hadoop cluster – ESPADON Data Lake Reports © M. Caraveo, Hadoop cluster NOE data center
  • 16. 16 Zoom on data  4 generations of plants, but high level of normalization of data and sensors (for example, use of trigrams for identification of elementary systems)  Two main types of sensors : ANA (for analogic) and TOR (for state events)  Time series  Volume  For the POC, 10 plants, 2 years: about 20 billions of points  Target (59 plants) : 15 To of data (all plants, whole lifecycle) Metric, global Date Value Quality BU2ABP177MT- 2015-04-30T22:05:00.000Z 156.6 Good/M BU2ABP177MT- 2015-04-30T22:06:00.000Z 156.4 Good/M BU2ABP177MT- 2015-04-30T22:07:00.000Z 156.2 Good/M BU2ABP177MT- 2015-04-30T22:08:00.000Z 156.0 Good BU2ABP177MT- 2015-04-30T22:09:00.000Z 156.2 Good/M BU2ABP177MT- 2015-04-30T22:10:00.000Z 156.4 Good/M BU2ABP177MT- 2015-04-30T22:12:00.000Z 156.7 Good/M BU2ABP177MT- 2015-04-30T22:14:00.000Z 157.1 Good BU2ABP177MT- 2015-04-30T22:15:00.000Z 157.3 Good BU2ABP177MT- 2015-04-30T22:16:00.000Z 157.5 Good BU2ABP177MT- 2015-04-30T22:19:00.000Z 157.3 Good/M BU2ABP177MT- 2015-04-30T22:20:00.000Z 157.1 Good/M BU2ABP177MT- 2015-04-30T22:21:00.000Z 157.3 Good/M BU2ABP177MT- 2015-04-30T22:22:00.000Z 157.1 Good/M BU2ABP177MT- 2015-04-30T22:24:00.000Z 156.9 Good/M BU2ABP177MT- 2015-04-30T22:27:00.000Z 157.1 Good/M BU2ABP177MT- 2015-04-30T22:28:00.000Z 157.3 Good/M BU2ABP177MT- 2015-04-30T22:29:00.000Z 157.5 Good/M BU2ABP177MT- 2015-04-30T22:30:00.000Z 157.7 Good/M
  • 17. 17 Data model  Use of HBASE and PHOENIX  Distributed key/values store  Allows models update (normalization requirements evolution, new indicators… new plants)  Phoenix for SQL compliance + BI tools  Tables  3 tables : DDT, ANA, TOR  Rowkey : <sensorid, timestamp> (queries mainly consider one or several sensors for a period of time)  Sequential storage ; split into Hfiles and Hregion according to the plant unit Clé ColumnFamily Colonne Valeur Phoenix type m (concat(metriquei d, timestamp)) 0 v H_ValeurANA Float q H_QualitéANA Char(10) n H_NiveauxANA varchar(10) Clé ColumnFamily Colonne Valeur Phoenix type m (concat(metriquei d, timestamp)) 0 v H_ValeurTOR Varchar(10) q H_QualiteTOR Char(10) n H_NiveauxTOR Varchar(10)
  • 18. 18 Validation and performances evaluation  POC validation  Upload of historical data; queries / analyses  Existing functions: viz, reports, services  Data injection: SCADA for the whole fleet, integration of other sources of data  Results  6 weeks (estimated) needed to upload historical data from 59 plants  Queries for validating the model :  Use of Jmeter for simulating load  With or without insertion workload  ~ < 1 second for drawing a curve for a selected month  Integration of an existing GUI for viz (realized within a few days)  Validation of specific calculation within reports  ODBC link for specific e-monitoring application  Integration of various sources of (structured) data into the data lake  ‘Real-time’ insertion of data (micro-batch):  Up to 2M points / s  Very low latency between insertion and availability (< 10s) SELECT MIN(v), MAX(v), FIRST_VALUE(v) WITHIN GROUP (ORDER BY ts ASC), LAST_VALUE(v) WITHIN GROUP (ORDER BY ts ASC), TO_CHAR(ts, 'dd') as day, TO_CHAR(ts, 'HH') as hour, TO_CHAR(ts, 'mm') as minute, count(*) as cnt FROM ORLI_ANA WHERE m = ? AND ts > current_time()-1 AND //last 24h ts < current_time() GROUP BY day, hour, minute Phoenix query (ANA)
  • 19. 19 Outline 1. A FEW WORDS ABOUT EDF 2. CONTEXT AND OBJECTIVES 3. A DATA LAKE FOR A NUCLEAR FLEET 4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS 5. A DATA LAB IN PROGRESS 6. AS A CONCLUSION Brice Richard - Flickr KC Tan Phoyography - Flickr
  • 20. 20 Added value of data science algorithms on heterogeneous data: Operations and maintenance can be better optimized through data analytics run on data coming from the whole fleet  Active and reactive power are indicators of constraints on alternators: effect on their wears • ~ 50 plants • 20 years of data • 10 min interval data • Phoenix queries allow to select plants and periods of time • Compute and show reactive power per day or per hour of the day • More detailed analysis • Fleet level analysis • Interactive queries
  • 21. 21 Added value of data science algorithms on heterogeneous data: Operations and maintenance can be better optimized through data analytics run on data coming from the whole fleet Monitoring and control of contractual agreements when network frequency varies (plants have to contribute to the global balance) • Pattern matching • Response time for different plants • Different levels of analysis : by plant, by generation, global • Generic approach implemented for any kind of patterns
  • 22. 22 Added value of data science algorithms on heterogeneous data Prediction of plants cooling according to the quality of incoming water in the plants • Correlations? • According to the plants • Use of GAM models • Integration of two internal sources + external data • Better understanding • // Work in progress //
  • 23. 23 Integration of data science and visualization: architecture Hadoop Cluster Web Service REST (VM) Browser
  • 24. 24 Integration of data science: a global approach Pre-processing Data quality Sampling Synchronization … Selection and queries Threshold Pattern matching Period of time … Analysis and data science Reporting Exploratory analysis (distribution …) Modelling …
  • 25. 25 Outline 1. A FEW WORDS ABOUT EDF 2. CONTEXT AND OBJECTIVES 3. A DATA LAKE FOR A NUCLEAR FLEET 4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS 5. A DATA LAB IN PROGRESS 6. AS A CONCLUSION Brice Richard - Flickr KC Tan Phoyography - Flickr
  • 26. 26 A Data Lab in progress: a team, an approach … … and some questions Objectives: Bring value from data analytics Issues:  Skills and organization (between entities)  Architecture :  Operational Hadoop cluster and loads (use of a multitenant enterprise cluster)  Other loads (data science)  Data prep within Hadoop + edge machine for data science (Spark, R, Python)  How to quantify value  Developments costs and maintenance  How to industrialize Source: Xebia
  • 27. 27 Outline 1. A FEW WORDS ABOUT EDF 2. CONTEXT AND OBJECTIVES 3. A DATA LAKE FOR A NUCLEAR FLEET 4. DATA SCIENCE ALGORITHMS FOR OPTIMIZING OPERATIONS 5. A DATA LAB IN PROGRESS 6. AS A CONCLUSION Brice Richard - Flickr KC Tan Phoyography - Flickr
  • 28. 28 Takeaways  A Data Lake for our nuclear fleet  In progress : industrialization and decommissioning of Historian applications  Great reduction of licensing costs  A Data Lab under construction  POCs showing the added value of data science algorithms  predictive maintenance  In the context of fleet renovation for plant life extension (major overhaul program): operations & maintenance, generation costs optimization  Issues remaining : skills, organization, technical architecture, quantify value  Perspectives and technical issues:  Data lakes and labs for other fleets (thermal plants, hydro, renewables)  Scalable time-series analytics (synchronization, missing data …)  Handling heterogeneous data (textual, images, graphs …)  IoT platform
  • 29. References A proof of concept with Hadoop: storage and analytics of electrical time-series. Marie-Luce Picard, Bruno Jacquin, Hadoop Summit 2012, Californie, USA, June 2012: http://www.slideshare.net/Hadoop_Summit/proof-of- concent-with-hadoop Massive Smart Meter Data Storage and Processing on top of Hadoop. Leeley D. P. dos Santos, Alzennyr G. da Silva, Bruno Jacquin, Marie-Luce Picard, David Worms,Charles Bernard. Workshop Big Data 2012, Conférence VLDB (Very Large Data Bases), Istanbul, Turquie, 2012: http://www.cse.buffalo.edu/faculty/tkosar/bigdata2012/program.php Searching time-series with Hadoop in an electric power company. Alice Bérard, Georges Hébrail, BigMine Workshop, KDD2013, Chicago, August 2013: http://bigdata-mining.org/ Real-time energy data-analytics with Storm. Rémy Saissy, Marie-Luce Picard, Charles Bernard, Bruno Jacquin, Simon Maby, Benoît Grossin, Hadoop Summit 2014, Californie, USA, June 2014: http://fr.slideshare.net/Hadoop_Summit/t-525p212picard Computing Data Quality Indicators on Big Data Stream Using a CEP Wenlu Yang, Alzennyr Gomes Da Silva, Marie-Luce Picard, IEEE Xplore - IWCIM 2015, Prague, Novembre 2015. Exploring Titan and Spark GraphX for Analyzing Time-Varying Electrical Network Guillaume Germaine, Thomas Vial, Hadoop Summit Europe 2016, Dublin http://www.slideshare.net/HadoopSummit/exploring-titan-and-spark-graphx-for-analyzing-timevarying-electrical-networks

Editor's Notes

  1. Nuclear energy supplies competitive, carbon-free electricity that we generate in the best possible safety conditions. In 2014, the International Atomic Energy Agency conducted an audit on how nuclear safety is integrated into the organisation and processes of our central departments: the IAEA found no departure from its standards and identified 17 best practices. → In France, we achieved our best performance in six years thanks to our management of scheduled shutdowns: the average length of extensions was halved. Wintertime fleet availability topped 90%. Our annual output was up 3% (415.9 TWh). • The principle of the “Grand Carénage” maintenance programme was approved. The programme involves renovating the French nuclear fleet over a 10-year period in order to extend its operating life beyond 40 years if all conditions are met. The investment is put at €55 billion for the entire fleet. • The Flamanville EPR worksite is continuing, the first nuclear plant to be built in France for 15 years. → In the UK, output was good (56.3 TWh) despite the unscheduled shutdown of two plants. EDF Energy established a world record for safety in the workplace (0.98 accidents requiring more than one day of lost time per million hours worked by employees and subcontractors). • The Hinkley Point C project to build two EPR in Somerset took a major step forward: in October, the European Commission approved the main terms of the agreements concluded with the British government. → In China, through partnerships, we are taking good advantage of the expertise we have acquired in the design, construction, operation and maintenance of our nuclear fleet. • Construction of two 1,750 MW EPR in Taishan (EDF 30% in partnership with CGN) is ongoing. • We signed an agreement to strengthen cooperation in engineering, operation and maintenance with CNNC, China’s largest state-owned nuclear company.
  2. 29