SlideShare a Scribd company logo
A proof of concept
with Hadoop :
storage and
analytics of
electrical time-
series
                        June 13th 2012
   Bruno JACQUIN, Marie-Luce PICARD,
     Leeley DAIO-PIRES DOS SANTOS,
            Alzennyr GOMES DA SILVA,
      David WORMS, Charles BERNARD
Outline

1. A very brief presentation of the EDF

Group
2. Smart metering data

3. Massive data management for utilities ?

4. A Proof of Concept using Hadoop

5. Conclusion and Perspectives
The EDF Group
profile
EDF GROUP PROFILE

 EDF Group profile
¥  A leading   player in the energy market, active in all areas of electricity from generation to
  trading
  and network management.

¥  Balance     between regulated and deregulated activities.

¥  Expertise   in engineering and operating generation plants and networks.

¥  Expertise   in the design and promotion of energy eco-efficiency solutions.

         in the French and UK electricity markets, solid positions in Italy and numerous
¥  Leader
  other European countries; industrial operations in Asia and the United States.


       37 million                     630,4 TWh                          108.9g of CO2
       customers worldwide            electricity generation worldwide   per kWh generated
                                                                         (CO2 emissions from EDF Group electricity
                                                                         and heat generation)
       158,842                        €65.2         billion
       employees worldwide            in sales                           Consolidated data at 12.31.2010.
EDF WORLDWIDE

Map of Group operations
Smart-Grids projects everywhere in the
world ...




                                                    Key: red=electricity, green=gas, blue=water
                                                    and triangle=trial or pilot where circle=project




EDF R&D : Créer de la valeur et préparer l’avenir                                     6
The EDF Group: a bright outlook for smart
grids
                                 Lower consumption peaks mean less                                      Clearer information
                                 dependence on high-carbon generation        to raise awareness of energy saving strategies


                                                                                Decarbonization of the energy mix             Billing based on actual
                                                                                   through a smoother integration             consumption
                                                                              of renewable energies into networks




                                                                                              Smart
                                                                                              meter




 Reduction in network losses to boost
          competitivity of the system
    Precision in targeted investments
                  for the maintenance
      and modernization of networks




                   More efficient repairs to networks after extreme weather events


               Promoting the development of electric transportation that emits fewer green house gases

                                                                   New energy uses (e.g. electric mobility, storage, etc.)
Smart Grids : what ? And what for ?
" Environmental, economical, social and policy
 drivers lead to a deep change of the energy sector:
 "    Climate change, environmental concerns
 "    Increased pressure of operational and financial efficiency
 "    Increasing awareness of consumers, role of citizens
 "    Technological pressure (IT, smart devices)
Source – Wikipedia
A smart grid delivers electricity from suppliers to consumers using
digital technology with two-way communications to control appliances
at consumers' homes to save energy, reduce cost and increase
reliability and transparency. It overlays the electrical grid with an
information and net metering system, and includes smart meters.
Such a modernized eletricity network is being promoted by many
governments as a way of addressing energy independence, global
warming and emergency resilience issues.
Smart metering data:
the Precious Load
Curves
WhatData ou « The curve look like ?
 Big does a load data deluge »
WhatData ou « The curve look like ? (2)
 Big does a load data deluge »
WhatData ou « The curve look like ? (3)
 Big does a load data deluge »




                    Individual load curves :
                    - Left : same customer, two
                    different days
                    - Up: same day, two different
                    customers
Massive data
management for
utilities ?
Massive data management in the energy
domain: myth or reality ?
"  Challenges :
 "   More complexity in the electric power system (demand
     response, distributed generation …)
 "   Faster evolution of customer indoor equipment (smart meters
     and devices, Internet of Things …)
 ð  Core business will involve more IT and data management


"  The R&D SIGMA project deals with scalability and Big
 Data :
 "   Skillson Big Data techniques
 "   Prototyping on business cases
 "   With internal (IT), academic or industrial partners
Massive data management in the energy
domain: myth or reality ?
"  The SIGMA project studies and experiments
 appropriate methods and techniques
 "   Storage technologies for massive data sets, especially time-
    series
 "   Data processing :
    "   Complex Event Processing, real time analytical processing
    "   Large scale data-mining : massively parallel processing,
      distributed data-mining

"  Use cases
 "    Smart-grids, CRM and customer insight, generation
      optimization : consumption and production forecasting, power
      plant maintenance
A Proof of Concept
using Hadoop
Storing massive time series
"  Objective: Proof of Concept for running a large number
 of queries (variable levels of complexity with variable
 scopes and frequencies, variable acceptable latencies)
 on a huge number of load curves
 "   Data:
         individual curves, weather data, contractual information,
  network data
     "   1 measurement every 10 mn for 35 million customers a
         year
     "   Annual volume of data
           "   1800 billion records ; 120 TB uncompressed data
Storing massive time series: objectives

  "  Build   an « operational Data Warehouse » able to:
       "   Supply    a large volume of data
       "   Ingest new coming data

             "   Pre-processing, synchronization and filling


       "   Allow concurrent and simultaneous queries

             " Tactical queries: Curve selection compared with a

                      mean curve
             " Analytical queries: Aggregated curves


             " Ad-hoc queries


             "   ‘Recoflux’ (simplified)


             "   Extraction capabilities
Storing massive time series: evaluation



    "  Evaluation   criteria
        "   Quantitative
             "   competition(QoS)
             "   Performances (SLA)
        "   Qualitative :
             "   Convergence
             "   Agility
Using relational technologies for storing
massive time series
"  Relational approaches, Very Large DataBases
 "   Works
        carried out with partners: Teradata, Oracle, IBM,
  EMC², HP
 "   Appliances   or software offers,
 "   Shared-nothing   or shared-everything ; Column-based, line or
     hybrid mode?
 "   Separation between an operational use (ODS) and an
     analytical use (DWH)?
Using Hadoop for storing massive time series

   "   Native   distributed file-system (HDFS)

   "   Distributed   treatments using the Map/Reduce paradigm
   "   Large
           dotcom usage but very limited industrial deployment,
    maturity is yet to come despite the major editors arriving
    with offers including integration, appliances and support

   "   Internal   POC concluded in April 2012
The Data model The data deluge »
 Big Data ou «

                                   Compressed data
                                   Volume on HDFS :
                                   è 10 TB (x3)
Data generator - CourboGen ©

   "   Generates   load curves and associated data
   "   Customizable tool: interval, duration, data quality, noise on
       the curves
   "   Distributed architecture (NodeJS, Redis)

   "   Output as data stream




Visualization of 35M curves for one week
Design




"   Hive in the center of our DW
" HBase at the forefront of data access
Design
 "   Hive in the center of our DW
   "   Allows ad-hoc and complex analytical queries
   "   Customer tables stored as rcfile are replicated in all Data
      Nodes (19)
   "   Consumption measurements are partitioned by day and
      customer profil criteria
        "   Daily volume: 25 GB ; Average block size: 10 MB



 " HBase at the forefront of data access
   "   Allows low latencies queries
   "   Recent metering data stored “In Memory” tables

   "   Stores a subset of measurements and aggregates in
      tables with “Bloom filters” enabled
Hardware configuration: the cluster


"   20 nodes in 2 racks:

  "  7   x 1U nodes with 4 x 1 TB

  "   13   x 2U nodes with 8x1 TB

  "   Total   : 132 TB ; 336 cores (AMD)

" Hadoop distribution : Cloudera CDH3u3 (open source)
Hardware configuration: the cluster
Time series representation models: options
 §  TUPLE
 CREATE TABLE cdc_tuple ( id_cdc INT, date_releve TINYINT,
 p INT )
      PARTITIONED BY(day STRING, optarif STRING, psousc
      TINYINT)
      ROW FORMAT SERDE
      'org.apache.hadoop.hive.serde2.columnar.ColumnarSerde'
      STORED AS RCFILE;

 §  ARRAY
      CREATE TABLE cdc_array ( id_cdc INT, values array<
      array< int > > ) …

 §  COLUMN
    CREATE TABLE cdc_144_cols (id_cdc INT, p1 INT, p2 INT,
    …, p144 INT )…
Time series representation models: options
Getting a daily individual load curve
‘select * from cdc_tuple where day='2008-01-01' and id_cdc =
136630;’
Time series representation models: impact



 "   Computing a global aggregated load curve for 1 day

  Representation model   Daily volume              Query execution time
  Tuple                  10.1 GB ( x 3 replicas)              2 min 22 sec
  Column                 8.8 GB ( x 3 replicas)               1 min 17 sec
  Array                  16 GB ( x 3 replicas)                1 min 18 sec
Results - HBase
"   Tactical queries are successfully handled by HBase, offering
 low latencies under a high concurrent load.
    Representation           Period       Nb concurrent     Queries / Sec    Query execution
        model               execution        queries                          time (seconds)
                              time

     Columns (7 * 144)         1 minute              100               470              0.21


    Array (7 x 1 array of      1 minute              100               495              0.20
            144 values)
     Columns (7 * 144)        5 minutes              500               524              0.19


    Array (7 x 1 array of     5 minutes              500               430              0.18
            144 values)


                                                          Query: curve selection
Results – Hive (1)

                               Query                                     Execution time
                                                                     (tuples representation)

  Aggregation France (sum) 10 min interval                                1 min, 56 sec

  Load curve aggregated by contractual information                        2 min, 21 sec

  Analysing consumption trends according to the customers building       1 heure, 18 sec
  caracteristics
  TOP N customers candidates for a power level update                1 heure, 7 min, 35 sec




               Results for different queries:
               - Planned queries (with adequate partitioning)
               - ad-hoc queries
Results – Hive (2)

"   Recoflux scenarios
   Scénario    Mode séquentiel (minutes)   Mode parallèle (minutes)
                  1 jour      1 semaine     1 jour       1 semaine
     521          1.44          10.10        1.56           3.00
     522          27.87        195.09       28.50          31.01
     523          7.98          23.94
     524          10.71         74.99       15.97          19.58
     525          6.10          42.70        7.45           8.39
     526          0.86          6.08         0.92           2.43


" Recoflux is a very important business application (power
 consumption aggregations are computed according to
 different criteria ; updates and temporal data): results really
 acceptable
Using NoSQL technologies for storing
massive time series: results
Integration Hadoop / Tableau Software : visualisation of 700k feeders
Alternative approach for storing massive
time-series : conclusions
"  The less
 "   Not yet mature, a few feedbacks available in the industry
 "   Lack of competences in Europe (impact of configuration and
     tuning, smart skills)
 "   Major editors offering: still young but actively emerging



"  The more
 "   Low   cost
 "   Ability to recycle existing commodity hardware
 "   One of the few solution which allows the coupling between
     structured and unstructured data
 "   Flexibility despite being a complex system to deploy and
     manage. Fault tolerant and scalable.
Alternative approach for storing massive
time-series : conclusions
"  Perspectives
 "   Partnersoffering industrial support
 "   Hardware configuration
 "   Usage of statistical libraries
 "   Connectivity with the relational world


 "   USAGES:
     "    ETL,
     "    intelligent and reliable archival solution,
     "    high throughput data presentation (publication)
Conclusions and perspectives
" Hadoop perspectives
 "   Non    traditional usage of Hadoop using a structured schema
 "   Will become a component of the company IS for non-critical
     usages
 "   Any suggestions ?
       "   storage mode for time-series ?
       "   usages ?



"  Contacts:
   " marie-luce.picard@edf.fr
   " bruno.jacquin@edf.fr

More Related Content

What's hot

GITEX Big Data Conference 2014 – SAP Presentation
GITEX Big Data Conference 2014 – SAP PresentationGITEX Big Data Conference 2014 – SAP Presentation
GITEX Big Data Conference 2014 – SAP Presentation
Pedro Pereira
 
Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017
Ray Bugg
 
Big data and Blockchain in HealthIT
Big data and Blockchain in HealthITBig data and Blockchain in HealthIT
Big data and Blockchain in HealthIT
Dave Callaghan
 
Big Data for Utilities
Big Data for UtilitiesBig Data for Utilities
Big Data for Utilities
Dale Butler
 
Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_
Tina Zhang
 
Big Data big deal big business for utilities vesion 01
Big Data big deal big business for utilities vesion 01Big Data big deal big business for utilities vesion 01
Big Data big deal big business for utilities vesion 01Marc Govers
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
Nati Shalom
 
Big Data Analytics in Energy & Utilities
Big Data Analytics in Energy & UtilitiesBig Data Analytics in Energy & Utilities
Big Data Analytics in Energy & Utilities
Anders Quitzau
 
Whitepaper - Transforming the Energy & Utilities Industry with Smart Analytics
Whitepaper - Transforming the Energy & Utilities Industry with Smart AnalyticsWhitepaper - Transforming the Energy & Utilities Industry with Smart Analytics
Whitepaper - Transforming the Energy & Utilities Industry with Smart Analytics
eInfochips (An Arrow Company)
 
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",..."From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
Dataconomy Media
 
The Soft Grid 2013 Opening Presentation
The Soft Grid 2013 Opening PresentationThe Soft Grid 2013 Opening Presentation
The Soft Grid 2013 Opening Presentation
GTMevents
 
A Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura WynterA Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura Wynter
wkwsci-research
 
Open Source Data Management for Industry 4.0
Open Source Data Management for Industry 4.0Open Source Data Management for Industry 4.0
Open Source Data Management for Industry 4.0
DataWorks Summit
 
Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning
Armando Vieira
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
ParStream Inc.
 
Protecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UKProtecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UK
Ulf Mattsson
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
WeAreEsynergy
 
Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big Data
DataWorks Summit
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
Vikas Manoria
 
Oil & Gas Big Data use cases
Oil & Gas Big Data use casesOil & Gas Big Data use cases
Oil & Gas Big Data use cases
elephantscale
 

What's hot (20)

GITEX Big Data Conference 2014 – SAP Presentation
GITEX Big Data Conference 2014 – SAP PresentationGITEX Big Data Conference 2014 – SAP Presentation
GITEX Big Data Conference 2014 – SAP Presentation
 
Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017
 
Big data and Blockchain in HealthIT
Big data and Blockchain in HealthITBig data and Blockchain in HealthIT
Big data and Blockchain in HealthIT
 
Big Data for Utilities
Big Data for UtilitiesBig Data for Utilities
Big Data for Utilities
 
Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_
 
Big Data big deal big business for utilities vesion 01
Big Data big deal big business for utilities vesion 01Big Data big deal big business for utilities vesion 01
Big Data big deal big business for utilities vesion 01
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Big Data Analytics in Energy & Utilities
Big Data Analytics in Energy & UtilitiesBig Data Analytics in Energy & Utilities
Big Data Analytics in Energy & Utilities
 
Whitepaper - Transforming the Energy & Utilities Industry with Smart Analytics
Whitepaper - Transforming the Energy & Utilities Industry with Smart AnalyticsWhitepaper - Transforming the Energy & Utilities Industry with Smart Analytics
Whitepaper - Transforming the Energy & Utilities Industry with Smart Analytics
 
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",..."From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
 
The Soft Grid 2013 Opening Presentation
The Soft Grid 2013 Opening PresentationThe Soft Grid 2013 Opening Presentation
The Soft Grid 2013 Opening Presentation
 
A Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura WynterA Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura Wynter
 
Open Source Data Management for Industry 4.0
Open Source Data Management for Industry 4.0Open Source Data Management for Industry 4.0
Open Source Data Management for Industry 4.0
 
Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning Customer insights from telecom data using deep learning
Customer insights from telecom data using deep learning
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
 
Protecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UKProtecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UK
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
 
Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big Data
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Oil & Gas Big Data use cases
Oil & Gas Big Data use casesOil & Gas Big Data use cases
Oil & Gas Big Data use cases
 

Similar to Proof of Concept for Hadoop: storage and analytics of electrical time-series

Managing Grid Constraints with Active Management Systems
Managing Grid Constraints with Active Management SystemsManaging Grid Constraints with Active Management Systems
Managing Grid Constraints with Active Management Systems
Smarter Grid Solutions
 
Enerji Sektöründe Endüstriyel IoT Uygulamaları - Şahin Çağlayan (Reengen)
Enerji Sektöründe Endüstriyel IoT Uygulamaları - Şahin Çağlayan (Reengen)Enerji Sektöründe Endüstriyel IoT Uygulamaları - Şahin Çağlayan (Reengen)
Enerji Sektöründe Endüstriyel IoT Uygulamaları - Şahin Çağlayan (Reengen)
ideaport
 
Universities as “Smart Cities” in a Globally Connected World - How Will They ...
Universities as “Smart Cities” in a Globally Connected World - How Will They ...Universities as “Smart Cities” in a Globally Connected World - How Will They ...
Universities as “Smart Cities” in a Globally Connected World - How Will They ...
Larry Smarr
 
Project GreenLight
Project GreenLightProject GreenLight
Project GreenLight
Jerry Sheehan
 
Smart energy summit 2019
Smart energy summit 2019Smart energy summit 2019
Smart energy summit 2019
Moustafa Shahin
 
Kammen
KammenKammen
Intelligence Artificielle et performances énergétiques | Axis Parc (LLN) - 27...
Intelligence Artificielle et performances énergétiques | Axis Parc (LLN) - 27...Intelligence Artificielle et performances énergétiques | Axis Parc (LLN) - 27...
Intelligence Artificielle et performances énergétiques | Axis Parc (LLN) - 27...
Cluster TWEED
 
MRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingMRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingRoger Rafanell Mas
 
EnBIS 2016 opening
EnBIS 2016 openingEnBIS 2016 opening
EnBIS 2016 opening
Monica Vitali
 
Ict4s_conference_ireen_project_page_blöchle
Ict4s_conference_ireen_project_page_blöchleIct4s_conference_ireen_project_page_blöchle
Ict4s_conference_ireen_project_page_blöchle
JessenPage
 
What is Smart grid
What is Smart gridWhat is Smart grid
What is Smart grid
Pasala Naresh
 
009
009009
Fog Computing – Enhancing the Maximum Energy Consumption of Data Servers.
Fog Computing – Enhancing the Maximum Energy Consumption of Data Servers.Fog Computing – Enhancing the Maximum Energy Consumption of Data Servers.
Fog Computing – Enhancing the Maximum Energy Consumption of Data Servers.
dbpublications
 
Smart appliances EupP interoperability
Smart appliances EupP interoperabilitySmart appliances EupP interoperability
Smart appliances EupP interoperabilityRogelio Segovia
 
Transformer Smart Grid
Transformer Smart GridTransformer Smart Grid
Transformer Smart Grid
pacificcresttrans
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computing
Shreyas Khare
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Edwin Poot
 
Green cloud computing
Green cloud computing Green cloud computing
Green cloud computing
JauwadSyed
 
Digital Grid Technologies for Smooth Integration of Renewable Energy Resources
Digital Grid Technologies for Smooth Integration of Renewable Energy ResourcesDigital Grid Technologies for Smooth Integration of Renewable Energy Resources
Digital Grid Technologies for Smooth Integration of Renewable Energy Resources
Moustafa Shahin
 

Similar to Proof of Concept for Hadoop: storage and analytics of electrical time-series (20)

Managing Grid Constraints with Active Management Systems
Managing Grid Constraints with Active Management SystemsManaging Grid Constraints with Active Management Systems
Managing Grid Constraints with Active Management Systems
 
Enerji Sektöründe Endüstriyel IoT Uygulamaları - Şahin Çağlayan (Reengen)
Enerji Sektöründe Endüstriyel IoT Uygulamaları - Şahin Çağlayan (Reengen)Enerji Sektöründe Endüstriyel IoT Uygulamaları - Şahin Çağlayan (Reengen)
Enerji Sektöründe Endüstriyel IoT Uygulamaları - Şahin Çağlayan (Reengen)
 
Universities as “Smart Cities” in a Globally Connected World - How Will They ...
Universities as “Smart Cities” in a Globally Connected World - How Will They ...Universities as “Smart Cities” in a Globally Connected World - How Will They ...
Universities as “Smart Cities” in a Globally Connected World - How Will They ...
 
Project GreenLight
Project GreenLightProject GreenLight
Project GreenLight
 
Smart energy summit 2019
Smart energy summit 2019Smart energy summit 2019
Smart energy summit 2019
 
Kammen
KammenKammen
Kammen
 
Intelligence Artificielle et performances énergétiques | Axis Parc (LLN) - 27...
Intelligence Artificielle et performances énergétiques | Axis Parc (LLN) - 27...Intelligence Artificielle et performances énergétiques | Axis Parc (LLN) - 27...
Intelligence Artificielle et performances énergétiques | Axis Parc (LLN) - 27...
 
MRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingMRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud Computing
 
EnBIS 2016 opening
EnBIS 2016 openingEnBIS 2016 opening
EnBIS 2016 opening
 
Ict4s_conference_ireen_project_page_blöchle
Ict4s_conference_ireen_project_page_blöchleIct4s_conference_ireen_project_page_blöchle
Ict4s_conference_ireen_project_page_blöchle
 
What is Smart grid
What is Smart gridWhat is Smart grid
What is Smart grid
 
009
009009
009
 
Fog Computing – Enhancing the Maximum Energy Consumption of Data Servers.
Fog Computing – Enhancing the Maximum Energy Consumption of Data Servers.Fog Computing – Enhancing the Maximum Energy Consumption of Data Servers.
Fog Computing – Enhancing the Maximum Energy Consumption of Data Servers.
 
Smart appliances EupP interoperability
Smart appliances EupP interoperabilitySmart appliances EupP interoperability
Smart appliances EupP interoperability
 
Transformer Smart Grid
Transformer Smart GridTransformer Smart Grid
Transformer Smart Grid
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computing
 
Smart Energy Systems of Future
Smart Energy Systems of FutureSmart Energy Systems of Future
Smart Energy Systems of Future
 
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
 
Green cloud computing
Green cloud computing Green cloud computing
Green cloud computing
 
Digital Grid Technologies for Smooth Integration of Renewable Energy Resources
Digital Grid Technologies for Smooth Integration of Renewable Energy ResourcesDigital Grid Technologies for Smooth Integration of Renewable Energy Resources
Digital Grid Technologies for Smooth Integration of Renewable Energy Resources
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 

Proof of Concept for Hadoop: storage and analytics of electrical time-series

  • 1. A proof of concept with Hadoop : storage and analytics of electrical time- series June 13th 2012 Bruno JACQUIN, Marie-Luce PICARD, Leeley DAIO-PIRES DOS SANTOS, Alzennyr GOMES DA SILVA, David WORMS, Charles BERNARD
  • 2. Outline 1. A very brief presentation of the EDF Group 2. Smart metering data 3. Massive data management for utilities ? 4. A Proof of Concept using Hadoop 5. Conclusion and Perspectives
  • 4. EDF GROUP PROFILE EDF Group profile ¥  A leading player in the energy market, active in all areas of electricity from generation to trading and network management. ¥  Balance between regulated and deregulated activities. ¥  Expertise in engineering and operating generation plants and networks. ¥  Expertise in the design and promotion of energy eco-efficiency solutions. in the French and UK electricity markets, solid positions in Italy and numerous ¥  Leader other European countries; industrial operations in Asia and the United States. 37 million 630,4 TWh 108.9g of CO2 customers worldwide electricity generation worldwide per kWh generated (CO2 emissions from EDF Group electricity and heat generation) 158,842 €65.2 billion employees worldwide in sales Consolidated data at 12.31.2010.
  • 5. EDF WORLDWIDE Map of Group operations
  • 6. Smart-Grids projects everywhere in the world ... Key: red=electricity, green=gas, blue=water and triangle=trial or pilot where circle=project EDF R&D : Créer de la valeur et préparer l’avenir 6
  • 7. The EDF Group: a bright outlook for smart grids Lower consumption peaks mean less Clearer information dependence on high-carbon generation to raise awareness of energy saving strategies Decarbonization of the energy mix Billing based on actual through a smoother integration consumption of renewable energies into networks Smart meter Reduction in network losses to boost competitivity of the system Precision in targeted investments for the maintenance and modernization of networks More efficient repairs to networks after extreme weather events Promoting the development of electric transportation that emits fewer green house gases New energy uses (e.g. electric mobility, storage, etc.)
  • 8. Smart Grids : what ? And what for ? " Environmental, economical, social and policy drivers lead to a deep change of the energy sector: "  Climate change, environmental concerns "  Increased pressure of operational and financial efficiency "  Increasing awareness of consumers, role of citizens "  Technological pressure (IT, smart devices) Source – Wikipedia A smart grid delivers electricity from suppliers to consumers using digital technology with two-way communications to control appliances at consumers' homes to save energy, reduce cost and increase reliability and transparency. It overlays the electrical grid with an information and net metering system, and includes smart meters. Such a modernized eletricity network is being promoted by many governments as a way of addressing energy independence, global warming and emergency resilience issues.
  • 9. Smart metering data: the Precious Load Curves
  • 10. WhatData ou « The curve look like ? Big does a load data deluge »
  • 11. WhatData ou « The curve look like ? (2) Big does a load data deluge »
  • 12. WhatData ou « The curve look like ? (3) Big does a load data deluge » Individual load curves : - Left : same customer, two different days - Up: same day, two different customers
  • 14. Massive data management in the energy domain: myth or reality ? "  Challenges : "   More complexity in the electric power system (demand response, distributed generation …) "   Faster evolution of customer indoor equipment (smart meters and devices, Internet of Things …) ð  Core business will involve more IT and data management "  The R&D SIGMA project deals with scalability and Big Data : "   Skillson Big Data techniques "   Prototyping on business cases "   With internal (IT), academic or industrial partners
  • 15. Massive data management in the energy domain: myth or reality ? "  The SIGMA project studies and experiments appropriate methods and techniques "  Storage technologies for massive data sets, especially time- series "   Data processing : "   Complex Event Processing, real time analytical processing "   Large scale data-mining : massively parallel processing, distributed data-mining "  Use cases "  Smart-grids, CRM and customer insight, generation optimization : consumption and production forecasting, power plant maintenance
  • 16. A Proof of Concept using Hadoop
  • 17. Storing massive time series "  Objective: Proof of Concept for running a large number of queries (variable levels of complexity with variable scopes and frequencies, variable acceptable latencies) on a huge number of load curves "   Data: individual curves, weather data, contractual information, network data "   1 measurement every 10 mn for 35 million customers a year "   Annual volume of data "   1800 billion records ; 120 TB uncompressed data
  • 18. Storing massive time series: objectives "  Build an « operational Data Warehouse » able to: "   Supply a large volume of data "   Ingest new coming data "   Pre-processing, synchronization and filling "   Allow concurrent and simultaneous queries " Tactical queries: Curve selection compared with a mean curve " Analytical queries: Aggregated curves " Ad-hoc queries "   ‘Recoflux’ (simplified) "   Extraction capabilities
  • 19. Storing massive time series: evaluation "  Evaluation criteria " Quantitative "   competition(QoS) "   Performances (SLA) " Qualitative : "   Convergence "   Agility
  • 20. Using relational technologies for storing massive time series "  Relational approaches, Very Large DataBases "   Works carried out with partners: Teradata, Oracle, IBM, EMC², HP "   Appliances or software offers, "   Shared-nothing or shared-everything ; Column-based, line or hybrid mode? "   Separation between an operational use (ODS) and an analytical use (DWH)?
  • 21. Using Hadoop for storing massive time series "   Native distributed file-system (HDFS) "   Distributed treatments using the Map/Reduce paradigm "   Large dotcom usage but very limited industrial deployment, maturity is yet to come despite the major editors arriving with offers including integration, appliances and support "   Internal POC concluded in April 2012
  • 22. The Data model The data deluge » Big Data ou « Compressed data Volume on HDFS : è 10 TB (x3)
  • 23. Data generator - CourboGen © "   Generates load curves and associated data "   Customizable tool: interval, duration, data quality, noise on the curves "   Distributed architecture (NodeJS, Redis) "   Output as data stream Visualization of 35M curves for one week
  • 24. Design "   Hive in the center of our DW " HBase at the forefront of data access
  • 25. Design "   Hive in the center of our DW "  Allows ad-hoc and complex analytical queries "   Customer tables stored as rcfile are replicated in all Data Nodes (19) "   Consumption measurements are partitioned by day and customer profil criteria "   Daily volume: 25 GB ; Average block size: 10 MB " HBase at the forefront of data access "  Allows low latencies queries "   Recent metering data stored “In Memory” tables "   Stores a subset of measurements and aggregates in tables with “Bloom filters” enabled
  • 26. Hardware configuration: the cluster "   20 nodes in 2 racks: "  7 x 1U nodes with 4 x 1 TB "   13 x 2U nodes with 8x1 TB "   Total : 132 TB ; 336 cores (AMD) " Hadoop distribution : Cloudera CDH3u3 (open source)
  • 28. Time series representation models: options §  TUPLE CREATE TABLE cdc_tuple ( id_cdc INT, date_releve TINYINT, p INT ) PARTITIONED BY(day STRING, optarif STRING, psousc TINYINT) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerde' STORED AS RCFILE; §  ARRAY CREATE TABLE cdc_array ( id_cdc INT, values array< array< int > > ) … §  COLUMN CREATE TABLE cdc_144_cols (id_cdc INT, p1 INT, p2 INT, …, p144 INT )…
  • 29. Time series representation models: options Getting a daily individual load curve ‘select * from cdc_tuple where day='2008-01-01' and id_cdc = 136630;’
  • 30. Time series representation models: impact "   Computing a global aggregated load curve for 1 day Representation model Daily volume Query execution time Tuple 10.1 GB ( x 3 replicas) 2 min 22 sec Column 8.8 GB ( x 3 replicas) 1 min 17 sec Array 16 GB ( x 3 replicas) 1 min 18 sec
  • 31. Results - HBase "   Tactical queries are successfully handled by HBase, offering low latencies under a high concurrent load. Representation Period Nb concurrent Queries / Sec Query execution model execution queries time (seconds) time Columns (7 * 144) 1 minute 100 470 0.21 Array (7 x 1 array of 1 minute 100 495 0.20 144 values) Columns (7 * 144) 5 minutes 500 524 0.19 Array (7 x 1 array of 5 minutes 500 430 0.18 144 values) Query: curve selection
  • 32. Results – Hive (1) Query Execution time (tuples representation) Aggregation France (sum) 10 min interval 1 min, 56 sec Load curve aggregated by contractual information 2 min, 21 sec Analysing consumption trends according to the customers building 1 heure, 18 sec caracteristics TOP N customers candidates for a power level update 1 heure, 7 min, 35 sec Results for different queries: - Planned queries (with adequate partitioning) - ad-hoc queries
  • 33. Results – Hive (2) "   Recoflux scenarios Scénario Mode séquentiel (minutes) Mode parallèle (minutes) 1 jour 1 semaine 1 jour 1 semaine 521 1.44 10.10 1.56 3.00 522 27.87 195.09 28.50 31.01 523 7.98 23.94 524 10.71 74.99 15.97 19.58 525 6.10 42.70 7.45 8.39 526 0.86 6.08 0.92 2.43 " Recoflux is a very important business application (power consumption aggregations are computed according to different criteria ; updates and temporal data): results really acceptable
  • 34. Using NoSQL technologies for storing massive time series: results Integration Hadoop / Tableau Software : visualisation of 700k feeders
  • 35. Alternative approach for storing massive time-series : conclusions "  The less "   Not yet mature, a few feedbacks available in the industry "   Lack of competences in Europe (impact of configuration and tuning, smart skills) "   Major editors offering: still young but actively emerging "  The more "   Low cost "   Ability to recycle existing commodity hardware "   One of the few solution which allows the coupling between structured and unstructured data "   Flexibility despite being a complex system to deploy and manage. Fault tolerant and scalable.
  • 36. Alternative approach for storing massive time-series : conclusions "  Perspectives "   Partnersoffering industrial support "   Hardware configuration "   Usage of statistical libraries "   Connectivity with the relational world "   USAGES: "  ETL, "  intelligent and reliable archival solution, "  high throughput data presentation (publication)
  • 37. Conclusions and perspectives " Hadoop perspectives "   Non traditional usage of Hadoop using a structured schema "   Will become a component of the company IS for non-critical usages "   Any suggestions ? "   storage mode for time-series ? "   usages ? "  Contacts: " marie-luce.picard@edf.fr " bruno.jacquin@edf.fr