Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ugif 12 2011-smart meters-11102011

805 views

Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Ugif 12 2011-smart meters-11102011

  1. 1. Integrated Data ManagementManaging Large Data Sets for Smart MetersJacques Roy, IBM·Click to add textjacquesr@us.ibm.com © 2011 IBM Corporation
  2. 2. Integrated Data Management Overview • How much data do smart meters generate? • Introducing Informix TimeSeries benefits • Some technical details • Proof points • Solution partners • Demo2 © 2011 IBM Corporation
  3. 3. Integrated Data Management Utility-Scale Smart Meter Deployments, Plans & Proposals* September 2010 Deployment for >50% of end-users Deployment for <50% of end-users *This map represents smart meter deployments, planned deployments, and proposals by investor- owned utilities and large public power utilities. © 2010 The Institute for Electric Efficiency http://www.edisonfoundation.net/IEE/3 © 2011 IBM Corporation
  4. 4. Integrated Data ManagementExample of Changing Storage Requirements Changing Workloads For 10 Million Smart Meters: Now – Each meter read once per month Very soon – Each meter read once every 15 minutes (2920X) Regulations – Need to keep data on line for 3 years (PUC) and, perhaps, save for 7 years Data for a Utility In California 350.4B# of RecordsPer Year for 10M meters 3.65B 120M Frequency Monthly Meter Daily Meter 15 Minute Meter of reads Reads Reads Reads4 © 2011 IBM Corporation
  5. 5. Integrated Data Management Keeping up with Smart Meter Data Large amounts of data causes problems in 2 areas: A) Storage management • Will get expensive and cumbersome to maintain • Query performance • Compliance Reports must be completed before the end of each day • Customer portal queries must be handled in a timely manner • Customer billing5 © 2011 IBM Corporation
  6. 6. Integrated Data Management Introducing IBM Informix Relational Database • Why IBM Informix? • Low cost/Low administration • Manage thousands of servers with one database administrator • Many installations have no DBA’s • High performance • Insert 100’s of thousands of records per second • Analytic functions not available in other database products • High Availability • Scale up, scale out, and disaster recovery • Native time series data support • Load and analyze time series data “The idea was to run the project in three cycles, and use a different type of analysis in each cycle, to see if wecould draw any conclusions in terms of how to promote change in the way people view their energyconsumption. With help from the IBM Hursley team, we quickly found that Informix TimeSeries could deliverspectacular results.” Clive Eisen, Chief Technology Officer at Hildebrand.6 © 2011 IBM Corporation
  7. 7. Integrated Data Management Key Strengths of Informix TimeSeries • Performance • Extremely fast data access • Data clustered and sorted by time on disk to reduce I/O • Handles operations hard or impossible to do in standard SQL • Continuous Real-Time and Batch Data Loaders • Space Savings • Typically saves 50% space over standard relational layout • Toolkit approach • Allows users to develop their own algorithms to run in the database • Algorithms running in the database leverage the buffer pool for speed • Easier • Conceptually closer to how users think of time series7 © 2011 IBM Corporation
  8. 8. Integrated Data ManagementTypical Relational Schema for Smart Meters Data Index Smart_Meters Table meter_id Time KWH Voltage ColN 1 1-1-11 12:00 Value 1 Value 2 …….. Value N Table Grows 2 1-1-11 12:00 Value 1 Value 2 …….. Value N 3 1-1-11 12:00 Value 1 Value 2 …….. Value N … … … … …….. … 1 1-1-11 12:15 Value 1 Value 2 …….. Value N 2 1-1-11 12:15 Value 1 Value 2 …….. Value N 3 1-1-11 12:15 Value 1 Value 2 …….. Value N … … … … …….. … • Each row contains exactly one record = billions of rows • Additional indexes are required for efficient lookups • Data is appended to the end of the table as it arrives • Meter Id’s stored in every record • No concept of a missing row8 © 2011 IBM Corporation
  9. 9. Integrated Data Management Same Table using an Informix TimeSeries Schema Smart_Meters Table meter_id Series 1 [(1-1-11 12:00, value 1, value 2, …, value N), (1-1-11 12:15, value 1, value 2, …, value N), …] 2 [(1-1-11 12:00, value 1, value 2, …, value N), (1-1-11 12:15, value 1, value 2, …, value N), …] 3 [(1-1-11 12:00, value 1, value 2, …, value N), (1-1-11 12:15, value 1, value 2, …, value N), …] 4 [(1-1-11 12:00, value 1, value 2, …, value N), (1-1-11 12:15, value 1, value 2, …, value N), …] … … Table grows • Each row contains a growing set of records = one row per meter • Data append to a row rather than to the end of the table • Meter Ids not stored in individual records • Data is clustered by meter id and sorted by time on disk • Missing values take no disk space, missing interval reads take 2 bytes9 © 2011 IBM Corporation
  10. 10. Integrated Data Management Informix TimeSeries: Key Concepts • DBSpace • A logical unit of storage used by the database server • Made up of one or more physical storage units called chunks • Containers • Specialized storage for TimeSeries in dbspaces • TimeSeries data element: row type • Flexibility to define as many parts as needed • TimeSeries types: regular, irregular • Covers regular intervals and sparse data distribution • Calendar • Defines business patterns • Relational view: VTI and processing return type10 © 2011 IBM Corporation
  11. 11. Integrated Data Management Informix TimeSeries: Space savings • Example of simple meter record: {loc_id, timestamp, register_1, register_2, register_3} 8 bytes 12 bytes 4 bytes 4 bytes 4bytes • TimeSeries savings: no duplication of loc_id, timestamp • Record size: 12 bytes instead of 32 bytes • Impact: 10M meters, 3 years, 15-minute interval • Savings: ~20TB 20 bytes * 10M meters * 96 intervals/day * 3 years * 365 days/year • Index on meter records: • Relational: 1 trillion entries on loc_id, timestamp • TimeSeries: 10M entries on loc_id11 © 2011 IBM Corporation
  12. 12. Integrated Data Management Informix TimeSeries: I/O advantages • Less data to read, less index entries to navigate • Meter data grouped on pages • Reading 1 day of data → One I/O (+ index I/O) • Relational: no grouping guarantees, could be 96 I/Os (+ index I/O)12 © 2011 IBM Corporation
  13. 13. Integrated Data Management Informix TimeSeries: Processing advantages • TimeSeries data is accessed in time order • Relational model has no order guarantees – Sorting required • Library of SQL functions to process TimeSeries • 100+ functions • Ex: Aggregate data from 15-min interval to daily interval Calculate a moving average • Ability to write custom functions • Write business-specific processing • Streamline the processing instead of generic functions • Can greatly reduce code length → Significant performance improvements13 © 2011 IBM Corporation
  14. 14. Integrated Data Management Informix TimeSeries vs Relational Database Systems Comparison of 1M Meter POC at Large US Energy Provider Informix Competitors Relational1M Meter @ 15min POC Improvement TimeSeries Database SystemLoad Time 18 Minutes 7 Hours 23x fasterReport Generate Time Seconds to 11 min 2-7 HOURS 38x fasterStorage Space 350GB 1.3 TB <1/3 storage IBM Informix TimeSeries Relational Database System Based on actual benchmark test run for a large electrical utility company Processing Disk Time Storage (Hours) Space Data Loading Reports Relational IBM Informix Database TimeSeries14 © 2011 IBM Corporation
  15. 15. Integrated Data Management Informix TimeSeries vs Relational Database Systems Comparison of Published Benchmarks for Meter Data Management Daily Total Total DB App DB App Meters Reads Cores RAM cores cores RAM RAM Informix TimeSeries (shared 100M 4.9B 16 500 16 (shared) 500 ) The Competition * 5.5M 1.06B 456 3668 96 360 768 2900 Daily Readings (meters * registers * intervals) Database Resources (CPU cores)Informix TimeSeries 4,900,000,000 Informix TimeSeries 16 The Competition 1,061,500,000 The Competition 96 5 times the performance < 1/5 the resources … with significantly simpler management using a single node system * Based on published Oracle benchmark http://www.oracle.com/us/industries/utilities/performance-benchmark-exadata-wp-161572.pdf15 © 2011 IBM Corporation
  16. 16. Integrated Data Management Informix TimeSeries – Success with leading MDM providershttp://smart-grid... http://www.metering.com/node/20062 MDM offers scalability to 100 million smart meters . “We feel that the Informix “Performance testing results of TimeSeries technology lends AMT-SYBEX’s Affinity Meterflow itself extremely well to the meter data management (MDM) performance characteristics, data application using IBM Informix types and high-volume TimeSeries software has processing required for the next demonstrated the capability to generation of smart metering and offer linear scalability up to 100 smart grid applications.” million meters to load and - David Hubbard, CTO and process meter data at 30-minute co-founder of Ecologic Analytics intervals in less than 8 hours.” 16 © 2011 IBM Corporation
  17. 17. Integrated Data Management Client Success: HildebrandHildebrand is a technology consultant company on the Digital Environment Home EnergyManagement System (DEHEMS) project. The Hildebrand team was asked by the UK government to finda way to scale up its energy monitoring solution and enable it to monitor three million homes. What’s Smart? • Energy management thru real time monitoring • Ability to monitor scales to over 3 million homes • Reduced energy Business Benefits • Helping people make better decisions about energy efficiency in the home. • Collect, store and analyse up to • 50,000 data points per second • Delivers high performance on low-cost hardware by leveraging Informix time series data management technologies.17 © 2011 IBM Corporation
  18. 18. Integrated Data Management Hildebrand: 3 Million Meters • Test involved 3,000,000 Homes • Hardware/Software Used • Data generated every 6 seconds! • Intel with 8 cores running: • 64 bit SUSE Linux v10 • Up to 26 values recorded • 16 GB of memory • Meter ID • Informix workgroup edition • Timestamp • Hildebrand software • 3 Electricity phases • 1 Gas reading • 20 Individual electrical sockets in the house • Data collected every 6 seconds • Aggregated to per-minute readings • Minute readings bulk-loaded into Informix every 10 minutes • Average load was about 50,000 inserts per second18 © 2011 IBM Corporation
  19. 19. Integrated Data Management Consumer Education19 © 2011 IBM Corporation
  20. 20. Integrated Data Management AMT/SYBEX Proving the concept for smart metering data management • The Need: • Extend AMT/Sybex’s Data Transfer Solution to: “We owe a great deal to • Load, validate, store, and provide smart meter interval and IBM, both for the Informix event data to external system technology itself, and for the fantastic support from all • The Challenge: levels of the organisation.” • Process the enormous volume of data in a timely — Gordon Brown, DTS Product manner Owner, AMT-SYBEX • Not to require a huge investment in new hardware Solution components: • The Solution:  IBM® Informix® TimeSeries™ • Combined effort with IBM Research and IBM Informix • Informix TimeSeries provides the core capabilities to  IBM Research store and process the meter and event data: • VEE • Real-time energy monitoring • Analytics for developing new tariff rates • Help to smooth peaks in demand20 © 2011 IBM Corporation
  21. 21. Integrated Data Management AMT/Sybex Tests and Performance Measurements The main tests for AMT-SYBEX were to prove key business functions at high volume: • Can high volumes of data be loaded into Smart DTS? • Can data be analyzed and processed in Smart DTS in a timely manner? To facilitate these requirements, the following tests were carried out: • These results are for 10 million meters storing data every ½ hour data for one day. Module Time Readings/intervals per second Technical Validation and 13 minutes >600,000 Transformation High Speed TimeSeries 50 minutes >150,000 database load (with full logging) Validation and Estimation 33 minutes >240,000 Full processing end to end for 10 million meter points with half-hourly interval data on a single mid-range P-series 8 CPU server is just over 1 ½ hours21 © 2011 IBM Corporation

×