National Institute of Science & Technology Thesis on Synchrophasor Data Storage and Analysis

NationalInstituteofScience&Technology
[1]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
“STORAGE AND ANALYSIS OF SYNCHROPHASOR DATA FOR
EVENT DETECTION USING HADOOP ”
P r e s e n t e d B y
G Hema nt a Kuma r
R o l l N o . C S E 2 0 1 3 9 0 0 0 9
R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
U n d e r T h e S u p e r v i s i o n o f
D r . D i p t e n d u S i n h a R o y
A s s o c i a t e P r o f e s s o r ,
D e p a r t m e n t o f C o m p u t e r S c i e n c e & E n g i n e e r i n g ,
N a t i o n a l I n s t i t u t e o f S c i e n c e a n d T e c h n o l o g y

[2]
Presentation Outlines
 Introduction
 Motivation
 Problem Statement
 Contribution & Development of A New Model
 Synchrophasors/Phasor Measurement Units (PMU)
 Indian Power Grid & Synchrophasors Initiatives Program
 PMU Data Sample
 A New Proposed Model Designed and Implemented
 PMU Data Sample Storage Model Designed
 MapReduce Implementation of Effective Values Finding
 Client Interaction For Power Grid Status Observation
 Conclusion
 Future Work
 References

[3]
Introduction
 Power Grid (PG) is composed of following three main components such as:
Generation, Transmission, Distribution
 Modern PG are massively interconnected and interacting network “Wide Area
Measurement System (WAMS)”.
 WASM is logically a Monitoring, Decision Making & Control Network of the PG. In
this network system both the Electrical Engineers & Computer Science Engineers are
working together.
 For stable & accurate maintain of PG a fast & probably accurate decision making
Information System software models is needed.

[4]
Motivation
 Synchrophasors/PMUs are electrical properties measurement devices having properties
like “High Data Sampling Rates” and “Synchronization”
 Due to high sampling rate & rapid deployment the PMU data sample Storage &
Processing becoming a Big-Data problem. Such a vast amount of Data Processing and
Storage is a major Computational Challenge Problem.
 NLDC using a central software package called as “SYNCHROWAVE Central
Software” to collect the data samples generated by PMUs.

[5]
Problem Statement
 SYNCHROWAVE Central Software implemented on Vertical scalable hardware
units. Such hardware units are more expensive.
 It stores all received data in a Database developed of File system and implemented
over Non-distributed storage architecture
 It processes stored PMU datasets in Non-parallel approach.

[6]
Contribution & Development of A New Model
 My idea is to develop New Operative Software Model for PG usages with latest
COMPUTATIONAL & STORAGE SYSTEM ARCHITECTURES.
 STORAGE SYSTEM Module Development
 I configured Apache Hadoop which provides Distributed Reliable file System
known as Hadoop Distributed File System (HDFS).
 For Random Access of PMU datasets with timestamp as key value I configured
Apache HBase over HDFS.
 Apache HBase is a Distributed Non-Relational Database which supports Multi-
Dimensional Storage.
 For storing PMU datasets I designed three Multi-Dimensional HBase tables namely
GRID_PROPERTIES, PMU_PROPERTIES, PMU_DATA_TREND

[7]
Contribution & Development of A New Model…
 COMPUTATIONAL SYSTEM Module Development
 A MapReduce implementation of a Statistical Analysing Method was designed to
get effective electrical properties over a specific size overlapping moving windows
 Other programming are designed to upload PMU datasets into HBase
 Apache Thrift Gateway was configured to access this New Distributed Server remotely
for PMU dataset recodes access and Analysis through R client, Java client etc.

[8]
Block Diagram of My Proposed Model
HDFS
NameNode Region Server + DataNode
…
…
…
Zookeeper Region Server + DataNode
Hbase Shell
MapReduce
HBase API
Thrift Gateway
R Client Java Client Other……
Distributed
storage
MapReduce Parallel
Processing
Admin
File
System

[9]
Synchrophasors/Phasor Measurement Unit (PMU)
 Synchrophasors/PMU is a synchronized
power signals measures devices deployed
at minimum voltage level of 110KV (i.e.
Sub-Station)
 60 samples per second is the maximum
sampling rate of PMU. (Resettable)
 PMUs have Ethernet & Serial Ports to
connect with WAMS communication
network.
 PMUs send messages were defined in
IEEE C37.118.2 standard.
 Generally PMUs sends information to
PDC.

[10]
Indian Power Grid & Synchrophasors Initiatives Program
 Indian PG was divided in to Five Regions & those inter-dependent region are centrally
under control of National Load Dispatch Centre (NLDC) located at New Delhi.
 North Regional Load Despatch Centre (NRLDC)
 North East Regional Load Despatch Centre (NERLDC)
 East Regional Load Despatch Centre (ERLDC)
 South Regional Load Despatch Centre (SRLDC)
 West Regional Load Despatch Centre (WRLDC)
 Indian PG started Synchrophasors Initiatives Program. Under this program 52 PMUs
were deployed national wide. Those were under control of NLDC. NLDC using a
commercial central software package called as SYNCHROWAVE Central Software
to collect the data samples generated by PMUs.
 The PMU data samples are collected and transferred to Regional Phasor Data
Collector (PDC) located at Regional Centre then from the Regional PDC all samples
are transferred to Central PDC located at NLDC. It’s a hierarchical data collection
structure.

[11]
SYNCHROWAVE Central Software & PMU Data Samples
 SYNCHROWAVE Central software provides both Real-Time Situational Awareness
and Historical Analytics.
 It includes four main components:
 Historian: Connects to multiple Synchrophasors sources PMUs or PDCs and
stores all received data in a database (i.e. file system).
 Services: Connects the Historian Central web-based application (i.e. Central and
Event Viewer)
 Admin: Configuration application for the system.
 Central and Event Viewer: The primary web interface.

[12]
SYNCHROWAVE Central Software… (Block Diagram)

[13]
Collecting Of PMU Data Using SYNCHROWAVE Central Software

[14]
PMU Data Sample
 PMU datasets concerts of timestamp, voltage, current, frequency, phase angles etc.
 Each PMUs whole days readings were recorded and stored into a separate specific file
with file extension .swave by SYNCHROWAVE software at NLDC.
 Each day 52 file are generated on Historian storage.
 Each file contains approximately 24*60*60*25 (= 2160000) records,
 Each file had around 62 columns
 This first column had the Timestamp
 The Timestamp format is in form of “<YYYY/MM/DD HH:MM:SS:MMM>”.

[15]
PMU Data Sample…

[16]
Uses of PMU Data Samples For Fault Detection
 Fault detection & classification: Main objectives is to understand the reasons that has
led to the event, performance of protective equipment’s and remedial actions
taken to avoid its occurrence in future.
 Transmission line faults consist of 85-87% of the total number of faults occurring in
power system.
 Transmission line faults are classified as Single Line-to-Ground faults, Line-to-Line
faults, Double Line-to-Ground faults and Three Phase faults.

[17]
Uses of PMU Data Samples For Fault Detection…
 Researchers working all across the globe for fault detection and fault classification, to
develop smart auto alarming control system models. Those model can be broadly
classified in the following categories:
 Complex Mathematical Optimization Model
 Minimum Volume Enclosing Ellipsoid (MVEE) method known as ellipsoid
model by Makarow et. al.
 Statistical Data Analysis Model

[18]
A New Proposed Model Designed and Implemented
 This model broadly categorized into following module:
 Store PMU Data Samples into HBase Tables
 MapReduce Implementation of Computation For Max & Min Effective Value
Finding in Each Overlapping Moving Window
 Plotting Graphs To Identify Possible Fault Events Using R Client
 Apache Hadoop cluster, Apache HBase, Apache Thrift thread pool

[19]
A New Proposed Model Designed and Implemented…
HISTORIAN STORAGE SINK FOLDER
NON_MAP-REDUCE PROGRAM TO
SINK DATASET INTO HBase TABLE HBase TABLE “GRID_PROPERTIES”
MAP-REDUCE PROGRAM TO
COMPUTE EFFICTIVE VALUES HBase TABLE “PMU_DATA_TREND”
Apache
Thrift Thread
Pool
CLIENT ACCESS (PLOT
GRAPH/ VISUALISE)

[20]
Apache Hadoop Overview
 Hadoop is an open-source framework that allows to store and process big data in a
distributed environment across clusters of computers using simple programming
models. It is designed to scale up from single servers to thousands of machines, each
offering local computation and storage. Hadoop framework includes following
modules:
 Hadoop Distributed File System (HDFS): A distributed file system that provides
high-throughput access to application data. HDFS uses a master/slave architecture
where master consists of a single NameNode that manages the file system
metadata and one or more slave DataNodes that store the actual data.
 Hadoop MapReduce: This is a system for parallel processing of large data sets.
The MapReduce framework consists of a single master JobTracker and one slave
TaskTracker per cluster-node.

[21]
Apache Hadoop Overview (Block Diagram)

[22]
Apache HBase Overview
 HBase is An Open-source, Horizontally Scalable, Non-Relational Database Model
designed to Provide Quick Random Access To Huge Amounts of Structured Data.
 HBase is a Distributed Column-oriented Database where:
 Table is a collection of rows.
 Row is a collection of column families.
 Column family is a collection of columns.
 Column is a collection of key value pairs. (key: Timestamp, value: Value).
 It provides low latency access to single rows from billions of records.
 Hadoop framework includes following modules: Master Server, Region Servers,
Zookeepers, HBase Client API and HBase Shell.

[23]
Apache HBase Overview (Block Diagram)

[24]
HBase Tables Designed to Store PMU Data Samples
 We designed three Multi-dimensional tables on HBase such as:
 GRID_PROPERTIES table:
 It stores all 52 PMU reading in a single record. So it makes easier to retrieve
the whole grid power properties at specific timestamp through a single record
selection query.
 Timestamp considered as RowKey
 Each column-family represents at particular PMU & its cells stores voltage,
current, frequency etc reading of that
 PMU_PROPERTIES table:
 It stores only the PMU identification properties like installed at voltage level
(i.e. voltage level of grid station), region details, feeder details, supplier details
etc. It help to identify the PMU
 PMU_DATA_TREND table:
 It stores Computed Output of MapReduce program.

[25]
GRID_PROPERTIES HBase Table Model Designed

[26]
GRID_PROPERTIES HBase Tables Implemented
 GRID_PROPERTIES Table:

[27]
PMU_PROPERTIES HBase Table Model Designed

[28]
PMU_PROPERTIES HBase Tables Implemented

[29]
PMU_DATA_TREND HBase Table Model Designed

[30]
PMU_DATA_TRAND HBase Tables

[31]
Mathematical Representation of Designed Model
 we considered W an Overlapping Moving Data Window set consists of 300 records
such as, W= {R1, R2... Rn} whereas n= 300 i.e. |W|= 300.
 R represents each record of this table. Each record consists of a set of PMUs located in
grid (i.e. 52 PMUs) represented as R= {RowKey, P1, P2... Pm-1} whereas m= 53 no of
PMUs, |R|= 53.
 Individual PMU are represented by P and it’s a columnfamily. Each P have five cells
such as P= {volt, cur, vAng, cAng, fr }
 where as volt, cur, vAng, cAng and fr represents voltage, current, voltage phasor angle,
current phasor angle and frequency respectively.

[32]
Mathematical Representation of Designed Model [Equations]…

[33]
MapReduce Implementation of Overlapping Moving Window

[34]
Compute Max & Min For Each Overlapping Moving Window
Max-Min MapReduce Algorithm:
INPUT: scan from GRID_PROPERTIES
table
OUTPUT: put into PMU_DATA_TREND
table
Map Task:
 map (RowKey, columns){
 if row_count == 300 then:
 frame_count= frame_count+ 1
 else:
 row_count= row_count+ 1
 key= columns.name+ frame_count
 x= getValue(RowKey, columns:cell)
 context.write(key, x)
 }
Reduce Task:
 reduce (key, values, key, values){
 for each x: value do
 //write code to perform
computation to obtain effective values...
 // create a new object to store
above effective values…
 Context.write(key, new object)
 }

[35]
Plotting Graphs To Identify Possible Fault Events Using R
 Analysis, scan out events, and plot visualization for butter understanding of grid status
and store more effective data sample for future reference
INPUT: scan records from PMU_DATA_TREND
OUTPUT: event (i.e. effective data sample), plot
 Scan columns from HBase table “PMU_DATA_TREND”
 Plot graphs for event observation (reading)

[36]
 Max~Min Graph: Red dots: max volt. & Green dots: min volt.

[37]
 voltage difference plot:

[38]
 Max-Min Method plot: Blue dots: volt. Diff., Red line: 3SD, Green line: Average volt.

[39]
 histogram plot of voltage difference:

[40]
Software & Hardware Infrastructure
 Software configuration used:
 “Ubuntu 14.04 LTS 3.13.0.57-generic kernel”
 “Hadoop cluster setup of Apache Hadoop V1.2.1”
 “HBase cluster setup of Apache HBase V0.94.27”
 “Java version 1.2.0_71”
 “R Studio version 0.98.1091”
 “Apache thrift V0.9.0.” Apache thrift (helps to connect to HBase server and access
HBase table data in R code)
 “rhbase" package (provides necessary method to connect and access HBase table in
R)
 Hardware configurations used:
 Intel Core 2Duo CPU T6600 @ 2.20GHz x 2 Processor, main memory of 2GB,
Physical Disk space of 250GB

[41]
Conclusion
 From the experiment its conclude that:
 PMU data can be stored more reliably in decentralized fashion with ssh security
 Supports using low cost commodity hardware units (i.e. Hadoop clusters and
HDFS) its more economically cost efficient than current vertical scalable high-end
servers system.
 By using HBase on the top of HDFS data read write becomes random
 MapReduce it’s a new concept of parallel programming for handling large scale
data sets and it gives more faster processing.
 HBase provides Non-Relational tabular database
 HBase provides two types of access facilities such as command line by using
HBase Query language and by writing Java code using HBase client API

[42]
Future Work
 Future work will look at development of MapReduce based clustering techniques and
machine learning techniques for classification of power fault detected as well as the
unique events detections. Machine learning is a Pattern Recognition technique which
can help to fetch specific patterns present inside the power signals, every different event
or fault develops a unique type of signal pattern [3]. So, development of a self-learning
neural network model for signal patent identification and early event prediction.

[43]
References
 [1] Phadke, A. G. "Synchronized phasor measurements in power systems." Computer Applications in Power, IEEE 6.2 (1993): 10-
15.
 [2] Aghaei, Jamshid, and Mohammad-Iman Alizadeh. "Demand response in smart electricity grids equipped with renewable energy
sources: A review." Renewable and Sustainable Energy Reviews 18 (2013): 64-72.
 [3] http://posoco.in/2013-03-12-10-34-42/synchrophasors
 [14] Agrawal, V. K., P. K. Agarwal, and POSOCO DGM. "Experience of commissioning of PMUs pilot project in the northern
region of India." dynamics 7 (2010): 8.
 [15] http://hadoop.apache.org/docs/r1.2.1/
 [16] pdf: Hadoop:The Definitive Guide By Tom White, O’REILLY, SECOND EDITION
 [17] http://hadoop.apache.org/docs/r1.2.1/
 [18] http://HBase.apache.org/
 [19] Song, Yaqi, Yongli Zhu, and Li Li. "Large scale data storage and processing of insulator leakage current using HBase and
MapReduce." Power System Technology (POWERCON), 2014 International Conference on. IEEE, 2014
 [20] Bach, Felix, et al. "Power grid time series data analysis with Pig on a hadoop cluster compared to multi core systems." Parallel,
Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on. IEEE, 2013
 [21] https://thrift.apache.org/
 [22] Allen, Alicia, et al. PMU Data Event Detection: A User Guide for Power Engineers. National Renewable Energy Laboratory,
2014
 [23] pdf: HBase:The Definitive Guide By Lars George, O’REILLY

[44]
THANK YOU…

National Institute of Science & Technology Thesis on Synchrophasor Data Storage and Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Similar to National Institute of Science & Technology Thesis on Synchrophasor Data Storage and Analysis

Similar to National Institute of Science & Technology Thesis on Synchrophasor Data Storage and Analysis (20)

Recently uploaded

Recently uploaded (20)

National Institute of Science & Technology Thesis on Synchrophasor Data Storage and Analysis