Smart Grid research with key findings and conclusions.
Smart Grid research with key findings and conclusions.
1.1 Foreground Information
The existing power grid of the 20th century needs a complete change owing to poor grid
monitoring and control, increasing energy demand, and above all the rising carbon foot
prints. Smart grid is an intelligent power grid that monitors electricity usage in real time
and reduces stress on the grid. To achieve this, it requires plethora of telecommunication
and networking technologies to overcome aforementioned above challenges . Thus, it
is a complete transformation of existing power grid into a modern, intelligent, digital,
reliable, secure, robust and clean power grid. Smart grid has many facets such as
replacing analog with digital devices, legacy point to point with more flexible and
intelligent communication between control stations, and replacing existing digital electric
and gas meters with smart metering technology. By 2050, power consumption of the US
is expected to rise up to 5 TW per year, implying that many more number of transmission
lines, transmission substations, and generation plants are required, thereby making a
complex tightly coupled network infrastructure with varying levels of stress and loads.
With all these hurdles ahead, very little attention has been given in recent years on
reliable and efficient power transmission.
According to the Energy Independence and Security Act of 2007, National
Institute of Standards and Technology (NIST) is assigned a prime responsibility for
developing Smart Grid standards, models and protocols. NIST is also being funded by
Department of Energy (DoE) for this Smart Grid development process.
Also, DoE's "Grid2030" is to have a fully automated electric power grid with
abundant, affordable, clean, efficient, and reliable electric power anytime and anywhere.
DoE in collaboration with North America Electric Reliability Corporation (NERC) and
North American electric utilities, vendors, researchers and academia formed North
American Synchrophasor Initiative network (NASPInet) framework [2, 3] for monitoring
and controlling the state of the grid. NASPInet will comprise of thousands of Phasor
Measurement Units (PMU) for measuring current and voltage phase at different
locations. NERC has also independently stated Critical Infrastructure Protection (CIP)
framework for reliable bulk power transmission . The CIP framework is classified into
eight different categories of which CIP-002 discusses requirements of routing protocols
for smart grid communications.
1.2 Research Objective
The primary focus of this thesis is to present a broad overview of Smart Grid, and then
propose a solution to one of the Smart Grid challenges of strictly monitoring the grid. The
Smart Grid communication network (bulk power transmission) requires real time grid
monitoring and control, to avoid or minimize impact of any future blackout. The Smart
Grid's WAN has much more stringent requirements than any other real time applications,
such as, for example, the maximum service disruption time of less than 5 milliseconds
. In other words, its WAN should be resilient and robust to sustain a link or node
failure. Hence, arbitrarily selecting any type of network recovery model is not sufficient
for the grid monitoring.
Currently, many utility across the grid is using its own proprietary non-routing
protocol like Modbus, Fieldbus, and DNP3, over different underlying technologies such
as ATM, Frame relay, and SCADA. Clearly, there is a need of interoperability among
utilities to gain better understanding of the behavior of the grid. Hence, there is a need of
a common routing protocol.
To meet the above challenges, it is suggested here to use the MPLS technology
for its fast rerouting and packet encapsulation techniques. Furthermore, different network
recovery models were compared through extensive simulations by using the ns2 network
simulator, upon which the best suitable model for each class of service was proposed.
To give a broad overview of Smart Grid, this chapter illustrates its drafts, frameworks,
standards, and communication and networking technologies that are likely to be used or
are currently in use in the Smart Grid network.
2.1 Gap Areas
According to the Energy Independence and Security Act (EISA) of 2007, the National
Institute of Standards and Technology (NIST) has been assigned the “primary
responsibility to coordinate the development of a framework that includes protocols,
model and standards for information management to achieve interoperability of Smart
Grid devices and systems” [3, 5]. NIST has made a three-phase plan to rapidly set up
standards to provide a robust process for continued development and to utilize these
standards when needs and opportunities arise. NIST unveiled its first draft in September
2009, for the Smart Grid interoperability standards – “NIST framework and roadmaps for
Smart Grid interoperability standards”. The draft discusses a high level conceptual
reference model for Smart Grid; it has identified nearly 80 existing standards that need to
be made Smart Grid compliant, 14 high priority gaps, and cyber security standards that
require new or revised standards. For prioritizing this work, NIST has classified eight
priority areas that are critical to existing and in near term for Smart Grid technologies and
1. Wide Area Situational Awareness - This concept has been proposed for decades,
but it was never an integral part of the power system. It has been used only for
postmortem analysis of the grid till date. However, monitoring the condition of
the grid in real or near real time requires a new high speed, reliable and robust
communication network, time-synchronized phasor measurement, and capability
to transmit data from different legacy/modern smart grid devices across wide
geographical areas. Goals of situational awareness are to enable understanding
and optimizing management of power grid components and also to anticipate,
prevent or respond to power disruption problems.
2. Advanced Metering Infrastructure (AMI) - Successful transformation from
existing power grid to smart grid is only achievable through active customer
participation. This technology will have smart meters deployed in every
household to have two way communications between utility and its customers.
The goal of this technology is to reduce stress on the grid and make efficient
3. Distributed Grid Management - DGM aims to maximize performance of different
power grid components such as feeders, transformers, and other components of
networked distribution system, and to integrate them with the transmission
4. Demand Response - Giving incentives to residential and business customers to
reduce energy usage during peak hours or when power reliability is at risk is the
objective behind DR. Utilities, grid operators, and power generating companies
will also be benefited from DR because it reduces their financial and operational
costs. It is also essential for balancing power supply and demand so that the grid
can run efficiently and smoothly.
5. Electric Storage - Storing energy economically has always been a challenge.
Energy can be stored directly or indirectly. The significant energy storage
technology is hydroelectricity, where water is stored in dams and transformed to
energy. New storage capabilities such as pumped hydro stages (PHS) and
compressed air energy storage (CAES) would benefit the entire power grid.
6. Cyber Security - Increasing the number of digital devices in the grid and
connectivity of one device to another across WANs and LANs has raised the
concern of cyber security. Moreover, concern for security increases due to
millions of Smart meters communicating to and from the grid. Cyber security
ensures the confidentiality, integrity and availability of data for strict grid
monitoring and control.
7. Electric Transportation - Reducing green house gas is the primary objective
behind Smart grid. Plug-in Electric Vehicle, eCARs, will significantly reduce
green house gas, foreign oil dependency, and reliance on renewable energy
8. Network Communications - As Smart Grid will comprise of public and private
networks, thus it requires many communication and networking standards that
need to be tailored according to requirements of different applications, actors, and
2.2 Smart Grid Architecture
Surprisingly, today’s power grid was not built on any planned architecture, it is rather the
outcome of poor adhoc planning in the past. The reliability of grid was ensured by mainly
having excess capacity, with unidirectional electricity flowing from centrally located
power plants to end utility customers. The major focus was to meet the increasing energy
demands, rather than changing the overall way the system works, and thus there is a need
of distributed Smart Grid architecture, with two way power flow across the grid.
In the future, the Smart Grid network will comprise of millions of field devices,
thousands of substations, and millions of smart meters, and hence there is a need for
robust network architecture to manage all these devices. Its architecture will be akin to
existing internet which comprises of many networks of networks; similarly, it will consist
of many systems of systems and their subsystems architectures. According to the NIST
Framework, having a single architecture for Smart Grid is a not practical solution to the
problem. Rather, it will have systems and sub-systems architectures.
Till date, there is no single architecture for Smart Grid. NIST along with GWAC,
NERC, NASPInet, FERC, EPRI, and NEMA is designing Smart Grid architecture. It has
adopted reference models that define characteristics, uses, behavior and other elements in
Smart Grid domains along with relationship among these domains. The role of all these
bodies is to finalize frameworks, roadmaps and reference models for the Smart Grid
architecture. The reference model selected should be robust and well documented since a
well documented reference models helps in developing new standards and protocols for
ensuring interoperability, cyber security, and also defining the architecture of the Smart
Grid systems and their subsystems.
2.3 Layers of interoperability
NIST describes the conceptual model on the basis of high level categorization approach
developed by GWAC , and thus, it is worth to mention Grid Wise interoperability
framework that has identified eight different Smart Grid interoperability categories.
"GWAC stack", an eight layer stack, is focused on technical, informational, and
organizational interoperability. The organizational category emphasizes on pragmatic
business and policy aspects of interoperation. The informational category emphasizes on
semantic aspect of interoperation. The technical category emphasizes on the syntax of the
information, and is primarily based on the OSI reference model, and thus this category
comprises all the seven layers of OSI. Figure 2.1 depicts the GWAC's eight layer stack
that lays the foundation of Smart Grid interoperability requirements.
2.3.1 Technical Drivers
The technical drivers consist of basic connectivity, network interoperability, and syntactic
1. Basic Connectivity - "Mechanism to Establish Physical and Logical Connections
of Systems". The basic connectivity category focuses on digital information
exchange between two systems and the establishment of reliable communication
path. It comprises the physical and data link layer of the OSI reference model.
Common interoperability standards at this level include Ethernet over Fiber,
Ethernet over Twisted pair, WiFi, Frame relay, PPP, and EIA-232.
2. Network Interoperability - "Exchange Messages between Systems across a
Variety of Networks". The network interoperability category focuses on issues
arising due to transportation of information between various domains across
multiple communication networks. This category includes the network,
transportation, and session layer of the OSI model. FTP, TCP, UDP, IPv6, ARP,
and IPSec are common interoperability standards.
Figure 2.1 GWAC eight layer model provides a context for determining Smart Grid
3. Syntactic Interoperability - "Understanding of Data Structure in Messages
Exchanged between Systems". Syntactic interoperability refers to mutually agreed
syntax and format for information exchange between domains or transacting
parties. This represents the presentation and application layer of the OSI model.
General standards for interoperability in this category include HTML, XML,
SOAP, and SNMP.
2.3.2 Informational and Organizational Drivers
Informational Drivers - Informational models are expressed in an object-oriented form
in terms of classes or data fields and methods. It is further subdivided into two layers
Semantic understanding and business context.
1. Semantic Understanding - "Understanding of the Concepts Contained in the
Message Data Structures".
2. Business Context - "Relevant Business Knowledge that Applies Semantics with
Standards for informational drivers include IEC 61970 CIM (Common
Information Model) power model, object models based on XML, OPC Unified
architecture, and IEC 61850 substation automation.
1. Business Procedures - "Alignment between Operational Business Processes and
2. Business Objectives - "Strategic and Tactical Objectives Shared between
3. Economic and Regulatory Policy - "Political and Economic Objectives as
Embodied in Policy and Regulation".
2.4 Conceptual Reference Model
The conceptual reference model for Smart Grid is divided in seven domains, namely,
bulk generation, transmission, distribution, customers, markets, service providers, and
operations. Figure 2.2 shows interaction of different Smart Grid domains. These domains
are further divided into sub-domains that encompass actors and applications. Actors are
devices, systems, or programs that make decisions and exchange information necessary
for performing applications. Examples of actors are smart meters, solar panels, IEDs,
PMUs, and control systems. Applications, on the contrary, are tasks performed by one or
more actors within a domain. The model described here is proposed by NIST, which acts
as a tool to identify possible actors and applications in Smart Grid, which will assist in
deciding Smart Grid standards and architectures. Figure 2.3 also shows a detailed
conceptual model with many communication links between and within networks, such as
SCADA, Enterprise Bus, Field Area Network, and Wide Area Network. Many issues ,
such as security, reliability, QoS, latency and interoperability, need to be addressed in
order to fully realize Smart Grid. Since the Smart Grid network will consist of networks
of networks with millions of end devices, there is a need to ensure secure information
exchange. The following two subsections discuss key outstanding issues of selecting a
common communication protocol and the required common communication protocol
within and across its domains.
Figure 2.2 Smart Grid Domains, sources “NIST interoperability framework” .
Note: The blue line in Figure 2.2 represents secure communication flows between seven different domains
and the orange (dashed) line represents flow of electricity from bulk generation to end utility customers.
NIST Smart Grid Cyber Security Coordination Task Group (CSCTG) is currently
identifying the overall threats, vulnerabilities and risks for the seven Smart Grid domains
. It is considering layered based approach for smart Grid cyber security, to ensure that
even if one layer is compromised, other layers should remain secure; this is referred to as
the "a defense-in-depth" strategy.
Figure 2.3 Conceptual Reference Diagram of Smart Grid Domains, Sources “NIST
interoperability framework” .
2.4.1 IP Based Networks
IP based networks are the most favorable choice for future Smart Grid applications . It
is attributed to mature IP standards and widespread acceptance of IP in both public and
private networks. Moreover, IP supports bandwidth sharing and increased reliability with
dynamic routing capability. There are many Smart Grid applications in the customer
domain such as smart meter, thermostat, electric storage, and appliances. Phasor
Measurement Units (PMU), electronic storage, and field devices in a transmission
distribution network require varying classes of services like QoS, minimum latency
maximum packet loss, or minimum bandwidth constraint. All these requirements can be
achieved through IP based communication networks.
2.4.2 Smart Grid Technologies
There are a number of mature Smart Grid technologies, which may be used in different
Smart Grid domains and their sub-domains. NIST has proposed a partial list of
technologies for Wired and Wireless networks .
Wired Network - WDM, SONET/SDH, Fiber paths, PON, Gigabit Ethernet, PLC
Wireless Network - IEEE 802.11, 802.15, 802.16, 3/4G.
Many other independent bodies have proposed different communication
technologies for Smart Grid including Internet2 and Ethernet over Fiber for Backhaul
network, Broadband over Power Line (BPL), WiMax for Mid-haul, 3G Wireless data and
voice, and Zigbee/WiFi for the lastmile networks .
2.5 Advance Metering Infrastructure
The utilities industry has been investing in Automated Meter Reading (AMR) for over
more than two decades. Utilities can consistently collect meter data and keep them in a
central database to analyze and monitor electric usage by using the AMR technology. The
main advantage of this technique is to lower monthly trips required to check meter
reading. Moreover, it helps utilities and power generating companies to efficiently
manage energy. Efficient grid monitoring requires two way information flow, and hence
Advance Metering Infrastructure (AMI) has emerged.
AMI is a combination of technologies for measuring, storing and analyzing data
collected from gas and electric meters in real-time. It basically consists of the following
three parts: smart meters, communication network, and meter Data Management
Application (MDMA) . The prime objective of this technology is to reduce stress and
operating cost of the grid by setting real time or near real time meter pricing. Traditional
electromechanical meters have their readings read once per month, but AMI allows
monitoring of hourly or daily energy usage pattern of consumers. Energy consumed per
hour or in every fifteen minutes is recorded by smart meter, and is sent over
communication networks to utilities companies for monitoring and control purposes.
These networks also send real time pricing and control signals to smart meters for
efficient energy usage. MDMA is a computer hardware and software application at a
utility center that analyzes energy consumption and set dynamic meter pricing. It is worth
mentioning that dynamic price control does not control devices like thermostat, gas,
water or electric meter at customer premises. Controlling of these devices according to
dynamic pricing can be achieved through a sensor network.
Many utilities around the world are deploying millions of smart meters at
consumer’s premises and business buildings for reducing stress on the grid, and,
moreover, encouraging customers to monitor the usage on an hour or day-to-day basis.
One of the critical challenges facing these utilities is to ensure that the selected
technologies for AMI are interoperable and comply yet-to-be-established national
standards . Furthermore, many utilities want to ensure that technologies they have
selected, should allow for evolution and growth as Smart Grid standards evolve. It is
required to keep motivating utilities, for deploying Smart meters without concerning the
future risk of firmware upgrade as Smart Grid is upgraded. NIST has identified this need
of AMI as one of the eight priority areas requiring immediate attention. Thus, NIST
requested National Electrical Manufactures Association (NEMA) to develop national
standards for Smart metering technology. The standard is referred to as NEMA SG-AMI
1-2009 - "Requirements for Smart Grid upgradeability". The objective behind this
standard is to define requirements for smart meter firmware upgradeability for
stakeholders, regulator, vendors, and utility customers .
One of the problems faced by utilities is the use of proprietary communication
protocols by different suppliers, and thus utilities are compelled by suppliers to use same
proprietary protocols to communicate with end customers. Thus, ANSI proposed
standards for interoperability and supporting multiple electric meter manufacturers. The
first standard in the field of electric metering was the ANSI C12.18-1996, which is a
point-to-point protocol for transporting table data specified by ANSI C12.19 via infra-red
optical port. The ANSI C12.19-1997 - "Utility Industry End Device Data Tables" defines
a set of flexible data structures for use in metering products. It is just a template for
transporting data without mentioning of how to store data. The end device only needs to
create data in the proper form and order when information is requested, and to accept
information in the proper form and order when it arrives. Furthermore, the ANSI
C12.21-1998 standard was specified for communication over modem lines between end
customers and utilities. All the above three standards facilitate the transmission of meter
data over optical ports or modem lines. These standards are widely accepted in
commercial and industrial meters, but using the above standards hinders the holistic view
of grid monitoring owing to point to point communication protocols.
Finally, a new standard, ANSIC12.22, is specified to overcome the
aforementioned challenge. The main objective of this new standard is to create a common
communication platform. It is an open standard that enables transportation of C12.19
table over any underlying network and supports interoperability among communication
modules and meters . The protocol does not specify how to transport C12.19 over the
OSI reference model. There are primarily two different types of models proposed in the
standard for transportation of metering data. Firstly, meter with an integrated network
connection only specifies the OSI's application layer protocol, and has the flexibility to
implement any lower layer protocols. Secondly, Meter with a separate Communication
Model (CM) --- the interface between CM and meter --- is explicitly defined from the
application layer down to the physical layer. Unlike C12.18 and C12.21, CM supports
both session and sessionless communications. In session communication, both ends
record and keep track of information requested and granted. In sessionless
communications, neither ends records or tracks any information, and thus it is less
complex to implement. Figure 2.4 illustrates session and sessionless communication
information exchange of C12.21 and C12.22 standards.
C12.22 provides an improved security as compared to C12.18 and C12.21
because these two protocols use unencrypted form of information exchange. C12.22 uses
the AES encryption technique [11, 12] to enable strong, secure Smart Grid
communications, since information from meters will be sent across internet which is an
open channel for intruders to sniff the information. In practice, C12.19 does not require
any encryption technique because it uses point-to-point communication protocol over
optical port; also, C12.21 transmits over telephone modem lines, and thus it becomes
very hard for intruders to eavesdrop the information.
Computer Meter Computer Meter
I den tify
R e ad
AC K o nse a ut he wi th
e sp n ti ca t
Id en t if y R io nd
A CK d Re a
Ne go t ia t A C K an se
e R e spo
A C K sp on se
Ne g ot ia t
Lo go n C 12.22
A C Kpo nse
Lo go n R
S ec u ri ty
A C K p o nse
Se c urity
A C K on se
R e ad R e
Te rm in a
A C K spo nse
Te rmi nat
Figure 2.4 Information exchange in C12.21 and 22 standards .
C12.22 also supports reliable data transfer over TCP, which is required to transmit
packet in a network where high error and retransmission rate exists. Reliability was not
the concern in prior metering standards, C12.18 and C12.21, since it is very hard to
eavesdrop any bit of information in point-to-point communications. As stated earlier,
C12.22 supports sessionless communications, and thus it supports faster meter data
transmission. It means that a single transaction can support authentication as well as read
billing data request without the usual overhead of a table read transaction. Currently,
ANSI C12.22 is in a draft stage for “C12.22 data transport data over IP” .
Many communication technologies including Zigbee, Z wave, Wi-Fi, BPL,
Internet, Wimax, Mobile Network, and RF can be adopted for advance metering
infrastructure (or for last mile and Midhaul networks).. With AMI technology, consumers
are encouraged to use electric appliances during off peak hours, since during that time
period, meter pricing will be least. For example, the cost of electricity during off peak
hours, i.e., 10:00 PM - 06:00 AM will be lower than during peak hours, i.e., 7.00 AM to
9:00 AM. This energy consumption pattern will ease stress on the grid. States of
California, Texas, and Minnesota will be the forerunner in AMI since they are in the pilot
phase of deploying and testing millions of smart meters.
AMI faces many challenges like opting best communication technology that will
last at least twenty to thirty years and security breach by some malicious customers.
2.6 Transmission and Distribution Network
The electric transmission network is one of the oldest and complex network
infrastructures. There are more than 150,000 miles of transmission lines running all
across the US , with the majority of using AC transmission lines. This is because the
power lost in AC, during long distance transmission is less in comparison to DC.
Broadly, the electric grid in the US is divided into three independent interconnects:
Eastern, Western, and Texas interconnect. All the three interconnects are connected to
each other through high voltage DC lines, in case power needs to be rerouted across
With 186 major transmission paths in the eastern interconnect, 50 are used to its
maximum capacity at some point of time in the year , and thus requiring robust
communication network to monitor the state of the grid. Eastern interconnect is and will
always remain under more stress in comparison to the other two since half of the US
population lives on the east coast. That was also why the August 2003 blackout occurred
in the eastern interconnect. The actual reason of this massive blackout was caused by the
Eastlake power plant which was incapable of meeting high electric demand, thus putting
stress on the transmission line. This problem got worse when FirstEnergy failed to trim
stress on time, which eventually led to power sagging. Generally, all power generation
plants are interconnected to each other through the electric grid. So, if there is a power
outage in some locality or some generation plant is incapable of meeting energy demands
of particular locality, then power is rerouted to these localities from some other
generation plant that can meet energy demands. However, owing to the poor
communication network, this information could not be passed to other plants, and thus
leading to the cascading effect. According to some researchers [14, 15, 16], the blackout
could have confined to a smaller region if we had a robust wide area monitoring and
Existing transmission network is typically operated by Regional Transmission
Operator (RTO)/ Independent System Operator (ISO) whose primary responsibility is to
balance generation with load across the transmission network. Presently, this
transmission network is monitored and controlled through the SCADA system composed
Figure 2.5 Complex power transmission network with 137 BA with AC and DC
transmission lines .
of different communication devices . The SCADA system will be discussed in more
details in later part of this chapter. It is worth to mention that SCADA systems are out of
place for grid monitoring due to their average RTU polling time of 4 seconds ; this is
unacceptable for the Smart Grid monitoring where service disruption is confined to be in
the order of few milliseconds.
2.6.1 NASPInet Framework
The Department of Energy (DoE) along with North American electric power industries,
utilities and North American Electric Reliability Coordinator (NERC) formed the
NASPInet framework (formerly known as Eastern Interconnect Phasor Project EIPP) for
wide area monitoring and controlling the grid. The objectives of this framework are to
decentralize and standardize existing synchrophasor measurement system.
Synchrophasors are precise measurements of the grid from Phasor Measurement Units
(PMU). At present, there are 56 PMUs in the western interconnect and 105 in eastern
interconnect. However, by the year 2019 [20, 21], there will be thousands of PMU in
1. Phasor Measurement Unit (PMU) – It is a device which calculates current and
voltage (both phase and magnitude) of different components in a power system
like transmission path status, generator load, active and reactive power, etc. If the
calculated value differs from the reference value, a trigger is generated notifying
instability in part or sub-part of the grid which technically means that the electric
load differs with the mechanical power. Typically, PMU samples at the rate of 30
frames/second (fps) - in comparison to 4 seconds polling time in conventional
techniques, for strict grid monitoring. In the future, depending on the required
level of accuracy, it will sample current and voltage at varying data rate of 10, 20,
30, 60 or 120 fps.
2. Phasor Data Concentrator (PDC) - It collects phasor data from PMUs, other
PDCs, event data, and stores and forwards data to PGWs located in control
centers. PDCs time aligns sampled data from different utility with GPS
technology, thus providing a precise and broad view of the region having multiple
transmission stations and substations. Data generated from all PDCs should have
Universal Time Coordinated (UTC) of less than 1µsec. Archived data is helpful
during grid instability for analyzing the problem.
Figure 2.6 Proposed NASPInet WAN
3. Phasor GateWay (PGW) - Each Balancing Authority Area (BAA) or control
center will have single PGW, which will forward traffic received from PDC to
another PGW. Functions of PGW are monitoring real time traffic, latency,
dropped packets, and detecting corrupted packet . NASPInet will have private
WAN of PGWs with the life span of at least 30 years with minor maintenance in
hardware and software. Currently, there are 150 Balancing Areas in the US, thus
comprising WAN of 150 PGWs . It is a complex network of interconnected
PGWs that needs to be controlled and monitored very precisely.
18.104.22.168 NASPInet WAN It will be a private network of Local Area Networks (LAN)
and Wide Area Networks interconnecting thousands of PMUs by the year 2019, the year
when phasor measurement technique will be fully deployed and functional. It is open
network architecture to allow addition of future functionality, industrial standard of
hardware, software, and replacement without interrupting its normal operation.
Requirements of this WAN are
1. Path of Redundancy - NASPInet WAN network will have redundant path to
overcome any single link failure. The network design will ensure the data
delivery time, maximum interrupt time, and system recovery time. Since class
A and B data will require high availability of path and least interruption times,
the NASPInet WAN shall implement independent redundant path for these
classes. These independent redundant paths will not share any devices or
components along their path, be it be virtual channel or physical components.
2. QoS - NASPInet will maintain full end to end Quality of Service for all the
five classes of services between ingress and egress PGW. The QoS here is
defined in terms of least delivery time, least service disruption time.
3. Traffic Prioritization - The NASPInet WAN shall have traffic prioritization
for each class of service. That is, Class A traffic will have highest
prioritization and Class E the lowest prioritization.
4. Disaster Recovery - The network should support major disaster recovery.
Particularly, class A and B data flow should not be interrupted, and
interruption of other data class is permissible within maximum disruption
5. Network Protocol - The future WAN should fully support IPv6; if not all
components, at least core components of the backbone should support Ipv6.
For the pilot or initial implementation, IPv4 can be used.
Although recommended network for WAN is private, but if the utilities use public
network, then the network should support guaranteed bandwidth. Network should support
industry standard protocol, that is reliable, scalable, providing alternate routes, and
resilient to fault tolerance.
NASPInet classifies different kinds of information generated from PMU like
control feedback, feed-forward, display, etc., into five different classes of services. Table
2.2 shows the maximum disruption time and acceptable end to end latencies for these
classes. End-to-end latency is calculated as the time difference when a packet reaches
ingress PGW to the time when it exits from an egress PGW. Also Table 2.3 shows traffic
priority according to different class of service, highest priority is for class A and lowest
is for class D and E data.
1. Class A: It will be assigned to applications requiring least service
disruption and minimum end-to-end latency, such as real time streaming
for feedback control.
2. Class B: Applications which can sustain higher disruption and end to end
latency than “class A” service will be categorized into this service class,
like feed-forward control and state estimator.
3. Class C: Applications with higher tolerance of disruption time will be
categorized into this class, like visualization to system operator.
4. Class D and E: former service class will be used for off-line and post-
mortem analysis in case of component failure, and later one for scientific
and research purpose.
Table 2.1 Service Disruption and Latency Time for Different Class of Service
Class of Data rate Availabilit Service End to
Description Service (fps) y (%) Disruptio End
n Time Latency
Feedback A 30, 60, 120 99.9999 <5 msec <50 msec
Feed forward B 20, 30, 60 99.999 <25 msec <100 msec
Display C 10, 15, 20 or 99.99 <100 < 1 sec
Disturbance D 30, 60, or 99.99 N/A <2 sec
Research E 30, 60, 120 99.99 N/A <2 sec
22.214.171.124 IEEE C37.118. IEEE C37.118 – Prior standard for synchrophasor real time
Wide Area Monitoring System (WAMS) was IEEE 1344, which was reaffirmed in 2001.
In 2005, it was changed to IEEE PC37.118.2005 due to some drawbacks which emerged
during August 2003 blackout. Key features added to this protocol were improved
information exchange with non phasor systems, sync frame, frame size, station
identification number, configuration, header frame, and command frame . PMUs
from different vendors may have discrepancy in monitoring and controlling power
system, and so variation in measurement concept of TVE (Total Vector Error) was added
to overcome this drawback.
Table 2.2 Traffic Priority for Varying Class of Service
Note: “4” represents highest priority and “1” least priority.
NASPInet Class A Class B Class C Class D Class E
Low latency 4 3 1 2 1
Availability 4 2 3 1 1
Accuracy 4 2 4 1 1-4
Time 4 4 1 2 1-4
High 4 2 4 2 1
Path 4 4 1 2 1
This protocol is defined for real time data transmission to and from PMU .
The protocol is only required if the PMU device is to be used with other power systems
such as digital fault recorder (DFR), Dynamic System Monitor (DSM), and Digital Signal
Analyzers (DSA) interface to macro dyne. If PMU needs to only archive data for future
purpose such as analyzing grid instability, then this protocol is not needed. There are four
types of message formats and five types of commands used for communication between
PMU and PDC of which Data frame is most often used.
1. Data Frame – It consists of sampled real time measured phasor data such as
magnitude, phasor, and frequency, sent from PMU to PDC. Size of data frame is
variable, due to varying level of precisions required. That is, high precision
application requires more frames than the usual 30fps to be monitored at 60 or
120 fps. Size of one frame is 128 byte. Figure 2.5 shows the frame transmission
2. Configuration Frame – It contains information and processing parameters for the
PMU. There are two types of configuration frames, config-1, and config-2.
Config -1 represents constant configuration information of PMU and config-2
represents variable configuration information of PMU, like variable data rate.
3. Header Frame – It contains descriptive information that is sent from PMU to
4. Command Frame – There are five command frames that are sent from PDC to
PMU to start or stop measuring phasor data.
TRANSMITTED SYNC FRAMESIZE IDCODE SOC FRAMESEC
FIRST 2 2 2 4 4
DATA 1 DATA 2 DATA N LAST
Figure 2.7 PMU Frame transmission order, source .
Phasor data generated from PMU is sent to PDC, where it is transported over
UDP/IP that provides connectionless, reliable and faster service for time sensitive
applications. The standard does not define any particular communication medium
between PMU and PDC. Generally, utilities use dial-up or serial communication link for
transporting phasor messages.
Phasor data generated from PMU is sent to PDC, where it is transported over
UDP/IP that provides connectionless, reliable and faster service for time sensitive
applications. The standard does not define any particular communication medium
between PMU and PDC. Generally, utilities use dial-up or serial communication link for
transporting phasor messages.
Table 2.3 Illustration of PMU-PDC Frames and Commands
Frame PMU PDC Message Format Functions
Data → Binary Real time phasor
magnitude, phase, angle
Config- 1 → Binary Constant parts of PMU
(Machine reliable) configuration
Config - 2 → Binary Variable part of PMU
(Machine reliable) configuration, no. of
Header → ASCII Any other information
Command → Binary Start, stop, config-1,
(Machine reliable) config-2, header
command to and from
2.6.2 NERC CIP Framework
The objective of this framework is make reliable and secure bulk power transmission.
North American Reliability Corporation (NERC) has outlined cyber security
requirements for the Critical Infrastructure Protection (CIP). It provides a cyber security
framework for the identification and protection of Critical Cyber Assest (CCA) to support
reliable Bulk power transmission. CAAs include power plants, control center, backup
control center, and transmission substation that provide bulk power generation,
monitoring and control of real time inter utility data exchange. There are eight different
cyber security standards defined from CIP 002 to CIP 009 that deal with critical cyber
asset identification, security management control, personnel training, electronic security
parameter(s), physical security, system security management, incident reporting and
resource planning, and recovery plans . CIP 002 is the most important standard with
respect to this thesis since it deals with routing and outlines characteristics in terms of
communications and cyber assets, According to the CIP 002 standard, a CAA asset
should have at least one of the following three characteristics .
1. The Cyber Asset uses a routing protocol to communicate outside the
Electronic Security Perimeter.
2. The Cyber Asset uses a routable protocol to communicate within a control
3. The Cyber Asset is dial-up accessible.
Dial-up accessible refers to any temporary, interruptible, or not continuously
connected communication access to critical cyber asset from any remote location. For
example, utilities use modem over land line, wireless, or VPN with routable protocol to
connect a critical cyber asset from one or more locations. Any access to a critical cyber
asset via a permanent communication link or dedicated communication circuit would not
be considered as dial-up access.
2.7 Power Plant Communications
Supervisory Control and Data Acquisition – SCADA collects data from sensors,
actuators, Intelligent Electronic Devices (IED), and switch relay in a substation, and
reports to Master Controller, which thus co-ordinates and controls among different field
devices. Here, SCADA is discussed in terms of power plant communication, but it is also
used in substation communications as well. Field devices are controlled near real time vai
the SCADA system. Basic components of SCADA include RTU (Remote Terminal Unit)
and MTU (Master Terminal Unit), in which RTUs collect data from sensors, Intelligent
Electronic Devices (IED) and other field devices, and report to MTU or HMI (Human
Machine Interface). Communications between RTU and MTU uses contention method
CSMA/CD to avoid collision of frames. MTU/HMI monitors and controls various field
The SCADA communication network in the past had been lease lines with point
to point and multi-drop configuration in lines. However, present communication
networks are using fiber optics, frame relay, ATM, Ethernet, T-1 lease line for RTU-
MTU communications. One of the major drawbacks of the SCADA system is that the
MTU polls many RTUs over a single link and the average polling time of any RTU is
nearly 4 seconds. This delay of 4 seconds is unacceptable if power grid has to have
99.9999% reliability, i.e., power disruption of less than ½ second/year.
Distributed Network Protocol 3.3 (DNP 3.0) and IEC-60870-5 are the two open
communication protocols used in the SCADA system. The later one is used in Europe,
and the earlier one is used by rest of the world. DNP3.0 is an open standard,
communication protocol used to communicate between RTUs, MTUs, and various other
field devices like Digital Relay and DFR. It is also used for inter utility communications.
DNP3.0 is based on the 3 Layer Enhanced Performance Architecture (EPA) model which
was created by the International Electrotechnical Commission (IEC).
Existing communication protocol for communicating between various SCADA
systems across WAN is Inter Control Center Protocol (ICCP) or IEC 60870-6. This
protocol is based on the client server model. For example, control center A may be a
client which requests for real time data from another control center B. A control center
can be a client or server. In general, a client is usually connected to many servers, and a
single server is connected to many clients. Figure 2.8 shows the existing WAMS network
with limited number of PMUs and IEDs that still uses ICCP for monitoring the grid.
Figure 2.8 Existing WAMS, using ICCP for grid monitoring.
TECHNOLOGY FOR THE GRID MONITORING
Previous chapters discussed broadly about various aspects of Smart Grid. This chapter
discusses about choosing a technology for Smart Grid monitoring, keeping in view new
and legacy technologies currently installed in different power plants and substations.
3.1 Challenges for Utilities
According to FAQs of NERC CIP-002 standards, routing protocols are those that provide
layer 3 (routing) and above routing functionalities of the OSI reference model .
However, most of the present utilities are still using non-routing protocols such as
DNP3.0, Profibus, Modbus, and Fieldbus, implying that they do not have routing
functionalities. These protocols have interface directly from application layer to data link
layer. This means that for these protocols to have routing functionalities, they must be run
over IP, like DNP over IP, Modbus over IP.
Within utilities, there are several kinds of networks deployed such as SCADA,
AMI, ATM, Frame relay, and PSTN in which each of these networks is maintained
separately. With the growing number of smart grid applications, the size of these network
increases, and it becomes very difficult to manage these networks individually. Thus,
there is a need for a consolidated network that can comprise all legacy technologies.
One of the major challenges ahead for utilities is to provision reliable inter-utility
real time communications, so as to avoid any future blackouts as stated earlier, which will
incur catastrophic losses. The communication protocol selected for Smart Grid should be
robust to accommodate all existing and new kinds of applications . According to
some researchers , the August Black Out could have been avoided if we would have
been equipped with highly efficient and reliable communication network between
3.2 Why not ICCP for Grid Monitoring?
Inter Control Center Protocol (ICCP) or IEC 60870-6 is a real time data exchange
protocol for exchanging data between utilities and RTO/ISO across WAN. ICCP works at
layer 7 of the OSI reference model , and so it utilizes application layer services to
establish and maintain logical association between control centers. Since ICCP is
independent of lower layers, it can also be operated over TCP/IP.
Although ICCP is capable of being able to run over TCP/IP, it is not suitable for
carrying real time synchrophasor data communication; ICCP is not suitable for scenarios
with high sampling rate of phasor data  since it does not generates time stamp
required to accurately monitoring and analyzing the grid condition. Time stamped data is
also required for futuristic grid performance, which the existing ICCP protocol does not
provision. Figure 3.1 shows the existing layout of wide area monitoring by the ICCP
protocol with different utilities connecting each other through different technologies and
UTILITY – D
MODBUS over IP
UTILITY – A
FRAME RELAY over IP
UTILITY – B
SCADA over IP
UTILITY – C
ATM over IP
Figure 3.1 Existing ICCP protocol for – C exchange between utilities.
DNP3.0 over IP
3.3 Why MPLS?
Before discussing the benefits of the MPLS technology, it is worth comparing the
requirements of the smart grid communication network and existing core internet
backbone network. This comparison is necessary since it decides by what degree smart
grid network will differ from the existing core network. Table 3.1 summarizes
requirements of the smart grid network in comparison features exhibited in the present
Smart has two major challenges of grid monitoring and control, and grid security,
further to meet these challenges there are two frameworks NASPInet and CIP, which has
requirements of end to end latency for different classes of services and selected protocol
must have routing and switching functionality as described in OSI reference model
respectively. Thus MPLS meets all the above challenges due to packet encapsulation,
network recovery, and VPN feature, and adheres to the requirements of both the
frameworks. Figure 3.2 shows the block diagram for choosing the MPLS technology for
the smart grid communications.
Securing the grid according to NERC CIP standards is essential for utilities.
Utilities have to bind to CIP standards for securing the power grid. So, utilities have three
of the following options to adhere to CIP standards .
Table 3.1 Comparison between Core Backbone Network and Smart Grid WAN
Network Existing Smart grid
Requirements core network wide area network
QoS requirement Yes Yes
Scalability Yes Yes
End to end latency requirement Yes Yes
High complexity Yes No (for the near future)
Support of real time traffic Yes Yes
quick traffic reroute
Security Yes Yes
Load balancing Yes No (for the near future)
Routable protocol Yes Yes
Not at present,
High bandwidth Yes
but surely in future.
Complex mesh topology Yes Not, at present
1. Removing all routable communications to substations.
2. Reverting back to serial communication over frame relay circuits or
narrow band point to multipoint SCADA (layer 2 only).
3. Enabling IP communications (layer 3) and becoming compliant with
NERC CIP standards.
Avoiding routable communication between substations is not a favorable solution,
hence utilities are implementing layer 2 and layer 3 solutions to enable NERC CIP
standards. Thus, the best suitable option for utilities is to have Multi Packet Label
Switching (MPLS) for having layer 2 and layer 3 functionalities as required by CIP 002.
NASPInet CIP 002
Protocol should have
End to End Latency for switching and routing
different class of service functionality as
described in OSI
Packet encapsulation (ATM, Frame
MPLS relay), Security, VPN, Network
Figure 3.2 Technology for the grid monitoring: suggested technology adheres to
NASPInet and NERC CIP requirements.
3.4 Benefits of MPLS
MPLS provides many benefits over existing similar technologies like traffic segmentation
with VPN, Traffic Engineering (TE), QoS, network resilience, consolidation, and
security. Many power generating companies use legacy technologies such as ATM,
Frame Relay, Ethernet, and TDM for communication between different power generating
companies , and hence MPLS is the appropriate technology owing to its backward
compatibility by virtue of its multiprotocol encapsulation technique. It provides security
through traffic segmentation; basically there are three types of MPLS VPNs deployed in
today’s networks, namely, point to point, layer 2 VPN, and layer 3 VPN for routing
different kind of traffic over the MPLS backbone. Point-to-point MPLS VPNs employ
Virtual Leased Lines (VLL) that is used to provide layer 2 point-to-point connectivity
between two sites for carrying Ethernet, TDM, and ATM frames . Layer 2 VPN or
Figure 3.3 Suggested NASPInet WAN, with MPLS technology for grid monitoring.
Virtual Private LAN Service (VPLS) provides ability to span VLANs between sites. It is
used to route voice, video and ATM traffic between substation and datacenters. Layer 3
VPN or IP enabled VPN is becoming choice for utilities since it enables them to
communicate irrespective of layer 2 technology, for example utility A may connect with
Ethernet to utility B and C using frame relay or SCADA.
High speed, real time synchrophasor data transmission and analysis are required
to gain deep visibility into the power grid. With such a requirement, it is necessary to
have robust and resilient network for carrying synchrophasor data. MPLS provides
various network recovery models like fast reroute and dynamic rerouting depending upon
the requirements of particular class of service. Figure 3.3 shows suggested WAN enabled
with MPLS technology that will provide network resiliency, robustness and security.
NETWORK RECOVERY WITH MPLS
Existing routing protocols used in present Internet backbone are OSPF, IS-IS, and BGP.
Although these protocols are robust and survivable, but the amount of time they take to
recover in case of a network failure is in the order of seconds or few minutes. This much
amount of disruption time is unacceptable for critical network applications.
Multiprotocol Label Switching provides proactive/reactive alternate paths
(backup path or protection path) to quickly reroute the traffic. MPLS is a mature
technology which has been readily deployed successfully owing to its faster traffic
rerouting capability, in case of a link or node failure.
As wide area monitoring of grid will likely make use of the optical backbone for
carrying data between multiple PGWs, it is equally likely for this kind of critical network
to have a node or link failure owing to fiber cut or component failure. Since the links will
be carrying high priority data with some class of service permissible to a maximum
disruption time of only 5 milliseconds. Hence, Smart Grid WAN should provision
network recovery to avoid service disruption. There has been different network recovery
schemes proposed for SONET, IP, Optical Protection, and MPLS. According to
Reference , the MPLS based recovery scheme outperforms as compared to other
technologies for the following reasons with respect to smart grid requirements.
1. IP rerouting is too slow for a core network since it has recovery time
smaller than convergence time
2. IP rerouting cannot provide bandwidth protection required by certain
classes of services in smart grid. For example, when PMUs will be
generating data at 60, 120 or 250 samples/sec in the future, high
bandwidth is required for link protection.
3. Recovery mechanisms in the optical layer or SONET do not consume
4. Recovery mechanisms in the optical layer or SONET do not
differentiate traffics into different classes of services. That is, if these
technologies are used in smart grid, then class A and class E data will
be treated equally, and the same fast restoration scheme will be
employed. Thus, resources allocated to class E will lead to inefficient
MPLS establishes interoperability of protection mechanisms between PGWs from
different vendors in IP or MPLS network that is a required property to enable network
recovery between PGWs from different vendors.
Recovery Path or Backup Path - It is the path by which traffic is restored after a link or
node failure. The recovery path can either be an equivalent recovery path with the same
QoS and bandwidth guarantee, or with limited recovery path with compromised QoS.
Label Switch Router (LSR) - It is the core MPLS component used for forwarding traffic
on the basis of labels attached to it.
Path Switch LSR (PSL) - It is a label switch router responsible for rerouting traffic in
case of a link/node failure.
Path Merge LSR (PML) - Path Merge LSR – It is a LSR at which traffic flowing
through the backup path is merged back to the working path.
Bypass Tunnel - A path that serves to backup a set of working paths. The working paths
must share the same PSL and PML.
Switch Back - The process of switching traffic back from one or more recovery paths to
the original working path.
Switch Over - The process of switching traffic from a working path to one or more
4.3 Network Recovery Schemes
There are two types of network recovery schemes, namely, protection and rerouting.
Protection switching reroutes traffic according to pre-computed paths, and rerouting
switching finds a new optimal path upon a network failure.
This recovery scheme pre-computes an end to end recovery path or a segment of path
according to network policies, traffic, bandwidth, delay requirement for different classes
of traffic. When the failure, either link or node failure, is detected in the network, traffic
is rerouted to the recovery path and the service is restored. The alternate path may be
used to either carry copy of the original traffic or reroute traffic upon fault detection.
Protection switching is thus divided into two sub-protection switching schemes.
1. 1+1 (one plus one) – Network carries the same traffic on alternate back up
path as it carries on the working path. The major advantage of this switching
protection is that there is no data loss, and service is not disrupted since traffic
flows in both paths. The disadvantage of this scheme is low efficiency as
resources have to be allocated throughout the traffic flow.
2. 1:1 (one to one) – the protected traffic flows on the working path, and it is
switched to alternate backup path in case of a link/node failure. The advantage
of this scheme as compared to 1+1 is that less resources need to be allocated.
The disadvantage of this scheme is that once the traffic is rerouted to the
backup path, the low priority traffic flowing in the backup path is dropped or
replaced by this high priority traffic flow. There are few variants to this
scheme like 1:n (one to n) and m:n (m to n). In 1:n, “n” working paths are
protected by only one backup path, and in m:n, “n” working paths are
protected by “m” backup paths.
This recovery scheme, as the name implies, selects a new optimal path upon detection of
a network failure. The new path is selected based on network policies, like bandwidth
guarantee and network topology. The service restoration time in this kind of recovery
model is typically longer since after failure detection, routing table needs to be updated,
and converged; the scheme has to select the best available backup path according to the
new routing table.
There are three defined recovery cycles; MPLS Recovery cycle model, MPLS based
Reversion Cycle Model, and Dynamic rerouting cycle model. The MPLS recovery cycle
model detects a fault and reroutes traffic onto the MPLS paths. If the path on which
traffic is rerouted is not optimal, it uses the other two recovery models. It uses MPLS
Reversion cycle model for explicit path rerouting, and uses dynamic reroute traffic for
forwarding traffic on hop by hop routing.
4.3.1 Local Repair
Local repair is also known as distributed repair. The key idea behind local repair is to
protect a single link or node failure. In local repair, the node that detects failures initiates
the traffic rerouting process. This kind of scheme has the advantage of faster recovery
time and lower packet dropping, but one of the drawbacks with this kind of topology is
that it consumes and requires a greater amount of network resources. There are two types
of local repair topology.
1. Link Recovery/Restoration – The recovery path is configured around an
unreliable link. The alternate backup path selected should be disjoined from
the working path. The traffic on alternate path is routed by upstream LSR.
2. Node Recovery/Restoration – The recovery path is configured around a node
that is unreliable. Also, in this model, selected backup path should be
disjoined from the working path.
4.3.2 Global Repair
It is also known as end to end or centralized repair. The key idea behind global repair is
to protect against any link/node failure on the entire path. In global repair, the node that
sends Fault Identification Signal (FIS) may be distant or near to ingress LSR depending
on how far link/node failure occurred from ingress LSR. The global recovery path is
completely link and node disjoint from the working path, and thus global repair has the
advantage of better resource utilization. The major disadvantage of this topology model
is that it has longer service disruption time than that of the local repair model since the
time taken by the FIS signal back to PSL depends on the distance between failed
link/node and PSL itself.
4.4 Label Distributions
There are different kinds of signaling protocols used for exchanging labels between adja-
cent LSRs. These labels are exchanged to set up Label Switch Path (LSP) for establish-
ing VPN link or rerouting traffic explicitly between LSR. Below are four different kinds
of commonly used LDPs.
1. Label Distribution Protocol (LDP) – It is a protocol for exchanging label for
FEC mapping between LSRs to establish LSP. The two LSRs must adhere to a
common set of procedures for establishing paths. Two adjacent LSRs running
LDP are known as LDP peers. Each LDP peer may have multiple interfaces
through which they are connected so that they may maintain different LDP ses-
sions using different interfaces. An LDP session is bidirectional in nature so that
path can be established by any of the peers. LDP works closely with IGP in se-
lecting a route to destination on hop by hop basis, and thus it does not support
Traffic Engineering (TE) . There are two types of label space defined, name-
ly, per-interface and per-platform label space. Per interface label space is com-
monly used by LER that has ATM or frame relay interface, and Per-platform label
space is used by non-ATM/frame relay interface.
2. Constrained based Routing LDP (CR-LDP) – In LDP, the path selected to the
destination is the shortest path since it is based on the Bellman Ford’s algorithm,
but it might not be the best path. To overcome this drawback, CR-LDP was pro-
posed that uses distance vector to find the shortest path with traffic engineering.
Though the path selected to the destination might be shortest but it could be con-
gested, and so other links which may be available in the network are underuti-
lized. To achieve this goal, the concept of explicit route (ER) was introduced and
incorporated into LDP. When ingress LSR wants to establish a tunnel to egress
LSR, CR-LDP runs the Constrained Shortest Path First (CSPF) algorithm, and
then creates the label request message and inserts an explicit route object in the la-
bel. Egress LSR upon receiving a label request replies with a label mapping mes-
3. Resource ReSerVation Protocol with Traffic Engineering (RSVP-TE) – The
signaling protocol was proposed for applications which require QoS guarantees
with TE. When ingress LSR establishes a tunnel to egress LSR, it sends a PATH
message with inserted label to its next LSR. This LSR will then make a tempo-
rary resource reservation and pass message on to the next node. On reaching an
egress LSR, if this LSR can satisfy the resource request, it will then establish a
tunnel and sends back the LABEL message with inserted label back to the ingress
LSR, as the confirmation of resources and request of tunneling path .
4. Multi Protocol Border Gateway Protocol (MP- BGP) - BGP is a routing proto-
col used to communicate between different autonomous systems. In the MPLS
domain, it is used to set up virtual path between nodes across multiple au-
tonomous systems. The ingress LSR uses BGP for distributing VPN routes to the
egress LSR. BGP by itself does not distribute VPN routes, and so it needs differ-
ent extensions to the protocol to provision this functionality, which is referred to
as Multi-Protocol BGP (MP-BGP). When the ingress LSR wishes to establish a
VPN route to an egress LSR, then this ingress LSR distributes the route to its next
LSR through MP-BGP which then forwards the message to the particular adjacent
4.5 Network Recovery Models
4.5.1 Makam Model
The Makam model, named after the author, is the first of its kind of network recovery
introduced for the MPLS domain [33, 34]. The main idea behind this method is to setup
end-to-end backup recovery path for any link/node failure. In the event of a failure, the
FIS signal generated by a particular LSR is forwarded to the upstream ingress LSR. This
ingress LSR is the same as PSL in this model. Figure 4.1 illustrates a network with the
Makam recovery model. The protection switch path is a pre-computed backup path; PSL
receiving the FIS signal through Reverse Notification Tree reroutes traffic from the
working path to the backup path. Dynamic Rerouting is created once PSL receives the
FIS signal. In general, the protection switch path is used since the service restoration time
is less than that by using dynamic rerouting. In this model a global recovery scheme is
adopted, implying that the node upon detecting a failure sends the FIS signal back to the
ingress LSR to reroute traffic to PML or the egress LSR.
PGW-1 PGW-3 PGW-5 PGW-7
PGW-0 Traffic Signal
PGW-2 PGW-4 PGW-6 PGW-8
Figure 4.1 Makam model.
4.5.2 Haskin Model
Haskin proposed a reverse backup path . The key idea in this model is to reroute
traffic back to PSL in case of a node/link failure. Upon detecting a link/node failure, the
LSR which detects the failure sends incoming traffic back to ingress LSR or PSL, instead
of sending FIS as in the Makam model. The major advantage of this model is the low
number of packet dropped since no FIS needs to be sent to PSL, and rerouted traffic is
inferred as a notification signal of the link/node failure to the ingress LSR. The
disadvantage of this approach is inefficient resource utilization since in general the
backup recovery path is longer than the original working path. Figure 4.2 shows how the
Haskin model works. When a link between PGW-5 and PGW-7 is down, PGW-5 will
send the traffic back to PGW-1. Upon receiving the transmitted traffic, PGW-1 becomes
aware of the link failure in the working path. PGW-1 then switches traffic from the
working path to the global recovery path.
PGW-1 PGW-3 PGW-5 PGW-7
PGW-2 PGW-4 PGW-6 PGW-8
Figure 4.2 Haskin model.
4.5.3 Hundessa Model
Hundessa proposed a model  that offers better functionality as compared to the
Haskin model. In the Haskin model, the PSL which reroutes traffic from the original
working path to the global recovery path causes packet disordering at the egress LSR.
Hence, Hundessa proposed an idea of buffering new packets until packets received from
LSR, which detects the failure are rerouted to the global recovery path. The first packet
received back from LSR which detects the failure is tagged, and hence all subsequent
packets are tagged. Once all tagged packets are rerouted to global recovery path, all the
buffered packets are then rerouted to the global recovery path, and thus avoiding packet
disordering at the egress LSR. Figure 4.3 shows the Hundessa model in which PGW-5,
which detects the link failure, reroutes the traffic back to the ingress LSR i.e., PGW-1.
On receiving the packets at PGW-1, they are tagged and rerouted first on the global
recovery path. At PGW-1, new traffic is first buffered until all tagged packets are
rerouted. The advantage of this scheme is that it saves the processing time at the egress
LSR of reordering packets.
PGW-1 PGW-3 PGW-5 PGW-7
PGW-2 PGW-4 PGW-6 PGW-8
Figure 4.3 Hundessa model.
4.5.4 Local Protection Model
In this model, the LSR, which detects the link/node failure, calculates the new routing
path to the egress LSR [31, 37, 38]. This model has the advantage of not requiring
resource reservation. Also, it does not require pre-computing the global recovery path.
However, this model is not suitable for time sensitive applications since new routes to
egress LSR are calculated dynamically, and the number of packets dropped are more due
to dynamic route selection. Figure 4.4 shows how local rerouting works. When PGW-5
detects the link failure between PGW-5 and PGW-7, it will start searching for a new path
to egress PGW-9. This is done by pruning PGW-7 from the local copy of the network,
and calculating the new shortest path to egress PGW-9.
PGW-1 PGW-3 PGW-5 PGW-7
PGW-2 PGW-4 PGW-6 PGW-8
Figure 4.4 Local Protection model.
4.5.5 Fast Reroute
One of the major driving forces for MPLS is its capability to quickly reroute traffic and
achieve service quality like SONET network with minimal service disruption time. In this
model, the point of failure detection is the same as the point of repair, and thus it quickly
reroutes traffic without sending the FIS signal or recalculating the new reroute path upon
detecting a network failure . According to IETF, many fast reroute techniques have
been proposed, but the most commonly used technique is link protection In this technique
LSP tunnel is setup to provide a backup path for the working path. When a link fails, the
immediate node quickly switches traffic to the backup path with minimal disruption in
service. The selected backup path should have similar bandwidth as that in the working
path for all LSPs. If not, it should have sufficient bandwidth for LSPs to carry high
priority traffic. In fast reroute, two types of labels are used. MPLS generated label is
encapsulated into the physical layer label. In case a link is broken, packets are rerouted
on to the backup path according to the physical layer label. Once packets reach the egress
LSR or PML, the physical layer label is stripped down and packets are forwarded
according to the MPLS generated label. Figure 4.5 shows how fast reroute works.
PGW-1 PGW-3 PGW-5 PGW-7
Pre-computed backup paths
PGW-2 PGW-4 PGW-6 PGW-8
Figure 4.5 Fast Reroute with link failure.
By the year 2013, Wide Area Monitoring System (WAMS) will consist of thousands of
PMUs for the grid monitoring. It is worth iterating that all the PMUs will be connected
across a WAN of nearly 150 PGWs (control centers). For simulation purposes, instead of
taking into account of all the 150 PGWs, a set of 25 PGWs was taken under consideration
for studying the conditions of the grid. The reason behind selecting only 25 nodes is that
it is unlikely to reroute power from a rather remote plant. For instance, in case of a power
outage in some parts of the New York, it is very unlikely that power will be rerouted
from the power plants in California or Texas. This will result in improper resource
BRITE , a random topology generator, was used and modified to generate
various network topologies This topology generator yields link latency in the order of 10s
of milliseconds, which is unacceptable for Smart Grid monitoring. Often, the distance
between two control centers lies in the range of 250-1000 miles, and thus the maximum
permissible link latency is about 4-16 milliseconds. This conclusion is drawn from the
fact that in a fiber link data travels at nearly 2/3rd of the speed of light. It is worth to
mention that Los Angeles Department of Water and Power (LDWP) and Bonneville
Power Administration (BPA) are the two control centers that are 1000 miles apart..
Figure 5.1 Topology of 25 PGWs (nodes).
5.2 Simulation Setup
Generally, every PMU maintains a dedicated telephone link to a single PDC. It is
assumed that each PDC will have nearly 150 - 250 PMUs directly attached to it. Also, it
is assumed that each PDC will maintain a dedicated fiber link to connect to a single
PGW. Thus, every single PGW will be sending Constant Bit Rate (CBR) packets over
UDP/IP. Data generated from each PMU will depend on the level of granularity needed
for monitoring, and thus higher granularity level requires higher number of frames per
second. According to the IEEE standard for phasor technology C-37.118, one frame is
128-byte long, and hence the minimum amount of bandwidth required between a PMU
and PDC is 128 bytes/sample * 30 sample/second, i.e., 30.72Kbps. Table 5.1 shows the
minimum and maximum amount of bandwidth required between any two consecutive
Table 5.1 Required bandwidth between consecutive PGWs
second 30fps 60fps 120fps 250fps
# of PMUs
1 30.72Kbps 61.44Kbps 122.8Kbps 256Kbps
150 4.608Mbps 9.216Mbps 18.432Mbps 38.4Mbps
250 7.68Mbps 15.36Mbps 30.72Kbps 64Mbps
The NS2 simulator was used to simulate the Smart Grid monitoring; it is the most
widely used and trusted network simulator in the research community. NS2 is an event
driven network simulator that was created with the joint effort among University of
California – Berkley, Xerox PARC, University of Southern California, and Lawrence
Berkeley National Laboratory. It is an object oriented simulator written in C++, and OTcl
languages. While C++ acts as a back end running the actual simulation, OTcl works at
the front end, and takes in user input to create and configure a network. C++ is used for
faster computation and OTcl for ease of creating simulation scenarios. The default
signaling protocol of the NS2 package is LDP, which is an obsolete protocol due to
capability of modern routers to process IP headers at "wire speed".
The most common signaling protocol for MPLS applications is RSVP-TE due to
its unique feature of allocating bandwidth efficiently, but since Table 5.1 concludes that
bandwidth is not a constraint for Smart Grid monitoring purpose, hence it was not used
for simulations. For the present simulation work, the CR-LDP protocol was chosen, since
it allocates bandwidth without requiring to reserve the resources for any kind of
application, thus less requiring less overhead compared to RSVP-TE and most suitable
for Smart Grid applications. Since the ns2 package does not contain MPLS module with
CR-LDP signaling, so the source code from [38, 39] were adopted for simulating network
recovery models. Figure 5.2 and 5.3 show the service disruption time and number of
packets dropped, due to a link failure. In comparing Figure 5.2 and Table 2.3, it can be
concluded that, for time critical applications, fast reroute is the most suitable network
recovery model. Table 5.2 summarizes the required network recovery models for
different class of services.
Makam Haskin Local Protection Fast Reroute
Figure 5.2 Service disruption time for different MPLS models.
Makam Haskin Local Protection Fast Reroute
Figure 5.3 Number of packets dropped for different MPLS models.
Table 5.2 Suggested Network Recovery Models for Different Class of Service
Class Description Availability (%) Network
of Service Recovery model
A Feedback Control, 99.9999 Fast reroute
B Feed forward control 99.999 Makam, or
C Display 99.99 Local protection
D Disturbance analysis 99.99 Local protection
E Research 99.99 Local protection
Smart Grid will be evolving from existing power grid to modern, reliable, secure and
green grid. NIST and DoE, along with CERTS, NASPInet, NERC, and EIRP, will be
playing crucial roles for this grid modernization; they are still under initial stages of
finalizing Smart Grid frameworks and roadmaps. By the end of year 2010, NIST has
planned to complete all the priority action plans, like guidelines for IP protocol suite and
wireless communications, and standard meter data profile. This thesis has proposed a
solution to one of the major challenges of Smart Grid, i.e., real time grid monitoring and
control. MPLS is proposed to realize WAMS, owing to the support of packet
encapsulation, VPN, QoS and real time communications. Most importantly, it was
demonstrated that it provisions the required network recovery models which can further
be used for different classes of services proposed by NASPInet. Based on the simulation
results, the Fast Reroute model is appropriate for applications like feedback control,
situational awareness, state estimation, and early event detection, although each link has
to maintain a separate recovery path in this model. This overhead is nevertheless well
justified for such a critical network.
 National Energy Technology Laboratory, "A Systems View of the Modern Grid
Integrated Communications", Feb. 2007, http://www.netl.doe.gov/moderngrid/docs/
 Yi Hu,"Phasor Gatew9ay Technical Specification for North American Synchro
Phasor Initiative Network (NASPInet)", May 2009, http://www.naspi.org/resources/
 National Institute of Standards and Technology, "NIST Framework and Roadmap
for Smart Grid Interoperability Standards Release 1.0 (Draft)", Sep 2009,
 North American Electric Reliability Corporation, "Reliability Standards for the
Bulk Electric Systems of North America", Nov 2009,
 Don Von Dollen, "Report to NIST on the Smart Grid Interoperability Standards
Roadmap", June 2009,
 "GridWise Interoperability Context Setting Framework", March 2008,
 SAIC Smart Grid Team, "San Diego Smart Grid Study Final Report" Oct. 2006,
 Rick Hornby et al., "Advanced Metering Infrastructure – Implications for
Residential Customers in New Jersey", July 2008,
 NEMA SG-AMI 1-2009 "Requirements for Smart Meter Upgradeability" Sep 2009,
 ANSI C12.21-2006 "Protocol Specification for Telephone Modem Communication",
May 2006, http://www.nema.org/stds/complimentary-docs/upload/ANSI-
 A. Moise and J. Brodkin, "ANSI C12.22, IEEE 1703 and MC1222 Transport Over
IP" Nov. 2009, http://tools.ietf.org/html/draft-c1222-transport-over-ip-02
 Edward Beroset, "An Overview of ANSI C12.22", Accessed Nov. 2009,
 National Transmission Grid Study, May 2002,
 M. Amin, “Towards a self-healing energy infrastructure” IEEE Power Engineering
Society General Meeting, 2006.
 M. Amin, “Towards Self-Healing Energy Infrastructure Systems” IEEE Computer
Applications in Power, pp. 20-28, Vol. 14, No. 1, January 2001.
 M. Amin, “Toward Self-Healing Infrastructure Systems” IEEE Computer Magazine,
pp. 44-53, Vol. 33, No. 8, Aug. 2000.
 North American Electric Reliability Corporation, 2007,
 Carl H. Hauser et al., Trust, and QoS in Next-Generation Control and
Communication for Large Power Systems. . International Journal of Critical
 K. Tomsovic et al., Designing the Next Generation of Real-Time Control,
Communication and Computations for Large Power Systems, IEEE Special Issue on
Energy Infrastructure Systems, page 93(5), May 2005.
 Synchrophasor Technology Roadmap, March 2009, http://www.naspi.org/resources/
 R. Hasan et al., “Analyzing NASPInet Data Flows” IEEE PES Power Systems
Conference & Exhibition (PSCE), March 15-18, 2009
 Yi Hu, "Data Bus Technical Specification for NASPInet, May 2009"
 “IEEE Standard for Synchrophasors for Power Systems,” IEEE Std C37.118-2005
(Revision of IEEE Std 1344-1995), pp. 1–57, 2006.
 NERC Cyber-Security Standards, June 2006,
 S. Mohagheghi, J. Stoupis and Z. Wang, “Communication Protocols and Networks
for Power Systems- Current Status and Future Trends,” IEEE PES Power Systems
Conference & Exhibition (PSCE), Mar. 2009.
 D. Bakken, "Smart Grid Data Delivery Service R&D in the USA" Jan 2009,
 MPLS-based VPNs: designing advanced virtual networks, by Peter Tomsu, Gerhard
Wieser , 2002.
 "Juniper Networks Smart Grid Networking Solutions", Nov 2009, Juniper White
 Tom Hulsebosch, Dan Belmont, and Mike Manske, "Smart Grid Network: MPLS
Design Approach", Westmonroe White Paper, Accessed Oct. 2009,
 V. Sharma et al., "Framework for Multi-Protocol Label Switching (MPLS)-based
Recovery", Feb 2003, http://tools.ietf.org/html/rfc3469
 Johan Martin Olof Peterson, “MPLS Based Recovery Mechanisms”, Master Thesis ,
University of Oslo May 2005.
 MPLS-based VPNs: designing advanced virtual networks, by Peter Tomsu, Gerhard
 S.Makam et al., "Protection/Restoration of MPLS Networks", Oct 1999,
 S.Makam et al., "Framework for MPLS-based Recovery", July 2000,
 D. Haskin, R. Krishnan, “A Method for Setting an Alternative Label Switched Paths
to Handle Fast Reroute", Nov. 2000, http://tools.ietf.org/html/draft-haskin-mpls-fast-
 Lemma Hundessa and Jordi Domingo Pascual, “Fast rerouting Mechanism for a
Protected Switch Path”, IEEE 2001, pp 527-530.
 "BRITE topology generator", Online, Accessed Sep 2009,
 D. Adami, C. Callegari, S. Giordano, M. Pagano, “Path Computation Algorithms in
NS2” First International Conference on Simulation Tools and Techniques for
Communications, Networks and Systems (SIMUTOOLS 2008), Mar 2-7, Marseille,
 D. Adami, C. Callegari, D. Ceccarelli,S. Giordano,M.Pagano “ Design and
development of MPLS-based recovery strategies in NS2” IEEE Global
Telecommunications Conference (GLOBECOM 2006), Nov. 27-Dec 1, San
Francisco, CA, USA.