Académique Documents
Professionnel Documents
Culture Documents
JUNE 2018
ADVANCED UTILITY DATA MANAGEMENT
AND ANALYTICS FOR IMPROVED
OPERATION SITUATIONAL AWARENESS
OF EPU OPERATIONS
JWG D2/C2.41
Members
Contributing Members
G. SANTAMARÍA MX A. HERNÁNDEZ MX
M.Y. HERNANDEZ PÉREZ MX D. MARAGAL US
M. BASTOS BR
Copyright © 2018
“All rights to this Technical Brochure are retained by CIGRE. It is strictly prohibited to reproduce or provide this publication in
any form or by any means to any third party. Only CIGRE Collective Members companies are allowed to store their copy on
their internal intranet or other company network provided access is restricted to their own employees. No part of this
publication may be reproduced or utilized without permission from CIGRE”.
Disclaimer notice
“CIGRE gives no warranty or assurance about the contents of this publication, nor does it accept any responsibility, as to the
accuracy or exhaustiveness of the information. All implied warranties and conditions are excluded to the maximum extent
permitted by law”.
WG XX.XXpany network provided access is restricted to their own employees. No part of this publication may be
reproduced or utilized without permission from CIGRE”.
Disclaimer notice
ISBN : 978-2-85873-434-4
“CIGRE gives no warranty or assurance about the contents of this publication, nor does it accept any
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
EXECUTIVE SUMMARY
Objective
The CIGRE Joint Working Group No. D2/C2.41 is a joint effort between the study committees D2 and
C2. It has surveyed and examined current practices, industry trends, and new research on the use of
various data sources and analytics tools to enhance situational awareness of system operators, as well
as on the data-integration and -management technologies to facilitate effective implementation of
data-analytics applications in the control room and to support operation engineers.
Motivation
The increasing complexity and interconnectivity of modern electric grids, in addition to the highly
stringent reliability, economic, and environmental constraints, impose the need to provide system
operators and operation engineers with better tools for assessing system conditions and to support
them on making critical decisions. Fortunately, the large variety of internal and external data sources
that are available to electric utilities opens up the possibility to implement advanced data-analytics
and -visualization technologies to improve the way the system is operated and controlled. Analytics
algorithms capable of synthesizing actionable information from the raw data can be used to provide
tools that use real-time data streams to support fast, accurate, and adaptable decisions solving critical
problems at the right moment, as well as to plan mitigation actions against anticipated system
security issues.
Using data to make critical operational and business decisions is certainly not new to the electricity
industry. Indeed, techniques for data analysis have been applied to several areas such as load
forecasting, predictive asset maintenance, crew scheduling, outage management, and demand
response, among others. Nevertheless, the maturity and practical implementation of data-analytics
applications to support the operation of power systems remains relatively low compared to other
areas and industries. Therefore, it is very valuable to examine how advanced data-analytics
technologies can be further used to solve the emerging critical challenges in operating electric
systems.
Approach
The content of the technical brochure is broken down into the major areas that are relevant for the
development and implementation of data-analytics tools, which are: data and information sources,
data-analytics techniques to interpret this data, applications of data analytics in system operations,
data integration and modelling to integrate data into operations, and data quality.
This document has six main sections, each of them addressing one of these topical areas. The content
in each section is intended to provide the reader with an informed and comprehensive starting point
to understand the relevant issues and challenges in each area. The sections discuss latest advances in
terms of data-analytics methodologies, data-management and -integration tools, applications
development, and new trends and emerging technologies.
Value
This technical brochure provides useful insight on how advanced data-analytics techniques and tools
that integrate various data sources can be used to improve situational awareness of those who
operate power systems and to support various operation functions. This work is expected to be useful
for Cigre members in the following areas:
Operators of transmission and distribution systems will gain knowledge on how new data-
analytics and -visualization technologies can help improve situational awareness.
Product vendors will assist in identifying gaps in the market and potentially new uses for
existing products.
Application and system developers will better understand what the challenges are for operations
and the need for better analytics tools.
Researchers will assist in recognizing new areas for research and the application of this
research.
Consultants and project engineers will provide relevant reference material.
3
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
4
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
5
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
CONTENTS
EXECUTIVE SUMMARY ............................................................................................................................... 3
OBJECTIVE .................................................................................................................................................................................. 3
MOTIVATION .............................................................................................................................................................................. 3
APPROACH .................................................................................................................................................................................. 3
VALUE ........................................................................................................................................................................................... 3
SUMMARY OF RELEVANT CONCLUSIONS AND TAKEAWAY ..................................................................................... 4
6
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
7
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
8
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
TABLES
Table 2-1: Monitored parameters of circuit breakers ..................................................................... 25
Table 2-2: Status of different condition assessment techniques for power transformers ................... 26
Table 2-3: Different sensors and output data ............................................................................... 27
Table 2.4: Solar measurement and description ............................................................................. 27
Table 2.5: Wind turbine sensors and applications ......................................................................... 28
Table 2-4: Example of the minimum required BESS signals for a EMS (SICAM microgrid control) ...... 30
9
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
10
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
11
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
1.2 OBJECTIVE
The objective of this technical brochure is to address the increasing importance of situational
awareness in grid operation and to give an overview of the most relevant developments in data
analytics and data integration associated with situational awareness. It aims to identify future needs
by addressing some of the fundamental questions that this growing challenge of maintaining an
increasing situational awareness in a complex system implies:
What does situational awareness mean for the electricity utility industry?
What are the future needs for improved situational awareness and better operator decision-
support tools (including tools for operation engineers, protection, etc.)?
Is the new data that is available through smart grid investment useful for accomplishing the
future needs and requirements for future solutions?
Who are these new data sources for, and do they fulfill their needs?
What data-analytics techniques and tools are needed to transform the large inflow of data
into actionable information?
What is the present status of the use of data analytics to support the operation of power
systems?
What technologies are needed for handling data and performing integration?
What models and data-integration technologies are needed to automate the processes and
enable actionable information, based on data from many sources, to reach the appropriate
users.
How can quality of data be properly assessed and improved to make the analytic solution
more valuable and reliable?
What are organizations doing in this space currently and in the future?
What areas do organizations need to focus on to address this challenge?
Cigre is ideally placed to draw together these different elements because it has a large knowledgeable
contributor base and can disseminate any learning to a wide audience. Therefore, this technical
brochure is aimed to assist its members in the following areas:
Transmission and distribution operations: essential for all levels of transmission from
distribution level to system operators to gain knowledge on how new data-analytics and -
visualization technologies can help improve situational awareness.
Product vendors: assist in identifying gaps in the market and potentially new uses for existing
products.
Application and system developers: better understand what the challenges are for operations
and the need for better analytics tools.
Researchers: assist in recognizing new areas for research and the application of this research.
Consultants and project engineers: provide relevant reference material.
1.3 APPROACH
This technical brochure aims to present the collective thinking from a wide range of industry experts
across a broad range of perspectives to the different challenges involved in providing and maintaining
situational awareness by breaking the problem down into the different areas involved in this
challenge.
Collection of rich data—complemented by system modeling, advanced data analytics, and emerging
decision-support tools—has the potential to improve the possibility of predictive analytics that can
enhance situational awareness and improve decision-making. In general terms, the development and
successful implementation of data-analytics tools involve addressing specific aspects on various
domains, as depicted in the Figure 1-1.
12
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Use Cases
Data
Data
Models and
Sources
New Integration
Decission
Support
Tools
Advanced Data
Analytics Quality and
Techniques Validation
Figure 1-1: Aspects to address the development and implementation of analytics techniques using
various data sources
13
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
that the data used in the various applications meet the minimum standards of data quality to
guarantee meaningful results.
This technical brochure is designed to provide the reader with an informed and comprehensive
starting point to understand these issues.
Each section aims to discuss and incorporate the latest advances in the relevant area in terms of
technology and approach to this particular challenge. The structure of this technical brochure is as
follows:
Section 1 – Introduction and Background: The second part of this introductory section
introduces the concept of situational awareness in relation to electricity power networks and discusses
why it is important to implement advanced data-analytics applications in the control room and
departments that support system operations.
Section 2 – Data Sources in Electric Power Systems: This section provides a description of the
many data sources that can be found in an electric power system. It covers both traditional sources
commonly used for monitoring, protection, and control and new or non-conventional data sources that
emerge from smart grid technologies. It also describes data sources that are external to the electric
system but can be accessed and used for power system applications and decision-making. It also
describes the communication requirements of each dataset type to ensure that data reaches the
different data-analytics applications with the required quality, velocity, and availability.
Section 3 – Data-Analytics Techniques: The main advanced data-analytics techniques that can be
used for a variety of operation-support tools are described in this section. The description of each of
these techniques includes a definition, technical description with some mathematical details, common
application domains, and potential applications in a smart grid.
Section 4 – Applications of Data Analytics in System Operations: This section describes an
extensive array of applications in power systems and the various tools and techniques identified in the
previous section. A survey of existing practices, tools, and techniques using various sources of data to
improve situational awareness and provide operation decision-making support is presented.
Section 5 – Data Integration and Modeling: This section examines typical data modeling
processes in electric utility transmission organizations to explain how data are assembled in the power
industry for secure and reliable grid operation. To illustrate the concepts, it presents an example of an
actual data-integration project in a large utility in the U.S.
Section 6 – Data Quality and Validation: The importance of good data quality and the methods
of validating this data are presented in this section.
Section 7 – Conclusions: This section summarizes the main findings and conclusions. It identifies
the future states, gaps, and research needs to move the utility industry to a more extensive use of
data-analytics technologies to support the operation of a power system. The brochure concludes by
discussing and presenting several conclusions, but due to the nature of the challenge, it will not
provide “one solution to fit all.” Instead, it aims to leave the reader in a more informed position and
with a valuable source of reference and further reading.
14
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
As grid operations become more complex—due to increasing variability in demand and supply
balancing through new (and often “intelligent”) types of loads, renewable integration, and cross-
border integration of systems—situational awareness becomes more challenging. To cope with this,
automation and automated decision-making have become essential for grid operations. However, this
creates a new level of complexity and makes the system less intuitive and transparent. So, to increase
situation awareness—or, at minimum, keep it on par—new tools for analysis and decision support for
grid operators are essential.
Figure 1-2 describes the levels of situational awareness required for grid operation under the new
conditions. It is a significant challenge to move upwards on these levels. Certainly, it requires an
understanding of the various elements within this problem space, hence the need for the problem to
be broken down in the different sections that are addressed in this brochure.
Therefore, situational awareness from the perspective of electrical power systems can be interpreted
as the continual assessment of the current and future state of the system in order to be able to
respond with the correct measures to reach a desired goal, such as keeping the operating conditions
within the appropriate boundaries, as well as reducing risks and increasing efficiency.
This is not limited to the awareness within the central control room but also incorporates “awareness”
and response from local equipment, often referred to as “edge processing.” More and more localized
control systems are implemented in the grid or at its perimeters, such as in smart inverters from
renewable energy sources feeding into the grid.
Situational awareness thus includes the awareness of how different active control mechanisms work
together. While the local active control in general helps to reach the goals of the central control, it
sometimes can work against it, leading to undesired or even dangerous situations.
Within the operation of transmission systems, there has been a focus on situational awareness
because maintaining system reliability has always been crucial. Now this operational situational
awareness becomes increasingly more important for distribution operation as well, increasingly also
down to the low voltage levels of the grid.
The growing amount and variability of data now available to operators within all levels of control
centers is changing the control centers beyond recognition. Operators in control centers routinely
receive system-related information, such as voltage, frequency, current, power flows, network
topology, etc. However, the knowledge derived from asset data that is accessible by asset managers,
equipment subject-matter experts (SMEs), and field staff is now finding its way into operators in
control centers and is taken into consideration in operating the grid. A common element with all levels
of control centers are the use of human operators. Even though technology has moved on, there
currently is still a requirement to keep a human in the loop.
Over time, operators build up a mental model of how the network works and behaves under certain
conditions based on their training and experience. This will not change in the future, because this
15
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
mental model holds the overall picture that still includes much that cannot be taken over by (self-
learning) algorithms. Thus, a mental model remains far more superior and flexible than algorithms
(even though operators will be more and more supported by algorithms). Thus, it is very important
that the correct models have been learned by the operators, because it is possible to present the
correct information to an operator while they still make an incorrect decision.
Figure 1-3: Representation of operator mental model based on training and experience
It is therefore very important to understand that situational awareness is in the mind of the operator
(see Figure 1-3), so while addressing all the highly complex analytics and technological challenges, we
must not forget that there are also many traditional things that can be done to improve and maintain
the situational awareness of the operator, such as:
Focused training
Increase experience
What–if simulations
These elements are beyond the scope of this brochure but are addressed by other Cigre working
groups.
It is important to ensure that the advanced analytics information is presented in the best possible
way. This issue can be addressed with appropriate HMI standards applied consistently across various
visualizations within different systems, to ensure that when operators swap between systems that
they understand what the key information presented means (e.g. using reserved colors: Red means
an operator needs to take action now, and Yellow means that a system is moving toward an unsafe
state).
Why is it important
Situational awareness is becoming increasingly important because of the increasing complexity of
power systems. It becomes more and more difficult to completely grasp the dynamics in the grid
because of the risks associated with the increasing interaction of technology on power systems.
16
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
As the amount of external data feeds increases, there is a growing need to focus on the inputs into
the power system algorithms. It can be inferred from Figure 1-4 that the increasing number of inputs
into the system requires new analytics and visualization techniques to be developed and integrated
into electricity utilities to create and enhance situational awareness. This is also due to the complexity
of the inputs and interactions between them. Whereas the number of outputs in terms of measures
has not really increased (e.g. we still measure voltage and frequency), how we visualize these
measures does not necessarily require complex visualizations.
Examples of trends that make grid operation more challenging and dynamic are:
Variability of demand, both because of new demand (electric vehicles, heat pumps) and moving
demand (demand-side management).
Changing generation mix (increasing reliant on weather, which also impacts demand) requires a
more active role of the grid operator.
Market vs physical – the merging of markets across larger geographical/geopolitical zones with
unclear impact of different physical power systems (island systems with larger interconnected
systems).
Increased cost awareness in the regulated environment: who will pay for decarbonization and the
increased system costs (reserves, response of weather variability)?
Tools to actively influence the grid become more easily available (e.g. demand response, grid-
connected storage, active switching in the grid, dynamic line rating, grid capacity management,
voltage management).
Therefore, while situational awareness is becoming more prominent, not only do operators need to
become more aware of the situation in their grid, but also equipment itself needs to be aware of the
situation outside its direct environment. An example of this is given below:
“Some years ago a smart MV/LV distribution transformer with automatic tap changing based on power
electronics with embedded controls was developed, build and fully tested at KEMA (now DNV GL)
laboratories, including short circuit capabilities. It was installed in a greenhouse area in western part
of The Netherlands to improve the voltage stability and power quality of the local distribution grid.
17
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
The smart transformer functioned according to expectation and it was decided to install a second one,
electrically close, that is on the same MV string. As the smart transformers did not communicate with
each other and no situational awareness and/or damping control loop was envisioned or implemented
they started to react to each other resulting in unstable and oscillating behaviour. The end of the
story was that they were removed from the grid.” (quoted from DNV GL white paper power
cybernetics -> [https://www.dnvgl.com/energy/publications/download/power-cybernetics.html])
While operators will remain at the core of grid operations, it becomes more and more important that
they need to be supported by advanced data analytics and analytics visualization in order to grasp the
increased complexity, higher time pressure, and interlocking mechanisms, as indicated in the example
below:
In September 2011 the loss of Arizona Public Service’s (APS) Hassayampa-North Gila 500 kV
transmission line, effecting over 2.7 million customers. That line loss itself did not cause the blackout,
but it did initiate a sequence of events that led to the blackout, exposing grid operators’ lack of
adequate real-time situational awareness of conditions throughout the Western Interconnection. More
effective review and use of information would have helped operators avoid the cascading blackout.
For example, had operators reviewed and heeded their Real Time Contingency Analysis results prior to
the loss of the APS line, they could have taken corrective actions, such as dispatching additional
generation or shedding load, to prevent a cascading outage. The evaluation report recommends that
bulk power system operators improve their situational awareness through improved communication,
data sharing and the use of real-time tools. NERC Report 2012.
Other sectors experienced the effects of increased complexity of systems in combination with human
control. For example, in 88% of aviation accidents, human error was indicated as the cause, 50% of
which was caused by air traffic control operational errors [Measurement of situation awareness in
dynamic systems, Human Factors, 37(1): 65–84. 1995c.]. Like grid operations, these systems have
grown in complexity and time-pressures with an increase in the amount of automation to assist the
operators. However, there is still a human in the loop with the potential for human error.
A major risk of (the necessary) increased automation in the grid is that operators actually become
(relatively) less situationally aware and that automated systems and operators will work against each
other, especially in crises and when under high pressure.
Conclusion
Situational awareness always was and always will be a major element in maintaining the integrity of
the electricity system. However, the importance of situational awareness in growing. As the
operational margins are increasingly variable due to the increase of renewables in the energy mix as
well as growing amounts of “intelligent” demand based on inverters, the system becomes more
decentralized and complex. Therefore, retaining the current level of situational awareness is
challenging. The changes in the power system require an increase of situational awareness on all
control levels, so that the quality of operational decision-making, which is necessary to maintain
system integrity, is kept.
This brochure is about state-of-the-art tools and new data sources that enable operators to be aware
of the situation in the power system and help them to make optimal decisions in operating it. And, as
the complexity of the power system will continue to increase in the future, future needs to increase
situational awareness are addressed.
Situational awareness is about an integrated picture of the electricity system, including:
Situationally aware automation on a higher level of decision-making of grid operations under high
time pressures, taking into account larger parts of the grid instead of the direct environment.
Increasing the situational awareness of operators by visualizing the current situation as well as
future situations and scenarios.
Shifting the main focus of the operators to prepare (“prime”) the system for (near) future critical
situations, using simulations, (short-term) scenarios, and models.
The latter has the additional benefit that operators will gain a much faster and thorough
understanding of the system dynamics than they would get based on experience of (hopefully rare)
real-life events alone.
18
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
The following chapters address all major elements of situational awareness, starting from data and
information sources, data-analytics techniques to interpret these data, applications of these analytics
in system operations, data integration and modelling to integrate data into operations, and finally data
quality and validation.
References:
[1]. Endsley, M.R. (1995b). "Toward a theory of situation awareness in dynamic systems". Human
Factors. 37 (1): 32–64
19
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
20
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
21
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
2.1.2 Recorders
If a substation has all microprocessor relays, it would be possible to know the condition of a power
system at the time of a fault/disturbance event by collecting and analyzing the information triggered
in various protection relays. However, because electromechanical relays do not have any capability to
record the disturbance information, separate standalone recorders are used to record the disturbance
data. Similar to microprocessor-based protection relays the recorders measure voltage and current
information from the substation and alarm/indication signals from power system equipment. One of
the primary advantages with standalone disturbance recording equipment is that they can be set
sensitive to trigger for any abnormal condition, whereas protection relays trigger only during fault
conditions.
The two types of standalone recorders widely used in power utilities are sequence-of-event recorders
and digital transient/fault recorders.
2.1.2.1 SER: Sequence-of-event recorder
Large power system equipment such as generators, circuit breakers, and motors have complex
operating mechanism, which operate through a sequence of steps. In such equipment, several
actuators, sensors, and control elements are connected in a complex configuration. Each of these
elements often provides the operating status (0/1) on whether a measuring quantity has exceeded
the threshold or an equipment has operated.
SERs connect several of these signals and record the status changes with time stamps. Analysis of
SER data helps to identify the operation time and performance of each of the control elements and
sub-systems. Data-analytics applications can utilize this data to locate a sluggish-performing device
and warn about a potential failure event. Proactive steps can be taken to replace the device and help
to prevent a catastrophic failure event that causes motor damage.
2.1.2.2 DTR/DFR: Digital transient/fault recorder
Digital fault recorders connect the continuous time-varying signals such as voltage, current, pressure,
temperature quantities, and provide triggering functionalities to record a disturbance. DFRs
continuously monitor these signals and record the transient waveforms on the occurrence of an event.
DFRs may also contain few binary signals (0/1) to indicate the status of equipment.
Analysis of disturbance snapshot recorded by DFRs provide insight into the transient performance and
operational characteristics of power system. The data can be utilized to access the behavior and
response of many connected power system equipment. The result of such analysis will help identify
the root cause of the disturbance and enable corrective actions. Data-analytics algorithms can utilize
DFR to model and access the system-wide health and performance.
2.1.2.3 Dynamic s wing recorders
Dynamic swing recorders (DSRs) are especially aimed to capture the dynamic response of the power
system as a result of a fault or sudden changes. DSRs exist both as standalone and integrated devices
with digital fault recorders. Data is usually stored as RMS or phasor values and sampled from twice a
cycle to every ten cycles. DFRs are able to capture swing record lengths from one minute to 30
minutes or pre-post triggering of swing data, and they can be used for several purposes, such as
analysis of disturbances, the quantification of power system parameter changes, the investigation of
system oscillations, and validation of stability models [1].
2.1.3 Revenue meters
Real-time revenue metering and economic dispatch of generation are two of the most important
functions in power enabling smooth and efficient operation. Revenue meters are located at the point
of interconnection, segregating generators, transmission/distribution owners, and load centers.
Metering data consists of capturing highly accurate data at the frequency of the power system,
representing magnitudes of voltages, currents, real power, reactive power, and system frequency.
The difference between regular meter and revenue meters is the accuracy. Regular meters are used
22
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
for visualization. However, revenue meters are connected to highly accurate revenue-grade
instrumentation transformers, and the devices contain filters to selectively choose power frequency
components.
The real-time meter information is utilized in advanced data analytic algorithms to detect conditions
and abnormalities in parts of the power system. Timely analysis can help system operators to take
appropriate actions to mitigate these.
2.1.4 Synchrophasors
A phasor is the mathematical representation of a continuously time-varying signal in terms of
magnitude and angle. A Synchrophasor is a digitized phasor data with a UTC (Coordinated Universal
Time) timestamp on each packet. Phasor measurement units (PMUs) are devices that measure the
voltage and current quantities in a substation and compute Synchrophasor data of voltages, currents,
and real/reactive power flow at a much higher rate than remote terminal units (10 to 120
samples/sec). Because Synchrophasor data utilizes a common time reference, it enables comparing
power system state information across a wide geographical area in a common manner. Hence,
mathematical operations such as addition, subtraction, multiplication, and division can directly be
performed on the Synchrophasor data collected from different sources. This enables access the state
of an entire power grid to a much higher granularity and accuracy than previously possible. With the
higher penetration of PMUs, complete real-time automated closed loop control from a centralized EMS
system becomes viable.
Highly accurate Synchrophasor data can reveal the condition of a power system to a greater degree.
It is possible to view the power system oscillations and generator dynamic responses in real time.
2.1.5 Remote terminal unit (RTU)
To achieve centralized control of a power system, real-time values of voltage, current, real power,
reactive power, system frequency, and circuit breaker status information are needed. RTUs connect
the circuit breaker status signals and continuous time-varying voltage and current signals and
calculate the magnitudes of these quantities. RTUs can be integrated into SCADA systems and
connect to wide-area communication networks where the real-time information of these quantities is
transmitted to a control center. Further, RTUs are connected to trip and control circuits of generators
and circuit breakers to regulate the operation of generators and enable remote connection/isolation of
sections of a power system. Hence, RTUs and SCADA systems are annexed in a critical equipment list
to achieve centralized control of power to enable efficient and stable operation of a power system.
RTU data is utilized in state-estimation algorithms to determine an accurate state of the power system
at a given moment. This gives complete visibility of power system depicting its real-time health.
Advanced control algorithms are further used to achieve manual and automated close-loop operation.
Data-analytics applications can use RTU and other types of data and provide enhanced foresight and
situational awareness to the system operator.
2.1.6 Power quality meters
Power quality (PQ) meters are designed to record different power quality variations such as impulsive
and oscillatory transients, sags/swells, interruptions, under/overvoltage, harmonic distortion, and
voltage fluctuations. Usually, the sampling time of PQ meters can be configured according to specific
application requirements. The newest generation of PQ meters can sample at rates of 1024 samples
per cycle for normal conditions and up to 100,000 samples per cycle for transients [7][8].
Traditionally, PQ meter data has been used by power quality engineers for specific PQ monitoring and
assurance purposes. However, the alternative usage of such data has recently been considered and
investigated, including condition monitoring of equipment, fault identification, and fault analysis.
There is an IEEE working group that specifically focuses on this type of data analytics [5][6][7][8].
Various software applications have been developed to process and analyze power quality databases
and automatically combine that data with other power system with data from SCADA, GIS, and
network topologies for detection and analysis of events in the grid [9].
2.1.7 SCADA (supervisory control and data acquisition)
A power grid is a highly interconnected system between generators and loads, which are spread
across wide geographical locations. For efficient and reliable operation of a power system, it is
necessary to monitor its state from both a local and central location. SCADA systems connect to RTUs
23
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
in substations to monitor the voltage and current quantities and control the operation of circuit
breakers. They also communicate with a central energy-management system for control actions.
SCADA provides the following control capabilities:
Generators: Control the voltage, frequency, and real/reactive power set points.
Transformers: Adjust the tap changers where tap changes are available.
Capacitor/reactor banks: Remotely open/close the banks.
FACTS (Flexible AC Transmission Systems): Control the set points to regulate power flow or
system voltage on a section of the transmission system.
Loads: Non-essential loads are operated dynamically to shed at peak load and stressed times
as part of demand-response programs.
A SCADA system uses information collected from both binary (0/1) and analog continuous time-
varying signals for decision-making. SCADA systems form an important source for data-analytics
applications because the data collection and communication infrastructure is already established and
readily available at energy-control centers. Operational data collected from SCADA systems from
various parts of the power system is continually streamed to a central location.
2.2 DATA FROM EQUIPMENT SENSORS
The urgent need to diagnose aging equipment and asset health has led to the development of a
variety of sophisticated equipment-based sensors, which enable one to assess the health and
performance of different pieces of equipment. Devices that monitor the condition of assets contain
equipment-specific intelligence to identify normal and abnormal responses. A condition-monitoring
system can be standalone with advanced analytics about specific equipment or it can be part of a
multifunction protection relay wherein general health and statistics information is provided. Examples
of a standalone condition-monitoring system include vibration-monitoring systems for turbines, partial
discharge monitoring systems for generator stators, and dissolved gas analysis (DGA) systems for
transformer oil. Common functions embedded in modern multifunction protection relays include circuit
breaker monitoring systems, as well as temperature and overload monitoring for transformers,
motors, generators, and transmission lines. Data-analytics algorithms can use the information from
various condition-monitoring systems to determine the health of major power system equipment in
real time and deliver information about equipment health to system operators for situational
awareness.
In what follows, different types of sensors and condition- and operation-monitoring devices installed
in substation and on transmission lines are briefly described. The list is not intended to be exhaustive
but rather to exemplify the characteristics and possible uses of sensor data. The descriptions include
conventional sensors commonly used in substation equipment, as well as emerging sensors and
systems.
24
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Battery voltages
Mechanism charging currents
Mechanism charging times
Categories Parameters
The CBM system can support client/server architecture. It consists of the CBM devices attached to the
CBs and software running on a central control unit. The main functions of the control unit are:
Supervise the operating conditions of the circuit breaker.
Prevent operation if the circuit breaker is outside its operational capabilities.
Execute operating commands when it is safe to do so.
Perform data acquisition of signals from the CB control circuit and record sequences of
tripping and closing.
When a breaker operates, recorded files are transmitted to the central control unit using wired or
wireless technologies. The bandwidth required for real-time data transfer of 15 signals, sampled at 2
kHz, is determined as 576 Kbps.
The CBM IED monitors 15 electrical signals from the circuit breaker control circuit. The signals are
generated during either tripping or closing of the breaker. Of these 15 signals, 11 are analog and 4
are binary signals. Analog signals include measurement of electrical variables such as phase current,
while binary signals indicate the statuses of different components.
2.2.2 Transformer
Online monitoring is used continuously during operation and offers possibilities to record the relevant
stresses that can affect the lifetime of a transformer. The evaluation of these data offers the
possibility of detecting incipient faults early. The addition of an embedded web-server, equipped with
powerful data-analysis tools, means that users can manage and interpret information. Table 2-2
illustrates the status of different condition-assessment techniques.
25
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Table 2-2: Status of different condition assessment techniques for power transformers
Method Offline Online Monitoring Offsite
Ageing of oil (e.g., color, moisture, and tan δ) 1 1 3 1
Furan in oil analysis 2 2 N/A 2
Gas-in-oil analysis (DGA) 1 1 1 1
PD (IEC 60270) 1 2 3 1
Unconventional PD-measurement (e.g., UHF PD
2 2 3 2
measurement)
Transfer function (FRA) 1 3 N/A 1
Dielectric diagnostic (PDC and FDS) 2 N/A N/A 2
Thermal monitoring N/A N/A 2 N/A
Degree of polymerization (DP-value) N/A N/A N/A 1
1: Generally accepted or standardized; 2: accepted by different users; 3: under investigation or consideration; moisture
measurement.
The control units of modern transformers offer a complete set of communication infrastructure based
on IEC 61850-8, including GOOSE messaging, IEC 61850-9-2 Process bus, IEC 60870-5-103 serial
communication, and DNP 3.0 slave protocol. The control, embedded webserver, and web-based
software units of the transformer work as a SCADA that is used for:
Incorporation of DGA, PD, and bushing monitoring (BM) in one unit.
On-site and online display of DGA, PD, and BM key parameters.
Control the operating conditions of the transformer and execute operating commands.
Correlation of data from external inputs.
Full control and communications via secure, flexible web access.
Extensive analysis tools.
Full compatibility with asset-management systems.
Dissolved Gas Analysis (DGA): Online DGA represents a vastly improved monitoring process. With
online DGA, devices are installed on substation transformers that are capable of:
Sampling and evaluating dissolved gasses and sending DGA data to back office systems.
Integration of online DGA data into operations and maintenance processes.
Capturing of data at least once per day and, in some cases, as often as once per hour.
Capability of analyzing a larger number of data points, which improves trending analyses.
Transmitting online DGA data to an energy-management system (EMS).
EMS triggers alarms using a rule engine with preconfigured asset-specific parameters.
Partial Discharge (PD): Electrical discharges appear as various forms of voltage and current
impulses that lead to PD and as having a very short duration (nanoseconds). These events radiate
electromagnetic energy with a specific spectral signature for which UHF detection is well suited,
enabling high levels of refresh rate and accuracy. The online PD indicator and its control unit work as
a SCADA system that is used for:
Radiated electromagnetic energy with UHF detection process, enabling high accuracy.
Phase-resolved analysis and UHF detection method based on IEC 60270 rules.
Simultaneous operation of PD indicator and SAW temperature monitoring system.
Real-time separation of PD events and ambient noise using high-performance algorithms.
Sample rate: 100 Mbps; Bandwidth: 16 kHz – 100 MHz or 1 MHz – 35 MHz.
Table 2-3 shows different monitoring techniques, sensor types, possible output data, and the purpose
of monitoring.
26
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
PD
Insulation: If there is partial discharge detected, it is possible
UHF sensor
to locate the fault location accurately by using multiple
Acoustic wave sensor
sensors.
Fiber optic sensor Digital
Vibration
Loose core clamping or bonding bolts
SKF Acceleration sensor Voltage
Moisture
Insulation
Vaisala Humicap MMT318 Current
27
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
The sensors/meters/actuators in one DG can be connected directly to the local controller through
either ADC, GPIO, or serial communication. The received data and any processed outputs can then be
transmitted by the reduced function device (RFD), which is connected to the local controller. The
transmitted data by the local controller of the DG will be received by the central controller through the
full function device (FFD). Alternatively, the sensors/meters/actuators can be connected directly to an
RFD, which transmits data to the central controller. This is applicable for measurements from the CB
and power distribution lines where no significant computation and control process is required.
2.2.4 BESS: Battery energy storage system
Power generation is shifting from large-scale to a highly complex, distributed generation in which
cost-efficient integration of renewables is paramount, and the demand for energy is continuing to rise.
Therefore, a BESS has to provide energy for a large range of applications to optimize asset
performance by stabilizing frequency and voltage and balancing variations in supply and in demand.
The typical applications are, but not limited to:
Generation
Frequency regulation
Renewable integration
Spinning reserve
Power plant hybridization
Ramp rate management
Transmission
Voltage support
Dynamic line rating support
Renewable integration
Dynamic stability support
Loss reduction
Constraint relief
Distribution
Residential and industrial backup power
Microgrid and island grid support
Distribution upgrade support
Peak load reduction
The data exchange of a BESS can vary because of the different manufacturer structures of a BESS.
Figure 2-1 shows the basic BESS elements. Usually, a BESS is conducted by the EMS shown on top.
28
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
EMS
Energy Management
System
BESS
Battery Energy Storage
System
SMS
Storage
Management System
BMS SCU
Battery Management Storage Control Unit
System
29
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
The BMS also tracks and flags the security sensors indicating the state of charge (SoC) and the state
of health (SoH) of the storage battery cells. In the fault case, the BMS sends corresponding warning
and/or alarm messages to the BESS control unit to react correspondingly to disconnect the battery
racks.
Table 2-6: Example of the minimum required BESS signals for a EMS (SICAM microgrid control)
Battery Signal Type Description Value Unit
Name
BAT1 DPI (DoublePointIndication) Status on / off N/A
BAT1 DPI Status ready / failure N/A
BAT1 SPI Operating Mode Grid forming / Grid supporting N/A
(SinglePointIndication)
BAT1 CO Operating Mode Grid forming / Grid supporting N/A
(Command)
BAT1 CO Status on / off N/A
BAT1 AI Active Power kW
(AnalogInput)
BAT1 AI Reactive Power kvar
BAT1 AI State of Charge %
BAT1 AO Active Power Setpoint kW
(AnalogOutput)
BAT1 AO Reactive Power Setpoint kvar
30
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Overhead Transmission Structure Sensor System: This system fuses RF sensors with image
processing and environmental data. The data is wirelessly communicated in real time with built-
in alarming functions. The system is used to address outages.
31
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Geospatial Data: Geospatial information systems (GIS) have been used by utilities for years;
nevertheless, the types of data available have increased, which leads to new applications of geospatial
data visualization. Weather event data integrated with geospatial information can be applied to
advanced power system analytics such as predictive modeling, real-time forecasting, and post-event
analysis [10].
Customer Data: Social media data can help the utilities to improve engagement with customers as
well as manage outages during major storm events. This data is a prime example of how a
combination of utility operation data, weather data, and customer data could be integrated to provide
better preparedness in outage management. Moreover, customers can see outage information, as well
as receive news (educational and operational) from their utility oftentimes via a mobile application
[10].
Geographic Information System (GIS): The GIS data includes two types of data: spatial and
attribute. The spatial data presents the absolute and relative location of geographic features, for
instance coordinates of location where a substation is situated. The key for effective use of this in
power system applications is the combination of GIS and GPS with the model of a power system. GPS
provides time references that can be applied to synchronize all events. Most digital measurement
devices such as PMUs, traveling wave fault locators, and lightning detectors and locators have
integrated GPS units to send precise time stamps with measured data. The GIS model of a system can
be correlated with an electrical model, providing a more enhanced geographical characterization of a
system [1][15].
2.4 COMMUNICATION REQUIREMENTS FOR SMART GRID DATA
Network reliability and coverage, bandwidth, packet jitter, and latency requirements are the most
critical issues when developing the technical requirements for the power system. For example, the
communications network needs to provide real-time, low latency capabilities for applications such as
centralized remedial action schemes (CRAS), tele-protection (less than 10 ms), transmission,
substation SCADA and VoIP applications (100 to 200 ms), phasor measurement (about 20 ms), and
load-control signaling. These requirements drive the need for high-speed fiber optic and/or microwave
communications to support those capabilities. On the other hand, applications such as automatic
meter reading (up to a few seconds) and data beyond SCADA, which are more latency-tolerant, could
use communications technologies such as unlicensed wireless mesh, broadband wireless, licensed
wireless, and satellite. Future trends and applications in generation, transmission, and distribution
systems present different class of requirements and challenges (general communication requirements
in power system application are provided in Table 2-7).
Table 2-7: General requirement of communication in power system
Requirement Description Example
Substation automation GOOSE applications
Data transmitting in power system needs
require low-latency communications with
different network performance and data
Performance latency budgets in order of milliseconds, while
(bandwidth, latency, and payload)
a conservation voltage reduction (CVR)
requirements.
application has latency expectation of seconds.
For wide-area power system, selecting
one or more communication technologies Rural areas have poor cellular coverage and
Coverage must be done after thorough analysis of metropolitan areas are deploying high-speed
its characteristics, cost, and other mobile 4G/LTE technologies.
associated operational challenges.
Private communication networks are CapEx-
intensive with low OpEx, while a service-
Different communication technologies
Cost provider-based public solution such as cellular
have different cost structure.
or satellite requires higher OpEx with lower
upfront CapEx.
A layered networking architecture ensures
integration of innovations over the Field infrastructures are deployed with an
expected lifetime of the deployment. Over average lifetime of 15 to 20 years, which may
Life Time
the next few years, newer protocols such appear incompatible with the pace of evolution
as IEC 61850 and beyond are expected to in data communications.
be prevalent.
32
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
The number of devices, the amount of Data collection from many sources on the
data, and frequency of communications power grid—such as sensors, meters, and
Data with the devices are necessary. voltage detection—in the customer premises—
Gathering Acceptable latency and required such as sensors for high-consuming appliances
bandwidth for every type of data should and from external sources such as weather—is
also be considered. necessary.
Power system is vulnerable to cyber- Additional traffic on the network and
Security
attacks. bandwidth consumption.
Also, other important applications and related communication requirements in modern power system
are provided in Table 2-9.
Table 2-9: Communication requirements in terms of latency and data time window
Latency
Application Origin of Data/Place Data Is Required Data Time Window
Requirement
State Estimation All substation/control center 1 sec Instant
Generating substation/application 10–50 cycles
Transient Stability 100 ms
server (167 ms – 830 ms)
Small Signal
Some key locations/application server 1 sec Minutes
Stability
Voltage Stability Some key locations/application server 1–5 sec. Minutes
Post-Mortem All PMU and digital fault recorder Instant and Event
NA
Analysis data/historian Data
Several smart grid applications have already been developed, and some are in the process of
development as a future trend in power systems. To understand their communication needs, a brief
33
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
and qualitative survey of some of the most important applications in terms of their data requirement
and latency are presented in Figure 2-2 and Table 2-10.
To meet all the performance, coverage, cost, and lifecycle requirements of the network, utilities
require a combination of multiple communication technologies, because no single communication
technology can meet all of their requirements. The dynamic nature and wide range of communication
technologies available today provide power systems with numerous options. However, this also
creates the multiple challenges of choosing the appropriate technology and networking architecture.
Specific technology supporting each particular application varies based on factors such as bandwidth,
34
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
latency, and reliability. Table 2-11 lists some of the modern power system applications and the
associated communication technologies that may be employed for each application.
Table 2-11: Technology supporting each particular application (L – Low, M – Medium, H – High)
Network Requirements
Infrastructure/Applications Technology Option
Bandwidth Latency Reliability
Optical Transport (DWDM,
Milliseconds
High-speed Backbone H H SONET)
to Seconds
MPLS and IP-based fabric
Wired and wireless
Milliseconds
carrier/utility company owned
Inter-utility Area Network H to M
wireless networks satellite,
Seconds
microwave
Fiber optic, microwave,
Phasor Measurements H Milliseconds H
broadband wireless
IEC 61850, hardened
Tele-Protection Network L Milliseconds H
routers/switches
Remedial Action Scheme L Milliseconds H Fiber optic, microwave
Centralized Remedial Action
H Milliseconds H Fiber optic, microwave
Scheme
Fiber optic, microwave, low
Protective Relaying L Milliseconds H
latency wireless, copper
IEC 61850, hardened
Substation LAN L Milliseconds H
routers/switches
IP-based fiber optic,
Transmission and Substation Milliseconds
M H microwave, copper lines,
SCADA to Seconds
satellite
Wired and wireless
Seconds to carrier/utility company owned
Field Area Network M M
Hours wireless networks satellite,
microwave
T&D Crew of the Future H Seconds H Broadband wireless
Fiber optic, microwave,
Outage Detection L Minutes H broadband wireless, unlicensed
wireless mesh
Distribution Automation (routine Microwave, satellite, unlicensed
L Minutes M
monitoring) wireless mesh
Distribution Automation (critical Microwave, satellite, unlicensed
L Seconds H
monitoring and control) wireless mesh
Distributed Generation
L Seconds H Microwave, satellite
monitoring
Distributed Generation control L Seconds H Microwave, satellite
Advanced Metering (meter
Seconds to Unlicensed wireless mesh, PLC,
reading, disconnect, M M
Minutes Zigbee
communication to HAN)
Minutes to Microwave, broadband
Data Beyond SCADA M M
Hours wireless, satellite
Outage Detection (thru Fault Fiber optic, microwave,
Indicators, Protection systems or L Minutes H broadband wireless, unlicensed
advanced meters) wireless mesh
Wired and carrier owned/utility
Seconds to
Premise Area Network H M company owned wireless
Minutes
networks satellite, microwave
Dynamic Pricing L Minutes M Internet, ZigBee
Plug-in Electric Vehicle L Minutes M Zigbee, PLC
Demand Response L Minutes H Zigbee, PLC, paging systems
Wired or wireless broadband,
Home Area Network Interface L Minutes M
Zigbee
*
EDISON, Southern California.
35
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Different modern technologies can be used in order to improve the functionalities of power systems
and remove associated problems and solve challenges. There is not a finalized architecture for future
power system communication infrastructure. However, the following technology options are the most
promising at this point:
2.5 REFERENCES
[1]. Advanced Data Analytics Techniques: Analysis and Applications for Power System Operation and
Planning Support. EPRI, Palo Alto, CA: 2015. 3002007076
[2]. M. Kezunovic, L. Xie, S. Grijalva, P. Chau, and et al, Systematic Integration of Large Datasets for
Improved Decision-Making, PSERC 2015.
[3]. Substation Data Integration and Analysis: Study Report. EPRI, Palo Alto, CA: 2011. 1019916
[4]. J. Perez, “A guide to digital fault recording event analysis,” in 63rd Annual Conference for
Protective Relay Engineers, 2010, pp. 1-17.
[5]. S. Santoso, and D. D. Sabin, “Power quality data analytics: Tracking, interpreting, and predicting
performance,” in IEEE Power and Energy Society General Meeting, 2012, pp. 1-7.
[6]. W. Strang, and e. al., “Considerations for Use of Disturbance Recorders ” in System Protection
Subcommittee of the Power System Relaying Committee of the IEEE Power Engineering Society,
2006.
[7]. "Next-generation power quality meters," 2015; Available online
[8]. W. Xu. "Working Group on Power Quality Data Analytics Objective & Scope," 2015;
http://grouper.ieee.org/groups/td/pq/data/downloads/PQDA-Objective-and-Scope.pdf.
[9]. "PQView," 2015; http://www.pqview.com/.
[10]. Sensor Technologies for a Smart Transmission System, EPRI, 2009.
[11]. Integration of Internal and External Data Sources to Support Transmission Operations, Planning,
and Maintenance, EPRI, 2014.
[12]. M. Kezunovic, L. Xie, S. Grijalva, P. Chau, and et al, Systematic Integration of Large Datasets for
Improved Decision-Making, PSERC 2015.
[13]. P.-C. Chen, T. Dokic, and M. Kezunovic, “The Use of Big Data for Outage Management in
Distribution Systems,” in Int. Conf. on Electricity Distrib. (CIRED) Workshop, Rome, 2014.
[14]. K. L. Cummins, E. P. Krider, and M. D. Malone, “The US National Lightning Detection
Network<sup>TM</sup> and applications of cloud-to-ground lightning data by electric power
utilities,” IEEE Trans. Electromagnetic Compatibility vol. 40, no. 4, pp. 465-480, 1998.
[15]. P.-C. Chen, T. Dokic, and M. Kezunovic, “The Use of Big Data for Outage Management in
Distribution Systems,” in Int. Conf. on Electricity Distrib. (CIRED) Workshop, Rome, 2014
[16]. https://www.nrel.gov/docs/fy17osti/67553.pdf
[17]. http://www.te.com/content/dam/te-com/documents/sensors/global/TE_SensorSolutions_Wind-
Turbines.pdf
36
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
3. DATA-ANALYTICS TECHNIQUES
Information management in companies is becoming a process of much relevance. The goal is to
discover knowledge from raw data generated during operation of the processes. Traditionally, data is
used for purposes of process control; sometimes it was processed to get graphs of what was going
with the process (situational awareness). Now the decision-makers want the data to be transformed
into useful information for decision-making (decision support).
In the early 1970s and 1980s, decision-support applications such as administrative information,
predictive analytics, and online analytical processing (OLAP) have emerged and expanded the
decision-support system domain. In the early 1990s, business intelligence (BI) played a pivotal role to
increase value and performance of the enterprise. As a technology-driven process, BI helped
corporate users make critical decisions by analyzing data and presenting information. BI involves a
variety of tools, applications, and methodologies that collect data, prepare it for analysis, run queries;
then analytical results such as reports, dashboards, and data visualizations are available to decision-
makers.
The natural evolution of BI is data analytics (DA) or advanced analytics. There is no unique definition
for the term “advanced analytics,” but it usually refers to tool types that based on predictive analytics,
data mining, statistical analysis, digital signal processing, artificial intelligence, natural language
processing, and other mathematical processes that attempt to recognize and validate data patterns
and trends and draw conclusions therefrom. Data-analysis techniques can be combined with other
analytical disciplines, such as descriptive modelling and decision modelling or optimization with the
main objective to provide support for making better decisions. Many of analytics techniques appeared
in the 1990s. Today, the data sets are significantly larger than before and most of these techniques
adapt well to minimal data preparation.
By using advanced analytics, utilities can study electricity usage data to understand and learn the
state of the load and operations, and customer behavior. The advanced analytics can help to discover
knowledge and facts that benefit business. By examining large volumes of data with details, useful
information from hidden patterns and unknown correlations can be extracted to make better
enterprise decisions.
Data-analytics techniques have been applied across many industries, but the practice in the energy
and utility sector is behind the other industries in terms of actual implementation. However, some of
the implementation of analytics techniques (EPRI, Jan 28, 2016) used in the utility industry already
show promising outputs. In order to enable secure, reliable, and interoperable operation of the power
grid, an information-based framework is to be integrated into the electrical transmission grid. A large
and heterogeneous collection of data from a multitude of measurements, status, or third-party data in
various formats is used in constructing the framework. Data analytics is able to identify its unrevealed
patterns, predict the prospective outcomes, and recommend appropriate decisions. Visualizing the
current situation as well as future situations and scenarios could help to increase the situational
awareness of operators; the visualization has to cooperate with data analytics. There is no unique
classification of advanced analytics techniques, but each technique can contribute to data analytics of
modern power system operation, especially in the situational awareness. In this brochure, the
advanced data-analytics techniques are divided into six categories:
1. Data mining and Association Rules
2. k-Nearest Neighbor
3. Supervised Learning and Unsupervised Learning
4. Probabilistic Networks
5. Deep Learning
6. Visual Analytics
These six well-known categories are described in this section. These data and visual analytics
techniques could apply to both real-time data and online and offline simulation data of electrical
transmission grids to prepare short- or long-term scenarios and models. Therefore, operators could
increase the situational awareness by visualizing the output of important information.
37
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
38
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
relationships. It uses this information to manage local store inventory and identify new merchandizing
opportunities.
3.1.4 Potential applications
Some examples of data-mining applications for electric power utilities are customer relationship
management (CRM) to track behavior; power plant maintenance; electrical transmission grid planning
(Chen, Onwuachumba, Musavi, & Lerley, 2017) and operation; human resource management; fraud
detection; and finding anomalies. See Section 4 for a more detailed description of applications of data
mining in the power industry.
3.2 K-NEAREST NEIGHBOR
3.2.1 Brief definition
The k-nearest neighbor (k-NN), which is also referred to as lazy learning, case-based reasoning, and
instance-based learning, is a well-established classification method that is based on closest training
sets in the feature space. The main idea of k-NN, which could be explained as a sample’s category, is
decided by its k most similar samples. The sample falls into a category that contains the largest
number of its k most similar samples.
The k-NN algorithm is among the simplest of all machine-learning algorithms. The k-NN is a
nonparametric learning algorithm because it does not make any assumptions on the underlying data
distribution. This feature is very advantageous because most of the practical data do not obey the
common theoretical assumptions in the real world. Another feature of k-NN is that it is highly adaptive
to local information. A k-NN algorithm utilizes the closest data points for estimation; it is capable of
taking full advantage of local information and form highly nonlinear and adaptive decision boundaries
for each data point.
3.2.2 Technical description
k-NN compares a group of training objects (k) that are closest to the test objects and label the
influential class in the neighborhood. Three essential elements are included in this process:
A set of labeled objects (e.g., a set of stored records (data)).
A distance measurement or a similarity metric.
The number of nearest neighbors, the value of k.
Once an unlabeled object is provided, the distance of this object to the labeled objects is computed.
Based on the data, k-nearest neighbors are identified, and the class labels of the nearest neighbors
are utilized to determine the class for this unlabeled object. Multiple training and testing sets with
random data from different sets could mitigate bias presented by noise or irrelevant data and thus
improve the performance of k-NN.
3.2.3 Application domains
k-NN is commonly applied to solving classification problems. Offline analysis helps to generate rules
for different data classes, and online analysis could initiate decision trees for classification purpose.
3.2.4 Potential applications in smart grid
One form of such classification is used for classifying historical load consumption data into three
different classes (iTesla (Innovative Tools for Electrical System Security within Large Area), July 29,
2013). The classification is based on training and testing load consumption data, and the training class
is prepared based on the cumulative distribution of the target load.
Another application of k-NN algorithm is the classification of abnormal data from a PMU. The example
in Figure 3-1 shows that k-NN is trained with phase angle difference data, which defines abnormality.
If the test data has an abnormal phenomenon, k-NN can detect this phenomenon based on the
training provided. It will be one of the online data-mining applications for PMUs. The purpose of such
classification could validate the PMU data.
39
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
40
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 3-2: Supervised learning (upper rectangle) and unsupervised learning (lower rectangle)
Observe that while both SL and UL fits a model to some data according to a vector parameter, the
final objective is different. The former uses a model to relate a dependent variable with its
explanatory features as is detailed by the labeled data; then, the model is used to predict unknown
data. The latter uses unlabeled data. Consequently, the true relation between data is unknown
beforehand. Thus, data is grouped, and then the resulting patterns are evaluated.
The remainder of this section presents machine-learning techniques and applications used for power
systems. Such applications are neither an extensive list nor necessarily the best solutions. Rather, this
list aims to show how ML techniques can be used to obtain robust regression models or to extract
valuable information about the application problems. First, supervised algorithms as linear regression
(LR), decision trees (DTs), artificial neural networks (ANN), and support vector machines (SVM) are
presented. Then, an unsupervised learning technique called K-Means is discussed. Additionally, other
clustering methods are mentioned, and references are provided where needed.
Formally, a generic SL problem can be stated as follows:
Given a dataset of the form of {𝒙𝑖 , 𝑦𝑖 }𝑛𝑖=1 ∣ 𝑋 ∈ ℝN × 𝑌 ∈ ℝ, where 𝑋 is an N-dimensional space of
features and 𝑌 is the corresponding response, we are asked to estimate the relation between 𝑦𝑖 =
𝑓(𝒙 ∣ 𝜽), where 𝜽 are the function's parameters. In classification, the response variable is binary 𝑌 ∈
{±1}, whereas, in regression, the response is continuous 𝑌 ∈ ℝ. For instance, the problem of
forecasting a generator's failure (given measurements of humidity, vibrations, thermic energy, gases,
and aging) is a classifying problem (i.e. fails or not), whereas the prediction of the daily wind power
generation of a wind farm is a regression problem. It is worth mentioning that, the relation 𝑌~𝑋 is
estimated by minimizing an error criterion that ensures that the inferred function generalizes as
accurately as possible the true underlying process.
3.3.2 Linear regression
3.3.2.1 Brief definition and technical description
The linear regression model is one of the oldest, most renowned, and most used models for statistical
and ML applications (James, Witten, Hastie, & Tibshirani, 2013; Hastie, Tibshirani, & Friedman, 2009).
This model is simple and leads to robust solutions. It is readily interpretable by non-expert users and is
accessible to code. Nonetheless, linear regression makes some naive assumptions about the modelled
process (e.g. the process can be approximated by a linear combination of variables, deviations from the
model obey a normal distribution, and so on), assumptions that are hardly met by real-world problems.
41
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
However, even while a large list of newer and more robust algorithms exists, still today LR remains the
workhorse of several industries. Some of the most important characteristics of LR are its high
interpretability (i.e. we know how much the dependent variable will change with respect each feature),
while additional analysis can be performed using the trained model itself (i.e. features ranking). For
instance, by using LR in a power generation forecasting application, we can know how each of the
measured variables (e.g. humidity, rain, mean transformer losses) impact the power generation output
in electrical power units.
Further, LR has been subject to several extensions to enhance its robustness and precision (Rao,
Toutenburg, Shalabh, & Heumann, 2008). Even more, with the advent of big data technologies, LR has
regained popularity and predictive power (Ma & Sun, 2015; Ma & Cheng, 2016). It is worth noting that,
in literature, LR usually refers to a model with only one explanatory variable, whereas multiple linear
regression (MLR) refers to a model with two or more explanatory variables. In this work, we refer to
both as LR. Colloquially, an LR model can be understood as RESPONSE = FIT + RESIDUAL. In this
expression, RESPONSE stands for the variable of interest; the FIT term represents a linear combination
(a summation) of measured features related to the response; the RESIDUAL term represents an
unpredictable error/noise of the observed values with respect to the model’s prediction. For illustrative
purposes, we will first introduce a one-variable LR model:
𝑌 = 𝑓(𝑋) → 𝑦𝑖 = βi,0 + 𝛽𝑖,1 𝑥𝑖,1 + 𝜖.
The former is a line equation where 𝛽i,0 stands for the intercept (i.e. the expected value of Y when X =
0), βi,1 for the slope’s line (the average increase in Y with a one-unit increase in X), and 𝜖 is the
irreducible error or noise made in the model (James, Witten, Hastie, & Tibshirani, 2013; Hastie,
Tibshirani, & Friedman, 2009; Shalev-Shwartz & Ben-David, 2014). 𝛽 corresponds to the weights
assigned to each variable and are the parameters of the model. Then, the problem is reduced to find
𝛽0 and 𝛽1 such that the difference between sample data labels Y and predicted labels 𝑌̂ is minimized.
For two or more variables, the LR model is simply defined as:
Building energy consumption is the main component of worldwide consumption and carbon dioxide
emissions. Nowadays, LR-based models have been successfully proposed for predicting how much and
when energy will be consumed for single buildings (Asadi, Shams, & Mohammad, 2014) and building
blocks (Ma & Cheng, 2016). On the other hand, understanding the relation between building energy
consumption and its components is essential for developing adequate energy-management policies
(Hsu, 2015; Walter & Sohn, 2016; Chung, 2012). For instance, a building’s energy consumption used
by indoor comfort such as heating, ventilation, and air-conditioning (HVAC) account for 65% (Lam,
Wan, Liu, & Tsang, 2010). In this regard, a penalized LR has been used for automatic identification of
energy system components such as operational schedule, number of customers, lighting control,
employee behavior, and maintenance in commercial buildings (Hsu, 2015).
In another instance (Braun, Altan, & Beck, 2014), an LR model was proposed to predict a U.K.
supermarket electricity and gas consumption. Given the particular conditions of a supermarket building
(i.e. large refrigerated shelves), it was found that climate changes in relative humidity and temperature
are expected to increase the electricity consumption by 2.1%, whereas gas will decrease by 13% (Braun,
Altan, & Beck, 2014).
LR applied to energy policies
42
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Energy policies constitute laws and actions to address energy infrastructure development, production,
distribution, and consumption. One of such policies with more significant impact on energy consumption
is retrofitting, which improves energy consumption (e.g. lighting, indoor comfort) by replacing older
electrical components with newer ones. LR was employed to assess the cost-saving benefits of
retrofitting commercial and residential buildings (Walter & Sohn, 2016).
Similar results were obtained in the work by Huebner, Hamilton, Chalabi, Shipworth, & Oreszczyn in
2015. Using an LR model, they assessed energy consumption of a U.K. housing stock by categories of
predictors such as building variables, socio-demographics, heating behavior, and psychological factors.
They found that a building’s electrical components explain by far most of the variability in energy
consumption, thus supporting retrofitting policies (Huebner, Hamilton, Chalabi, Shipworth, & Oreszczyn,
2015). Furthermore, the construction of smart buildings and smart energy policies require building
energy consumption benchmarks. In this sense, expert knowledge and non-technical regulations need
to be integrated into benchmarks. Thus, in (Chung, 2012), a fuzzy-LR model was developed for
benchmarking building energy consumption, including expert knowledge.
LR applied to utility companies
Predicting energy load (Hong, Gui, Baran, & Lee, 2010), demand (Kandananond, 2011), and
consumption (Tso & Yau, 2007) plays an important role in decision-making and planning for utility
companies. For instance, the long-term load forecasting can be employed for transmission and
distribution (T&D) planning, whereas a short-term can be used for the demand-side management
(DSM). DSM is particularly important to reduce peak electricity demand while maximizing utility
generation capacity (Hong, Gui, Baran, & Lee, 2010). Another LR application for utility companies is the
assessing of the reliability and security of a power system (Halilcevic, Gubina, Strmcnik, & Gubina,
2006). In this sense, LR can be employed to identify the critical components of energy transmission in
power supply networks. By knowing this, utilities can perform better managerial actions such as power
reserving and transmission network reinforcement planning (Halilcevic, Gubina, Strmcnik, & Gubina,
2006).
3.3.3 Decision and regression trees
3.3.3.1 Brief definition
Classification and regression trees (DTs) were introduced to the AI area in the mid-1980s. Even though
classification trees and regression trees perform different tasks, they are both referenced here as DTs.
DTs can be used for supervised and unsupervised tasks. However, later applications are beyond the
scope of this document; further, even though DTs can perform regression, binary, and multi-
classification, for pedagogical purposes, we constrain DTs algorithm explanation to binary classification.
In such a setup, DTs build a rule model for separating two classes (e.g. 𝑦 = {±1}) graphically presented
in the form of a tree, thus their name. More precisely, DTs perform a partition of the feature space into
subspaces where a simple model (e.g. the most common class) is fitted (Hastie, Tibshirani, & Friedman,
2009).
DTs have positive and negative characteristics: on one hand, they are interpretable as rules providing
an explanation between x measurements and target value y, they can handle different types of data
(e.g. numerical, categorical, nominal) and missing data at the same time, and they are computationally
cheap (Rokach & Maimon, 2015). On the other hand, DTs suffer from high variance (i.e. they tend to
over fit the model to training data, performing poorly with new data) reducing its performance against
more robust classifiers (James, Witten, Hastie, & Tibshirani, 2013). Nonetheless, DTs performance can
be enhanced by constraining the tree parameters such as the depth of the tree or using combinations
of trees to reduce variance. Using statistical methods like bagging a DTs forest can be grown and used
as a single classifier/regressor. Such statistical methods and how to combine the forest into a single
function are elsewhere documented.
3.3.3.2 Technical description
DTs models are composed of branches and nodes. Branches connect each node in a directed way (i.e.
from A to B). Except from the root node, all other nodes have an incoming branch from a previous
node, whereas except from terminal nodes, each one has a pair of outgoing branches. Each node
corresponds to a decision or split of the feature space. Nodes can be of three types: root, internal, or
terminal (i.e. leaf node). The root is the starting node, and it performs the best dichotomic partition of
the feature space between two given classes and connects to a pair of internal/terminal nodes. Internal
43
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
nodes correspond to intermediate steps where feature space is further split into more specific sub-
spaces in accordance with some criterion. Terminal nodes correspond to a final decision on the analyzed
point.
It is worth mentioning that terminal nodes rather than a class allocation can be interpreted as the
probability of each class. Moreover, DTs display an explicit hierarchy between features: the root node
(the first variable to perform a split) is the most important feature for the problem, the internal nodes
are the second most important variables, and so on.
Thus, a binary classification DTs is a function 𝑓(𝒙) = 𝑦, which predicts the class or probability of any
instance x by taking decisions following binary rules described by nodes. A binary tree is built as follows:
1. Identify the feature that performs the best separation between classes.
a. Find the best split-point of the feature (the value in which the best separation is
obtained).
b. Divide the feature space into two distinct and non-overlapping regions 𝑅𝑖 𝑦 𝑅𝑗 .
2. If the maximum tree depth is reached or the stopping criterion is met, assign to every
observation in the region 𝑅𝑗 the most common class. Else, identify a new feature and its split-
point to separate region 𝑅𝑗 into two new sub-partitions.
3. Repeat step 2.
As an example, a simple DT for detecting faults in a transmission line during a storm is shown in
Figure 3-3. On the right side, the tree classifier is depicted; on the left, the partition performed in the
feature space is shown. Sampled transmission lines under storm conditions are shown in the feature
space (B part of the figure). Orange dots correspond to lines, which suffered a failure, whereas gray
dots are the non-interrupted transmission lines. Features on this example are precipitation, which is
the continuous variable, and thunderbolts, which is categorical one.
On the A side, the tree constructed for precipitation and thunderbolts is shown; split-points for each
feature are shown on edges, while labels under each terminal node corresponding to the region defined
on the feature space. Further, on each node is also shown the frequency of faults/no-fault and the
corresponding probability. On the B side, the feature space, which is divided into regions R1, R2, and
R3, are presented. In this example, if precipitation is less than a 10 cm^3 threshold (R1), transmission
lines will be classified as no-fail with a 91% probability, whereas failure during a storm with precipitation
below the former threshold will have a very low probability (0.09).
Furthermore, so far we have neglected some important concepts like the criterion for selecting partition
features, how to select split-points, and how to determine a DTs depth. Thus, readers are referred to
(James, Witten, Hastie, & Tibshirani, 2013; Hastie, Tibshirani, & Friedman, 2009; Rokach & Maimon,
2015) for more DTs details.
44
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
As was mentioned in the LR section, understanding and forecasting energy consumption patterns in a
building allow for reducing CO2 emissions and manage energy load by regulating demand. Properly
managed energy consumption also requires understanding how the electrical components impact the
overall building consumption. For instance, building designers and architects require tools that can allow
them to predict a new building’s energy usage patterns based on atmospheric data, building architecture
and household characteristics, and energy sources (Yu, Haghighat, Fung, & Yoshino, 2010).
Recently, DTs have been employed to forecast Energy Use Intensity levels (i.e. the ratio of annual total
energy used between the building’s floor area) for buildings across Japan (Yu, Haghighat, Fung, &
Yoshino, 2010). One of the most important contributions of this work is the analysis of rules obtained
from the classification tree. By analyzing the hierarchy of features on the DT, they found that different
sets of features impact a building’s consumption in accordance with district temperatures. In this regard,
we could test other data sources like sun movement and clear-sky solar irradiance on a building’s
surface. Thus, by re-training the DT model and analyzing where such variables are located in the
hierarchy of the tree, we may conclude whether such variables are significant or not in characterizing a
building’s energy consumption.
3.3.4 Artificial neural network
3.3.4.1 Brief definition
Artificial neuronal networks (ANNs) are computational networks that try to simulate the decision process
that occurs in biological networks of neurons in a central nervous system (Graupe, Sep 2013) (Kalogirou,
Dec 2001) (Russell & Norvig, Dec 13, 1994). Similar to biological neurons, an ANN can be described as
a massively parallel-distributed processor that stores knowledge and makes it available for use (Haykin,
1999). According to (Kalogirou, Dec 2001), ANNs “are good for tasks involving incomplete data sets,
fuzzy or incomplete information and for highly complex and ill-defined problems, where humans usually
decide on an intuitional basis. They can learn from examples and are able to deal with non-linear
problems.” An ANN is a group of interconnected artificial neurons, interacting with one another in a
concerted manner. In such a way, excitation is applied to the input of the network. It resembles the
human brain in two respects:
1) Knowledge is acquired by the NN by means of a learning process.
2) Inter-neuron connection strengths known as synaptic weights are used to store the knowledge.
They learn the relationship between the input parameters and the controlled and uncontrolled variables
by studying previously recorded data, similar to the way a nonlinear regression might perform
(Kalogirou, Dec 2001).
3.3.4.2 Technical description
The network consists of three elements:
1) Input layer
2) Hidden layers
3) Output layer (see Figure 3-4 [22]).
45
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
46
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
The most popular and powerful algorithm of ANN is back propagation (BP). The train of all patterns of
a dataset is called an epoch. BP tries to improve the performance of NN by reducing the total error by
changing weights and its gradient. The error is expressed by the RMS (root-mean-square), a zero-error
value indicates that all the computed output patterns match the expected values, and therefore network
is well trained (Kalogirou, Dec 2001).
3.3.4.3 Application domains
According to (Kalogirou, Dec 2001), ANNs are able to learn the key information patterns within a multi-
dimensional information domain which is fault-tolerant, robust, and noise-immune (Rumelhart, Hinton,
& Williams, 1986). Data from energy systems are noisy, making the data a good candidate to be
analyzed with neural networks. ANN has been applied to predict and optimize energy use in commercial
buildings—particularly in HVAC in commercial buildings—without sacrificing comfort (Kreider, Wang,
Anderson, & Dow, December 1992) (Curtiss, Brandemuehl, & Kreider, January 1994).
ANNs have been applied to the diagnosis of line faults of power systems and load forecasting in power
systems. ANNs were used to model the combustion process of incineration plants with the purpose to
optimize the reduction of toxic emissions (Muller & Keller, 1996). In (Milanic & Karba, 1996), ANNs were
used for predictive control of a thermal plant, by using the steam flow as input and a simple network
structure because on-line predictions of plants are faster. In (Mandal, Sinha, & Parthasarathy, 1995),
ANNs were applied for short-term load forecasting in electric power systems. The output of the ANN
was the next hour load, and no weather variables were considered. In (Khotanzad, Abaye, &
Maratukulam, 1995), a recurrent neural network (RNN) load forecaster was used for hourly prediction
of power system loads. In (Datta & Tassou, 1997), ANNs networks were used for prediction of the
electrical load in supermarkets.
ANNs are used in wind energy systems and can be grouped into three major categories: forecasting
and prediction, prediction and control, and identification and evaluation (Keles, Scelle, Paraschiv, &
Fichtner, 2016).
Forecast methods for day-ahead electricity prices are essential for energy traders and supply companies.
ANN has to be used to successfully forecast day-ahead electricity prices, providing even better results
than ARIMA (Keles, Scelle, Paraschiv, & Fichtner, 2016).
Finally, ANNs are used for the implementation of a wide variety of anomaly-detection systems,
including intrusion detection systems (IDS) for network computers in the electric energy sector as well
as advanced IDS for the smart grid in an ensemble with other algorithms (Aburomman & Reaz, March
2017).
3.3.5 Support vector machine (SVM)
3.3.5.1 Brief definition
The support vector machine (SVM) was developed by Vapnik and others during the 1990s (Scholkopf &
Smola, 2002). SVM was initially developed as a linear classifier, although it is somewhat famous for its
capacities to handle noisy nonlinear data. SVM has also been extended to the problem of regression,
probability estimation, clustering, and so on (Scholkopf & Smola, 2002). However, because SVM main
features are shared among all distinct SVM applications, we limit the description of the algorithm to
classification problem.
3.3.5.2 Technical description
To introduce SVM, we first require introducing the empirical risk minimization (ERM) principle. It is the
most used criterion to train any ML model: It only requires that the model achieves the lowest possible
error on the training sample. Achieving the lowest error rate on a given sample only requires to model
every possible case. However, such model will be so particular that will perform poorly on out-of-the-
sample points. On the other hand, SVM was designed based on the structural risk minimization (SRM)
principle (Scholkopf & Smola, 2002). The former establishes a bound that relates generalization (i.e.
how well the model explains unseen samples) to the simplicity of the model (i.e. if the model is too
complex, it should perform poorly on unseen data). Thus, SVM uses the simplest family of functions
and hyperplanes to approximate a given sample. Further, to constrain the number of valid hyperplanes
and its complexity, a margin around the hyperplane is added.
47
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Such a margin guarantees that the problem is convex (i.e. it has an optimal solution(s)), improving on
the computational burden of estimating the hyperplane. Another important feature of the SVM
formulation is that the hyperplane is described using a reduced set of the sample points known as the
support vectors (SVs). Thus, in a classification problem, learning the optimal hyperplane with a given
sample is reduced to find the support vectors. Because only the SVs are required to build the hyperplane,
all remaining training points are disregarded. An SVM toy example of two dimensions is shown in Figure
3-6.
48
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
support vectors αi ) are found, a new instance can be classified using the hyperplane equation (Scholkopf
& Smola, 2002).
3.3.5.3 Potential application in smart grid
SVM applied to transmission lines fault detection
Transmission line faults entail 85% to 87% of power system faults (Singh, Panigrahi, & Maheshwari,
2011). Power systems and electrical grids require reliable transmission lines, detection tools designed
to find early faults may decrease the time that a circuit is interrupted. Protective relays are in charge of
detecting energy or hardware faults and ameliorate their impact. Initially, these protections were
electromagnetic. However, nowadays they are digital and possibly transmitting its measurements to the
internet. The main problem with the detection of any fault resides on the characterization of the current
and voltage signals. Once a proper characterization is chosen (Singh, Panigrahi, & Maheshwari, 2011;
Ray & Mishra, 2016), machine-learning techniques such as DT or SVM can be used to classify fault/non-
fault signals. However, DTs can be heavily biased given that they tend to overfit training data, and such
data is gathered from simulators rather than real measurements. On the other hand, SVM produces
more robust models that are less sensitive to particular simulated conditions. Moreover, with the kernel
trick, complex relations between faults and its characteristic signals can be captured. Even more, adding
data gathered by the protective relay, more solid results may be expected from SVM than DTs.
3.3.6 K-means and clustering
3.3.6.1 Unsupervised learning
This introduction is necessarily incomplete given the enormous range of topics under the rubric of
“unsupervised learning.” For instance, the goal may be to discover groups of similar data points
(clustering), to determine the distribution of data within the input space (probability density estimation),
or to project the data from a high-dimensional space down to two or three dimensions for visualization
purposes. This document focuses on the first objective.
Clustering can be understood as gathering data into groups of similar individuals known as clusters.
Using similarity/dissimilarity measurements, points are assigned to one or more clusters with other data
that share common features. Further, groups may be ordered by hierarchy or be linked to other groups.
Clustering algorithms are preferred for exploratory purposes, such as when there is no a priori
knowledge about relations existing within data. The most iconic clustering algorithm is called K-means.
3.3.6.2 Brief definition and technical description
K-means is a hard-clustering (Gan, Ma, & Wu, 2007; Wu, 2012), partitional (Kaufman & Rousseeuw,
1990) method. On one hand, it is hard because any point only belongs to one cluster. On the other, is
partitional because it divides the feature space into non-overlapping regions. Further, K-means proposes
a single point to represent each divided region of the feature space. The former are called centroids or
means, and they geometrically correspond to the center (mean) of the cluster (Wu, 2012). These k-
means are refined iteratively by minimizing/maximizing some similarity/dissimilarity function among all
the members of each cluster.
K-means is fast, scalable, and has a linear computational cost in regards to the dataset size (Gan, Ma,
& Wu, 2007; Wu, 2012). Nevertheless, K-means requires clusters to be convex (e.g. spherical) and
tends to perform poorly on different-sized groups (Gan, Ma, & Wu, 2007; Wu, 2012). Thus, they are
prone to be outliers and not well-suited for modelling skewed distributions or noisy overlapping groups.
A K-means algorithm works as follows: First, select randomly k centroids and assign the remaining to
the closest centroid. Second, using all the points assigned to the ith-cluster, recalculate its centroid.
Third, if a centroid does not change or changes little or other stopping criteria are satisfied, the algorithm
ends. Else, it reassigns points to the new centroids and returns to the second step.
Formally, given a dataset X = {xi }, xi ∈ ℝN , i = 1, … , m, K-means assigns each xi to a particular cluster
ck ∈ C, k = 1, … , K by minimizing some objective function. As originally formulated, each partition of the
feature space is determined by minimizing the Euclidean distance among the members of each cluster,
and its mean is μk ∈ ℝN . Given that the Euclidean norm for a ck cluster is defined as:
49
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
where μk is the centroid of the K-cluster, Deucl is the Euclidean distance, and W is m × k matrix that
satisfies:
1. 𝑤𝑖,𝑘 ∈ {0,1} 𝑓𝑜𝑟 𝑖 = 1, 2, … , 𝑚, 𝑎𝑛𝑑 𝑘 = 1, 2, … , 𝐾
2. ∑𝐾
𝑘=1 𝑤𝑖,𝑘 = 1, 𝑓𝑜𝑟 𝑖 = 1, 2, … , 𝑚
Figure 3-7: A toy example of clustering transmission lines during storm using K-means
Dots represent different types of fault lines: Gray corresponds to no fault at all, blue represent L2L
faults, and SL2G are shown in orange. Centroids are displayed in red: The initial centroids are shown in
the lightest red, whereas final centroids are shown in vivid red. Dotted red lines display the partitions
corresponding to each centroid. As can be observed, as the optimization procedure unfolds, centroids
altogether with their partitions are tuned (lightest red displays the first iteration, whereas vivid red
shows the final centroid/partition). The corresponding numbers for each iteration are shown on the left
side of the figure.
3.3.6.3 Potential applications in smart grid
K-means applied building energy consumption
50
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
As has been stressed, before understanding and predicting energy consumption patterns in a building,
it is key to attack climate change and to better define energy management policies, utility planning, and
so on. However, the categorization of buildings is a rather challenging task: They are multidimensional
and heterogeneous. On one hand, the number of components and interactions of a building electrical
system are vast. On the other, each building population is also heterogeneous and composed of many
sub-groups in different locations, with distinct legislations, energy requirements, and so on.
Thus, ways to group buildings into clusters of similar energy consumption patterns are highly valuable.
For instance, given a dataset of buildings and energy consumption (characterized by active power,
reactive power, voltage, and so on), the most trivial approach would consist in applying K-means to
data for exploratory purposes. Because no relation between data is known beforehand, K-means allows
us to explore possible relations among data. In this example, the Euclidean similarity measure is
employed. However, readers must be aware that such distance measure requires continuous
independent features. Afterwards, the number of clusters to be tested is defined, and the results are
displayed. Although numeric performance measures of the clusters exist, visualization of the clusters
may provide more explicit hints on the relations between building energy consumption patterns groups.
3.4 PROBABILISTIC NETWORKS
Probabilistic networks are representations based on graph theory and probability theory, for modeling
domains with uncertainty and for making inferences with uncertain or incomplete information. They
are based on a domain model through a set of random variables and their dependency relationships
represented using a graph. This structure allows representing the joint probability distribution by a set
of local probabilities, which significantly reduces the computational complexity in space and time.
Probabilistic networks include, among others:
Bayesian networks
Bayesian classifiers
Decision networks
These types of models are suitable to represent problems involving uncertainty; applications include
medical and industrial diagnostic systems, user and student modeling, tutor strategies, planning under
uncertainty, voice and gestures recognition, prediction, image analysis, and robotics. Reference
(Kang, S. B., Advances in Computer Vision and Pattern Recognition, 2015) has detailed discussions on
the Bayesian networks, Bayesian classifiers, and Decision networks.
3.4.1 Bayesian networks
3.4.1.1 Brief definition
A Bayesian network (BN) takes consideration of a set of local parameters. These parameters are the
conditional probabilities for each variable given its network structure in Erreur ! Source du renvoi
introuvable.. Therefore, based on these local parameters, the conditional probabilities can be
represented.Erreur ! Source du renvoi introuvable. Depicts it is an example of a simple BN; the
structure of the graph implies a set of conditional independence assertions for this set of variables.
B C
D E
Figure 3-8: Example of a simple Bayesian network
51
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
B C
D E
Inference
Inference uses a Bayesian network to compute probabilities. Inference involves a general scenario to
compute 𝑃(𝑋|𝐸 = 𝑒), where X is query variable, E=e is evidence (observed) variable; and the joint
distribution 𝑃(𝑋, 𝐸, 𝑌) is known, where Y is unobserved variable. There are two types of the inference:
single-query inference and conjunctive-query inference, which consists of the effects of observed
variables in a Bayesian network to estimate its effect on the unknown variables.
Pearl’s algorithm, inference elimination, conditioning, junction tree, and stochastic simulation are the
algorithms used for Inference.
Structure and parameter learning
Learning problem in Bayesian networks includes structure learning and parameters learning.
When the structure or topology of the BN is known, and sufficient data are available for all the
variables, parameter learning is straightforward and could estimate the CPTs for the variables. If there
is not sufficient data, the uncertainty of the parameters can be modeled and estimated by a second-
order probability distribution like a Beta distribution for this situation.
There are two main types of methods for structure learning: search and score for global methods and
conditional independence tests for local methods. The complex process of obtaining the topology of
the BN for structure learning requires good estimation on the statistical measures. Techniques such as
trees, polytrees, general DAG depending on the type of structure could be used.
3.4.1.3 Application domains
Bayesian networks have advantages to express a compact representation of joint probability
distribution of nodes and fit data, it is an efficient way to represent complex probabilistic systems.
Bayesian networks modeling in several real-world application domains are listed such as system
biology, gene regulatory networks, medicine, biomonitoring, document classification, information
retrieval, semantic search, image processing, turbo code, and spam filter.
52
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
53
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
An influence is denoted as an arrow, which connects the nodes described above and also expresses
relevant knowledge from a node to another.
A decision tree is a graphical representation of a decision problem, which is also complementary of
influence diagram. It consists of three types of nodes that represent decisions, uncertain events, and
results. Usually, an influence diagram has a much more compact representation than a decision tree.
3.4.3.3 Application domains
These types of models are adequate to represent problems in which decisions have to be made with
uncertain information. Some applications are educational, medical, and industrial diagnostic systems,
such as student and tutor models to select tutorial actions in intelligent tutors given the current and
incomplete information of the context.
3.4.3.4 Potential applications in smart grid
Decision networks can be used to model intelligent power grids. They can be seen as a complex and
uncertain system, where decisions can be done (for example, intelligent assistants in operation and
maintenance diagnostic systems). Another potential application is the energy market, supporting and
permitting both the suppliers and the consumers to be more flexible and sophisticated in their
operational strategies.
54
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
55
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
future research directions in this emerging field. Reference (VISUAL-ANALYTICS.EU, n.d.) has more
research-related topics and projects that have been done or are still going on.
The visual-analytics research challenges could be categorized in the following areas: visualization
data, users, design, and technology, which include the challenges of:
Dealing and integrating with huge, heterogeneous, variable quality datasets.
Meeting the needs of the users.
Assisting designers of visual analytic systems.
Providing the necessary infrastructure technology.
3.6.3 The visual-analytics process
In order to gain knowledge from data, the combination of a computation and visual model tied with
human interaction suggests passing through the following visual analytics process. Figure 3-11 shows
an general overview of the different stages (represented by square blocks) and their transitions
(arrows) in the visual-analytics process (Keim, Kohlhammer, Ellis, & Mansmann, 2010).
56
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
3.7 REFERENCES
[1] EPRI, "Advanced Data Analytics Techniques: Analysis and Applications for Power System
Operation and Planning Support," Power Delivery & Utilization - Transmission, Jan 28, 2016.
[2] S. Chen, A. Onwuachumba, M. Musavi and P. Lerley, "A Quantification Index for Power Systems
Transient Stability," Energies 2017, 10, 984.
[3] iTesla (Innovative Tools for Electrical System Security within Large Area), "Deliverable D2.4
Data mining methods - Uncertainties modeling for offline and online security assessment," July
29, 2013.
[4] G. James, D. Witten, T. Hastie and R. Tibshirani, An Introduction to Statistical Learning: with
Applications in R., Springer, 2013.
[5] T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning, Springer, 2009.
[6] C. Rao, H. Toutenburg, Shalabh and C. Heumann, Linear Models and Generalizations - Least
Squares and Alternatives, Springer, 2008.
[7] P. Ma and X. Sun, "Leveraging for big data regression," WIREs Comput Stat, vol. 7, no. 1, p.
70–76, 2015.
[8] J. Ma and J. Cheng, "Estimation of the building energy use intensity in the urban scale by
integrating GIS and big data technology," Applied Energy, vol. 183, pp. 182-192, 2016.
[10] W. Chung, "Using the fuzzy linear regression method to benchmark the energy efficiency of
commercial buildings," Applied Energy, vol. 95, pp. 45-49, 2012.
[11] J. Lam, K. Wan, D. Liu and C. Tsang, "Multiple regression models for energy use in air-
conditioned office buildings in different climates," Energy Conversion and Management, vol. 51,
pp. 2692-2697, 2010.
[12] D. Hsu, "Identifying key variables and interactions in statistical models of building energy
consumption using regularization," Energy, vol. 83, pp. 144-155, 2015.
[13] S. Asadi, S. Shams and M. Mohammad, "On the development of multi-linear regression analysis
to assess energy consumption in the early stages of building design," Energy and Buildings, vol.
85, pp. 246-255, 2014.
[14] T. Walter and M. Sohn, "A regression-based approach to estimating retrofit savings using the
Building Performance Database," Applied Energy, vol. 179, pp. 996-1005, 2016.
[15] M. Braun, H. Altan and S. Beck, "Using regression analysis to predict the future energy
consumption of a supermarket in the UK," Applied Energy, vol. 130, pp. 305-313, 2014.
[17] T. Hong, M. Gui, M. Baran and H. Lee, "Modeling and Forecasting Hourly Electric Load by
Multiple Linear Regression with Interactions," in Power and Energy Society General Meeting,
2010.
[18] K. Kandananond, "Forecasting Electricity Demand in Thailand with an Artificial Neural Network
Approach," Energies, vol. 4, pp. 1246-1257, 2011.
[19] G. Tso and K. Yau, "Predicting electricity energy consumption: A comparison of regression
analysis, decision tree and neural networks," Energy, vol. 32, pp. 1761-1768, 2007.
57
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
[20] S. Halilcevic, A. Gubina, B. Strmcnik and F. Gubina, "Multiple regression models as identifiers of
power system weak points," Generation Transmission and Distribution, vol. 153, no. 2, pp. 211-
216, 2006.
[21] L. Rokach and O. Maimon, Data Mining with Decision Trees, Link, Singapore: World Scientific
Publishing Co., 2015.
[22] Z. Yu, F. Haghighat, B. Fung and H. Yoshino, "A decision tree method for building energy
demand modeling," Energy and Buildings, vol. 42, pp. 1637-1646, 2010.
[23] D. Graupe, Principles of Artificial Neural Network, World Scientific, Sep 2013.
[24] S. A. Kalogirou, "Artificial neural networks in renewable energy systems applications: a review,"
Renewable and Sustainable Energy Reviews, vol. 5, no. 4, pp. 373-401, Dec 2001.
[25] S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, Dec 13,
1994.
[26] S. S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd Edition, Prentice Hall, 1999.
[30] B. Muller and H. Keller, "Neural networks for combustion process modelling," in Proc of the Int
Conf EANN '96, London, UK, 1996.
[31] S. Milanic and R. Karba, "Neural network models for predictive control of a thermal plant," in
Proc of the Int Conf EANN '96, London, UK, 1996.
[32] J. K. Mandal, A. K. Sinha and G. Parthasarathy, "Application of recurrent neural network for
short term load forecasting in electric power system," in Proc of the IEEE Int Conf ICNN '95,
Perth, Western Australia, 1995.
[33] A. Khotanzad, A. Abaye and D. Maratukulam, "An adaptive and modular recurrent neural
network based power system load forecaster," in Proc of the IEEE Int Conf ICNN '95, Perth,
Western Australia, 1995.
[34] D. Datta and S. A. Tassou, "Energy management in supermarkets through electrical load
prediction," in Proc of the First Int Conf on Energy and Environment, Limassol Cyprus, 1997.
[35] D. Keles, J. Scelle, F. Paraschiv and W. Fichtner, "Extended forecast methods for day-ahead
electricity spot prices applying artificial neural networks," Applied Energy, vol. 162, pp. 218-230,
2016.
[36] A. A. Aburomman and M. B. I. Reaz, "A survey of intrusion detection systems based on
ensemble and hybrid classifiers," Computers & Security, vol. 65, pp. 135-152, March 2017.
[37] B. Scholkopf and A. Smola, Learning with Kernels, The MIT Press, 2002.
[38] M. Singh, B. Panigrahi and R. Maheshwari, "Transmission Line Fault Detection and
Classification," in International Conference on Emerging Trends in Electrical and Computer
Technology (ICETECT), 2011.
58
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
[39] P. Ray and D. Mishra, "Support vector machine based fault classification and location of a long
transmission line," Engineering Science and Technology, an International Journal, vol. 19, pp.
1368-1380, 2016.
[40] G. Gan, C. Ma and J. Wu, Data Clustering: Theory, Algorithms, and Applications, SIAM, 2007.
[42] L. Kaufman and P. Rousseeuw, Finding Groups in Data, John Wiley & Sons, Inc., 1990.
[43] I. Goodfellow, Y. Bengio and A. Courville, Deep Learning, MIT Press, 2016.
[44] I. Goodfellow, Y. Bengio and A. Courville, Deep learning, MIT Press, 2016.
[45] E. Mocanu, P. Nguyen, M. Gibescu and W. Kling, "Deep learning for estimating building energy
consumption," Sustainable Energy, Grids and Networks, vol. 6, pp. 91-99, June 2016.
[46] J. Kelly and W. Knottenbelt, "Neual NILM: Deep neural networks applied to energy
disaggregation," in ACM BuildSys' 15, Seoul, November 4-5, 2015.
[47] J. J. Thomas and K. A. Cook, Illuminating the Path: Research and Development Agenda for
Visual Analytics, IEEE-Press, 2005.
[49] D. Keim, J. Kohlhammer, G. Ellis and F. Mansmann, Mastering the Information Age Solving
Problems with Visual Analytics, Eurographics Association, 2010.
59
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
61
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
The various displays are intended to expose real-time conditions of the grid, as well as trends of
relevant system variables, to help power system operators maintain adequate situational awareness
and respond to conditions potentially threatening the system stability in an expedited manner.
The way in which system data is presented to the operator can support the strengths and reduce the
effects of limitations of human perception and performance, thereby enhancing operator situational
awareness. As explained in Section 3, there are several principles of display design that help to
understand how humans detect, process, interpret, and act on information[1]. Some ground rules are:
A display should look like the variable that it represents.
Processing a large set of information can be facilitated by dividing this information across several
resources (e.g. using both visual and auditory information) and minimizing the cost in time or
effort to “move” selective attention from one display location to another to access information.
These principles lay the foundations for how to design a human-machine interface that satisfies the
needs of human abilities to process information and prevents the negative consequences of cognitive
biases. A description of the main components of the human information-processing system, and how
they apply for display design in system operations, can be found in [1]. Generally, the key driver for
the selection of the appropriate visualization display depends on the task at hand. For example, if one
wants to understand the overall voltage variation across a region, then contours can be quite
effective, but if the exact voltage to three decimal points is needed, then a numerical display is more
appropriate.
Many techniques have already been applied to the field of power system visualization, with some of
them described in the following subsections.
4.2.1.1 Schematic network diagrams (one-line/single diagram)
A schematic network diagram is a simplified notation for visualizing an electrical power system.
Elements on the diagram do not represent the physical size or location of the electrical equipment.
The display is optimized to provide the user a good overview of the network topology.
62
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
63
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 4-3: Contour showing voltage magnitudes with values below 0.98 per unit
It is important to carefully select the colors that are used to represent the different elements or
conditions as to avoid potential covering or camouflaging other important information [2].
Transparency is also used in some cases for this purpose.
Contour gradients are another variation of contouring used to represent and compare to classes of
values. Thus, the operator can identify significant deviations in the network at a single glance, as well
as their location and severity. In the example shown in Figure 4-4, generation infeed from renewables
is visualized such that red and yellow spots represent a higher generation, while green spots indicate
a lower generation compared with the current (in the cited case, today’s) schedule [28].
64
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
indicating upcoming problems and high density representing severe problems. Violated limits are
coded additionally by showing a small black ring. With increasing excess of the limit value, the bubble
grows beyond the initial radius, creating a halo effect around the black ring.
Feeders that are connected to a violated bus bar are highlighted in the same color as the bubble to
indicate the impact of the deviation on the network. The operator receives a first-level indication of a
potential problem on a feeder, even if the feeder itself does not have real-time measurements [27].
65
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
3 D cones
This an extension of the 2D bubbles visualization presented above, where 3D transparent cones are
used to display variables of interest with less obfuscation of other parameters, as illustrated in Figure
4-7Figure 4-8. The height of the cone indicates the severity of the violation, so non-critical deviations
(limit not yet violated) stay flat for indicating potentially upcoming but not yet critical problems. In
addition, the pointing direction is showing the type of violation. Hence, low voltage violations as cones
pointing downwards with high voltage violations as cones pointing upwards. Each cone matches a
bubble with a circle in the 2D view, supporting the user’s orientation in the system. The described
principle can be applied to multiple other scenarios, for example for representing areas with very low
versus very high demand, or outage indices such as customer average interruption duration index
(CAIDI) [27].
66
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
4.2.1.4 Animation
Some visualization tools provide the option to display animated vectors to visualize power system
dynamics. Figure 4-9 shows an example of animation of power flows, where the direction of the
transmission line animation corresponds to the direction of flow in the physical system [28].
In this figure, animated power flow arrows display profiles for active and reactive power flows. On
demand, the user can turn the animation for a window off or on. Further, some tools offer the ability
to define thresholds based on percent of thermal limits or other parameters, which can be set so that
flows get animated only when they approach alarming levels. Animated arrows are used to visualize
MW and MVAR flows to identify loop flows or other abnormal patterns.
According to the results of human factor experiment conducted by D. A. Wiegmann and other
researchers [4], animation in power system displays can be very effective to help operators interpret
displays by directing their attention to the most important information for a particular task or
situation. It also enhances an operator’s understanding of system behavior. If properly configured, it
also assists an operator to better assess current system states and the causal factors that underline
those states, decide on mitigation measures if a violation of system resources occurs, and provide
immediate feedback regarding the effectiveness of implemented measures.
67
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
E
Figure 4-11: Visualization of dispersed generation in the general panel at Red Eléctrica del España
68
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
69
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 4-13: Integrated system view with Icons and Info boxes
Selection of profiles
Figure 4-15: Distribution network visualization
70
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
71
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Works closely with the ERCOT control room system operators, providing round-the-clock
support for analysis and system applications.
Develops and authors congestion management plans for mitigation of temporary and
ongoing grid vulnerabilities.
Gathers relevant and accurate information about grid events and communicates that
information in a timely manner to the shift supervisor and engineering support groups.
e. Shift Supervisor:
Monitors the operation of all desks in the control room.
Continually reviews and analyzes system security.
Provides the primary point of communication with ERCOT management and market
participants.
f. DC Ties:
Schedules and monitors energy transactions into and out of the ERCOT control area across
the asynchronous DC ties.
Coordinates the import of emergency energy across the DC ties into the ERCOT control
area during emergency operations.
g. Reliability Unit Commitment (RUC):
Oversees the weekly reliability unit commitment (WRUC), day-ahead reliability commitment
(DRUC), and hourly reliability unit commitment (HRUC) processes.
Performs hourly studies to identify potential voltage problems on the ERCOT system.
Responds to inquiries about RUC commitments.
h. Reliability Risk:
Coordinates with the RUC, real-time, transmission and security, resource, operations, shift
supervisor, and other ERCOT operators as necessary to maintain grid reliability .
Responsible for the safe and efficient operation of all intermittent renewable resource
(IRR) generation assets.
Responds to inquiries about intermittent generation dispatch, wind and solar forecast,
operations, curtailments, and other related tasks.
Figure 4-16 is an overview of the control room main visualization board and various displays. Figure
4-17 shows displays related to generation and load information. The graphic on the right displays in a
graphical information non-spin and quick-units, to facilitate interpretation by system operators.
72
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 4-17: ERCOT control room - load and generation details display and quick start/non-spin
graphs
Figure 4-18 is an overview of the visualization tool that is used to display wind generation details. It is
intended to provide the operator with breakdown of wind generation by zone, allowing the operator to
visualize current wind generation trends in one screen. Figure 4-19 is a snapshot of the real-time
sequence monitor, whose main objective is to summarize results from real-time security assessment
tools, such as state estimation and contingency analysis, with timer indicating last execution. It
provides the shift engineers and operators with alarms when real-time applications have not
successfully run.
Figure 4-20 is the system voltage overview display. It gives an overview of voltage levels at some
345-kV and 138-kV busses around the ERCOT system. It also alerts operators when voltage levels are
too high or too low and indicates what reactive devices can be put in service to help control voltage.
73
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
74
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
75
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
To illustrate this idea, this section describes a forecast security tool, which will be part of the hyper-
vision system. On a rolling basis, the system gathers data from forecasting tools such as load,
renewable generation, and market results forecasts, as well outage planning, and combines that data
with the last SCADA snapshot to feed it into grid models that have been updated to represent
foreseen system conditions in the near-term future (few minutes, to 24 hours). The system performs
security analysis through data analytics and modelling tools. If a constraint or reliability issue is
detected, it assesses the effectiveness of possible remedial actions that have been used in the past in
similar situations, or that have been considered in grid studies to solve the type of problem being
analyzed. If no solution is found in the remedial action library for the foreseen constraint, the system
alerts the operator that a detailed evaluation needs to be performed to assess whether and when a
preventive action is to be taken to mitigate the security risk. In that case, the operator performs
studies to design a proper solution to the constraints in question and adds the solution to the remedial
actions library for future use. The system will alert the operator in a timely manner, so that the
mitigation actions can be effectively implemented.
The hyper-vision user interface remains empty as long as no potential unsecure conditions are
detected within the time horizon of the analysis. If a constraint is identified for any future operating
condition considered, it is displayed in the upper timeline along with the proposed remedial actions, as
shown in Figure 4-22. The operator has access to detailed information about the constraint and
results of the analysis.
Figure 4-22: Example of control actions displayed in the main interface of Apogeé
To monitor the process, a time-based supervision display that synthesizes the results of the forecasted
security analysis is also proposed. Figure 4-23 illustrates the concept (labels and names have been
obfuscated to protect sensitive information). The first column is an expandable tree representation.
The first level of the tree displays contingencies that result in constraints. For each of those
contingencies, the field can be expanded to show the second level, which contains further description
of the constraint and recommended remedial actions. The color code is as follows:
Type Color Meaning
Contingency Green Constraints are detected.
There is at least one effective remedial action.
Red Constraints are detected.
No effective remedial action found.
Constraint Black The constraint is detected.
Remedial action Green The remedial action is effective.
76
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
In the following sections, we present a brief description of these use cases, with references where
interested readers can find more details.
4.3.1 Real-time situational awareness with PMU data
Outstanding characteristics of Synchrophasor data, namely high resolution and time synchronization,
make it possible to monitor power system dynamic performance as well as grid stresses over a large
geographical area. Synchrophasor applications for on-line or near real-time operations enhance
situational awareness and help detect situations that can threaten reliability of the grid. On-line
applications include, among others, system electromechanical oscillations detection and evaluation of
associated damping, voltage, and angular stability assessment; voltage sensitivities with respect to
real and reactive power changes; display and analysis of voltage angle difference over wide
geographical areas; improved state estimation; islanding detection and monitoring; and event
detection [12].
Actionable information from these applications is useful when there is sufficient time for an operator
to take action to mitigate the threat. In cases where there would be not enough time for an operator
action, automatic corrective control should be designed and implemented. Even though there is a
77
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
great potential to use Synchrophasor data for automatic control, not many applications have been
successfully implemented. One of the main hurdles that prevents extended use of control applications
is data quality and availability. Apart from those main applications, analytics that use Synchrophasor
data have been developed to identify and diagnose a wide number of grid events, such as failing
potential transformer, capacitor bank switching issues, open phases on breakers, negative sequence
concerns, issues created by variable loads (such as arc furnaces), as well as generation and
transmission equipment misoperations [13].
Because of the advantages of Synchrophasor technology to enhance monitoring and situational
awareness of the grid, many electric systems have deployed a large number of PMUs across their
footprints. In the USA, for example, the Smart Grid program led by the Department of Energy in the
recent past resulted in the installation of PMUs at about 1000 substations and an extensive
communication infrastructure to collect and archive the data. Other countries, in particular China and
India, have also implemented plans for large deployment of PMUs and the corresponding
communication and computation infrastructure.
There is an abundant technical bibliography on Synchrophasor technology and applications.
Reference [14] gives a thorough update of Synchrophasor projects in North America, with detailed
explanation of the applications currently implemented. Reference [15] provides a methodology for
identifying and estimating the benefits of using Synchrophasor technology to enhance grid operations
and planning. Also, a wealth of technical information can be found at the North American
Synchrophasor Initiative (www.naspi.org). Several vendors provide platforms and software solutions
for various one-line applications of Synchrophasor s data [12][16].
The following visualizations examples represent standard applications of typical wide area monitoring
systems based on Synchrophasor technology:
4.3.1.1 Power swing recognition (PSR)
Wide area monitoring is getting more and more in focus in the control center visualizations. They will
be integrated in the standard visualizations to monitor the whole roundtrip from monitoring until the
automatic regulations.
The “power swing recognition,” also called “oscillation monitoring system” (OMS), can recognize,
evaluate, and display active power swings in the energy supply network. This ensures that power
swings that can be dangerous for network operation are recognized and reported automatically. A
power swing can be observed between two locations by evaluating the phase angle difference of the
PMUs involved, or at a single location by evaluating the active power determined there. If a power
swing measured in terms of phase-angle difference is present, the locations of both PMUs involved are
circled and the associated connection line of the same color is inserted between them (see connection
Paris – Rome in Figure 4-24 [29]). If a power swing measured in terms of active power at an
individual PMU is present, then the PMU where the measurement was made is marked with a circle
(see Copenhagen in the following figure). The assigned color represents the damping ratio and
amplitude quantities, which are required for a meaningful estimate of the actual degree of danger.
These quantities, coupled with the associated limiting values, give a degree of danger that forms the
basis for assessing the potential consequences of the detected power-swing event.
78
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
79
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 4-27: Phase angle difference of the voltages between different PMUs
80
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
allow engineers to correctly select the affected transmission line, determining fault type and
performing fault location calculation.
The use of data-analytics techniques and tools for identification and classification of power system
events has been the subject of extensive research. It is probably one of the most studied areas for
the use of data analytics to support system operations. Reference [9] provides a survey of data-
analytics applications to support system operations, with emphasis in fault and event detection and
analysis.
Techniques used for fault detection, classification, and analysis include: Artificial neural network
(ANN), wavelet transform, support vector machine, k-Nearest Neighbor, decision tree, and association
rules. See section 3 in this document for an explanation of these techniques. Hybrid methods that
combine various techniques have also been developed. For example, reference [10] presents a new
method based on combined wavelet transform-extreme learning machine (WT-ELM) technique to
identify, classify, and locate a fault in a series-compensated transmission line.
Fault-diagnosis methods based on association rules can take both spatial and temporal characteristics
into account. The resulting set of rules could obtain a real-time model that helps to create a list of
preventive actions to be taken. A second suggested advantage of this approach is that the data is
provided by protection equipment like relays.
Sequences of events like voltage sags can be analyzed using pattern sequence discovery algorithms
that are an association rule-based method. The events that are measured in a measuring point have
an associated time of occurrence, and temporal patterns that occur with a sufficient frequency are
identified. The time spans between expected related events that are the result of this model can be
used for prediction and prevention of successive events.
In the same field of fault but a different specific objective, an application using advanced data-
analytics techniques has been proposed for fault-direction detection for protection purposes. It based
on Multilayer Feedforward Neural Network (MFNN). It is claimed that the proposed discriminator is
fast, robust, and accurate. And it is suitable for realizing an ultrafast directional comparison protection
of transmission lines [11].
As concluded in the study presented in [9], only a few of the developed data-analytics algorithms
have been implemented in production mode in electric utilities. One of the difficulties to fully
implement them is the need for appropriate communication and data integration infrastructure.
Certainly, a system intended to perform fault detection, classification, and location has to operate in
on-line mode with minimal manual interventions in place. This requires a communication system to
retrieve the appropriate data from relays, digital fault recorders, and any other necessary devices and
securely send this data to a centralized location where the algorithms are run. If the methodology
uses data from other sources as well (such as weather data), such data also needs to be timely
available and properly integrated with the electrical data for the analysis.
Example: Lightning correlation detection
One such use case is the real-time outage and lightning correlation described in [21]. In the event of
a feeder outage, the correlator process combines data on current network topology and geography,
circuit breaker operation, and lightning data from a lightning location service to show the affected
feeder area and the lightning strike that caused the outage of the feeder. This provides valuable
information to the dispatcher to coordinate the field crew and accelerate remedial activities.
81
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
82
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 4-29: Smart Cable Guard system and web interface, showing the location of increasing
partial discharge activity over time
Example: OpenXDA platform
An open-source platform that integrates several applications has been developed by Grid Protection
Alliance (GPA) in the United States, under the auspice of several sponsors, including EPRI, electric
utilities, government agencies, and other research organizations. One of the applications is the
openXDA software, which is an extensible platform for processing events and trending records from
disturbance-monitoring equipment such as digital fault recorders (DFRs), relays, power quality meters,
and other power system intelligent electronic devices (IEDs). Open PQ Dashboard, another application
of the suite, provides visual displays to quickly convey the status and location of power quality
anomalies and other events throughout the electrical power system (see Figure 4-30 [22]). It is also
used to display results from openXDA in the geo-referenced visualization panel. Summary displays
start with the choice of a geospatial map-view or annunciator panel, both with visualizations for
across-the-room viewing fit for operations support center [22].
83
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
84
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 4-32: Framework proposed in [17][18] for real-time dynamic security assessment combining
PMU data analytics and high performance dynamic simulation
85
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
86
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
companies. Digging damage comprises a large part of low-voltage and medium-voltage power failures
in any distribution network, so prevention of cable digging damage is important. Based on various
data sources (location, soil, cable types, subcontractor track record, etc.), a predictive model is a very
useful tool to management the risk related to digging damage.
87
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
88
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
89
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
90
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Measurements: currents from SCADA, measured data from weather stations, gridded weather
data applied to micro locations using weather model and terrain data.
Reliability analyses: N-1 analyses, line outage distribution factors (LODF) for power flow
calculations.
Forecasts: short-term load flow forecasts and short-term weather forecasts for corridors of
power lines.
Dynamic thermal ratings (DTR): calculations based on current weather and forecasted
weather (t0 ... t0+3h).
Exceptional weather events.
Visualization.
Integration platform and data exchange: SUMO BUS.
ODIN VIS –
Visualization
Platform
Physical
Conductor Data
Commercialy and Power Line
available Spatial (GIS)
DTR Data, System
ZM ONAP LF LODF DTR OIAP Configuration
subsystems ZM
DTR
Data
subsystems
Exceptional
Weather NOV Weather
assesment Load Flow Data
and forecast Calculations Forecast of Loads Load Flow for Notification
in Network Nodes N-1 state
91
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Symbol Meaning
Thunderstorm –
lightning activity
High air
temperatures
Low air
temperatures
Extreme rainfall
Figure 4-40: Exceptional weather events Figure 4-41: Thunderstorm – lightning activity
and rainfall event notification
Visualization
The visualization provides the means to aggregate the vast amount of data in a convenient and easy-
to-understand manner. The results are presented in real time to dispatchers in the network control
center (NCC) via advanced visualization platform ODIN-VIS (Figure 4-42).
On the center of the screen, a part of the transmission grid is shown. The power lines are colored
according to ratio of the actual current to the actual rating. On the right side of the screen is the
SUMO panel that for each power line shows the following:
“Four quadrant” view of the relative line load:
o Upper left: actual line current versus actual line rating for actual network topology.
o Upper right: forecasted line current versus forecasted line rating for actual network
topology.
o Lower left: actual line current versus actual line rating for N-1 network topology.
o Lower right: forecasted line current versus forecasted line rating for N-1 network
topology.
Exceptional weather events.
N-1 power line – the power line in the transmission grid, when tripped, that causes the largest
rise of load on the power line of interest.
The quadrants are colored green if the ratio of the line current versus the rating is less than 90%. If
the ratio is between 90 and 100%, the quadrants are colored orange. If the ratio is 100% or more,
the quadrants are colored red and additionally show the safe remaining operating time.
92
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
93
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Q1.1 Analytics practice in electric utilities is lagging other industries such as transportation,
healthcare and financial services, in terms of actual implementation
Q1.2 Data analytics technologies that use multiple data sources can play a significant role in
improving situational awareness tools
Q1.3 The value and accuracy of data analytics solutions that integrate various data sources is
not well understood, and that affect implementation and adoption of the data analytics
technology
Q1.4 There is a need to develop standardized data structures and data models for effective
deployment of enterprise’s data analytics capability
Q1.5 My organization is planning to introduce new data-analytics techniques and tools for
transmission operation improvement within the next 2 to 4 years
Q1.6 Data quality issues is a major barrier for wide spread use of Synchrophasor data to
improve system operations
The responses to these questions are shown in Figure 4-44. It can be observed in this figure that for
the first two statements, responses are equally divided between those who agree or strongly agree
with the statement and those who don’t have a strong opinion about it (neutral). None of the
responders seems to disagree. Responses for Q1.4 indicate that there is strong consensus among the
responders about the need for standardized data structures and data models for effective deployment
94
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
of enterprise’s data analytics capability. Surprisingly, there is quite divided opinion about the
importance of data quality for widespread use of Synchrophasor s for improving system operations.
95
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
96
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
97
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Regarding visualization, the company uses classical GIS system for OHL only and schematic diagrams
in the specific SCADA/EMS. They share system data and visualization with the EAS (European Answers
System), which is dedicated to the interconnection security.
Tohoku Electric Power Co., Inc. – Japan
System characteristics:
- System peak load: about 14,000 MW
- Installed generation capacity: 17,810 MW
- Installed VG capacity (include wind and other variable generation resources): 3,230 MW
- Voltage levels: max 500 kV
- Number of interconnections with other systems, control areas, regions: 2
- Number of substations: 624
The company has implemented a tool for assessing and monitoring reliability in on-line mode. It has
several functions, the dynamic stability assessment module being the most important, which runs
every 30 minutes.
For trending and forecasting analysis, they have developed and implemented a system called
Photovoltaics Output Estimating and Forecasting System, which is intended for control room operators
and operation engineers. It estimates the amount of solar radiation from numerical weather forecasts
and calculates photovoltaics output based on that.
For visualization in the control room, measurement data, such as transmission line power flow,
generator output, and bus voltages, are displayed on the system diagram every 10 seconds. Results
from security analysis tools are also displayed in the monitor screen using a color code to highlight if a
reliability violation occurs. Information about weather conditions and forecasts are also displayed.
Relative to distributed generation, total current wind power output and photovoltaics output in the
entire area are displayed on a dedicated screen, along with generation forecasts for a short- and
medium-term period selected by the user.
For the question about the biggest challenges and needs in visualization, the responder indicated that
it is critical to improve accuracy of renewable energy and output prediction and use those results to
estimate voltage variations and display them in a monitor screen alongside critical visualizations.
Kyushu Electric Power Company – Japan
System characteristics:
- System peak load: about 15,082 MW
- Installed generation capacity: 28,036 MW
- Installed VG capacity (include wind and other variable generation resources): 5,793 MW
- Voltage levels: 500, 220, 110, 66, 22, 6.6kV
- Number of interconnections with other systems, control areas, regions: 1
- Number of substations: 596
For wide-area monitoring, Kyushu Electric has implemented a system to evaluate current state of the
power system as well as expected conditions in the near future (30 to 60 minutes). The application
assesses power system reliability in terms steady and dynamic security, including frequency
variations, overload, unscheduled power flows, voltage deviations, and dynamic and voltage stability.
A separate tool is used for voltage and reactive power control. The tool determines optimal control
actions in response to predicted demand and system operating conditions, with the objective to
prevent voltage at critical buses to deviate from operating margins.
The company has a renewable energy forecast system that forecasts output of solar photovoltaic
every 30 minutes. The system uses radiation forecast data purchased from a weather information
provider.
Another application is used for short-term demand forecasts. Future demand is estimated based on
accumulated historical demand, historical weather data, and weather forecast data purchased also
from a weather service company.
Visualizations to support system operation include:
98
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Weather and weather forecast: Weather information (cloud radar images, forecast information, etc.)
and images from weather video cameras are displayed on a system panel. Temperature, other
weather information, and lightning strike status are displayed on the operator monitors.
Distributed generation: Battery charge/discharge status and area renewable output and status are
shown graphically on a system panel and operator monitors.
Anticipated state of the network: Results from power system reliability assessment tools are displayed
on operator monitors.
Other types of visualizations specific to that control center: Dam level information is displayed on
operator monitors, and earthquake occurrence information is presented on the overall system panel.
Visual information for other sectors of the company: Some power system information and
demand/supply information are made available throughout the company. In addition, power
generation forecast and lightning strike information are available in the company website.
In accordance with other responses, the responder indicated that the biggest challenge in
visualization, and other related analytics tools, is the needed for better renewable energy power
output forecast.
National Grid - UK
System characteristics:
- System peak load: about 60,000 MW
- Installed generation capacity: 120,000 MW
- Installed VG capacity (include wind and other variable generation resources): 20,000 MW
- Voltage levels: 400/275 kV
- Number of interconnections with other systems, control areas, regions: 4
- Number of substations: 340
- Number of measurement points: 1,000,000
National Grid has a tool called VISOR for wide-area monitoring developed by Psymetric and the
University of Manchester. The tool performs real-time monitoring and alarming of sub-synchronous
oscillation.
Related to equipment health monitoring in the control room, they have developed and implemented a
tool for monitoring underground cables, which uses data from temperature sensors around cables to
evaluate cable conditions in terms of loading and capacity.
For trending and forecast analysis, the company has developed an energy forecast system to predict
national demand based on weather forecast data plus the contribution of both solar and wind
generation to the generation mix.
Visualization to support system operation includes:
Weather and weather forecast: TBD
Distributed generation: Battery charge/discharge status and area renewable output and status are
shown graphically on a system panel and operator monitors.
Anticipated state of the network: Results from power system reliability assessment tools are displayed
on operator monitors.
Other types of visualizations specific to that control center: Dam level information is displayed on
operator monitors, and earthquake occurrence information is presented on the overall system panel.
Visual information for other sectors of the company: Some power system information and
demand/supply information are made available throughout the company. In addition, power
generation forecast and lightning strike information are available in the company website.
4.5 REFERENCES
[1]. Technology Assessment of Power System Visualization. EPRI, Palo Alto, CA: 2009. 1017795.
99
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
[2]. T. J. Overbye, D. A. Wiegmann, A. M. Rich, and Y. Sun, “Human factors aspects of power system
voltage contour visualizations,” IEEE Transactions on Power Systems, pp. 76-82, February 2003
[3]. T. J Overbye, D. Wiegmann and R. J. Thomas, “Visualization of Power Systems”, PSERC Final
Project Report, Publication 02-36, Nov. 2002.
[4]. D. A. Wiegmann, G. R. Essenberg, T. J. Overbye, Y. Sun, “Human Factor Aspects of Power System
Flow Animation,” IEEE Trans. on Power Systems, vol. 20, August 2005, pp. 1233-1240.
[5]. ORNL VERDE: Visualizing Energy Resources Dynamically on Earth
http://techportal.eere.energy.gov/technology.do/techID=17
[6]. Woody Rickerson, “A Control Room View of the ERCOT Grid”, ERCOT Public April 19, 2016
http://www.ercot.com/content/wcm/key_documents_lists/81724/5_A_Control_Room_View_of_the
_ERCOT_Grid.pdf
[7]. https://www.coreso.eu/mission/
[8]. http://www.tscnet.eu/
[9]. Advanced Data Analytics Techniques: Analysis and Applications for Power System Operation and
Planning Support. EPRI, Palo Alto, CA: 2015. 3002007076.
[10]. V. Malathi, N. S. Marimuthu, S. Baskar, and K. Ramar, “Application of extreme learning
machine for series compensated transmission line protection,” Engineering Applications of Artificial
Intelligence, vol. 24, no. 5, pp. 880-887, 2011.
[11]. T. S. Sidhu, H. Singh, and M. S. Sachdev, “Design, implementation and testing of an artificial
neural network based fault direction discriminator for protecting transmission lines,” IEEE Trans.
Power Delivery, vol. 10, no. 2, pp. 697-706, 1995.
[12]. Review of Synchrophasor Applications, EPRI, Palo Alto, CA: 2014. 3002002870
[13]. Alison Silverstein, Kyle Thomas, and Jim Kleitsch, “Using Synchrophasor Data to Diagnose
Equipment Mis-operations and Health”, NASPI Work Group Meeting October 22, 2014
[14]. U.S. Department of Energy, “Advancement of Synchrophasor Technology in ARRA Projects”,
March 2016 –https://www.smartgrid.gov/files/20160320_Synchrophasor _Report.pdf
[15]. THE VALUE PROPOSITION FOR SYNCHROPHASOR TECHNOLOGY, North American
Synchrophasor Initiative NASPI Technical Report, October 2015.
https://www.naspi.org/sites/default/files/reference_documents/5.pdf?fileID=1571
[16]. Catalog of Data-Intensive Applications for Transmission Systems. EPRI, Palo Alto, CA: 2015.
3002005231.
[17]. High-Performance Hybrid Simulation/Measurement-Based Tools for Proactive Operator
Decision-Support – U.S. Department of Energy (DOE) DE-OE0000628 - Final Report, September
2014.
[18]. E. Farantatos, A. Del Rosso, N. Bhatt, K. Sun, Y. Liu, L. Min, C. Jing, J. Ning, M. Parashar, “A
Hybrid Framework for Online Dynamic Security Assessment Combining High Performance
Computing and Synchrophasor Measurements”, 2015 IEEE PES General Meeting
[19]. Case Study: Demonstration of the Quantum Weather Storm-Prediction Model and Application,
EPRI, Palo Alto, CA: 2016. 3002004268
[20]. Situational Awareness – Opportunities for the Electric Power Industry, EPRI, Palo Alto, CA:
2016. 3002007606
[21]. Djurica, V., Milev, G., An Application to Display Lightning Data Using SCALAR Information
System,(2014), 23rd International Lightning Detection Conference, Tucson, USA
[22]. https://www.gridprotectionalliance.org/products.asp#XDA
[23]. U. Minnaar, “The Characterisation and Automatic Classification of Transmission Line Faults”,
Ph.D. Thesis, University of Cape Town, September 2013.
100
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
[24]. Mahfuz Ali Shuvra and Alberto Del Rosso, “Root Cause Identification of Power System Faults
using Waveform Analytics”, accepted for the CIGRE US National Committee 2017 Grid of the
Future Symposium
[25]. Lakota, G., et al., “Real-Time and Short-Term Forecast Assessment Of Power Grid Operating
Limits – SUMO”, 5th International Scientific and Technical Conference - CIGRE B5, Sochi, Russia,
2015.
[26]. Djurica, V., dr. Kosmač, J., Milev, G., “A Multiple Power Line Corridor and Lightning Error-
Ellipse Spatial Processor for Real-Time Correlator”, (2008) 20th International Lightning Detection
Conference, Tucson, USA.
[27]. CIRED Paper / 0406 / June 2015:2D AND 3D VISUALIZATION STRATEGIES FOR
DISTRIBUTION MANAGEMENT, Siemens AG: Sonja Sander / Siemens AG: Dr. Roland Eichler
[28]. User Interface: Spectrum PowerTM 7 / Siemens AG
[29]. User Interface: SIGUARD PDP / Siemens AG
[30]. U.S. Department of Energy – Electricity Delivery & Energy Reliability, “Dynamic Line Rating
Systems for Transmission Lines”, Topical Report, Smart Grid Demonstration Program, April 25,
2014 (available online at www.smartgrid.gov)
[31]. Integrating Dynamic Thermal Circuit Rating into System Operations: Utility Experiences and
Technology Roadmap. EPRI, Palo Alto, CA: 2011. 1021751.
[32]. Increased Power Flow: Overhead Transmission Line Rating Research Advancements. EPRI,
Palo Alto, CA: 2015. 3002005709.
[33]. Cigre Working Group B2.36 Technical Brochure, “Guide for Application of Direct Real-Time
Monitoring Systems”, June 2012.
[34]. http://lindsey-usa.com/dynamic-line-rating/
[35]. http://info.genscape.com/physical-grid-monitoring
[36]. http://www.lineamps.net/about.shtml
[37]. Alarm Grouping and Event Root Cause Analysis for Transmission Control Centers. EPRI, Palo
Alto, CA: 2016. 3002008275.
[38]. Alarm Management Philosophy for Transmission Operations Control Centers, EPRI, Palo Alto,
CA: 2016. 3002008274.
[39]. DNV GL Smart Cable Guard www.dnvgl.com.
101
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
103
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
TOs, GOs, and other electricity utility companies are responsible for providing the information and
data for an accurate modeling of their electrical system. Usually, the data and information need to
include, but not limited to, the following (PJM Operation Support Division, August 25, 2016):
Substation topology (including generator substations), facility connectivity, and physical
location upon request (state and global positioning satellites (GPS) coordinates)
Equipment names or designations
Facility physical characteristics including impedances, transformer taps, transformer tap
range, transformer nominal voltages, etc.
Facility limits and ratings
Voltage control information and recommended set-points
Recommended contingencies to be studied
Protective device clearing times, as appropriate, to support real-time transient stability
analysis
Buses, breakers, switches, and injections or shunts such as loads, capacitors, SVCs, etc.
Lines and series devices (reactors or series capacitors)
Transformers and phase shifters
Generator auxiliary, station service, or common service loads (MW & MVAR)
Generator step-ups to be modeled for Bulk Electric System (BES) generators
Generator “D” curve limits
Real-time analog and equipment status telemetry for transmission elements, including, but
not limited to:
o Breaker, switch, or other equipment status required to determine connectivity to real
(MW) and reactive (MVAR) power flow for lines, transformers (high or low-side), and
phase shifters
o Real (MW) and reactive (MVAR) for loads and/or other injections as appropriate
o Reactive (MVAR) power flow for capacitors and SVCs
Figure 5-1 shows typical data involved in EMS modeling for power system operations. The EMS
modeling includes telemetry data, connectivity data, and electrical parameter data. Note that the basic
connectivity information is necessary to include external system models. In order to collect the EMS
data, communication, construction design, transmission and planning, operations modeling engineers,
and RTO need to be involved in this cooperative process as the figure demonstrates.
104
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
105
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
[5] include angular separation, dynamics oscillations monitoring, disturbance location identification, and
islanding and resynchronization.
5.1.2 Model update procedure and lifecycle
This EMS model requires significant coordination between system operators and stakeholders.
Summer and winter builds are two updates commonly known as the regularly-scheduled builds in
North America, and the other regions of the world may have similar model update processes. A new
build usually includes two essential types of changes:
Topology changes
Parameter changes
TSOs and power utilities are responsible for providing data about all construction projects that will
impact the RTO model. They are typically required to notify the RTO from six months to one year in
advance of system topology changes. The EMS network model updates accordingly a few times (e.g.
four times each year) to reflect the topology changes. Thus, to ensure that the EMS update includes a
facility addition, revision, or deletion, all model information must be submitted to the RTO or the TSO
accurately and timely. An example of an EMS model build lifecycle is shown in Figure 5-2.
Jun-Sept
25%
Jan-Jun
38%
Jul-Sept
19%
Dec
6% Oct-Nov
12%
106
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
107
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
108
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
together constitute the Interface Reference Model (IRM). Communication between application
components of the IRM requires compatibility on two levels:
o Message formats and protocols.
o Message contents must be mutually understood, including application-level issues of
message layout and semantics.
IEC 61970 series, which provides among others a set of general guidelines and infrastructure
capacity lines necessary for the implementation of EMS-API interface standards (Energy
Management System - Application Program Interface). ENTSO-E in Europe, for example, uses
the Common Grid Model Exchange Standard (CGMES), which is a superset of the IEC CIM
standard. It was developed to meet necessary requirements for TSO data exchanges in the
areas of system development and system operation.
IEC 62325 series, which specifies the CIM for communications for deregulated energy
markets. The IEC developed these standards as a framework for energy market
communications encompassing two market styles: European style and North American style
markets.
The foundation of the IEC 62325 series is:
IEC 62325-301 “CIM extensions for markets” standard, which is an abstract model that caters
for the introduction of the objects required for the operation of electricity markets.
IEC 62325-450 “Profile and context modeling rules,” the international standard for the
generation of profiles.
For each standard, there are degrees of freedom that must be defined. The CIM standard must be
adapted within the energy companies according to their needs. For example, the energy company EDF
described the M-SITE model. This is a UML model derived from the CIM model for network domain
requirements. It defines CIM UML classes as well as specific M-SITE additions to describe networks
and extensions used to support a number of study functions. It is the reference (data dictionary) for
defining classes, associations, and UML attributes used to construct exchange interfaces based on the
MSITE model. Industrials must appropriate and adapt the standards to his needs while respecting the
basic rules in order to remain CIM compliant.
5.2.2.2 IEC 61850
IEC 61850 is a standard established by the TC 57 of the IEC. This standard defines common
communication architecture for systems inside the substation (process level, cubical level, and station
level). Historically, IEC 61850 is based on IEC 60870 and IEEE UCA.
The information exchange mechanisms rely primarily on well-defined information models. These
information models and the modeling methods are at the core of the IEC 61 850 series. The IEC 61850
series uses the approach to model the common information found in real devices as depicted in Figure
5-5 [7]. All information made available to be exchanged with other devices is defined in the standard.
The model provides for systems for power utility automation an image of the analog world (power
system processes, switchgear).
109
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
110
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
111
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
112
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
The new application runs using PMU data from a phasor data concentrator (PDC) and can conceivably
run for every PMU sample set (i.e. 30 to 120 times/second). If this complete data is not available in the
EMS, which is likely because down-sampling to 1 sample/second is typically used, the natural choice
for the application is to reside on the same system as PDC. In such scenario, there may be a need for
some EMS information to be periodically available to the application and the results of the application
transferred to EMS. This will necessitate a reliable, secure data transfer mechanism, hopefully using a
web service [10].
With the integration of fast Synchrophasor measurements (at rates of 50 to 60 measurements per
second) into the control center data model, the EMS now has real-time visibility of the dynamics of the
power system. This complements the visibility of the steady-state behavior of the grid with traditional
SCADA measurements. Many of the new Synchrophasor analytics complement and corroborate
traditional EMS analytics and can therefore be used together to jointly validate and fine-tune the
analytics for improved precision and accuracy. For example, the oscillation monitoring analytic using a
network dynamic model can be “married” with its counterpart measurement-based analytic to compare
results and to gradually improve the network dynamic model parameters.
5.3.2 Impact of renewable energy on operations data modeling
In many parts of the world, the government-mandated Renewable Portfolio Standards (RPS) requires
electricity suppliers to obtain a minimum percentage of their power from renewable energy resources
by a certain date in response to the recent emphasis on environmental issues and concerns for global
warming. There have been a wide variety of financial incentives that are being put in place by
governments around the globe to the boost economy and employment and to mitigate the impacts of
the looming climate crisis. These incentives are expected to spur investments and growth in wind and
solar industries. All those factors are causing wind and solar energy to expand at an ever-quickening
pace, leading to high levels of penetration in a relatively short time. Utilities and power system
operators must prepare to integrate and manage more of these variable renewable electricity sources
on a much larger scale [11].
Apart from the many benefits owing to the ever-increasing amount of variable resources, most
renewable resources required in the RPS are variable resources characterized by their high level of
variability and uncertainty, and the variability with these resources remains a major concern for
utilities in terms of grid operations. First of all, the task of controlling the power system and balancing
supply and demand becomes more of a challenge for the grid operators. In addition to the inherent
variability and unpredictability associated with these resources, the fast ramping associated with wind
and solar photovoltaic resources will further challenge the utility companies.
The task of balancing and controlling the power system is further complicated by the fact that, in
current practice, in most balancing areas, renewable resources are treated as “must take” resources,
requiring the grid operators to look for additional fast responding resources to compensate for the
113
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
variability, uncertainty, and the fast ramping of variable resources. In order to accommodate the
increasing penetration levels of variable resources, balancing areas will need to adopt strategies and
implement new tools to provide better visibility into variable resource operations, to better forecast
their expected generation levels on a short-term basis, and to dispatch and control these resources.
The operator of these resources, on the other hand, will require tools with adequate datasets and
advanced data models to interface with the balancing area operators, and to facilitate and automate
the participation of variable resources in various energy and ancillary services markets.
Integrating data from large utility-scale variable generation presents unique challenges. These
challenges call into question the long-standing set of assumptions that determined how utilities
operated the power systems for decades. Power systems are designed to handle significant amounts
of load variations and other uncertainties. Thus, managing risks is not new for grid operators. The
expected increase in wind and solar generation, however, introduces new operational paradigms: how
to ensure system controllability and observability and how to manage new kinds of variability and
uncertainty.
Operational integration deals with how operating characteristics of wind and solar plants are combined
with existing operating policies (e.g. system balancing, ancillary services, ramping resources up/down)
and decision-support tools deployed to support the utility control-room operators who run the power
grid. Operating policies include different heuristics that are used to ensure balance between load and
generation. With increased variable generation, policies on how this balance is maintained can be
expected to change.
Wind and solar energy generation are intermittent resources and, as such, can make it difficult to
operate the power grids to which they are connected. The primary requirement for integrating these
variable generations with utility operations is having access to forecast information about the quantity
and availability of the power output from wind or solar plants. Thus, reliable forecasting systems are
necessary to achieving increased wind and solar energy penetration. The use of forecasting in control
rooms is the key to managing variability and reducing uncertainty, operational impacts, and costs.
Forecasting allows operators to anticipate generation levels from wind and solar plants and adjust the
remaining generation units accordingly. Accurate short-term wind production forecasts enable grid
operators to make better day-ahead operational decisions, including scheduling the mix of generation
resources to be dispatched. What constitutes a challenge is how to integrate wind forecast data with
existing tools used in control centers. The goal of the data modeling and integration must be to
enhance operators’ local and global situational awareness in light of increased variability and
uncertainty. Toward this end, existing EMS, GMS (generations management system) and MMS
(market management system) applications must be enhanced by incorporating wind forecast
information and by making changes to different applications such as unit commitment, automatic
generation control, and special protection schemes. Below is the high-level information flow and
datasets of the variable renewable energy integration for grid operations.
114
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
State Estimation
Generation/Load Look Ahead
Balance Analysis
Applications Interface
Turbine Facility
Wind Areas Telemetry S.E. Setup
Configuration
Parameter Types Owner
115
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
116
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
117
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 5-9: Proposed concept to incorporate equipment condition information indices into PRA
calculations
The grid operators used to receive the asset/equipment information through asset management or
field personnel, and this may still be the case in many utilities where operators sometimes have to
scramble to gather adequate information about the equipment under duress and to perform quick
assessments or detailed analyses, such as the real-time contingency analysis, to arrive at mitigation
1-2 measures. This communication method inherently introduces some delay and possibility of
miscommunication. Now that most of the real-time equipment health information is available in the
control center, the firsthand knowledge will help the grid operators in assessing the situation and the
associated risk. However, this constitutes a widely recognized challenge of organizing/integrating the
asset health related data into the current grid operation data models and merging the information into
real-time SCADA data stream and connected network model.
In the past decade, there have been some recommendations within the industry for the network
architecture, communication protocol, and information model needed to integrate and transmit this
equipment health data to grid operators efficiently. It presents an initial effort to identify functional
specifications for an integrated "equipment health information system for grid operators," including
conceptual visual displays. The use of CIM for asset health information sharing addresses how the CIM
can be leveraged in defining both the shared semantic data model and the actual data exchanges
required by the integration layers of the framework. There have been some preliminary results proposed
by various research institutes and consulting firms on this forefront endeavor. Figure 5-10 illustrates a
proposal from EPRI regarding integrating asset health information into the CIM-based EMS data
structure for the gird operators and reliability engineers to consume.
118
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
119
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 5-10: Overview CIM class model for breaker health integration environment
Power transformers and circuit breakers are two of the most important transmission components due
to their high costs and large impacts on system operations if a failure occurs. The diagram below is
presenting rating the CIM extensions for circuit breaker data modeling for grid operations in the EPRI
proposal to include asset health information in the overall network model standard formed by the IEC
61970 and IEC 61968. Figure 5-11 shows the extensions in relation to the framework set up in CIM for
congregation and contextualizing the data needed for power system operations [14]. With the asset
health information being in the same data structure alongside with the SCADA real-time data, the
parameters, and connections of the major grid components, the operators will be better equipped with
the information you need to operate the grid in a more effective, efficient, secure, and reliable way.
120
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Figure 5-11: Location of UML diagrams and modifications for the breaker health integration
121
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
configured; visualization can be set up; dynamic asset management can be optimized by data driven
algorithms; and even switching work order can be automatically generated for review if abnormality is
found somewhere in the grid.
Key to the success of data modelling and NMM implementation in this platform is the power network
model management and the integration of asset-related data into the contextualized business
intelligence and situational awareness data hierarchy for grid operations. The goal is to correlate asset
information with the connectivity nodes in the network models for both planning and operations.
Consequently, Dominion is looking to implement some initial use cases to leverage network model and
connectivity information in business intelligence data structure such as dynamic equipment health
assessment for strategic asset management and optimal VAr advisory for the optimization of reactive
power flow.
Traditionally, utilities gather asset health data from online and office line capabilities, as well as historical
SCADA information. The advent of centralized network model management (NMM) capability has
allowed utilities to not only streamline their network model usage across operation, planning, and
engineering but also to make the network connectivity information available for asset management
purposes. Given the business challenges that Dominion and more broadly the industry faces today and
some of the technology investments it has committed to, the asset and network model integration
solution architecture was developed to meet the business challenges. This architecture reflects the
industry best practices around the technology, standards, and requirements of utility asset and network
model management. The following diagram provides an overview of the architecture. It is believed that,
implemented correctly, it will provide the right foundation for Dominion’s short- and long-term asset
management and operational applications needs as business requirements grow and change over time.
122
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Due to the complexity of the power grids, the high volume, large variety, and fast velocity characteristics
of the utility space big data, and lack of strategic methodology and plan to alleviate or eliminate the
issues arising from it, the electric utilities are consistently confronted with day-to-day difficulties list below
in the operations of their grid and the equipment within it.
1) Data silos
2) No semantics layer on top of the data
3) Lack of cross system integration
4) Not all relevant data is shared
5) Difficult to share data and models
6) Excessive time used to validate data/models, not running studies
7) Data accuracy and inconsistency
8) Common data not in sync and up to date
9) Impossible to propagate data change to all pertinent data destinations
To cope with these challenges, it is first and foremost important for the electricity utility companies to
properly integrate and model the data from the variety of data sources that we are relying on to carry
out the daily operations of the grid in a stable, dependable, reliable, and more efficient fashion. As we
described in Section 5.2, utilities are moving toward data description from international standards such
as IEC 61850, CIM, or COSEM. Standards are likely to have evolved and others may emerge over the
next few years, but the important thing is to continue the efforts. Indeed, the normative effort is crucial
123
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
because it makes it possible to improve the opportunities for interoperability of electrical systems, which
are increasingly interconnected both between countries and between upstream and downstream with
the arrival of new uses.
5.6 REFERENCES
[1]. PJM Operation Support Division, "PJM Manual 3A: Energy Management System (EMS) Model
Updates and Quality Assurance (QA)," August 25, 2016.
[2]. "IEEE Standard for Calculating the Current-Temperature Relationship of Bare Overhead
Conductors," IEEE Std 738-2012 (Revision of IEEE Std 738-2006 - Incorporates IEEE Std 738-
2012 Cor 1-2013), vol. no, pp. 1-72, 23 Dec 2013.
[3]. U.S. Department of Energy, Electricity Delivery & Energy Reliability, "Dynamic Line Rating
Systems for Transmission Lines," Smart Grid Demonstration Program, April 25, 2014.
[4]. Power Systems Engineering Research Center, "The Next Generation Energy Management
System Design," Sept 2013.
[5]. J. Giri, M. Parashar, J. Trehern and V. Madani, "The Situation Room: Control Center Analytics
for Enhanced Situational Awareness," IEEE Power and Energy Magazine, vol. 10, no. 5, pp. 24-
39, Sept.-Oct. 2012.
[6]. EPRI CIM Primer 3rd edition Technical Report – 2015
[7]. IEC 61850-7-1 : Basic communication structure – Principles and models – 2011
[8]. DLMS WEBSITE : http://dlms.com/index2.php
[9]. V. Madani, et al., "Advanced EMS Applications Using Synchrophasor Systems for Grid
Operation," T&D Conference and Exposition, 2014 IEEE PES, Pages: 1 – 5 DOL
10.1109/TDC.2014.6863246.
[10]. Jampala, et al., "Practical Challenges of Integrating Synchrophasor Applications into an EMS,"
2013 IEEE PES Innovative Smart Grid Technologies Conference (ISGT), Pages: 1 - 6, DOI:
10.1109/ISGT.2013.6497847.
[11]. F. Albuyeh, "Integrating Variable Renewable Generation in Utility Operations," Power and
Energy Society General Meeting, 2010 IEEE, DOL10.1109/PES.2010.5590118.
[12]. Integration of Asset Information into Control Centers: Prioritization of Asset Information and
Concept Development. EPRI, Palo Alto, CA: 2012. 1024257.
[13]. Integration of Equipment Condition Information into Control Center Operations: Survey on
Equipment Condition Information for Transmission Operators. EPRI, Palo Alto, CA: 2014.
3002004614.
[14]. “Standard Based Integration Specification, Common Information Model Framework for Asset
Health Data Exchange”, EPRI 2014 Technical Update
124
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Power data has the characteristics of large amount and many types. They are mainly derived from the
control system, the production system, and the management system, such as data monitoring
information, smart meter collection, device maintenance information, SCADA, Internet of Things (IoT),
and energy management systems.
125
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
This data is used for supporting different kinds of application topics, such as sensing status, load
forecasting, and user behavior analysis. Besides, many enterprise management systems (ERPs) are
established to record and produce enterprise data such as financial and human resources data.
However, in all the data mentioned above, there is a variety of data quality problems that may impact
data analysis. The main manifestations are data incompleteness, data inaccuracy, data non-
normative, data inconsistency, and other aspects [5-7]. For example, the following are some common
data quality problems found in smart meters:
Incompleteness: The smart meter needs to collect multi-point data every day, including
positive and negative active data, reactive power data, three-phase voltage data, etc.
However, some data collection points are missing.
Inaccuracy: The equipment operation time is not accurate. In particular, the operation time
information is not updated after the transmission line is replaced or broken. The customer
contact information (telephone number and address) is not accurate. The information cannot
be updated immediately after some changes.
Non-normative: Equipment manufacturer data is not standardized; multiple names may exist
for the same manufacturer.
Inconsistency: The inherent connection association from transformer substation to line,
to transformer, to substation area, to key users is not consistent by data level.
Non-uniqueness: The equipment master data is maintained by multiple sources in the
infrastructure, materials, production, and dispatch systems. For example, one equipment may
have different names and encodings.
Another example of data quality problems is related to a material system. This mainly focuses on the
non-standard data entry, incompleteness, and duplicate data entry:
Incompleteness: Kinds of fields such as material number, start time, production time, etc. are
empty.
Inaccuracy: The contract amount is less than 0, or more than 10 billion.
Non-normative: The encoding method does not conform to the specification. The data type is
not standardized.
Inconsistency: The same information such as line loss rates may be different in the statistics
of the financial, planning, and operational system.
Non-uniqueness: The same information may be entered in multiple systems.
As a result of power systems being interconnected and networked, data inconsistency problems will
be there. Due to personnel negligence, database failure, communication interruption, and other
reasons, data association may be missed or mismatched, and data quality problems, such as data loss
and data error, will be caused. Enterprise information level differences, data model differences, and
other reasons will also cause a data interoperability problem. These data quality problems will have
direct impacts on the results of subsequent data-analytics applications in a production operation.
Therefore, it is necessary to focus on improving power data quality.
6.3 DATA QUALITY ASSESSMENT
Data quality assessments are typically performed both bottom-up and top-down. The bottom-up
approach utilizes profiling tools and schema inspections to perform a generic and usage-agnostic
assessment. The bottom-up approach will reveal indicators of potential areas of data inconsistency.
However, due to the generic nature of the method, this approach is also prone to detect false
positives. The top-down approach will involve domain experts and actual usage scenarios to detect
inconsistencies. Although this method does not typically result in false positives, it will not lend itself
easily to automation and hence might not be as conclusive as the bottom-up approach. Normally, the
bottom-up assessment provides valuable input to the top-down assessment, and hence both are
required to perform an exhaustive evaluation.
Performing an initial iteration is recommended in order to validate input data flow, map data paths
and all transformations, identify enhancements and refinements on data, collect and use metadata
126
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
and schemas involved, and document this as part of the data quality assessment. The next iteration
will detect any relationships between data feeds. These could include: (i) a sensor measuring power
supply to a pump can be correlated to sensor data measuring performance of a pump, or (ii) starting
an engine, resulting in a rise in engine oil temperature being detected by sensors. Subsequent
iterations can assess whether the data quality is sufficient for the algorithms using them.
If simulations and/or digital twins are used, these models should also be quality assured. For this
purpose a digital twin can be viewed as a dataset in the context of the data quality method. The first
assessment provides a baseline for measurements and the improvement cycle. ISO 8000-8 defines
three categories for data quality measurements: syntactic, semantic, and pragmatic. Information and
data quality are defined and measured according to these categories.
For organizations with well-defined requirements, the assessment will tend towards that of the
assessment model for ISO 8000-8. In this case, the data quality dimensions are categorized according
to ISO 8000-8, and the appropriate methods are employed:
Automatic syntax and integrity checks for syntactic quality.
Correlation with reference models and sampling techniques for semantic quality.
Algorithm sensibility for data quality issues, user feedback, and focus groups for pragmatic
quality.
As described above, data quality can generally be understood as “the extent to which a set of intrinsic
properties of data meets the requirements.” The intrinsic properties can be decomposed into five
dimensions:
Timeliness: the extent to which data generation and transfer meet the requirements of
management and usage.
Integrality: the extent to which the data has or maintains its intrinsic information.
Compliance: the extent to which the type, format, dimension, and accuracy of the data meet
the normative design.
Accuracy: the degree to which the data truly reflects the actual information.
Consistency: the data association obtained by different approach.
Different methods are applied to assess data quality problems in the various levels of the logic
hierarchy model depicted in Figure 6-1. For example, information matching method is usually adopted
to discover data quality problems in control layer; data analysis method and rule checking method can
be used for the operation layer, management layer and analysis layer. Some of these methods are
briefly described below.
6.3.1 DATA INTERPOLATION
127
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
can be used to detect and repair the corresponding voltage data of SCADA system if the calculation of
the voltage amplitude is accurate.
6.3.2 DATA PROFILING
Data profiling is the systematic analysis process of data structure, data content, and data relationship
[11-13]. Through this method, it is possible to empirically examine the potential problems of data.
Figure 6-1 shows the implementation of data analysis.
data structure
profile minimum length
column profile
maximum length
data value
Integrality
distribution
placeholder
format Consistency
foreign key
analysis
cross-table profile
correlation
data relationship analysis
profile
dependency
table profile
conflicts
Through data profiling methods, the abnormal value that can be identified as follows:
High frequency value: its frequency is greater than the expected value.
Rare value: its frequency is lower than the expected value.
Complete value: the empty value is higher than the expected number or percentage.
Frequent pattern: its frequency is larger than expected pattern.
Rare pattern: its frequency is lower than the expected pattern.
Value cardinality problem: the number of different values in columns is higher or lower than
expected.
Accident value: value that does not conform to the defined range constraint.
Defaults: high frequency value or the empty value as the default value.
Orphan record: a record that has a foreign key but does not match the main key.
Mapping problems: the consistency of the values between columns in a single table or cross
table does not conform to expectations.
Duplicate records.
Association relation: association relation does not follow the defined mapping expectation (for
example, a primary key record is mapped to more than one foreign key record, but
association relation requires one to one mapping).
128
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Based on data profiling, validation rules can be set up while locating data problems, and the quality
assessment model can be used to conduct comprehensive quality diagnosis for the analysis dataset.
6.3.3 DATA QUALITY ASSESSMENT FRAMEWORK
Except data profiling results, data quality must need electrical business logic knowledge (such as
charge of electricity requirement, relation between meter and measuring point) to determine whether
the results are correct. So, based on data profiling results and business logic knowledge, we can
analyze and evaluate the data quality by tools or program codes to provide better data quality
analyzing basis. A data quality assessment framework can be developed from timeliness, integrality,
compliance, accuracy, and consistency dimensions [6, 13, 15]. Each dimension can set up several
rules to be described, which is shown in the figure below. Some rules are given as follows in detail.
Non-Blank Rule
Type Rule
Format Rule
Data
Quality
Compliance Dimensional Rule
Assessment
Framework
Precision Rule
Range Rule
Code Rule
Data timeliness rule: the generation and circulation of data should meet (or meet certain
conditions) timeliness requirements of management and use.
Record integrity rule: the number of centralized records of the test data should (or meet
certain conditions) meet the business expectations.
Non-Blank Input rule: the tested data of the dataset should (or meet certain conditions) be
not null.
Primary key rule: when a field of the tested dataset is the primary key, the value of the data
should uniquely identify a record.
Foreign key rule: when a field of the tested dataset is the foreign key, that field should (or
meet certain conditions) reference the primary key of another data table.
129
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Type rule: the data type of the tested dataset should (or meet certain conditions) meet the
field type requirements predefined by the business system.
Format rule: the format of the tested data in the dataset should (or meet certain conditions)
meet the field format requirements predefined by the business system.
Dimension rule: the dimension of data should (or meet certain conditions) meet the
dimensional requirements predefined by the business system.
Precision rule: the precision of numerical data should (or meet certain conditions) meet the
precision requirements predefined by the business system.
Value-domain rule: data values of the tested dataset should (or meet certain conditions) occur
within a certain range, and the range can be determined by one or more means such as the
data dictionary, business knowledge, distribution, and variation of historical data.
Equivalent function rule: in the same data table, one data should (or meet certain conditions)
be calculated from another one or more data, and such equivalence calculation relationship
must to be in line with the business characteristics.
Logical function dependence rule: in the same data table, one data should (or meet certain
conditions) meet some kind of logical relationship (greater than, less than, earlier than, later
than, etc.) with another one or more data, and this logical relationship must be in line with
the business characteristics.
Code rule: the value of the data from the tested dataset should (or meet certain conditions)
conform to the constraints of the source business system’s design.
Equivalency consistency dependence rule: in different data tables, one data should (or meet
certain conditions) be calculated from one or more data from other data tables, and such
equivalence calculation relationship must to be in line with the business characteristics.
Logical consistency dependence rule: in different data tables, one data (or meet certain
conditions) should satisfies some logical relationship (greater than, less than, earlier than,
later than, etc.). With one or more data from other data tables, this logical relationship needs
to be in line with the business characteristics.
In data analysis and inspection process, many data quality problems would be revealed. Having
identified the data quality problems that may adversely impact the reliability and validity of analytics
applications, the next step is to design and implement a measure to solve the issues found [16, 17].
Figure 6-3 shows the main steps to correct data quality problems. The task involved in each of these
steps are described next.
Impact
assessment
Monitoring Correction
and and
Prevention cleaning
Scavenging
of essential
causes
130
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
At first, consider the following characteristics of the data quality problem [15, 18]:
Scope of the influence. It identifies the extent to which business processes is impaired by the
problem
Feasibility of correction. That is, the possibility to correct the data quality problem in question.
Feasibility of prevention. That is, the possibility of eliminating the root cause of the problem or
identifying problems through continuous monitoring.
6.4.2 CORRECTION AND CLEANING
The correct-and-clean process almost exists in all stages of data collection and storage, integration,
analysis, and application. Figure 6-4 shows the entire flow of power system data from collection to
application. In different stages of the data correction and clean, the methods and emphases are
focused differently [6, 15].
It is usually based on multi-source information from different sampling time and different monitoring
sources with related topology nodes, as well as business common sense to do the abnormality
judgment and deviation correction [18].
b. ETL process
Data is not perfect. There is a gap between the raw data and the final result. It usually needs to be
cleaned, converted, and sorted by the ETL (extract the transformation load). ETL includes three main
links. The first one is data extraction. It implies reading data from original business systems, which
could be different operating platforms or different databases. The second is data conversion. This
process entails converting data under pre-defined set of rules, including the operation of fields merge
and split, sorting, default value assignment, data aggregation, and so on. The third link is data load,
during which the conversion data is loaded into the data warehouse [15, 19].
In the data conversion process, operations for resolving data quality problems include:
Data integrality check and incomplete data filling
Incorrect data check and repair
Duplicate data inspection and handling
131
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
In the ETL process, the data has been cleaned, converted, and collated. Then, they will be stored in
the data warehouse. However, the ETL operations are not enough for the data quality requirement in
the following data analysis topics or BI (business intelligence) analysis topics. There may still be some
data quality problems in the different application topics, such as data mismatching and logical error.
These problems should be corrected to satisfy the power business requirements. Thus, data quality
problems should be differently considered during the different phases of power data lifecycle, which is
shown in Figure 6-5 [6, 20-22].
132
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
In order to substantially solve the quality problem, it is necessary to analyze where the problem
comes from, and the best place to repair and eliminate the root causes. If the sources and the best
place can be identified, it is possible to assess and correct the process to eliminate the essential
causes of the quality problem.
The possible essential causes include:
Poor tunnel conditions
Arguments setting errors
System running failures
Extraction and conversion errors
Terminal failures
Human factors
Software systems lack verifications
Because the sources of data are diverse, models of data are large, and the conversion processes are
complex. To identify data sources and the best place, the metadata-management-based data source
tracking technology can actually improve the performance.
Assessing and eliminating the quality problem causes can be considered from the following points:
Assess the workload of every candidate program.
Choose one as the repair program.
Determine the repair time.
Design the development plan.
Design the test plan.
6.4.4 MONITORING AND PREVENTION
If the workload of eliminating the above root causes goes beyond the organization’s capabilities,
resources, or requirements, then monitoring procedures should be established for known data quality
issues. When an error occurs to the monitoring procedures, the appropriate person can be notified to
take appropriate action to delay or terminate the error until data processing continues normally.
6.5 CONCLUSIONS
Data quality hasn’t been fully considered in the initial design phase of system function development of
power control systems and enterprise management systems. In the process of data integration,
analysis, and application, a series of problems such as data inconsistency, inaccuracy, and
incompleteness must be faced and handled. However, these problems are extremely important for the
system to achieve the desired effect in the applications of data analytics in system operations.
For data quality measurement, kinds of methods including multi-source information matching, data
analyzing, and rule testing can be used. For data quality correction, a set of technologies—including
133
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
issue influence analysis, correction, and cleaning; scavenging of essential causes; monitoring and
prevention—is proven feasible to improve data quality, which can satisfy the needs of data analysis
and application.
DNV.GL proposed that the data quality assessment and improvement process include defining the
scope, data exploration and profiling, data quality assessment, organizational maturity assessment,
data quality risk assessment, and risk-based data quality improvement [6]. These measures have a
positive effect on improving the enterprise’s data quality, maturity, and risk management.
6.6 REFERENCES
[1]. ISO 8000-8 Information and data quality: Concepts and measuring
[2]. ISO 9000 Quality management
[3]. Data quality assessment framework, DNVGL-RP-0497, Jan. 2017
[4]. Yang et al. Journey to data quality, MIT Press 2006
[5]. X. Chen, et al., Integration of IoT with smart grid, IET ICCTA 2011, Beijing, China, 2011.
[6]. T. Zhao, et al. Data quality assessment and improvement techniques for power system, SGCC
Technical Report, 2017.
[7]. G. Liu, et al., Evolving graph based power system EMS real time analysis framework", IEEE
ISCAS 2018, Italy, 2018
[8]. H. Hang, Q.-L. Zhu, Development analysis and prospect of data quality control in smart grid,
Science & Technology Information, pp. 92-93, Jul. 2012
[9]. K.-Y. Liu, et al, Detection and evaluation of SCADA voltage data quality in distribution network
based on multi temporal and spatial information of multi data sources, Power System
Technology, pp. 3169-3175, Nov. 2015
[10]. NASPI, PMU Data Quality: A framework for the attributes of PMU data quality and a
methodology for examining data quality impacts to Synchrophasor applications, Mar. 2017
[11]. DAMA United Kingdom, The six primary dimensions for data quality assessment, Report, Oct.
2013
[12]. Sadiq, Shazia, Handbook of data quality: research and practice, Springer, 2013
[13]. C. Batini, and M. Scannapieco Data and information quality: Dimensions, principles and
techniques, Springer, 2016
[14]. S. Keller, et al., The evolution of data quality: understanding the transdisciplinary origins of data
quality concepts and approaches, Annual Review of Statistics and Its Application, vol. 4, pp 85-
108, 2017
[15]. Q/GDW 11570-2016, The common criteria of data quality evaluation based on power grid
operation data, Enterprise Standard of SGCC, 2016
[16]. D. Loshin, The practitioner's guide to data quality improvement, 2011.
[17]. H. Liu, et al, Research on the advanced computing method for supporting large data quality
assessment and improvement, Advances in Computer Science Research, Jan. 2017
[18]. Y.-W. Cheah, and B. Plale, Provenance quality assessment methodology and framework,
Journal of Data and Information Quality, vol. 5 (3), Feb. 2015
[19]. X. Chen, N. Li, F. Wu, and X. Li, Research on hierarchical information aggregation technology in
the smart grid Internet of Things, Telecommunications for Electric Power System, vol.32 (230),
pp.73-77, Dec. 2011
[20]. ISO 8000-110-2009-Data quality - part 110: Master data: exchange of characteristic data:
syntax, semantic encoding, and conformance to data specification.
[21]. K. Xing, et al, Mutual privacy-preserving regression modeling in participatory sensing, IEEE
INFCOM 2013, Turin, Italy, Apr. 2013
[22]. W.H. Inmon Dan Linstedt, Data Architecture: A Primer for the Data Scientist: Big Data, Data
Warehouse and Data Vault, Morgan Kaufmann, Nov. 2014
134
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
7. CONCLUSION
With increasing complexity and interconnectivity of the grid, the scope and complexity of maintaining
and increasing situational awareness have grown. As a consequence, there is the need to furnish
system operators and operation engineers with better tools and visualizations for assessing system
conditions, and for providing effective and timely decision-making and remedial reactions to an
incident. It is not enough to just understand the current state. Situational awareness implies also the
ability to foresee and anticipate system changes and their impact on system security.
The large variety of internal and external data sources that are available today to electric utilities
make it possible the implementation of advanced data analytics and visualization technologies to
improve the way the system is operated and controlled. Analytics algorithms capable of synthesizing
actionable information from the raw data can be used to provide tools that use real-time data streams
to support fast, accurate, and adaptable decisions for solving critical problems in the right moment, as
well as plan in advance mitigation actions to anticipated system security issues.
Even though the use of data analytics for power system operation support is not new, its widespread
use remains low. Hence, there is need to examine how advanced data analytics technologies can be
further used to solve the emerging critical challenges in electric systems operations.
This technical brochure provides an insight into how advanced data-analytics techniques and tools
that integrate various data sources can be used to improve situational awareness of power system
operators and support various operation functions. The content of this technical brochure is broken
down into the major areas comprising the development and implementation of data analytics tools,
which are: data and information sources, data analytic techniques to interpret these data, applications
of these analytics in system operations, data integration and modelling to integrate data into
operations, and data quality and validation.
Some relevant observations and takeaway from this work are as follows:
Utilities have started to realize the value and benefits of data analytics tools that integrate data
from various data sources. Several software tools have been developed to serve a variety of
functions to support system operation, including tools for system events detection, faults
identification and analysis, wide-area monitoring, equipment health monitoring and analysis,
trending and forecast of load, renewables and system conditions, and recommendation for
operation. There is a recognized growing need to improve situational awareness for system
operators, as it is also revealed in our industry survey. Advanced data management and analytics
can help fill this need. One of the challenges for successful implementation of such tools is the
difficulty to integrate data that is collected and resided in different enterprise systems. Hence,
effective implementation of advanced analytics tools greatly depends on operational data-
management policies and technologies. Proper data models allow the definitions and
characteristics of the data to be clearly understood. Even though significant advances have been
made to improve data interoperability, more effective and accurate data models and procedures
are needed for ensuring data integrity and availability of the right data in the right format.
Another aspect that may hinder the implementation of data analytics solutions in system
operation is the lack of understanding of the value and accuracy of these technologies.
Traditionally, most of the tools used in control centers and operation engineering are based on
system models and simulations. Engineers understand the capabilities and limitations of those
tools, as well as the considerations that need to observed to develop and validate the simulation
models. Data-analytics techniques that attempt to recognize and validate data patterns and trends
and draw conclusions therefrom may be less understood. Advantages of both approaches—model-
based and data-based methods—can be combined in hybrid methodologies to developed superior
technical approaches and software tools for use in system control rooms and to support various
operation functions. Those tools will combine conventional analytics techniques based on physical
models with heuristic data analytics and decision-making methodologies. For instance, simulations
engines would perform contingency analysis across a number of scenarios, which in turns will be
built with the help of data collected and integrated from a variety of sources. Data-analytics
techniques will then be used to extract relevant patterns from the simulation results, assess
vulnerability/risk, and classified critical conditions based on given risk criteria.
135
ADVANCED UTILITY DATA MANAGEMENT AND ANALYTICS FOR IMPROVED OPERATION SITUATIONAL AWARENESS OF EPU OPERATIONS
Effective use of these data sources in operation support tools relies upon the real-time exchange
of this data through a high-performance, reliable, secure, and scalable communication network
infrastructure. The new trend of integrated network architecture that is happening with the
expanding smart grids investments will enable effective and reliable use of analytics tools that
integrate various real-time data streams.
It’s widely recognized that visual analytics is key is to improve operator ability to understand the
system situation and make effective decisions. Visualization technologies and techniques have
advanced significantly since first developments in the early 1980s. Best practice from within the
visualization industry and learning from other data intensive areas should be used. Newer
visualization platforms include many advanced futures such as geographic-based dynamic
visualization with user-friendly interfaces and real time measurements and analytical results from
measurement-based and model-based tools that populate the system map. Visual aids strategies
such as color contouring, 2D and 3D bubbles and cones, animation, geospatial representation,
display profiles, and integrated system views are widely used in newer visualization tools. The
most significant trend in new visualization is the integrated space-time concept, which is intended
to help operators to assess current situations in a static fashion and to understand and visualize
the conditions the system is evolving into, to get better prepared to implement effective control
actions.
There is a fundamental need to understand the challenges and requirements that system
operators are experiencing in terms of the main goals that drive their actions. Operators may
need to reconcile multiple objectives and answer questions that may sound conflicting, such as:
o Is my goal purely economic?
o Is my goal purely focused on maintaining system security at any cost?
It is not possible to utilize and deliver effective data-analysis tools and associated visualization
without challenging these two key drivers.
Also, another paradigm shift in terms of the approach for presenting data and information to the
operator is required. The common current approach is “show the user lots of data,” but the
rationale behind such simplistic strategy has never been clear. Perhaps the concept is that by
providing the user all the data available there is less chance to filter out important information, or
possibly because analytics tools used so far to process raw data have not provided successful
results. Regardless of the cause, it is clear that the current approach will not improve situational
awareness of system operators. This shift in thinking needs to be mainly based on timescales:
o What can we tell from historic data? (have we been here before?)
o Now and near now
o Future (plus how long into the future do you need to understand)
Real time operations need to move away from a simplistic deterministic approach towards a
decision-making process based on probabilistic scenario/contingency models, and by harnessing
the insights provided by effective data analytics combined with advanced visualization.
Currently, there is no single data-analytics solution for supporting operator decision-making
that will fit all possible scenarios and requirements of modern power systems, but there is a
potential for significant improvement as the data-analytics technologies continue to evolve.
We expect that this technical brochure will be a positive step towards enabling readers to
understand such a potential and the complexities involved in development and
implementation process.
136