Académique Documents
Professionnel Documents
Culture Documents
h i g h l i g h t s
Reviews li-ion: voltage, arc-ash, re, and vent gas combustion and toxicity.
Reviews Probabilistic Risk Assessment (PRA) for safety engineering li-ion systems.
Presents Systems-Theoretic Process Analysis (STPA) as alternative to PRA.
Presents research applying STPA to a li-ion grid energy storage system.
Concludes STPA may be more cost effective than PRA for li-ion systems.
a r t i c l e i n f o
a b s t r a c t
Article history:
Received 1 July 2015
Accepted 16 September 2015
Available online xxx
As grid energy storage systems become more complex, it grows more difcult to design them for safe
operation. This paper rst reviews the properties of lithium-ion batteries that can produce hazards in
grid scale systems. Then the conventional safety engineering technique Probabilistic Risk Assessment
(PRA) is reviewed to identify its limitations in complex systems. To address this gap, new research is
presented on the application of Systems-Theoretic Process Analysis (STPA) to a lithium-ion battery based
grid energy storage system. STPA is anticipated to ll the gaps recognized in PRA for designing complex
systems and hence be more effective or less costly to use during safety engineering. It was observed that
STPA is able to capture causal scenarios for accidents not identied using PRA. Additionally, STPA enabled
a more rational assessment of uncertainty (all that is not known) thereby promoting a healthy skepticism
of design assumptions. We conclude that STPA may indeed be more cost effective than PRA for safety
engineering in lithium-ion battery systems. However, further research is needed to determine if this
approach actually reduces safety engineering costs in development, or improves industry safety
standards.
2015 Published by Elsevier B.V.
Keywords:
Energy storage
Battery
Safety
Lithium-ion
STAMP
STPA
1. Introduction
Controlling the potential hazards that lithium-ion batteries can
pose has been a challenge since their market introduction by Sony
in 1991 [1]. Lithium-ion batteries, while inert and non-hazards in
most contexts, have the following properties that can develop
hazardous conditions: voltage [2], arc-ash/blast potential [2], re
potential [1,3], vented gas combustibility potential [4], and vented
gas toxicity [3]. While this is not a comprehensive list, for example
weight could also produce a hazard, these are properties that are
somewhat unique to lithium-ion batteries and become more
* Corresponding author.
E-mail addresses: dmrose@sandia.gov
(A. Williams).
(D.
http://dx.doi.org/10.1016/j.jpowsour.2015.09.068
0378-7753/ 2015 Published by Elsevier B.V.
Rosewater),
adwill@mit.edu
Nomenclature
Accident an undesired or unplanned event that results in a
loss
CESS
Community Energy Storage System
Hazard a system state, or set of conditions that, together
with a particular set of worst-case environmental
conditions, will lead to an accident
Loss
any unacceptable outcome (loss of life or injury,
damage to property, loss of mission, loss of data, loss
of investment, damage to reputation, etc.)
PRA
Probabilistic Risk Assessment
Risk
the effect of uncertainty on outcomes
Safety
freedom from accidents (loss events)
STAMP System-Theoretic Accident Model and Processes
STPA
System-Theoretic Process Analysis
System a set of components, including mechanical;
electrical; computer; human; organizational; and
societal elements, along with the connections
between components that together form a complex
whole
461
but this requires that qualied workers must have high level work
authorization in addition to adequate shock Personnel Protective
Equipment (PPE), and insulated tools.
1.1.2. Arc-ash/blast
High string voltage affects both the potential for shock and the
potential for arc-ash/blast. Equations (1) and (2) show the
maximum power point method for calculating the incident energy
in DC arc-ash [2]. Indecent energies calculated by this equation are
described as conservatively high [2] and other methods are being
explored for calculating and classifying the potential harmful energy in a DC arc-ash [5]. Arc-blast results from explosive components of an electric arc (e.g., vaporized copper) and depends greatly
on the equipment and environment involved in the arc. Common
controls to prevent arc ash include increasing separation between
positive and negative conductors, regular maintenance to prevent
equipment failure, and arc-rated PPE for electrical workers.
Iarc 0:5Ibf
IE 0:01Vsys Iarc Tarc
(1)
.
D2
(2)
Where:
Analysis (STPA) and its application to a lithium-ion battery system.
We argue that framing hazard analyses to emphasize uncertainty,
in the ways that component interactions violate safety constraints,
can help to overcome costly systematic biases which are enforced
by the conventional perspective. Systematically identifying and
eliminating the ways that can hazards develop allows for safety to
be ensured more efciently than trying to prove safety through the
collection and analysis of historical data. A brief discussion is also
included on how this perspective could impact the way safety is
represented, and therefor publicly perceived, promoting a better
understanding of uncertainty and a more rational approach to risk
management.
1.1.3. Fire
Thermal runaway is chemical process where self-heating in a
battery exceeds the rate of cooling causing high internal temperatures, melting, off-gassing/venting, and in some cases, re or explosion. Causes of thermal-runaway include mechanical, electrical,
and thermal abuse; internal short circuit from manufacturing defects; and the development of metallic dendrites that form an internal short over time [1,6,7]. Reactivity1 level is measured on a
scale between 0 and 7, shown in Table 1. The reactivity1 level in
thermal runaway can vary greatly depending on chemistry, concentrations, additives, cell design, cell conditions (such as its state
of charge (SOC) or state of health (SOH)) and environmental conditions [1,6,8]. At very high reactivity1 levels (5e7) the cells can
produce heat rapidly enough to catch re, rupture or explode.
Controls for lithium-ion battery res can be divided into three
classes: abuse testing, battery management design, and emergency
systems. Abuse testing exposes a representative sample of cells to
the worst case environmental conditions they would expect to see
during both use and foreseeable misuse; thereby establishing the
limits of safe operation [8]. Many abuse testing standards exist
[9e17], each with different intended environments and use conditions. Designers then impose these limits in products, often
through the application of a Battery Management System (BMS).
There exist many challenges in BMS design to detect and respond to
the violation of environmental or use limits [18]. When res do
occur, emergency systems use warnings, alarms, re suppression,
or other response mechanisms to mitigate the scope of damage
from the re. Fire detection and suppression systems are used in
1
The term Reactivity is used in place of Hazard as source uses a conicting
denition of hazard.
462
Table 1
Reactivity1 levels and descriptions (adapted from Ref. [8]).
Reactivity1
level
Description
0
1
No effect
Passive protection
activated
Defect/Damage
Classication criteria
2
3
4
5
6
7
ELEL
Vroom *LEL*Ecell
Vrunaway
(3)
Where:
ELEL Minimum energy of lithium-ion batteries required to
reach LEL (Wh)
Vroom Volume of the room (liters)
LEL Low explosive limit of vent gas (concentration by volume
%)
Ecell Energy of tested cell (Wh)
Table 2
Estimated battery energy to reach the IET and FLET values for the NO, CO, HCl, SO2
and HF toxic gases (exposure time of 60 min, re occurring in a 50 m3 room)
(adapted from Ref. [3]).
(Wh)
HF
CO
NO
SO2
HCl
IET
FLET
60
110
290
1140
280
2080
530
4710
1320
7880
463
464
OR BAR*Btonmiles
(4)
Where:
OR Occurrence Rate (Accidents per 10 year period 2012e2021)
BAR Battery Accident Rate (1.99 * 109: historical battery accidents per ton shipped, per mile)
Btonmiles Battery Ton-Miles (2,063,769,370: Estimated tons of
batteries to be shipped estimated miles through the air in 10
year period 2012e2021)
While such calculations do produce an estimate of risk, this
mathematical reduction could be misleading given the magnitude
of the uncertainty in both probability and severity observations. As
the designs of lithium-ion batteries continue to evolve,
manufacturing processes change, and shipping regulations are
applied; the underlying risk will change and historical observations
may no longer predict future occurrence rates. Also, the costs
directly associated with an accident do not begin to capture the
reputation and condence damage to the airline, the battery
manufacture, and the shipping and battery industries. The uncertainty inherent in these quantities is clouded by the apparent
clarity of calculated risk.
Second, the NTSB reported on the re in the Boeing 787 APU
battery in 2013. According to the report, designers calculated that
the likelihood of occurrence of a cell venting was 1 in 10 million
ight hours. This estimate was based partly on the available data
from the battery supplier that 14,000 similar cells had been used in
industrial applications for signicant time without incident [33]. At
the time of the 2013 re, the actual occurrence of venting in the
APU design was 2 in 52,000 ight hours [33]. As previously discussed, the causes for thermal runaway re can be a complex
combination of factors spanning materials, manufacturing, shipping, installation, and use environment. The effect of each of these
factors is difcult to account for in new applications as they can
combine in non-linear and unanticipated ways. Not only is the
calculation of risk made more difcult by the intricacies of thermal
runaway in lithium-ion batteries, organizational complexities can
make the mechanisms that compound risk in system design more
difcult to predict. Indeed, the NTSB cited manufacturing defects as
the root-cause of the re, but also identied the effects of integration, thermal management, testing, design reviews, and regulatory oversight [33]. Boeing's observation and subsequent
calculation of risk may have been fully accurate based on the
available information at the time but this metric may not have
adequately represented the underlying effects of uncertainty on the
real-world system.
In these examples contextual factors make some observations of
risk quantities inaccurate and the ways that it compounds in systems difcult to predict. While PRA may be robust enough to
distinguish between these different types and qualities of risk, as
humans we are subject to the anchoring and what you see is all
there is heuristics of risk estimation described by the psychologist
and Nobel laureate Daniel Kahneman [50]. These cognitive biases
make invalidating/ignoring available but inaccurate data and
seemingly-deterministic mechanisms especially difcult. This
happens because it is cognitively easier to trust available information than ignore it or look for where the information is incomplete.
The conformation bias also effects how risk data are interpenetrated in that it is cognitively easier to accept new data that
conrms, rather than conicts with, what we already believe [50].
As PRA is structured around available data and deterministic
mechanisms, it can be inferred that its use enforces the biases that
may, in complex systems, lead to mischaracterization of
uncertainty.
Instead of focusing on the potential for mischaracterization, PRA
can perhaps be better assessed on the basis of its claim of costeffective usefulness, that it improves safety even without accuracy, for which there is some support [51]. It could be argued
however that the cost-effective usefulness of risk management
generally could benet from a lack of probability and severity calculations where any such gures would be more misleading than
helpful. Indeed Aven and Zio write that the motivation for the
qualitative analysis is the acknowledgment and belief that the full
scope of the risks and uncertainties cannot be transformed to a
mathematical formula and in such systems Numbers can be
generated but would not alone serve the purpose of the risk
assessment [52]. This suggests that for lithium-ion energy storage
systems, where risk quantities are difcult to observe/compound, a
robust and non-quantitative method for safety engineering could
be more useful or less costly than PRA. Such a method could
encourage a healthy skepticism of our knowledge, help us to
overcome our cognitive biases, and enable us to make design decisions informed by all that we do not know. This paper presents
the application of a non-quantitative safety engineering method to
a lithium-ion battery system in order to assess the plausibility of
this claim.
2. A systems perspective on safety
System-Theoretic Accident Model and Process (STAMP) provides a new model of causality for analyzing (and designing
against) accidents, especially those involving complex, sociotechnical systems [47]. This model has been applied successfully
to complex problems in many high consequence industries
including aviation [53,54], space [55,56,54], automotive [57,58],
medical [59,60], security [61,62], and nuclear power [61,63]. STAMP
argues that in today's complex world safety is best understood in
terms of interrelated components needing to maintain dynamic
equilibrium. In order to prevent accidents, the system must enforce
safety constraints by adapting to changes in itself or its environment [47].
As such, safety can be described and analyzed as an emergent
system property that results from adequate system-wide enforcement of design constraints through control actions. In this causality
model, losses are considered the result of awed interactions between physical components, engineering activities, operational
mission, organizational structures and social factors [64]. Further,
losses occur when the system enters a hazardous state (e.g.,
buildup of explosive vent gases) and experiences an additional
challenge external to the hardware system, such as an environmental or human event scenario (e.g., technician opens the door
and causes a spark) [47]. Rather than focus on estimating risk of
such external scenarios or events (like PRA-based analyses), STAMP
emphasizes identifying and manipulating the elements of a system's design that can be controlled. This model shifts the analytical
paradigm from preventing failures to enforcing safety control actions [47].
This paradigm shift has its origins in systems and control theory.
Systems theory introduces two concepts useful for battery system
safety: hierarchy and emergence. Hierarchy refers to understanding
the fundamental differences and relationships between levels of
complexity within a system, including identifying what generates,
separates and links each level. Emergence refers to the phenomenon by which behaviors at a given level of complexity are irreducible to the behavior or design of its component parts. Such
emergent properties act as constraints on the actions of
465
466
sections, a CESS will be prepared for STPA by establishing the system accidents, hazards, and control structure. Then one control
loop within this structure will be analyzed in detail to exemplify the
application of STPA Steps 1 and 2. The discussion section will then
provide an assessment of the positive and negative attributes of the
analysis approach.
3.1. System characterization
Before STPA can be performed the system level losses to prevent
must be clearly dened. Working with the vender, three unacceptable losses were identied: injury or death, damage to property, and cost overruns and damage to reputation with costumers.
Table 3 shows these accidents along with descriptions detailed
enough to establish pass/fail criteria. While these criteria are
helpful for the system in question, pass/fail criteria are not necessary for STPA. Unacceptable losses can often be unique to a specic
organization or project so it is important to capture them each time
the analysis is performed.
With the accidents to prevent clearly dened, the system's
hazardous states can be derived. These hazards are developed by
tracing the losses in Table 3 to the ve properties of lithium ion
batteries that can develop a hazard as discussed in the introduction: voltage, arc-ash/blast potential, re potential, vented gas
combustibility potential, and vented gas toxicity. This process
promotes a complete list of hazardous states but does not, by itself,
ensure comprehensiveness. It is vital to be skeptical of seemingly
rm knowledge of how each of these losses could come about and
Table 3
STAMP unacceptable losses for CESS safety.
Losses Description
L1
L2
L3
Injury or death (Shock, electrocution, burn, smoke inhalation, or any other event related to life safety in excess of that expected for similar electrical equipment
installed in one- or two-family dwellings)
Damage to property (Fire that spreads, explosion, or any other event or condition that causes damage property outside of the unit itself)
Cost overruns and damage to reputation with costumers (No more than 10% of the initial distribution require service by a technician within the rst year)
467
Table 4
STAMP potential hazardous states of CESS.
Hazard
Description
H1
H2
H3
H4
H5
H6
468
Table 5
Safety responsibilities, constraints, and control actions.
Name
Responsibilities
Constraints
Control actions
EMS
Power commands must not drive battery module, Inverter, or other elements
into an unsafe state
Battery current must not push battery into voltage, temperature, SOC or other
condition exceeding present limits
Access to string current and each cell voltage and temperature must be
uninterrupted and allow for accurate data to be collected
Data on highest cell voltage, lowest cell voltage, highest cell temperature,
lowest cell temperature, string current, and the battery module warning/
alarm status must be accurate and timely
Inverter
Battery
module
BMS
469
Table 6
Example unsafe control actions.
Control action
Control needed
Control provided
and not provided
Voltage measurement
Voltage measurement stuck
delay
(2a) Voltage transmission Voltage transmission stuck
delay
Inaccurate voltage
Inaccurate highest cell
voltage
470
needed and not provided, provided, provided too early or too late,
or provided for too long or not long enough. The system's logical
unsafe control actions were then analyzed for the causal scenarios
involving other system components, control actions, and environmental factors that could contribute to their development.
The causal scenarios developed from STPA provided holistic
insight into how accidents might happen in the CESS. For example,
the analysis showed how EMS software updates which change
battery process model can cause the EMS to provide unsafe control
actions and possibly allow thermal runaway to develop in the
battery. It is important to recognize that nothing in this scenario
has a probabilistic mechanism of failure and so generally would not
be accounted for in a PRA based analysis. Understanding the
complex causes of accidents promotes design changes that act
systematically. In the example above this means that designers can
choose the right combination of version control, database management, software testing, oversight, technician training, signage,
and informational/hardware redundancy to assure that a new
process model will not provide unsafe control. If scenarios are
identied early enough in the design process, then even architectural changes can be made to eliminate accident scenarios altogether. These insights provide designers a complete picture of how
to avoid accidents such that cost and performance optimization can
be performed without compromising safety.
In addition to the benets of applying these techniques to specic design challenges, a systems perspective on safety could have a
positive impact on the energy storage industry at large, where there
is a narrow focus on battery safety. The analysis in this paper has
demonstrated that the batteries themselves are only one small
piece of a much larger safety picture in a battery energy storage
system. While it is a semantic distinction, using the term battery
safety narrows the public's perspective on what design choices
affect safety. Shifting usage to battery system safety or equivalent
terminology more appropriately distributes the perceived responsibilities for safe design between battery development and
integration engineering. If language is the medium through which
humans provide control actions, then a breakdown in language
could lead to hazardous system states. Given this potential, one
could speculate that a concerted, industry wide effort to better
communicate how lithium-ion battery systems can be designed
safely would help breakdown safety concerns that are now a barrier
to market growth.
Future work in on these techniques will include further development and application of STPA to analyze safety in energy storage
systems, the application of Casual Analysis using Systems Theory
(CAST) to analyze accidents, and working to make these abstract
techniques more accessible to energy storage manufactures, integrators, and customers. This effort is working toward, in the long
term, a large scale cost/effectiveness comparison between PRA and
STPA for energy storage technologies.
Acknowledgments
This work was funded by the US DOE OE's Energy Storage Program. The authors would like to thank Dr. Imre Gyuk for his support
of research advancing safety in grid energy storage. Special thanks
to Dr. Katrina Groth, Dr. Summer Ferreira, and Dr. Josh Lamb for
reviewing and editing the content this article.
Sandia National Laboratories is a multi-program laboratory
managed and operated by Sandia Corporation, a wholly owned
subsidiary of Lockheed Martin Corporation, for the U.S. Department
of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
References
[1] T. Reddy, D. Linden (Eds.), Linden's Handbook of Batteries, forth ed., McGraw
Hill, 2011.
[2] NFPA70E Standard for Electrical Safety in the Workplace.
[3] P. Ribiere, S. Grugeon, M. Morcrette, S. Boyanov, S. Laruellea, G. Marlair,
Investigation on the re-induces hazards of li-ion battery cells by re calorimetry, Energy Environ. Sci. 5 (2012) 5271e5280.
[4] Q. Horn, Failure modes unique to large format cells and battery systems, in:
NAATBatt Annual Meeting and Symposium, January 2014.
[5] L. Gordon, K. Carr, N. Graham, A Complete Electrical Arc Hazard Classication
System and its Application, Los Alamos National Laboratory, LA-UR-14-29516.
[6] L. Florence, White Paper on Safety Issues for Lithium-ion Batteries, Underwriters Laboratories, 2012. URL, http://www.ul.com/global/documents/
newscience/whitepapers/resafety/FS_Safety%20Issues%20for%20Lithium-Ion
%20Batteries_10-12.pdf.
[7] Z. Zhang, The main cause of li-ion safety and internal shorts, in: Battery Safety
Conference, The Knowledge Foundation, Washington DC, 2014.
[8] D.H. Doughty, C.C. Crafts, FreedomCAR Electrical Energy Storage System Abuse
Test Manual for Electric and Hybrid Electric Vehicle Applications, Tech. Rep.
SAND2005-3123, Sandia National Laboratories, 2005.
[9] UL1642 Standard Lithium-Ion Batteries.
[10] UL2054 Alkaline Cell or Lithium/Alkaline Packs.
[11] United Nations, Recommendations on the Transport of Dangerous Goods,
Manual of Tests and Criteria: 38.3 Lithium Metal and Lithium Ion Batteries.
[12] IEC62281 Safety of Primary and Secondary Lithium Cells and Batteries During
Transport.
[13] IEC62133 Secondary Cells and Batteries Containing Alkaline or Other NonAcid Electrolytes e Safety Requirements for Portable Sealed Secondary Cells,
and for Batteries Made From Them, for Use in Portable Applications.
[14] ANSI C18.3M, Part 2-2011 American National Standard for Portable Lithium
Primary Cells and Batteries Safety Standard.
[15] SAE j2929 Electric and Hybrid Vehicle Propulsion Battery System Safety
Standard e Lithium-based Rechargeable Cells.
[16] IEEE1625 2008 Standard for Rechargeable Batteries for Multi-cell Mobile
Computing Devices.
[17] IEEE1725 2011 Standard for Rechargeable Batteries for Cellular Telephones.
[18] L. Lu, X. Han, J. Li, J. Hua, M. Ouyang, A review on the key issues for lithium-ion
battery management in electric vehicles, Power Sources 226 (2013) 272e288.
[19] Advancion: features and specs, (accessed 01.22.15.). URL http://www.
aesenergystorage.com/advancion/features-specs/#safety-climate.
[20] NFPA11: Standard for Low-, Medium-, and High-expansion Foam, 2010.
[21] NFPA12: Standard on Carbon Dioxide Extinguishing Systems.
[22] NFPA 12a: Standard on Halon 1301 Fire Extinguishing Systems.
[23] NFPA 13: Standard for the Installation of Sprinkler Systems.
[24] NFPA 750: Standard on Water Mist Fire Protection Systems.
[25] NFPA 850: Recommended Practice for Fire Protection for Electric Generating
Plants and High Voltage Direct Current Converter Stations.
[26] NFPA 2001: Standard on Clean Agent Fire Extinguishing Systems.
[27] J. Jeevarajan, Can cell-to-cell thermal runaway propagation in lithium-ion
modules be prevented, in: Battery Safety Conference, The Knowledge Foundation, Washington DC, 2014.
[28] J.S. McLaughlin, Risks of lithium batteries in air transportation, in: Battery
Safety Conference, The Knowledge Foundation, Washington DC, 2014.
[29] J. Lamb, C. Orendorff, L. Steele, S. Spangler, Failure propagation in multi-cell
lithium ion batteries, J. Power Sources 283 (June 2015) 517e523.
[30] K. Marr, V. Somadepalli, Q. Horn, Explosion hazards due to failure lithium-ion
batteries, in: Global Congress on Process Safety, 2013.
[31] NFPA 68: Standard on Explosion Protection by Deagration Venting.
[32] NFPA 69: Standard on Explosion Prevention Systems.
[33] Aircraft Incident Report, Auxiliary Power Unit Battery Fire Japan Airlines
Boeing 787-8, ja829j, National Transportation Safety Board, 2014. Tech. rep.,
URL, http://www.ntsb.gov/investigations/AccidentReports/Reports/AIR1401.
pdf.
[34] Q. Wang, P. Ping, X. Zhao, G. Chu, J. Sun, C. Chen, Thermal runaway caused re
and explosion of lithium ion battery, Power Sources 208 (2012) 210e224.
[35] Battery Room Fire at Kahuku Wind-Energy Storage Farm, (accessed 01.22.15.).
URL
http://www.greentechmedia.com/articles/read/Battery-Room-Fire-atKahuku-Wind-Energy-Storage-Farm.
[36] Questions and Answers Concerning the NAS Battery Fire, (accessed 01.22.15.).
URL http://www.ngk.co.jp/english/announce/111031_nas.html.
471