The U.S. electric grid, which was designed more than 100 years ago, consists of control systems and field equipment. The grid was originally designed with large central station generation – for example, coal, oil, nuclear, natural gas, hydro – with transmission and distribution substations to deliver electricity to the end customer. As central station power plants are being supplanted by renewable generation from sources such as solar and wind, control systems are becoming more complex and increasingly vulnerable to cyberthreats, especially as the control system networks are connected to the internet.
The grid has operated effectively, though not necessarily as efficiently as possible, before there were Internet Protocol (IP) networks and Microsoft Windows Human Machine Interfaces (HMIs), which are maybe 15-20 years old. The grid can operate without the internet, but the internet cannot operate without power.
IP networks and associated HMI/SCADA systems provide an extended situational awareness capability, productivity improvement, and capability for controlling various forms of generation. However, they also bring a cybersecurity threat. For many people, grid cybersecurity means preventing a network compromise that can lead to short-term outages of hours to days. Short-term outages from natural disasters and equipment failures were addressed by having redundant systems and equipment spares and by utilizing mutual aid from other utilities when needed. However, mutual aid may not be available after a cyberattack against the grid.
Control System Vulnerabilities
The control systems used to monitor and control power plants and substations do not have adequate cybersecurity, nor do the utilities have adequate control system cybersecurity policies and procedures. Additionally, physical security, IT security, and business continuity policies must be coordinated with control system cybersecurity policies to ensure safe operation and recovery. It should also be noted that concerns about the electric grid also affect chemical plants, pipelines, water systems, manufacturing, and transportation, as these industries all use similar equipment from the same suppliers with the same cyber vulnerabilities.
Control system cyberthreats are important because cyberattacks could physically damage critical equipment (as if from explosives) – such as generators, transformers, and pumps – which can lead to long-term outages (months to years). Damaging this equipment could also lead to injuries to utility personnel and first responders. Control system cyberthreats include network vulnerabilities and engineering system vulnerabilities. Addressing network cyber vulnerabilities is necessary but not sufficient to protect the grid from cyberthreats.
The Aurora hardware demonstration in March 2007 at the Idaho National Laboratory is an example of a cyberattack that does not involve network malware but damages critical substation and other hardware equipment. This type of attack can bring down the grid – as well as any facilities that are connected to the affected substations – for 9 to 18 months. The damage occurs in milliseconds and only specifically designed hardware can prevent it.
The details of the Aurora vulnerability were made public and have been on hacker websites for several years. The 2015 Ukrainian cyberattack was step one of the two steps of Aurora. If the attackers had reclosed the substation breakers they had opened, the outage would not have been 6 hours but 6 months as critical equipment could have been damaged. This level of damage could have been considered an act of war.
Network monitoring, however, can do little to identify these types of attacks. Other cyberattack scenarios that target large, long lead-time equipment such as transformers, motors, and generators can cause long-term damage that, like Aurora, are not network-centric and require electrical engineers rather than network analysts to understand. The lack of cybersecurity and authentication of the process sensors – for example, measurements of pressure, level, flow, temperature, voltage, current – actuators, and drives can directly lead to loss of safety that will not be identified by network monitoring.
Call to Action for Emergency Operations
First responders and recovery operations have well-developed policies, procedures, and training for recovery from major outages generally caused by natural disasters such as earthquakes, floods, and tornadoes, or equipment-caused outages such as the 2003 Northeast Outage. However, cyberthreats can cause issues beyond those caused by natural disasters or equipment failures. Control system cyber events can have the following impacts on first responders and recovery operations:
- Recovery programs can take weeks to months when a cyber-physical system is impacted. This already occurred, in 2004, when a utility’s SCADA was compromised by a cyberattack. In that case, SCADA was unavailable for two weeks and it took four man-months to recover (see Protecting Industrial Control Systems from Electronic Threats). Capability for extended manual operation may not be available. However, Ukrainian cyberattack experience showed that an ability to operate using manual means was essential for months after the cyberattack.
- There is low confidence that early information will be correct and high probability that early information could lead to conflicting statements, retractions, etc., which can translate into a public relations ordeal. Obtaining the root cause could take weeks or even months – possibly unknown during the crisis response to the physical impacts. An example was the 2008 Florida outage when the Department of Homeland Security (DHS) did not understand the actual cause yet was providing its public disclosures that the event was not terrorist-related. Consequently, there is a need to understand what has already occurred to better understand the potential impacts and magnitude of the incidents.
- In most cases, it will be a difficult and lengthy process to prove that an event is a cyberattack and not an unintentional incident. In some cases, such as the 2008 Florida outage, the only difference between the incident being malicious or unintentional was the motivation of the engineer, which cannot be identified by technology. The response to a cyberattack may be different than the response to an unintentional incident, yet decisions must be made with limited information. There is generally little control system cyber forensics or training available. Additionally, attackers might have left footholds in other systems that could take weeks to uncover and cause potential safety threats to first responders and maintenance personnel. Consequently, the crisis management team (CMT) needs control system cyber subject matter experts (SMEs) involved. Additionally, third party expertise is often essential.
- Estimating the recovery effort is harder because of the unique issues associated with cybersecurity. There are few documented control system cyber incidents even though there have been almost 1,100 actual cyber incidents to date. Many of these control system cyber incidents have similar characteristics. A cyberattack requires an understanding of how the traditional recovery effort will be affected by cyber issues, and identification of the appropriate cybersecurity people, technology, and physical resources in addition to the traditional people, technology, and physical resources needed to recover from a non-cyber event.
- Mutual aid is an agreement through which other utilities offer their restoration services after natural disasters strike and cause widespread outages. The unwritten premise is that a natural disaster in one region will not affect other regions so that utilities in the unaffected regions can provide restoration support. There is an assumption that a mutual aid approach for natural disasters can be extended to include cyberattacks. However, the premise of cyber mutual aid is flawed for many reasons:
What is mutual aid for a cyberattack?
Is it providing technical resources to identify and remediate the cyberattacks? (This has been unsuccessful for IT networks.)
- Is it providing replacement equipment if transformers, capacitor bank switches, valves, motors, etc. are damaged by cyberattacks?
- What happens if the replacement equipment is damaged by recurring cyberattacks?
- Where will the replacement equipment come from if it is not still manufactured in the United States?
- Will the new equipment coming from “overseas” already be infected?
If a utility in one region suffers a cyberattack against its operational systems, other utilities may be ready to respond with mutual aid. However, if another utility in a different region is cyberattacked the next day, every utility will have all available resources dedicated to protecting themselves because the vulnerabilities that were exploited against one utility can potentially be exploited against other utilities using the same equipment. There is a need to understand what has already occurred to better understand the potential impacts and magnitude of the incidents.
- There is a lack of addressing cyber/reliability interdependencies of control system equipment. Since the same control system equipment are used in multiple industries worldwide, an attack in one industry can have repercussions in other industries and regions. Stuxnet is an example. The same Siemens systems controllers that were compromised in the centrifuges in Iran in 2010 are used in power plants, water systems, railroads, breweries, and even amusement park rides worldwide. Consequently, there is a need to understand what has occurred to control system equipment used by any industry to better understand the potential impacts and magnitude of the incidents that could affect the utilities.
With all this in mind, it is imperative for CMTs, first responders, and recovery teams to understand the unique issues that can occur following control system cyberattacks that affect the equipment used in the grid.