One Year of Top 20 Secure PLC Coding Practices
What they are, what they aren’t, and what you should do with them
It’s already been a year since the Top 20 Secure PLC Coding Practices have been published (see the original article here). The project has stirred up some dust, so now is a good time to give an update as well as clarify a few things that may have led to misunderstandings in the past: What the Top 20 are, what they are not, and what you should do with them (as an asset owner, integrator, or vendor).
What do you regard a security decision?
I’ve asked this question before, and of course there is no single correct answer. But below, based on a typical OT (operational technology) network, I have marked a few potential security decisions in red.
There’s no need to discuss each of them in detail. But I would like to draw your attention to three things.
First, security decisions are rather not one big monolithic thing, but a hodgepodge of many small configurations.
Second, security decisions don’t always look like they are security-relevant at first sight. They do not always have “security” written on them.
Third, you’re making security decisions all the time, whether you intend to or not. It’s impossible not to make security decisions. What we need to do is to turn them into visible, conscious decisions.
For our Top 20 project, that’s what we had in mind when starting it. Making security decisions visible —not for the entire OT network, but only for the small part of it that is the PLC.
Top 20 scope and project setup
The Top 20 have a limited scope: “anything that involves changes made directly to a PLC”. We did this in order not to water down the Top 20 to resemble all the OT best practice guides that already exist.
Why is everyone so excited about the Top 20 Secure PLC Coding Practices?
PLCs are the most typical device that differentiates OT from IT. You have the physical process on the right, the digital network on the left, and PLCs in between.
In a way, PLCs are the quintessential OT. PLCs are the devices that are deterministic, that have real-time capabilities, that operate in rough environments, that often run decades without being substituted — so anyone caring about ICS security necessarily cares about PLCs.
At the same time, we know that PLCs are vulnerable.
They have been called “insecure by design”. There have been many publications exposing their vulnerabilities, starting with Project Basecamp in 2012. And a decade later, until the Top 20 project, we still didn’t have anything at our fingertips to address the security issues of PLCs.
We all knew PLCs are vulnerable, but we had nothing to improve the situation. For IT, there are IT secure coding practices to do that. That’s why people became so excited when the Top 20 Secure PLC Coding Practices were published about a year ago — because we finally started to have something to do about PLCs’s security issues, something to make the security decisions all engineers make when using PLCs more conscious.
Top 20 facts and stats
Right from the beginning, the Top 20 project has stirred up some dust. Here are some facts and stats:
There was a global community effort with about 1000 registered people to create the first version of the Top 20. After we had published the first version, we created a “small team”, meeting monthly, to maintain the Top 20 and and the many surrounding projects.
The “small team” has since grown to about 75 people.
And there also is a small ecosystem of projects surrounding the Top 20 Secure PLC Coding Practices. Here are a few examples:
- Translation of the Top 20 into 15 languages are being created (5 are finished already).
- There have been more than 20 presentations, and Top 20 trainings have been set up too, including virtual machines for hands-on training.
- We have been approached by asset owners, vendors and integrators who are looking into implementing the Top 20 — we‘ll get to that later.
- MITRE CWE wants to integrate the Top 20 into their CWE database.
- The Top 20 were included in the NATO’s guide for protecting industrial automation and control systems agains cyber incidents security.
- ISA has created a video in their new Micro Learning Modules explaining the Top 20.
- The Singapore Cybersecurity Agency (CSA) is including parts of the Top 20 into their cybersecurity code of practice for OT.
It’s awesome that there are so many initiatives around the Top 20 Secure PLC Coding Practices. But I feel with the increasing attention, we also need to clarify a few things. During the rest of this article, we are going to look at the Top 20 from four perspectives: project setup, security capabilities, threats, and implementation. We’ll clarify what the Top 20 are, what they are not, and how you should be using them.
We aready finished the project setup perspective. To summarize:
- The Top 20 have a limited scope, they’re restricted to what can be changed directly in PLCs.
- The Top 20 are not the result of scientific work. They are neither complete nor validated.
- Also, they are not a standard, they are not consensus-based. They are a community project “written by engineers for engineers” and a first draft. There is ample room for improvement.
Top 20 Secure PLC Coding Practices from a security capabilities perspective
It’s time to take a closer look at the Top 20’s contents, and we’ll do that by looking at the security goals the Top 20 fulfill:
There are six security goals, and they are integrity, integrity, integrity, monitoring, hardening, and resilience. Obviously, that’s somehow imbalanced, but you will see why that is when we go through each category.
The integrity goal is actually split up into three: Integrity of PLC logic, variables, and I/O values.
Integrity of PLC logic
The first practice in this category is the most obvious. because it’s how integrity is usually approached in IT: cryptographic and / or checksum integrity checks.
All other practices are workarounds or additional measures for the likely case that the first cannot be applied: Using PLC flags as integrity checks, tracking operating modes, leaving operational logic in the PLC so it’s harder to manipulate, modularizing PLC Code to make it easier to spot manipulations.
Integrity of PLC variables
Since PLC variables are often set from external sources, there’s a lot of input validation in this category. But since we have a very narrow scope, we can be very specific on what exactly to validate, namely: timers and counters, paired inputs and outputs, indirections, and HMI input variables. Also, it helps to assign designated register blocks by function (read/write/validate) to facilitate input validation.
Integrity of I/O values
I/O is the interface to the physical process. Therefore, for I/O values, your best integrity checks are those that make use of process knowledge: validating inputs based on physical plausibility. Because you can only measure what you have a sensor for, it’s important to instrument for plausibility checks as well — and not always rely solely on one way of measuring.
That you can do monitoring is obvious, and there are so many specific security monitoring tools out there. But we collected a few things from the PLC to monitor that may not look security relevant, but can reveal details about your PLC’s security status anyway: PLC cycle times, PLC uptime, PLC hard stops, and PLC memory usage. And for all monitoring, you should be trapping false negatives and false positives.
Hardening and Resilience
The last two security goals the Top 20 cover are hardening and resilience.
First, hardening: Since the typical PLC is a pretty specialized device anyway, there is not that much room for hardening — but these two are doable and important: Restricting communication ports and protocols and third-party data interfaces.
Second, resilience: This is the only overall resilience control that made it into the Top 20, since crashing PLCs is a likely threat scenario: Set a safe process state for PLC restarts.
So yes, the Top 20 DO focus on integrity. That makes sense because integrity is arguably the most important security goal for PLCs, and also the one where PLCs are best equipped to contribute to.
Secure Coding vs Secure PLC Coding
Next, let‘s emphasize what the Top 20 Secure PLC Coding Practices DO NOT focus on. This becomes clear once we compare the Top 20 to typical IT secure coding practices, which are available from Microsoft, the Carnegie Mellon Software Engineering Insitute, or OWASP.
But first, we recall a typical PLC architecture using this simple model:
The PLC is in the middle, a sensor to the left, an actuator to the right, and an HMI or prorgamming device on top. Now, a PLC’s CPU works in cycles. Each cycle looks the same: Read input, then execute logic, then communicate with external component, then write output — and each task has a dedicated time slot allocated. This is how PLCs ensure real time. Remember that real time does not necessarily mean reacting fast, but guaranteeing to react within a given time frame — regardless if it’s one millisecond or one hour.
Let’s keep these PLC architecture characteristics at the back of our minds when we divide the secure coding practices we know from IT into four groups.
The first — marked green in the image above — is the easiest: Secure IT coding practices that are applicable to PLCs as well. Because of the PLC architecture, they can be even more specific to PLCs, to tailor them to PLCs capabilities.
Take monitoring: Since we know PLC cycle times are exactly the same as long as logic doesn‘t change, it makes sense to monitor cycle times. This works similarly for input validation, hardening and memory management practices.
But there are also three reasons why a secure IT coding practice cannot be used for PLCs, and they are in the three grey areas:
First, some IT pracitices just don‘t apply to PLCs, and the reason can again be found in a PLC‘s specific architecture.
For example, the principle of least privilege applied to processes does not make sense because a PLC prioritizes processes differently than a normal OS — in fixed time slots.
Output encoding makes no sense because the output is restricted to values for actuators anyway.
And practices covering sessions, databases, and file management make no sense if you don’t have those in your architecture.
Then, there’s „not in scope“.
There are practices that are often listed under “secure coding” for IT, but frankly, are not really coding. Those are not in our Top 20 scope (“anything that involves changes made directly to the PLC”). Examples include asset management, separation of development and production, and any quality assurance techniques like pentests, source code audits, fuzz testing, and the like.
Lastly, here’s the most important reason why IT secure Coding Practices are not useful for PLCs: They are not technically feasible for PLCs.
The three practices in this category — authentication on the PLC, authentication on the protocols the PLC uses, and cryptography — would without doubt be effective security improvements on PLCs.
BUT: PLCs on the market today mostly do not have the capabilities for authentication or resources for cryptography. That’s not something secure PLC coding can fix. And that‘s an important message here: the Top 20 are written for the — imperfect — PLCs on the market today. It’s not realistic to create practices that the majority of today’s PLC’s cannot implement, because the imperfect PLCs we have today will likely be in plants for years to come.
Lesson learned from the security capabilities perspective
- The Top 20 mostly improve integrity.
- They make use of PLC specific architecture and characteristics like real time capabilities and process knowledge.
- And they are written for the imperfect PLC’s (from a security perspective) that are on the market today.
Top 20 Secure Coding Practices from a threats perspective
Which problems do the Top 20 solve? We’ll take a few typical PLC threats and vulnerabilities and see what the Top 20 do or do not address.
Again, we begin with the easiest case: Threats we already address.
This is everything that involves sending unexpected packages to the PLC. There is the famous “ping of death”, where a PLC cannot handle ICMP requests. Other examples are inputs that cause overflows in memory or denial of service.
These types of threats can be addressed properly by the Top 20 practices covering input validation and restricting PLC interfaces, as well as overall resilience practices alleviating the impact of a crash.
Next up are two groups of threat scenarios that the Top 20 do address but could do better:
Causing indeterministic results: One of the most important characteristics of a PLC is that it is deterministic. If you feed it a certain input, you can predict the outcome, and it’s 100% the same every time. There are some conditions in the code that may cause indeterministic results.
Examples are race conditions, where the outcome depends on which of two branches is being calculated faster, bypassed instructions, where there is an empty branch parallel to an instruction, or duplicate instructions, where there are instructions for the same value in two different places, making it unpredictable which one is carried out.
The second threat group in this category is (re)writing undefined or hard-coded parts of the logic. Both leave room for the attacker to define in a malicious way, while likely flying under the radar. Examples are undefined variables or outputs or hard-coded parameters.
Frankly, for both threat categories, I think, the Top 20 can — and should! — address causes for these threats more explicitly, not just with generic integrity and monitoring controls. There are ideas how that could be done, for example in this paper. Something to put on our list for the Top 20 version 2.0.
Now for the most delicate category: Threats that we don’t address.
Unfortunately, this is the most important group of PLC threats: Those that abuse normal, legitimate features that lack authentication. That’s what we would call “malicious insider attacks” — except you don’t have to be an insider to carry them out, because there’s no authentication required. These threats are well-known, and have been given names like “insecure by design” or “foreverdays”.
A good example for abusing these insecure-by-design features is the incontroller / pipedream: For the most psrt, the malware does not exploit vulnerabiltities, but uses legitimate features. Pipedream almost looks like someone had taken all the proprietary ICS engineering tools and created a more efficient “open source” one — unfortunately for malicious purposes.
Threat scenarios for this category are as endless and powerful as any engineering tool’s capabilities: You can upload and download PLC logic, change firmware, force output values, set input values, change thresholds, silence alarms…everything you’d wish for as an attacker. And there’s not much that can be done by secure PLC coding while today‘s PLCs don’t have authentication mechanisms.
Lesson learned from the threat perspective
The Top 20 are good at addressing threats that involve unexpected input, and they could do better on threats that cause indeterminstic or undefined output. But the biggest threat to PLCs is abusing their legitimate features. The remedy — authentication — can only be done by vendors. Even if they begin now — it will take decades until all “old and imperfect” PLCs are gone.
Let that sink in: The Top 20 Secure PLC Coding Practices cannot address the most important threat category for PLCs.
Top 20 Secure PLC Coding Practices from an implementation perspective
For each practice, we estimated who probably has the knowledge and access to implement them: vendors, integrators, or asset owners. Looking at these numbers, we clearly have an issue here: 10 practices can likely be implemented by PLC vendors, 19 by integrators, and only 6 by asset owners (multiple answers were possible).
Asset owners are most enthusiastic about PLC security — but they need integrators and vendors for implementing the Top 20. So asset owners cannot implement most of the Top 20 themselves. Here’s two things we created that empower them to use the Top 20 anyway.
First: a template for integrating the Top 20 into their vendor policies when purchasing. That’ll hopefully bring the Top 20 to the integrators’ and vendors’ attention.
Second: a template for so-called application notes, that vendors and integrators can easily fill out to showcase if and how they’re following the Top 20, to be transparent about how the approach PLC security. As an example, here’s how the first integrator used the template: Grantek has provided Top 20 application notes for a North American Pharmaceutical Manufacturer Use Case.
Timeline of PLC security
Now we‘ve learned what the Top 20 are and what they aren‘t. Last, we‘ll cover what you should be doing with them. Here’s a rough timeline of securing PLCs, divided into short, mid, and long term measures.
The first two steps are called “compensating controls” because they assume PLCs are still as insecure by design as they are today. The third is bringing security capabilities to PLCs.
The lowest hanging fruit and easiest to implement measures are short-term: compensating controls around the PLC like checkups, audits, training, isolation of PLCs, monitoring, locking ports and cabinets, creating backups and implementing version management.
Mid-term measures are compensating controls within the PLCs. These are compensating because you still work with “imperfect PLCs”, but use the programming features and specialitites PLCs already have. These are the Top 20! The Top 20 Secure Coding Practices are neither your most urgent measure, nor are they the ideal long-term goal. They are mid-term measures for the imperfect PLCs we have today.
Long-term measures are those that would solve most of our PLC security problems, but that are not easy to implement: Creating additional, dedicated security features within the PLCs, like authentication in PLCs, authentication in PLC protocols, secure boot, signed firmware, logging, syslog support, access control lists, and ultimately, self-aware PLCs that “know” what is running inside them and can identify malicious code or otherwise suspicious behavior.
Now what does the timeline of PLC security look like for you? Well, that depends on what role you have.
PLC Security for asset owners
If you’re an asset owner and not responsible for most of your PLC coding, the Top 20 are definitely not your most important building block.
Other measures are easier to implement for you and more effective too — theses are the short-term measures around the PLCs. Implement them now. You can use about any OT security best practices.
Regarding the Top 20: Implement what you can, which will likely not be that much. Instead, use our asset owner empowerment documents. Request your vendors and integrators to implement the Top 20 by including them in your RFPs and vendor policies.
For the long-term measures, you can mainly talk, talk, and talk some more. Tell your vendor you want to see authentication etc. Participate in user groups and bug them about authentication.This wouldn’t be the first case where customer pressure drives more secure products.
PLC Security for integrators
If you’re an integrator: For the short-term measures, help your asset owner where you can.
But more importantly, take the Top 20 and implement them, that’s your job.
Also, talk about that. Even more so if you’ve been doing all that before the Top 20 came out, or if you’re actually doing much more. Since there are so many asset owners excited about the Top 20, they will be excited for you to follow them. It’s excellent marketing, really. The Top 20 project team will help you, there’s even the simple “application notes” template you can use for free.
PLC Security for vendors
If you’re a vendor: Do all the things I just recommended for integrators, plus begin integrating basic security features into your PLCs and protocols. If you don’t know where to start, start with authentication. The security community will love you even more.
Lesson learned from the implementation perspective
The Top 20 Secure Coding Practices are neither your most urgent measure, nor are they the ideal long-term goal. Likewise, they do cannot solve all PLC problems. They are an intermediate step on the PLC security timeline. The Top 20 are mid-term compensating measures for the imperfect PLCs on the market today.
Most of the Top 20 needs PLC vendors and integrators for implementation. Little can be done by asset owners— but they can request and control Top 20 implementation from their vendors and integrators. Asset owners are likely the ones that need to drive the change towards more secure PLCs, and for that they can use the Top 20.
So yes. the Top 20 are novel — there was nothing comparable before though we’ve known about PLC security problems for a long time. BUT: The Top 20 are not a standard, not validated — but a conversation starter. Also, hopefully a conversation ender on the tiresome “PLCs are vulnerable and cannot contribute to security” discussions.
Anyway, the Top 20 Secure PLC Coding Practices need to keep evolving. We need volunteers. We need asset owners to demand secure PLCs. And we need vendors and integrators to build and program secure PLCs.
Top 20 Secure PLC Coding Practices Project Website: https://plc-security.com/
This article is based on a presentation held in Singapore at the OT Cybersecurity Expert Panel Forum on July 12, 2022.
A recording will be available at https://www.otcep.gov.sg/.