Lets automate things securely!

The chase for shorter time to market almost always involves automation in some kind of form, but the rush for implementing it can sometimes introduce security issues. Organisations might gain speed but the question is often at what cost of security? In this blogpost and several more to come we will talk a bit about how automation can increase security when done right, but also introduce new security issues if rushed and decisions are made as we go.

What we at TRUESEC do.

We help companies in my different ways, for instance we assess environments and workflows, we perform penetration tests to see what their current security situation is like, we handle incidents and investigations after security breaches and we also assist development teams and organisations to produce more secure code and applications.

Release faster than humanly possible.

As releases get more frequent and time to market needs to be as short as possible, the amount of automation in an organisation has risen dramatically over the last few years. Automation is often a good thing and we usually encourage it. We humans are thinking beings and our brains often don’t like doing repetitive tasks, we are way to smart for that!

Automation many times also has the advantage of doing repetitive tasks a lot faster than us humans, which increases efficiency and cuts costs. But there are also other upsides to introducing automation. One advantage that many don’t think about is that it can give a increased moral boost to company employees. Employees often complain about how boring and monotonous their job is and that they never get to show their full potential. Automation is a way to remove that feeling, and help employees feel like they can add value, where value is needed.

Digitalisering och säkerhet – bristande säkerhet hotar att sänka etablerade varumärken
Automation to gain speed

But one of the biggest upside with automating monotonous tasks is the reduced risk of human errors. We humans make mistakes, it is a given fact. Sometime they are small and insignificant, but other times they can be threatening to an entire organisation. By automating certain tasks, we can guarantee that the task will always be done exactly as specified. We can always over time evolve the automation to do more and more complicated tasks, and we can even test and version control automated tasks. Try version controlling a human!

It is very common that there is an upfront cost of introducing automation and setting it up, with the hopes that it will pay off massively in the long run. But to reduce these upfront costs teams are often given very short deadlines, or no extra time at all to introduce said automation.

But this post isn’t going to be solely about the advantages of automation, which are many, instead this post is going to address the fact that setting up automation takes time the first time it is done, and if automation is rushed it could introduce new security issues into any potential stable environment.

I have spent a lot of time at quite large customers quite recently assessing security development teams, build pipelines, and other CI/CD setups. And a usual common denominator among these companies is that they all come from a long history of of manual releases. What does not make anything easier is that the applications are often very large, highly complicated, and not made to be automatically released.

But all these companies wanted to become agile, they wanted automated builds and releases, and they wanted it fast.

Human rights vs computer rights.

One of the most common things we see when we are brought into investigating incidents and ransomware attacks is that account privileges are usually a lot higher than necessary. Malicious actors often start out with access to a low level account, but can quite rapidly escalate themselves to local admins and then domain admins in almost no time at all because unnecessarily high account privileges are everywhere throughout the domain.

Due to often tight deadlines, fast development cycles, and a lot of releases, developers often tell us things like: “they (the org.) wanted this to work as fast as possible”, “pressure was high” and “they didn’t give us enough time”. And this often leads to developers with the feeling and sometimes even the need to cut corners to meet set deadlines.

And one of those corner cutting measures is to give accounts local admin rights so that they don’t get hindered by permissions even though it might not be necessary.

Would you give anyone you personal password, just in case they need it someday?

Lets compare it to a real world scenario. In real life you might give a caretaker a key to your house or apartment, but you would never also give them the key to your personal safe and the password to your email account just because it might be easier if they one day need access to these things for some unforeseen reason. You know… just in case. We should think the exact same way when it comes to computer accounts in domains.

Account privilege limiting is one of the most effective ways of hardening your domain. We limit exposure if an account gets compromised and also slow down or even straight out hinder users with malicious intent.

Remember, the more time they need to spend in hiding in a domain, the bigger risks they need to take the elevate themselves and by doing so the bigger the risk is that they will make noise in the domain and get caught.

This hardening mentality is commonly called the least privilege principal and has been around for decades, but we see this principal getting circumvented on a daily basis for the sake of meeting deadlines.

When visiting customers.

Many customers have the same setup, they are large, many employees and they have a lot of servers. And in the center of their development organisation they have some kind of CI/CD pipeline system for instance Microsoft Azure DevOps server, Jenkins with a combined github/gitlab to host the codebase etc. If we take a look at specifically MS DevOps server, often a large amount of test and production servers have a microsoft build agent or microsoft release agent installed. And all of these are connected to the MS DevOps instance. Then when developers want to do a release of an application to a test or production environment these installed agents fetch and replace the old application with the newest updated one.

A common setup in any typical company today.

This is a very common setup, and most companies have set this up in quite recent years. This is a setup we usually endorse, and it works usually quite well if done right. But sometimes you need to look a little bit closer at the setups.

You get admin, you get admin, everyone gets admin!

When a build/release agent is installed on a server you have the option to select what account this agent should be run with. Everything this agent later performs is done using this underlying account with its underlying permissions. Microsoft states in their official documentation:

The choice of agent account depends solely on the needs of the tasks running in your build and deployment jobs.

docs.microsoft.com – About agents and agent pools

And on the last row in the same paragraph they mention a small recommendation:

consider using a service account such as Network Service or Local Service. These accounts have restricted permissions

docs.microsoft.com – About agents and agent pools

Microsoft documentation on agent accounts.

And more often if not always, we find these agents running with full admin rights. And the stories are usually the same. “There was pressure from upper management to meet deadlines” and “they wanted to implement CI/CD pipelines with automated releases as quick as possible“. So the developers felt forced to script the installation of all release agents to use local admin accounts. This was due to having “problems with permissions” during their initial proof of concept. Their releases just wouldn’t work without admin rights and there was no time to investigate why, everyone just wanted it to get done!

The introduced risk.

Microsoft posted a blog series not long ago showing the different ways of “hacking” pipelines. It is a good introductory read of how you can tinker with pipelines. The first post has an excellent example of how to inject arbitrary code into arguments that later will be executed by any running agent.

Possibility to control multiple servers from one place.

A very common scenario for developers is to copy and paste release definitions which means potential vulnerabilities could be reproduced throughout the releases pipelines quite easy. Another common scenario is a that developers have a common custom built release template that every build inherits from. If this common template has a exploitable vulnerability (for instance a command injection exploit) then suddenly every deployable server could be at risk.

Having agents that can execute code with full local admin rights on all servers even if they potentially don’t need all this power could be considered quite a big risk if exploitable.

Another potential risk could be someone managing to push malicious deployment code to a code repository, this would then be executed on release servers using the local admin account.

Threat model, mitigate and dedicate time.

These types of risks are almost always introduced because of set time constraints.

Developers and system operators usually start out with good intentions but when they hit privilege problems and deadlines are getting closer they often feel the pressure to solve everything as quickly as possible. Organisations need to inherently understand what risks they are potentially introducing when they put high pressure on development teams. Also worth mentioning is that automation is quite new in the world of development. Common practices have not yet always been established and a lot of developers don’t have the full knowledge of writing automation.

These types of risks can be mitigated by for instance introducing threat modeling into organizations to locate these types of risks early and make active decisions on if and how to mitigate risks. Decision makers/product owners need to find and specify their risk appetite and actively prioritize issues to be able to meet release dates with the expected quality.

Threat modeling helps to raise security awareness in organizations so that they can make active decisions of what should be done before deadline, what can be pushed forward to a later time and what is an acceptable long term risk.

In future posts we will be looking at how to harden a Microsoft IIS release, and also talk a little bit about why organisations need to participate in putting pressure on large suppliers and platform producers like Microsoft to make automation tools simpler without sacrificing security.