Everyone is frustrated of traditional IT processes of using a ticketing system such as Jira to request access to systems. They are cumbersome and it's hard to verify that the person made the change exactly as requested in the ticket. Travis talks us through how at Teleport they have solved this problem by leveraging infrastructure as code to manage access to all internal systems. He talks us through how you can go about migrating to infrastructure as code and what are some of the gotchas you need to watch out for.
Travis Gary is running the IT department at Teleport. The access management tools that Teleport provide have been really important for companies going remote, who now have to embrace zero trust and change how they are doing their security.
Travis recently spoke at Conf42. His talk "Using Infra-as-code, not Jira tickets to pass audits" is all about moving off Jira ticket driven workflows to infrastructure as code. He will share more of his thoughts on this important topic with us in this episode
As a system administrator, begin with thinking about your current workflows. You might receive a ticket, go log into environment that needs changing and then action that change manually in the console. This is sometimes known as ClickOps.
This process doesn't scale for a large amount of changes or large systems, as you might see in fast growing organisations. Furthermore, making manual changes through the console is neither secure, nor repeatable. Changes are not reviewed, so there are no guarantees they are made correctly or that they won't break anything.
Infrastructure as code (IaC) allows you to describe processes, including errors and rollbacks, into code. It is not generally a programming language, but a simple language that allows you to describe your resources. Under the hood, it calls to backend APIs and endpoints to create and manage these resources.
It is important to remember that IaC is stateful. It brings resources back to the their described configuration and state, regardless of what has previously been done to them manually or otherwise. IaC is very well suited to IT processes - what's defined in the code is what will exist. This is really easy to review and audit, breaking the disconnect between what a ticket prescribes and reality.
In the IaC world, changes are made using pull requests and branches. This opens up the opportunity for automated testing of our changes, easing up the need for manual QA verification. All changes can then be reviewed by the necessary experts, but anyone can propose a change and submit it for review. The ability to propose changes without any gatekeeping is great for the developer agility of distributed teams.
Terraform is an example of an infrastructure as code tool. It allows a remote service to apply your changes. It can also generate a plan on an opened pull request, so you can see the changes that will be made, before they take place. This is another powerful mechanism for verification of current state vs proposed state.
Keeping your infrastructure definitions in code also allows you to check when and who made changes, as well giving you the possibility to search through changes. This makes it easier to work asynchronously, accross locations and timezones.
The migration process requires a cultural shift - you have to have full buy-in from your developers and consider the developer experience. A lot of the benefits of IaC migrations come at the tailend of the process so you really need support from your developers to go on the journey.
Making one change will often times be faster using ClickOps, but audits and security are much better in IaC. Initially, there might be a friction, but the gains will come as we shift left, making the process closer to the development cycle. The platform stability will improve with tighter control, making for quieter oncall rotas as well.
Comparatively to the cultural change, the technical changes are quite simple. Terraform is a declarative language that is relatively easy for engineers to pick up. From a security perspective, the burden moves from one space to another. Terraform still needs access to powerful credentials to be able to make changes to infrastructure.
At Teleport, the engineers have a "hack yourself" mentality so they have had engineers trying to play capture the flag games against infrastructure as code repositories. As you migrate to IaC, you need to consider that you are shifting security concerns from humans to the pipeline.
Generally, the aim of IaC is to remove all admin users from the system. This can make recovery when something goes wrong really difficult - the "break glass procedure" becomes difficult. One way Teleport has handled this is with an alerting pattern and admin roles. When something goes wrong, an incident is created and a limited amount of users can instantly take on the admin role to fix the platform.
Teleport run their own podcast titled "Access control podcast". It has some great episodes about IaC and security, so make sure to give it a listen if you liked this episode.
Written by
Adelina is a polyglot engineer and developer relations professional, with a decade of technical experience at multiple startups in London. She started her career as a Java backend engineer, converted later to Go, and then transitioned to a full-time developer relations role. She has published multiple online courses about Go on the LinkedIn Learning platform, helping thousands of developers up-skill with Go. She has a passion for public speaking, having presented on cloud architectures at major European conferences. Adelina holds an MSc. Mathematical Modelling and Computing degree.
Here are some other resources that you might find interesting:
Blogs · 5 min
For a Red Team operator it can be disappointing to retire a particular technique, but it can also be an opportunity to share their knowledge with the community. Phishing operations can require a lot of time and effort to set up the infrastructure, acquiring and categorising domains, fine tuning payloads, preparing pretexts and bypassing those pesky filters and controls, but there are ways to make the process simpler. This post will explore one such method, using GitHub as a tool to distribute, host, and compromise a target in a bait, hook, and catch operation that can be done from a mobile device. This post will cover: GitHub Apps, Hosting, Distribution and SSH Access.
February 1, 2023
Blogs · 5 min
It's always daunting moving jobs. In this post, Chris Townsend shares insights into his first month as a Senior Software Engineer at Form3. He talks us through his reasons for joining, the interview process and his onboarding experience, as well as what his future career aspirations are.
January 25, 2023
Blogs · 4 min
Dragan Stepanović is a Senior Principal Engineer at Talabat. He joins Renato Rodrigues de Araujo, Senior Software Engineer at Form3, to discuss asynchronous pull request based code reviews. Dragan shares a study he conducted on the topic and discusses the advantages of synchronous team collaboration.
January 19, 2023