How we work
After their six month anniversary, our senior engineers take on the extra responsibility of joining the oncall rota. We know this is a crucial aspect of engineering life and it is essential that everyone on the engineering team has good oncall experiences! This page gives you some detailed insight into our approach to oncall at Form3.
We operate under a true DevOps culture, so our engineers are responsible for building, maintaining and supporting our high volume platform. We firmly believe that supporting a platform oncall is a great learning experience in how to debug and fix a complex system, at scale.
In general, the responsibilities of the engineer oncall are:
Responding to incidents on our production and staging environments.
Identify and fix errors from the backlog to contribute to platform stability.
Supporting customer queries regarding setup or functionality.
Our engineers are in charge of maintaining the monitoring and alerting of their services, ensuring their services are not noisy and that their alerts are meaningful and actionable.
First off, let's cover some of the tooling we use for our oncall rotas. All of our engineers have company phones (currently iPhones). All necessary company software and accounts available on these devices.
The main tools that engineers use oncall are:
We are always improving our documentation and tooling to ensure that it is up to highest standards.
It's important to us that our engineers never feel alone, pressured or unsupported when they're oncall. We have dedicated incident managers to support the engineer oncall throughout the lifecycle of an incident.
Typically, our incident process consists of:
The incident manager and relevant engineer oncall get paged. They begin the initial investigations, begin assessing customer impact and coordinate any customer communication that might be required.
The incident manager identifies next steps to take in the runbooks. With support from the incident manager, the engineer begins to investigate the issue. The incident manager is in charge of the required communications.
If needed, other engineers are brought in to support incident resolution. In particular, production fixes are only done using timeboxed credentials and in pairs, so the engineer oncall does not have to make changes in production alone.
Finally, once the incident is resolved, the postmortem process begins. A detailed no blame investigation of the incident cause and response is conducted. Changes to service, alerting and runbooks are identified and prioritised.
Our platform runs critical payments for our clients, so we provide round the clock oncall support. Everyone from the engineering group takes part in the rota - including our leads, heads of engineering and the executive team!
We divide oncall shifts between office hours and out of hours shifts. During office hours, the engineer oncall has the support of the entire team, but is the first point of contact in the case of incident or error. Out of hours shifts run on a daily rotation, while office hours shift can be longer, depending on the team.
As out of hours shifts are less convenient than office hours shifts, these are remunerated on top of engineer base salary. Engineers are paid for making themselves available for a shift, even if they are not paged. We also give our engineers time in lieu if they are called out for a longer time to fix a challenging issue.
The frequency of the oncall shifts for each engineer varies with team size. Our heads of engineering are constantly reviewing schedules and teams to ensure that we hit the correct balance of team size and oncall frequency. This is a constant work in progress.
Most engineers currently take on a shift every 2 weeks. The team members support each to accommodate for annual leave and personal schedules. Swapping shifts or a part of a shift is normal and easy to do with PagerDuty.
We provide support and training for engineers going on call. While the process varies across our teams, the oncall onboarding generally follows this structure:
New oncall engineers shadow their experienced colleagues when they finish onboarding to the team. This gives them the opportunity to get used to being on call as well as learn more about our platform.
After getting plenty of shadowing experience, the engineers usually join the oncall rota by taking on office hours oncall shifts. This gives them the experience of being the primary point in the case of an incident, while still getting the support of their team members.
When they feel ready, the engineers begin to take on out of hours oncall shifts. However, as we mentioned previously engineers are never alone on their shifts.
Adelina is a polyglot engineer and developer relations professional, with a decade of technical experience at multiple startups in London. She started her career as a Java backend engineer, converted later to Go, and then transitioned to a full-time developer relations role. She has published multiple online courses about Go on the LinkedIn Learning platform, helping thousands of developers up-skill with Go. She has a passion for public speaking, having presented on cloud architectures at major European conferences. Adelina holds an MSc. Mathematical Modelling and Computing degree.
Blogs · 4 min
Kaspar Von Grünberg is a the CEO and founder of Humanitec. He joins us to discuss what an Internal Developer Platform is and what to focus on when you're building your own. Finally, he provides an overview of Humanitec's platform, which provides open-source tools you can use when you're building your own IDP.
March 15, 2023
Blogs · 3 min
No system is free of security vulnerabilities, which can be exploited to gain access to restricted resources. Ethical hackers are our allies, using the same techniques as malicious actors to help us find and fix the security vulnerabilities of our systems. In this short introductory article, we explore: the different types of hackers, the goals of ethical hacking and the main activities of ethical hackers.
March 2, 2023
Blogs · 5 min
Once upon a time, there were project boards in GitHub. They helped you plan, they looked like Trello, and they were much loved. They were classic. Then, one day, they were deprecated! Along came project v2 boards. They were like Trello, but also like a spreadsheet, and much more between, and they became the new project planning tool in GitHub. This post is about migrating project boards in GitHub. It's not, as you might expect, about migrating from classic project boards to v2 projects. GitHub offer a migration tool for that in their UI, and it's easy to do. Instead, this post is about migrating from one v2 board to another.
February 22, 2023