Platform engineering is the practice of creating and constructing toolchains and workflows that empower software engineering organizations to have self-service capabilities in the era of cloud-native technology. Platform engineers develop an integrated product known as an “Internal Developer Platform” that addresses the operational requirements throughout the entire lifespan of an application.
Thank you for reading this post, don't forget to subscribe!It involves a range of technologies and tools that are seamlessly integrated to reduce the cognitive burden on developers while preserving important context and underlying technologies. It assists operations in structuring their setup and enabling developers to have self-service options. Effective platform engineering entails providing clear paths and well-established frameworks that align with the individual developer’s preferred level of abstraction when interacting with the Internal Developer Platform (IDP).
This article will delve into the origins of platform engineering and discuss the primary areas of focus for platform engineers. Additionally, we will explore how this discipline fits into the framework of modern engineering organizations and serves as a crucial component in advanced development teams.
From the ashes of DevOps: the rise of Internal Developer Platforms
Let’s go back a few decades. In the late 90s and early 2000s, most setups relied on a single gatekeeper, the SysAdmin, which was a point of failure. Developers had to go through them to get anything done for their applications, leading to the well-known “throw over the fence” workflow and resulting in poor experiences for both sides. As an industry, we collectively agreed that this was not the ideal we should aim for.
As cloud technology began to rise, with AWS launching in 2006, the concept of DevOps emerged as the new standard for engineering teams. While cloud native solutions brought significant improvements in scalability, availability, and operability, setups became more complex. The days of deploying a monolithic application with a single script were gone.
Suddenly, engineers had to familiarize themselves with a variety of tools like Helm charts, Terraform modules, etc. just to deploy and test a simple code change across multiple environments in a multi-cluster microservice setup. Despite the success of the division of labor (Ops and Devs) in other sectors of the economy, the industry shifted towards championing the DevOps paradigm as the key to achieving high performance setups.
Developers must possess the capability to deploy and execute their applications and services seamlessly from start to finish. The principle of “You build it, you run it” embodies the essence of true DevOps.
The challenge with this strategy lies in its impracticality for the majority of companies. While it may work for highly advanced organizations such as Google, Amazon, or Airbnb, implementing true DevOps in practice is a significant hurdle for most other teams. This is primarily due to the lack of access to a similar talent pool and resources required to optimize developer workflows effectively.
When a typical engineering organization attempts to adopt true DevOps practices, it often results in the emergence of various anti-patterns. Team Topologies team (Matthew Skelton and Manuel Pais, speakers at one of our Platform Engineers meetups) have extensively documented these DevOps anti-types, offering valuable insights for those seeking a deeper understanding of these dynamics. For instance, removing a formal Ops role or team in favor of developers (typically senior ones) taking on responsibilities for managing environments and infrastructure can lead to detrimental consequences. This shift creates a scenario where these engineers engage in “shadow operations,” diverting their focus from coding and product development. Consequently, everyone involved suffers – from the senior engineer burdened with additional tasks to the organization misusing its top resources, resulting in slower and less reliable feature delivery.
Several studies, including the State of DevOps by Puppet and Humanitec’s Benchmarking study, have demonstrated this particular type of antipattern. In the latter study, organizations were categorized as top or low performing based on standard DevOps metrics like lead time, deployment frequency, and MTTR. The data reveals that a significant 44% of low performing organizations exhibit this antipattern, where some developers take on DevOps tasks independently and assist less experienced team members. In contrast, all top performing organizations have effectively adopted a genuine “you build it, you run it” approach.
What sets apart low-performing organizations from top-performing ones? How do leading teams empower their developers to independently run their applications and services, minimizing reliance on senior colleagues for assistance? The answer lies in the presence of a platform team dedicated to constructing an Internal Developer Platform. The Puppet’s State of DevOps Report 2020 highlights a strong connection between the adoption of internal platforms and the level of DevOps maturity within organizations.
The top engineering companies follow this practice. They establish internal platform teams that create IDPs. By utilizing these IDPs, developers have the flexibility to choose the appropriate level of abstraction for running their applications and services based on their preferences. For instance, if they enjoy working with Helm charts, YAML files, and Terraform modules, they can do so. On the other hand, if they are junior frontend developers who are not concerned about the app running on EKS, they can easily access a pre-configured environment that includes all the necessary tools for deploying and testing their code, without having to be concerned about the underlying infrastructure.
Golden paths and paved roads
What are the concepts of golden paths and paved roads? To elaborate, in today’s CI/CD setups, the main focus is on updating images. CI builds them, updates the image path in configs, and it’s done. This process caters to most deployment use cases. However, things become more intricate and time-consuming when tasks go beyond this fundamental workflow.
- Refactoring
- Enforcing RBAC
- Rolling back and debugging
- Adding/changing resources
- Spinning up a new environment
- Adding services and dependencies
- Adding environment variables and changing configurations
The scope extends further. Platform engineering involves integrating all these elements into a well-structured paved road. Instead of requiring everyone to manage everything and understand the entire toolchain, platform engineers create the connections to offer a consistent self-service experience.
This connection is the internal platform. According to Evan Bottcher from Thoughtworks, platforms consist of “a foundation of self-service APIs, tools, services, knowledge, and support, organized as a compelling internal product. Autonomous delivery teams can utilize the platform to accelerate the delivery of product features with less coordination.”
Expanding on this concept, Kaspar von Gruenberg from Humanitec defines an Internal Developer Platform as “the collection of all the technology and tools that a platform engineering team integrates to create golden paths for developers.”
Principles of platform engineering
Numerous organizations are recognizing the advantages of Internal Developer Platforms and developer self-service. According to Puppet’s State of DevOps Report 2021, “The mere presence of a platform team does not automatically lead to higher evolution DevOps; however, exceptional platform teams amplify the benefits of DevOps initiatives.”
Recruiting the appropriate talent to construct such platforms and workflows can pose a challenge. Ensuring that they consistently deliver a dependable product to the rest of the engineering organization, while integrating their feedback into the IDP, is even more challenging.
More Info: The Impact of AI on Software Development
Here are several guiding principles that I observe as a common denominator among successful platform teams and self-service oriented organizations.
Clear mission and role
It is crucial for the platform team to have a clearly defined mission. For instance, they can focus on building reliable workflows that empower engineers to independently interact with our setup and easily access the infrastructure required to run their apps and services. It is essential to establish this mission from the beginning. Additionally, it is important to clarify the role of the platform team, which should not be perceived as just another help desk that sets up environments on demand. Instead, they should be recognized as a dedicated product team that serves internal customers.
Treat your platform as a product
Building on the product-oriented approach, the platform team should adopt a product mindset. Their focus should be on delivering real value to their internal customers, the app developers, based on the feedback they receive. It is crucial for them to prioritize shipping features that address the needs identified through this feedback loop, rather than getting distracted by the allure of new technologies.
Focus on common problems
Platform teams play a vital role in preventing other teams from reinventing the wheel and repeatedly tackling shared problems. To achieve this, it is important to identify these common issues by understanding the pain points and friction areas experienced by developers, which can be gathered through qualitative feedback and quantitative analysis of engineering KPIs.
Glue is valuable
Often, platform teams are perceived as a mere cost center since they do not directly deliver product features to end users. However, this perspective is dangerous as the glue they provide is immensely valuable. Platform engineers should actively promote their value proposition within the organization and embrace their role. Once the optimal paths and infrastructure are designed for teams, the main value created by the platform team lies in being the cohesive force that brings the toolchain together and ensures a seamless self-service workflow for engineers.
Don’t reinvent the wheel
Platform teams must not only prevent duplication of efforts within the organization but also steer clear of repeating the same mistakes. Even if their internally developed CI/CD solution is currently ahead, it is inevitable that commercial vendors will close the gap. Therefore, platform teams should identify their unique selling points. Rather than creating their own versions of a CI system or metrics dashboard and going head-to-head with companies that have significantly larger resources, they should concentrate on customizing off-the-shelf solutions to meet their organization’s specific needs. Commercial rivals are more inclined to cater to the broader requirements of the industry in any case.
When should you look at this?
Many people mistakenly believe that platform engineering is only relevant for large teams. However, once your organization surpasses 20-30 developers, it’s advisable to start considering an Internal Developer Platform (IDP) sooner rather than later. Countless stories have been shared about teams that delayed implementing an IDP and faced unnecessary challenges, such as being unable to deploy for weeks when their only DevOps hire left. Investing in IDPs and hiring platform engineers is something worth considering for your organization today.
How to get started
Congratulations on being halfway there by reading this! To further enhance your knowledge and engagement as a platform engineer, we encourage you to attend our events, join our Slack channel, and connect with other platform engineers and enthusiasts. Follow the journeys of successful teams like Adevinta and Flywire to gain insights and inspiration. Share your team’s struggles with the community and explore the potential benefits of self-service solutions. Take a look at existing Internal Developer Platform offerings to kickstart your journey, starting with lightweight implementations and continuously iterating on use-cases.