Back to all tech blogs

Achieve a smaller blast radius with Highly Available Kubernetes clusters

Kubernetes is integral to our operations, allowing us to serve 60,000 requests per second to 20 customers across 30 clusters. In order to deliver an optimised service, we built SCHIP, a specialised multi-tenant Kubernetes distribution that extends and enhances the vanilla distribution.

By Tanat Paul Lokejaroenlarb

At the same time, we wanted to increase the resilience and redundancy of our platform, allowing us to easily redirect traffic between clusters to maintain a quality service should one degrade or we needed to perform risky maintenance and upgrades. We called this ‘reducing the blast radius’.

Read our latest article to discover:

  • How to use Weighted records in AWS Route 53 to control traffic flow between clusters.
  • How to build a traffic controller to automatically direct traffic to clusters using Weighted records info.
  • How the traffic controller also monitors Kubernetes pod health to update Weighted records and automate traffic redirection.

The new traffic operator allows us to grow and evolve the SCHIP platform for future success – without affecting service availability or performance. Read the Adevinta blog for full details of what we did and what was achieved.

Related techblogs

Discover all techblogs

How we saved €5k a month with a single line of code

Read more about How we saved €5k a month with a single line of code
How we saved €5k a month with a single line of code

Why did we transition from Gatekeeper to Kyverno for Kubernetes Policy Management?

Read more about Why did we transition from Gatekeeper to Kyverno for Kubernetes Policy Management?
Navigating Challenges: Considering the transition from Gatekeeper to Kyverno in Kubernetes Policy Management

Trial by Fire: Tales from the SRE Frontlines — Ep2: The Scary ApplicationSet

Read more about Trial by Fire: Tales from the SRE Frontlines — Ep2: The Scary ApplicationSet
Understand the safeguard configuration of the ArgoCD’s ApplicationSet through the experience of our SRE who learned from an incident