How does Pinterest's In-House Kubernetes Platform ‘PinCompute’ work?

A system design case study.

Feb 27, 2024

PinCompute is a regional Platform-as-a-Service (PaaS) built on top of Kubernetes, designed to manage diverse workloads at Pinterest.

Read on for an overview of PinCompute's architecture, primitives, APIs, resource management, operation practices, and its approach to scalability and service level objectives (SLOs).

Key Architectural Components:

Host cluster: Manages the regional federation control plane and tracks workloads.
Member clusters: Zonal clusters used for actual workload execution.
PinCompute primitives: Provide building blocks for various workload types, including:
- PinPod: A Pod with additional functionalities like per container updates and data persistence.
- PinApp: Designed to run and manage long-running applications.
- PinScaler: Enables application auto-scaling.

User Access and APIs:

Users access PinCompute primitives through APIs offering functionalities for:
- CRUD operations on workloads
- Debugging tasks like log streaming and container shell access
- Gathering runtime information about workloads

Resource Management:

PinCompute supports three resource tiers: Reserved, OnDemand, and Preemptible.
Scheduling decisions are made through a two-layer system:
- Cluster-level scheduling: Selects member clusters for workload execution.
- Node-level scheduling: Places Pods onto nodes within member clusters.

Cost Efficiency:

PinCompute prioritizes cost-efficiency through methods like:
- Promoting multi-tenancy and shared resource pools.
- Collaborating with users to optimize workload patterns.
- Utilizing cost-effective alternatives for hardware resources.

PinCompute Node Runtime:

Nodes are designed to securely, reliably, and efficiently run containerized workloads.
Key features of PinCompute nodes include:
- Pod-level isolation for security and resource management.
- Support for various container networking options.
- Integration points with Pinterest's infrastructure services like logging and security.

Enhanced Operability:

PinCompute offers features that improve node operability, including:
- Node-level health probes for comprehensive monitoring.
- Enhanced quality of service for reserved tier workloads.
- Runtime APIs for troubleshooting workloads.

Managing PinCompute Infrastructure:

Automation is heavily leveraged for daily operations, including:
- Automatic remediation of node health issues.
- Application-aware cluster rotation with user-defined configurations.

Release and Management:

PinCompute has a multi-stage release pipeline with end-to-end testing.
Machine images (AMIs) are used for node bootstrapping and managed through the upgrade service.
User-facing tools are provided for debugging, project management, and cluster management.

Scalability and SLOs:

PinCompute is designed to scale horizontally by adding more member clusters.
It defines SLOs for API availability, control plane latency, and workload launch speed.

Consider subscribing to get these system design case studies delivered straight to your inbox:

Liked this breakdown? Share it with a friend!

References:

Pinterest Engineering Blog.