[Remote] Senior Site Reliability Engineer (Remote Build)
Note: The job is a remote job and is open to candidates in USA. Remote is solving modern organizations’ biggest challenge – navigating global employment compliantly with ease. As a Senior Site Reliability Engineer for Remote Build, you'll own the operational excellence and infrastructure strategy that makes Build's platform reliable, performant, and safe for customers.
Responsibilities
- Design, implement, and maintain infrastructure-as-code patterns using Terraform and Kubernetes that support both standard connectors and custom builds
- Build and maintain comprehensive monitoring, logging, and alerting systems
- Lead incident response efforts, conduct post-mortems, and drive continuous improvement in system reliability
- Work with our Security team to embed security into every layer of Build infrastructure
- Ensure we meet compliance requirements across 100+ jurisdictions without creating friction for developers or customers
- Continuously optimize system performance, resource utilization, and cloud costs
- Make recommendations that improve both reliability and unit economics
- Identify manual operational toil and systematically eliminate it
- Build tools and processes that let teams operate efficiently without scaling headcount
- Partner with platform teams to ensure APIs, MCP, and CLI are resilient and observable
- Give infrastructure feedback that shapes how the platform evolves
Skills
- Senior-level SRE experience: demonstrated experience in a Site Reliability Engineering, DevOps Engineering, or SysOps role. You have stood up and operated production systems at scale
- Kubernetes and AWS: deep, hands-on experience running Kubernetes in production. Solid AWS fundamentals across compute, networking, storage, and managed services
- Infrastructure-as-code: Proficiency with Terraform or similar IaC tools. You write code to define infrastructure; you don't click buttons in the console
- CI/CD and deployment automation: real experience setting up and operating GitLab, GitHub Actions, Jenkins, or similar. You understand deployment strategies, rollback mechanisms, and safety nets
- Scripting and systems knowledge: strong bash scripting. Comfortable debugging system-level issues, reading logs, and understanding Linux kernel basics
- Great communication: you explain complex infrastructure decisions clearly to both engineers and non-technical stakeholders. You write clear runbooks and documentation
- Experience with 1+ backend programming language (Elixir, Python, Go, Java, Node.js, etc.)
- Experience in consultancy settings
- Container registry and artifact management (ECR, Docker Hub, etc.)
- Observability stack depth (Datadog, Prometheus, ELK, Grafana, or similar)
- Experience working with or scaling multi-tenant platforms
Benefits
- Work from anywhere
- Flexible paid time off
- Flexible working hours (we are async)
- 16 weeks paid parental leave
- Mental health support services
- Stock options
- Learning budget
- Home office budget & IT equipment
- Budget for local in-person social events or co-working spaces
Company Overview
Company H1B Sponsorship