[Remote] Principal Site Reliability Engineer - ARINCDirect (Remote)
Note: The job is a remote job and is open to candidates in USA. Raytheon, specifically the ARINCDirect team within Collins Aerospace, is seeking a Principal Site Reliability Engineer to enhance their infrastructure automation and continuous delivery processes. The role involves ensuring service reliability, performance monitoring, and collaborating with cross-functional teams to design scalable solutions for flight operations services.
Responsibilities
- Spend your days working to automate and improve reliability and continue to push the ARINCDirect infrastructure forward, ensuring it is resilient and reproducible
- Be responsible for service availability, performance, monitoring, incident response, and capacity planning
- Create, improve, and manage environments to ensure decisions on resource allocation, problem identification, and capacity planning are based on accurate data-driven insights
- Maintain a physical infrastructure using Linux
- Help facilitate a push towards Kubernetes and declarative infrastructure
- Impact technology decision and direction to grow and support the ARINCDirect platform
- Collaborate closely with fellow SREs on your team and extend your collaboration across other teams and disciplines to design dependable and scalable solutions and services
- Identify, implement, and champion process improvements to enhance productivity, collaboration, and delivery efficiency, while ensuring alignment with company goals and industry best practices
Skills
- Typically requires a degree in Science, Technology, Engineering or Mathematics (STEM) and minimum 8 years prior relevant experience or an Advanced Degree in a related field and minimum 5 years of experience or in absence of a degree, 12 years of relevant experience
- Must be authorized to work in the U.S. without sponsorship now or in the future. RTX will not offer sponsorship for this position
- Experience as a SRE, Platform Engineer, or related position within a Linux or UNIX environment working on large, complex infrastructures and/or projects using Docker and Kubernetes solutions
- Experience automating configuration and infrastructure with tools such as Saltstack, Ansible, Terraform or other declarative languages
- Experience with hardware; including servers, network switches, & cabling
- Experience managing infrastructure using GitOps with continuous delivery (CD) pipelines
- Established proficiency in at least one (ideally more) of the following: Python, Linux Shell (bash, awk, sed)
- Experience with PostgreSQL, or equivalent RDBMS and SQL in general
- Familiarity with Cloud infrastructure, ideally AWS
- Understanding of SRE principles including building observability solutions and exposing metrics to inform SLO's and KPI's
- Understanding of how IT infrastructure services work, including: DNS, DHCP, LDAP, NFS
- Understanding of network segmentation, routing and VPNs
Benefits
- Medical, dental, and vision insurance
- Three weeks of vacation for newly hired employees
- Generous 401(k) plan that includes employer matching funds and separate employer retirement contribution, including a Lifetime Income Strategy option
- Tuition reimbursement program
- Student Loan Repayment Program
- Life insurance and disability coverage
- Optional coverages you can buy pet insurance, home and auto insurance, additional life and accident insurance,