All roles

Production Systems Engineer, AI Systems

Remote · USA Full-time New today

Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on Meta Training and Inference Accelerator (MTIA) program as a part of the AI/ML initiatives supporting large scale AI Training and Inference. Our servers and data centers are the foundation upon which our rapidly scaling infrastructure operates efficiently to deliver our innovative services. The RTP team is responsible for the end-to-end Hardware Lifecycle of all Meta servers, including prototyping of experimental HW, pre-production hands-on system and hardware debugging and stress testing, enabling production-ready system monitoring, automated provisioning and automated remediation of issues. RTP team also helps in exploring, developing and productizing high-performance software and hardware technologies for AI at datacenter scale. RTP Engineers have a large swath of cross-functional partners they work closely with e.g. HW/SW co-design teams, hardware designers, networking teams, system manufacturers, component vendors, capacity engineering, production engineering, production services, and data center operations teams to enable new systems that will be deployed in our production data centers. We are looking for a candidate to work on scale up and scale out network technologies (e.g RDMA NIC) for Meta Training and Inference Accelerator (MTIA) systems that are powering Meta’s tremendous leaps in the AI space. The ideal candidate is knowledgeable about network protocols (TCP/IP, RDMA) and has hands-on experience driving post-Silicon validation for networking platforms, all the way to mass production and deployment. Apply Job!

Related roles

Adjunct (Psychology)

Remote · USA Full-time

Logistic Documentation Coordinator

Remote · USA Full-time

Associate Security Analyst

Remote · USA Full-time

Experienced Registered Dental Assistant

Remote · USA Full-time

Mechanical Turbine & Compressor Specialist (Oil & Gas - LNG)

Remote · USA Full-time

Aerospace Manufacturing - Supply Chain & Vendor Management - Dallas-Fort Worth, TX

Remote · USA Full-time

Construction Superintendent - Solar Farm

Remote · USA Full-time

Medical Transcription Manager

Remote · USA Full-time

Data Entry - Fiverr - Montana, Conrad, USA - DoScouting

Remote · USA Full-time

Shop Hand

Remote · USA Full-time

Experienced IT Service Center Technician – Technical Phone Support for arenaflex

Remote · USA Full-time

Senior Partner, PBM Compliance

Remote · USA Full-time

Talent Acquisition Specialist (1099 Contractor)

Remote · USA Full-time

Devin.ai Consultant (Enterprise AI Implementation Specialist)

Remote · USA Full-time

Experienced Entry-Level Data Entry Clerk Administrator – Fully Remote Opportunity for Career Growth and Development in Staffing and Recruiting Industry

Remote · USA Full-time

Sales Intelligence & Insights Analyst, Senior

Remote · USA Full-time

Sr. SOC Analyst | Incident Response (Contract to Hire/Remote)

Remote · USA Full-time

Principal Backend Developer

Remote · USA Full-time

Senior Data Engineer, Data Platform - IntelliScript (Remote)

Remote · USA Full-time

Bilingual Customer Success Business Consultant

Remote · USA Full-time