We are seeking a highly skilled Senior Site Reliability Engineer to join our team in the Infrastructure Engineering group.
As we deliver real-time player tracking, sport analytics, and broadcast augmentation to customers worldwide, we need to scale from hundreds of sport arenas to thousands.
The ideal candidate will design and code end-to-end processes enabling untrained staff to autonomously prepare, install, and monitor all Linux servers and networking devices installed in 100+ sport venues.
Main Responsibilities
* Design and code automation processes for infrastructure deployment and monitoring.
* Take ownership of long-term technical efforts and articulate design choices to technical and non-technical people.
* Collaborate closely with teammates to solve problems, share knowledge, and provide actionable feedback.
* Participate in an on-call rotation that emphasizes eliminating repeating escalations.
* Visit the Lausanne office regularly.
Required Skills and Qualifications
* 5+ years experience in SRE with Linux.
* Strong understanding of the entire Linux server stack: OS boot and installation, systemd, networking, container deployment, logging, metrics & monitoring, out-of-band management, etc.
* Strong experience designing robust automation processes for a large inventory of on-premise servers.
* Strong experience contributing to a large automation code base using Ansible or a similar platform.
* Proficient with Bash scripting and Python programming.
Benefits
* 25 days holiday per year + extra holidays between Christmas and New Year's Eve.
* Improved insurance plan, including accident, loss of earnings in case of sickness and occupational plan.
* Office location only 2-min walk from the train station.
Our Stack:
* Languages and frameworks: Ansible, Bash, Python.