About the job
Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services — both our internally critical and our externally-visible systems — have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SREs will keep an ever-watchful eye on our systems capacity and performance.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE culture emphasizes intellectual curiosity, problem solving and openness. Our organization values collaboration, big thinking and risk-taking in a blame-free environment, with support and mentorship to learn and grow. We help keep Google’s product portfolio running from development to deployment in data centers and the next generation of Google platforms.
Responsibilities
* Lead the design and implementation of Artificial Intelligence (AI)-powered systems, with a focus on reliability, scalability, and efficiency.
* Lead and influence cross-organisational collaborations, ensuring adherence to product goals and initiatives to move work forward. Drive technical discussions and decision-making within the team and across the organisation.
* Provide guidance to other team members on managing availability and performance of mission critical services, building automation to prevent problem recurrence, and building automated responses for non-exceptional service conditions.
* Mentor and train other team members on design techniques and coding standards, and cultivate innovation and collaboration across multiple teams.
* Manage separate projects priorities, deadlines, and deliverables.
Minimum qualifications
* Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
* 8 years of experience with programming in one or more programming languages.
* 4 years of experience leading projects.
* 3 years of experience building and developing large-scale infrastructure or distributed systems.
Preferred qualifications
* Master's degree in Computer Science or Engineering.
* Experience with Machine Learning Infrastructure.
Qualifications notes
Note: This section reflects the minimum and preferred qualifications as listed in the original description and has been reformatted for clarity.
Equal employment opportunity
Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google’s EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
Seniority level
* Mid-Senior level
Employment type
* Full-time
Job function
* Information Technology and Engineering
Industries
* Information Services and Technology
* Information and Internet
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr