Gelato is an enterprise-grade Rollup as a Service Platform that helps you build scalable, blazing-fast, custom enterprise-grade Rollups with Gelato's powerful Native Web3 Modules. Today, over 50 projects rely on our Rollup Platform processing over 4.5M daily transactions and securing over $600M in TVL. We are proud to collaborate with teams such as Kraken’s Ink, Fox News, Reya, Lisk, and Open Campus to bring millions of users onchain.
Our team is passionate and dedicated to bridging the gap between current blockchain capabilities and its potential. We foster an environment that encourages innovation, collaboration, research, and in-depth discussions.
We are seeking a Senior Site Reliability Engineer to play a key role within our team and help elevate Gelato!
Responsibilities
1. Maintain and operate Gelato infrastructure in a multi-cloud environment.
2. Improve our incident management lifecycle for overall reliability.
3. Enhance our postmortem processes.
4. Strengthen our DevOps culture.
5. Deploy and maintain RaaS core components and observability stacks.
6. Modernize infrastructure and deployment strategies to industry standards.
7. Maintain and improve our CI/CD pipeline and governance.
8. Participate in on-call rotations to support operations and service availability.
9. Conduct regular team meetings and provide insights on system design, scalability, security, and efficiency in a Web3 context.
10. Promote cost-effective, innovative solutions and industry standards adoption.
Minimum Qualifications
1. At least 4 years of experience maintaining cloud infrastructure with modern technologies.
2. At least 1 year of experience with Web3-related infrastructure.
3. Strong understanding of GitOps principles.
4. Leadership skills to influence peers positively.
5. Ability to perform accurately in dynamic environments.
6. Experience with at least one major cloud provider (GCP, AWS, Azure).
7. Proficiency with Docker, Unix systems, and Kubernetes.
8. Experience with Git, Helm, Terraform, Kubectl, and similar tools.
9. Knowledge of networking, CDNs, gateways, and deployment strategies.
10. Experience operating highly available infrastructure.
11. Understanding of microservice architectures.
12. Skills in debugging, logging, monitoring, and alerting tools such as Prometheus, Grafana, Splunk, Datadog.
13. Experience with cost-optimized solutions.
14. Proficiency in at least one programming language (Go, Python, Rust, PHP, TypeScript).
15. Understanding of Web3 technologies and challenges, including RaaS.
16. Enthusiasm for learning and professional growth.
#J-18808-Ljbffr