Senior Site Reliability Engineer (SRE)
Company
Authlete Inc
Date Posted
23-10-2025
Location
Remote
We’re a globally distributed team building a secure, standards-based OAuth/OIDC engine used by global businesses, digital banks, and regulated industries. Our API-first approach enables organizations to implement OAuth 2.0 and OpenID Connect with ease. Authlete powers mission-critical authorization services for financial institutions and enterprises worldwide.
We’re hiring a Senior Site Reliability Engineer (SRE) to strengthen the reliability, scalability, and performance of our cloud and self-managed offerings. You’ll work closely with our platform engineering and security teams to design and operate robust systems that deliver high availability across multiple environments.
What You’ll Do
This is a hands-on engineering role. You’ll:
- Design, maintain, and optimize Kubernetes-based deployments across Shared Cloud, Dedicated Cloud, and Self-Managed deployment models.
- Develop and improve Helm charts as the standard deployment method across all supported environments.
- Manage and automate GitLab CI/CD pipelines, including container image packaging, and release processes.
- Enhance monitoring, alerting, and observability using Google Cloud Monitoring, Prometheus, and Grafana.
- Review and improve cloud functions and internal tooling written in Go, Ruby, and Bash.
- Troubleshoot infrastructure issues and performance bottlenecks.
- Contribute to product reliability by investigating and resolving issues in our Java-based servers.
- Participate in on-call rotations to maintain uptime and rapid incident response.
- Lead post-incident reviews and drive long-term reliability improvements.
- Collaborate with Engineering and Support teams to diagnose customer issues and optimize service quality.
What We’re Looking For
- Strong experience operating Kubernetes in production (preferably on GKE).
- Deep understanding of Kubernetes networking, security, Helm charts, and storage management.
- Proficiency in one or more programming languages such as Go, Java, Bash, or Ruby.
- Experience managing GitLab CI/CD pipelines and container image workflows.
- Ability to write PromQL alerting rules and interpret key reliability metrics.
- Familiarity with Redis, Liquibase, and TLS/mTLS certificate management.
- Strong analytical skills for diagnosing performance bottlenecks across network, cache, and database layers.
- Experience with observability, incident management, and performance testing.
- Clear communication skills in English; Japanese language proficiency is a plus.
- Comfortable working independently in a distributed team across time zones.
Why Join Us
- Work closely with experienced engineers building a high-security, standards-compliant OAuth/OIDC engine.
- Solve complex reliability challenges across multi-cloud and self-managed environments.
- Be part of a lean global team where your contributions have direct product impact.
- Enjoy flexibility, autonomy, and the opportunity to shape infrastructure best practices.
- Competitive compensation, global collaboration, and meaningful technical challenges.