Senior Site Reliability Engineer (SRE)

Company

Authlete Inc

Date Posted

23-10-2025

Location

Remote

We’re a globally distributed team building a secure, standards-based OAuth/OIDC engine used by global businesses, digital banks, and regulated industries. Our API-first approach enables organizations to implement OAuth 2.0 and OpenID Connect with ease. Authlete powers mission-critical authorization services for financial institutions and enterprises worldwide.

We’re hiring a Senior Site Reliability Engineer (SRE) to strengthen the reliability, scalability, and performance of our cloud and self-managed offerings. You’ll work closely with our platform engineering and security teams to design and operate robust systems that deliver high availability across multiple environments.


What You’ll Do

This is a hands-on engineering role. You’ll:

  • Design, maintain, and optimize Kubernetes-based deployments across Shared Cloud, Dedicated Cloud, and Self-Managed deployment models.
  • Develop and improve Helm charts as the standard deployment method across all supported environments.
  • Manage and automate GitLab CI/CD pipelines, including container image packaging, and release processes.
  • Enhance monitoring, alerting, and observability using Google Cloud Monitoring, Prometheus, and Grafana.
  • Review and improve cloud functions and internal tooling written in Go, Ruby, and Bash.
  • Troubleshoot infrastructure issues and performance bottlenecks.
  • Contribute to product reliability by investigating and resolving issues in our Java-based servers.
  • Participate in on-call rotations to maintain uptime and rapid incident response.
  • Lead post-incident reviews and drive long-term reliability improvements.
  • Collaborate with Engineering and Support teams to diagnose customer issues and optimize service quality.


What We’re Looking For

  • Strong experience operating Kubernetes in production (preferably on GKE).
  • Deep understanding of Kubernetes networking, security, Helm charts, and storage management.
  • Proficiency in one or more programming languages such as Go, Java, Bash, or Ruby.
  • Experience managing GitLab CI/CD pipelines and container image workflows.
  • Ability to write PromQL alerting rules and interpret key reliability metrics.
  • Familiarity with Redis, Liquibase, and TLS/mTLS certificate management.
  • Strong analytical skills for diagnosing performance bottlenecks across network, cache, and database layers.
  • Experience with observability, incident management, and performance testing.
  • Clear communication skills in English; Japanese language proficiency is a plus.
  • Comfortable working independently in a distributed team across time zones.


Why Join Us

  • Work closely with experienced engineers building a high-security, standards-compliant OAuth/OIDC engine.
  • Solve complex reliability challenges across multi-cloud and self-managed environments.
  • Be part of a lean global team where your contributions have direct product impact.
  • Enjoy flexibility, autonomy, and the opportunity to shape infrastructure best practices.
  • Competitive compensation, global collaboration, and meaningful technical challenges.