Ceph Storage Engineer (English)

Company

VEXXHOST

Date Posted

29-11-2025

Location

Remote

About VEXXHOST

VEXXHOST is a leading provider of cloud infrastructure solutions, offering four core services: OpenStack, Kubernetes, Zuul, and MigrateKit. We're committed to delivering high-performance, reliable, and secure cloud services to businesses worldwide. Our team is passionate about open-source technology and building cutting-edge infrastructure solutions.

Position Overview

We are looking for a Ceph Storage Engineer to design, deploy, operate, and optimize large-scale distributed storage clusters that power our global cloud platform. The ideal candidate has deep hands-on experience with Ceph, a strong understanding of distributed systems, and a passion for building reliable and scalable infrastructure. You will play a key role in shaping the performance, resiliency, and growth of our storage services.

Responsibilities

  • Design, deploy, and maintain Ceph clusters that support large-scale production environments
  • Monitor and optimize cluster performance, capacity, and reliability
  • Implement best practices for data durability, replication, and recovery
  • Troubleshoot storage issues at scale
  • Manage lifecycle operations including upgrades, expansions, and migrations
  • Collaborate with engineering teams to integrate Ceph with OpenStack and Kubernetes platforms
  • Automate operational workflows to ensure efficient and repeatable processes
  • Participate in on-call rotations to support critical production systems
  • Work directly with clients to fix their technical problems
  • Contribute to documentation


Qualifications

  • Strong hands-on experience operating Ceph in production environments
  • Solid understanding of distributed storage concepts and architecture
  • Proficiency with Linux systems administration and networking fundamentals
  • Experience with OpenStack Cinder, Glance, or Nova storage backends is a strong plus
  • Familiarity with Kubernetes persistent storage concepts is an asset
  • Scripting experience with Python, Go, etc
  • Knowledge of monitoring tools such as Prometheus, Grafana, or ELK stack
  • Ability to diagnose performance bottlenecks and conduct root cause analysis
  • Strong problem-solving skills and attention to detail
  • Excellent communication and collaboration skills


Nice to Have

  • Contributions to Ceph or other open-source projects
  • Experience with automation tools such as Ansible or Terraform
  • Background in large-scale cloud infrastructure environments
  • Understanding of object storage protocols such as S3 or Swift


What We Offer

  • Remote-first work environment
  • Professional development opportunities and conference attendance
  • Access to cutting-edge technology and infrastructure
  • Collaborative and inclusive team culture
  • Opportunity to work with open-source technologies
  • Be part of the foundation of a growing company