Site Reliability Engineer

Company

Cognits

Date Posted

25-05-2025

Location

Remote

SRE Engineer


Education & Certifications
Secondary degree diploma preferably University degree in Computer Science, Engineering, or a related field

Professional Experience

  • 5+ years of hands-on experience designing, building, testing, and maintaining production-grade software systems
  • Proven track record of delivering scalable, maintainable, and high-performance software within Agile development environments
  • Experience collaborating in globally distributed engineering teams and contributing to cross-functional technical initiatives

Core Engineering Competencies:

System Architecture & Design

  • Defines and implements software components, systems, and services with consideration for scalability, maintainability, and performance
  • Makes thoughtful architectural decisions aligned with business goals and technical best practices

Agile Delivery & Engineering Practices

  • Actively participates in Agile ceremonies (daily stand-ups, sprint planning, retrospectives, reviews)
  • Supports continuous delivery practices, source control strategies, and iterative development workflows

Quality, Testing & Documentation

  • Writes modular, reusable, and testable code
  • Designs and maintains automated test coverage (unit, integration, and/or end-to-end tests)
  • Produces clear and concise technical documentation for both implementation and processes

Collaboration & Communication

  • Works effectively in a cross-functional environment with designers, product managers, QA, and fellow engineers
  • Provides technical mentorship and supports knowledge sharing within the team
  • Engages with stakeholders (including client-side engineers) to drive clarity and shared understanding of technical solutions

Release, Risk, and Change Management

  • Supports and/or leads release management, ensuring smooth deployment cycles
  • Identifies and mitigates technical risks early in the development process
  • Participates in onboarding and offboarding processes to ensure knowledge continuity and team stability
  • Embraces change management best practices during feature rollouts and system upgrades

Soft Skills & Leadership

  • Autonomous problem-solver with strong ownership mindset
  • Clear, confident communicator in English
  • Advocates for clean code, performance, security, and accessibility

Key Qualifications

  •  5+ years hands-on experience in infrastructure engineering.
  • Adopt and apply SRE best practices to services you support
  • Keep users, key stakeholders, and leadership informed through regular reporting and communications
  • Expertise with Amazon Web Services and/or other cloud platforms.
  • Experience with Infrastructure-as-Code tools like Terraform and Pulumi
  • Enterprise networking experience: NGINX, Load Balancers, Firewalls, DNS, VPCs, AWS Security Groups, ACLs, Virtual IPs
  • Experience with containerized applications and Docker.
  • Experience with Kubernetes, Helm, and Spinnaker.
  • Application and/or tooling development in Java, Python, or Go.
  • In-depth knowledge of build/release systems and process.
  • Prometheus, Splunk, Grafana, Datadog... observability is imperative to success.
  • Ambiguity doesn’t scare you. You see it as an opportunity to define the future.
  • You automate things rather than doing them twice.
  • Strong attention to detail and excellent analytical capabilities.
  • Should be comfortable working in fast paced and dynamic environment.
  • Periodic on-call duties required for this role
  • Experience building observability and monitoring for distributed systems
  • Experience with configuration management tools such as Terraform, Ansible or Puppet
  • Passion for designing and building reliable systems
  • Familiarity with microservices architecture and container orchestration with Kubernetes
  • Demonstrated ability to deliver results on time with high quality
  • Automation advocate - you truly believe in removing operation load with software
  • Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks
  • Experience coding with programming languages such as Golang, Python, bash
  • Experience in and passion for a Reliability Engineering, DevOps or infrastructure focused role
  • Deep systems and infrastructure knowledge
  • Excellent troubleshooting and problem solving skills"