Site Reliability Engineer
Company
Cognits
Date Posted
25-05-2025
Location
Remote
SRE Engineer
Education & Certifications
Secondary degree diploma preferably University degree in Computer Science, Engineering, or a related field
Professional Experience
- 5+ years of hands-on experience designing, building, testing, and maintaining production-grade software systems
- Proven track record of delivering scalable, maintainable, and high-performance software within Agile development environments
- Experience collaborating in globally distributed engineering teams and contributing to cross-functional technical initiatives
Core Engineering Competencies:
System Architecture & Design
- Defines and implements software components, systems, and services with consideration for scalability, maintainability, and performance
- Makes thoughtful architectural decisions aligned with business goals and technical best practices
Agile Delivery & Engineering Practices
- Actively participates in Agile ceremonies (daily stand-ups, sprint planning, retrospectives, reviews)
- Supports continuous delivery practices, source control strategies, and iterative development workflows
Quality, Testing & Documentation
- Writes modular, reusable, and testable code
- Designs and maintains automated test coverage (unit, integration, and/or end-to-end tests)
- Produces clear and concise technical documentation for both implementation and processes
Collaboration & Communication
- Works effectively in a cross-functional environment with designers, product managers, QA, and fellow engineers
- Provides technical mentorship and supports knowledge sharing within the team
- Engages with stakeholders (including client-side engineers) to drive clarity and shared understanding of technical solutions
Release, Risk, and Change Management
- Supports and/or leads release management, ensuring smooth deployment cycles
- Identifies and mitigates technical risks early in the development process
- Participates in onboarding and offboarding processes to ensure knowledge continuity and team stability
- Embraces change management best practices during feature rollouts and system upgrades
Soft Skills & Leadership
- Autonomous problem-solver with strong ownership mindset
- Clear, confident communicator in English
- Advocates for clean code, performance, security, and accessibility
Key Qualifications
- 5+ years hands-on experience in infrastructure engineering.
- Adopt and apply SRE best practices to services you support
- Keep users, key stakeholders, and leadership informed through regular reporting and communications
- Expertise with Amazon Web Services and/or other cloud platforms.
- Experience with Infrastructure-as-Code tools like Terraform and Pulumi
- Enterprise networking experience: NGINX, Load Balancers, Firewalls, DNS, VPCs, AWS Security Groups, ACLs, Virtual IPs
- Experience with containerized applications and Docker.
- Experience with Kubernetes, Helm, and Spinnaker.
- Application and/or tooling development in Java, Python, or Go.
- In-depth knowledge of build/release systems and process.
- Prometheus, Splunk, Grafana, Datadog... observability is imperative to success.
- Ambiguity doesn’t scare you. You see it as an opportunity to define the future.
- You automate things rather than doing them twice.
- Strong attention to detail and excellent analytical capabilities.
- Should be comfortable working in fast paced and dynamic environment.
- Periodic on-call duties required for this role
- Experience building observability and monitoring for distributed systems
- Experience with configuration management tools such as Terraform, Ansible or Puppet
- Passion for designing and building reliable systems
- Familiarity with microservices architecture and container orchestration with Kubernetes
- Demonstrated ability to deliver results on time with high quality
- Automation advocate - you truly believe in removing operation load with software
- Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks
- Experience coding with programming languages such as Golang, Python, bash
- Experience in and passion for a Reliability Engineering, DevOps or infrastructure focused role
- Deep systems and infrastructure knowledge
- Excellent troubleshooting and problem solving skills"