Site Reliability Engineer (SRE)
We're looking for a skilled Site Reliability Engineer to join a high-impact engineering team focused on building a secure, scalable, and observable Kubernetes-based SaaS platform on Amazon EKS. You'll play a critical role in ensuring infrastructure reliability and enabling fast, secure delivery across services.
Key Responsibilities:
* Design, build, and maintain Kubernetes-native infrastructure on Amazon EKS.
* Implement and manage observability tools (Prometheus, Grafana, OpenTelemetry).
* Own CI/CD pipelines using GitHub Actions, ArgoCD, Terraform.
* Drive security best practices, performance tuning, and cost optimization.
* Partner with development teams (Golang) to embed SRE practices early.
* Lead incident response and continuous service reliability improvements.
Experience:
* Strong experience in SRE, DevOps, or Platform Engineering.
* Hands-on expertise with Kubernetes (especially EKS) and IaC tools (Terraform).
* Deep knowledge of observability stacks and cloud networking/security.
* Familiarity with serverless (AWS Lambda), multi-cloud, and FinOps is a plus.
* Experience with advanced CI/CD and resilience testing tools is beneficial.
Perks & Benefits:
* Fully remote working - N.Ireland wide
* Unlimited PTO
* Standard pension
* Private health insurance for you & family
* Life assurance
* Learning budget, flexible working, and more
Please apply now if you are meeting the above criteria. For a confidential conversation, feel free to contact Andrew Harrison directly