Lead Engineer - Site Reliability (Metrics and Monitoring)
Xero
- New Zealand
- Permanent
- Full-time
- Design systems to improve adoption of Xero's observability tools with a strong focus on reducing toil in managing our monitoring and logging platforms.
- Have a strong focus on developing and growing engineers through technical mentoring and coaching.
- Provide leadership around observability standards and practices.
- Create systems that support and enable our product teams to uplift their observability practices.
- Improve the implementation of system instrumentation as and when required.
- Be a key member of the pod leadership, contributing to technical strategy, feasibility, backlog management and enabling delivery.
- Participate in the wider SRE team on-call roster responding to Xero-wide incidents.
- Empower other engineering teams at Xero to achieve a high standard of system awareness so they can create efficient, scalable and reliable applications for Xero's customers.
- Experience with agile software development methodology including continuous integration and delivery.
- An understanding of how solutions architecture or architecture design works in a large software delivery organisation.
- Experience building and implementing observability with large distributed cloud environments (ideally AWS).
- Excellent knowledge of reliability and observability concepts and practices.
- An understanding of Open Telemetry and how it works.
- Experience being on call and helping to resolve production incidents in a complex environment.
- Experience in instrumenting applications and integrating with monitoring solutions like New Relic, Datadog, Dynatrace, SignalFX, Scalyr, Sumo Logic or Splunk (ideally New Relic).
- Proficiency in one or more object-oriented programming languages such as C#, JavaScript, Golang, Python etc.
- Experience with DevOps tooling, eg. Linux, Docker, Kubernetes, IaC, CICD tools.
- The ability to help structure work to make optimal use of the team's resources.
- The ability to set quarterly and annual objectives for the team in collaboration with the Product Manager and Team Lead
- Proven ability to engage, influence and build relationships with internal stakeholders.
- Experience in managing and maintaining healthy observability platforms for a large user base.