My client is seeking a Site Reliability Engineer to join their growing SRE team, playing an essential role in maturing the company’s approach to service reliability and continuity.
You and your team will be directly responsible for solutions around the reliability of the platform, including availability, latency, performance, efficiency, capacity planning, and incident response.
Things you will do
- Help us measure and improve reliability across the product line by working with product owners and engineering teams.
- Improve and maintain site reliability, availability, scalability, and system performance
- Investigate system performance, errors, and problems.
- Make wise decisions balancing availability and delivery and communicating those decisions clearly.
- With our team, be responsible for the systems you build.
- Work with engineering teams as a subject matter expert on operating software and systems at scale, teaching them from your experience or know-how, and helping them reach their goals.
Your background and skills will include
- A minimum of 5+ years relevant working experience.
- Excellent knowledge and experience in Systems Engineering, Administration, and Operations.
- Strong Software Engineering experience, ability to work in multiple programming languages.
- Experience with Distributed Systems and operating them as they scale.
- Experience operating services running in the cloud (AWS primarily) or virtualized API-driven platforms.
- Articulate and personable with strong spoken and written English language abilities.
- Demonstrate the ability to work independently and collaboratively as part of a specialized team.
Would be great if you
- Have experience automating datastore operations or datastores as a service.
- Were well versed in PostgreSQL database management.
- Had experience analyzing system-wide performance: latency, throughput, and efficiency.
- Have experience working as part of a distributed or partially distributed team and thrive in an a highly collaborative and communicative work environment.
- Could pride yourself on giving back to your community: open source contributions, speaking, teaching, mentoring, helping others.