Our client is a remote-first company with team members across the globe! Offering a SaaS-based Learning Management System powering the world's leading education programs. Our client helps large brands and fast-moving companies increase revenue, improve customer retention, and decrease support costs through external education. The platform includes all the tools an organization needs to create, manage, track, and improve highly personalized learning experiences for customers, partners, and employees.
The Lead Site Reliability Engineer has a pivotal role at the forefront of our engineering operations, responsible for guiding the Platform Team toward achieving exceptional standards of reliability, performance, and stability across all our applications. The successful candidate will possess deep expertise in these core areas and will be instrumental in defining and implementing industry-leading practices. As a key leader, this role will not only shape the strategic direction of our platform operations but also establish the benchmarks and processes by which our engineering excellence is measured.
8+ years of experience as a software engineer
5+ years of experience working with Ruby on Rails
Proven experience leading SRE teams
3+ years of experience working in infrastructure and operations
Expertise with SQL databases such as PostgreSQL
Experience with Cloud computing Amazon Web Services and/or Google Cloud
Ability to dig into unfamiliar code bases
Ability to document solutions and train operational teams on supportability
A sense of comfort working in a team-oriented and collaborative environment
Can communicate clearly and seek help and support proactively
Takes ownership of tasks and leads them to completion
Experience in developing solutions using server automation tools such as Ansible.
Experience writing and maintaining CI/CD pipelines and services.
Bachelor’s degree in Computer Science or related technical field