We are seeking a Senior Operations Engineer to help operate and maintain our SaaS products. You’ll be working on our operations team to collaborate on developing and maintaining tooling to improve our infrastructure automation. This position requires a deep knowledge of linux systems as well as a solid understanding of what it takes to scale large customer-facing services reliably.
- Develop tooling to enable product development teams to build and deploy software as quickly and efficiently as possible
- Partner with other engineers to identify the optimal cloud infrastructure, networking and storage design for each solution
- Collaborate daily with fellow team members to drive best practices and identify innovative strategies for successful delivery
- Own all information compliance and security policies, SOC II testing, and remediation.
- Manage credentials and permissions, and ensure the security of our systems by monitoring and preventing vulnerabilities
- Identify, recommend, and implement system improvements to reduce infrastructure costs
- Participate in on-call rotation to ensure production’s system availability is 100%
- Plan appropriately to keep infrastructure costs as low as possible
Our Ideal Candidate
- 5-7+ years experience operating large-scale customer-facing production environments with a focus on uptime and service quality
- Experience designing and building tooling and infrastructure for cloud platforms
- Experience with UNIX/Linux operating systems internals and administration
- Experience with orchestration and configuration management technologies like Terraform, SaltStack, and Ansible
- Familiarity with AWS and/or other cloud-based providers
- In-depth knowledge of service monitoring, log aggregation, collecting metrics, as well as infrastructure security
- Experience working with interpreted languages like Python, Ruby, Shell.
- Experience managing distributed database systems like MongoDB, Riak, and CockroachDB
- Experience working with continuous delivery tools and automated testing frameworks
- Innate tendencies to ensure strong security, redundancy, scalability, and performance
- Strong communication skills and high-level problem-solving skills, coupled with a strong sense of ownership and drive