Site Reliability Developer at Medic Mobile

Medic Mobile designs, delivers, and supports world-class software for health workers who provide care in the hardest-to-reach communities. Our software is free, open-source, and deployed at scale in the last mile of healthcare. Evidence-based workflows come together in the software to support health workers and families, helping to ensure safe deliveries, track outbreaks, treat illnesses at the doorstep, communicate about emergencies, and more. We envision a more just world in which health workers are supported as they provide care for their neighbors, universal health coverage is a reality, and health is secured as a human right.
By the end of 2017, Medic Mobile was supporting a network of more than 20,000 frontline health workers. In our next phase of growth, we are scaling through large partnerships and reaching out to health organizations with minimal resources. We aim to support health workers serving 100 million people between 2017 and 2021.
Our global team supports more than 60 partners across 23 countries, with offices in San Francisco, Nairobi, and Kathmandu. Medic Mobile brings a wide range of experience and creative solutions to complex problems, building and delivering technology that works for people who have been marginalized.We're seeking a talented and dedicated Site Reliability Developer to join our distributed product team.

Are comfortable in a UNIX-like environment, enjoy automation, script efficiently, and produce checklists and documentation for processes and systems.
Have coursework or experience equivalent to an undergraduate computer science degree.
Have knowledge of at least some of the following APIs: AWS Identity and Access Management (IAM) policies, Elastic Compute Cluster (EC2), Virtual Private Cloud (VPC).
Can use Docker Machine, Amazon ECS, or a higher-level orchestration tool to deploy a container-based application in test or production.
Are comfortable with basic Linux system administration, monitoring, security best practices, networking, and logging.
Are familiar with at least some of the core web technologies: HTTP, SSL/TLS, REST, JSON, HTML.
Thrive working as part of a distributed team with a flexible schedule.
Enjoy working remotely with opportunities to travel to project sites (e.g. India, Uganda, Senegal) or to work with teammates (e.g. San Francisco, Nepal, Kenya).
Want to help build software that improves lives in a real and significant way.

Position Details
The Site Reliability Developer works closely with Medic Mobile's software development and product teams to assure high-quality deployments of mobile health software and hardware.
Cloud infrastructure development: 30%

Work with the development and product teams to help guide the design of recommended compute, network, container, and storage resource layouts.
Document resource layouts and network designs; produce checklists and automated processes to deploy new instances and containerized applications.

Proactive monitoring: 30%

Proactively monitor performance and reliability of production Medic Mobile systems.
Produce status pages consumable by non-technical users.
Be available (subject to time zones of team members) to respond to, troubleshoot, remediate, and document expected or unexpected outages, incidents, or problems in production.

System image engineering and deployment support: 40%

Work with the software development team to improve and optimize production system images - AMIs and Docker containers.
Manage upgrades and upgrade processes on production instances.
Automate deployments to increase testability and reliability.