Azure DevOps (regular)
Kubernetes (advanced)
Python (advanced)
PostgreSQL (advanced)
Networking (master)
Monitoring Tools (master)
Monitoring systems (master)
About us
Pass the Keys is a leading tech-enabled short let management platform. Over the past 6 years we have hosted 65k+ stays and have some of the highest customer ratings in our industry.
We are Professional Co-Hosts for Airbnb operating in over 60 locations across the UK and expanding rapidly through franchising.
Our vision is to become the biggest short let management company in the world by creating unparalleled opportunities for entrepreneurs, hosts and guests.
Currently focusing on the UK market, we plan to expand internationally later in 2022.
We are a cloud-native organisation, with no on-premise infrastructure and a preference for SaaS or fully managed infrastructure offerings.
Who we are looking for
We are seeking a systematic and adaptable Operability Engineer to work alongside our Product Development and Operational Delivery teams.
As our first dedicated Operability Engineer, the role will be both proactive (improving DevOps processes and cloud infrastructure), and reactive (monitoring and responding to anomalies).
Additionally you will assist with investigating customer issues and handling privacy requests.
Objectives of the role
Pro-actively ensure availability, latency and efficiency of bespoke systems for our users
Improve observability and promote best practices among software engineers (logging, metrics, use of tools etc)
Respond to incidents in a timely and controlled manner, keeping stakeholders updated on status
Enable smooth and efficient throughput for the development team by improving DevOps practices (CI / CD pipelines etc)
Promote use of automation to reduce human effort and error
Shield product engineers from operational concerns to minimise ad hoc interruptions, whilst enabling them to share responsibility for production environments
Responsibilities
Identify and implement security improvements
Implement monitoring, metric dashboards and alarms, and review these to pre-empt problems before they arise
Define Service Level Objectives and Service Level Indicators for key services
Document Standard Operating Procedures, Runbooks and Playbooks for common system administration tasks
Plan, enable and support the transition of our environments to cost-efficient and scalable kubernetes clusters
Optimise build and release pipelines
Support Customer Success teams with investigation and troubleshooting
Handle data-privacy requests (deletion and extraction)
Requirements
Skills and experience
In depth understanding of UNIX concepts and Debian-based Linux distributions
Sound knowledge of networking fundamentals, and experience with configuring virtualized networks
Track record of supporting and administering mission-critical production systems
Experience with incident response including detection, communication, troubleshooting and postmortem investigation
Configuration and / or implementation of structured logging and log-insights (indexing / querying) tools
Familiarity with Application Performance Monitoring tools and using them to identify bottlenecks in system performance
Proficiency with Python both for scripting and application development
Experience with infrastructure-as-code-tooling
Knowledge of configuring, administering, and monitoring Kubernetes clusters
Strong troubleshooting abilities, being confident to trace through logs, configuration and source code to identify root causes
Our tools and Technologies
Whilst we recognise that conceptual understanding is transferable, hands-on experience with the following tools is a big plus :
AWS including IAM, VPC, Route53, Multi-account setups, EKS, Fargate, RDS, Elasitcache (Redis), OpenSearch (ElasticSearch), DynamoDB
Pulumi (using Python), especially advanced integrations with AWS and Kubernetes
Flux (v2)
Azure Devops Services (Repos, Pipelines)
Postgresql
Django, Celery, Boto3
Newrelic, Sentry, InsightOps
Benefits
Fully remote with optional co-working (we have access to offices worldwide)
30 days holiday + your birthday off
Quarterly social meetup
NB whilst the role is fully remote, we will only be able to accept candidates located within time-zones + / - 3 hrs from London.
UK-based candidates will be required to employed as full-time salaried individuals. Non-UK based candidates will need to work as contractors.