Site Reliability Engineer

Oak National Academy

Employment Type Full time
Location Remote · UK
Salary £65,460 (GBP)
Team Engineering
Seniority Mid-level
  • Closing: 11:59pm, 7th Apr 2024 BST

Perks and benefits

Flexible working hours
Work from home option
Retirement benefits
Life Insurance
Employee Assistance Programme
Additional parental leave
Enhanced maternity and paternity leave
Paid emergency leave
Extra holiday
Professional development
Mentoring/coaching
Payroll giving
Team social events
Equipment allowance

Candidate happiness

8.61 (5317)

Job Description

Please note that whilst we capture CVs as part of the application process, initial sifting of applications occurs 'blind' and is based purely on the responses to the admin questions and sift questions, so please include sufficient detail as we won't have visibility of your CV at this stage.

Description

We are looking for a Site Reliability Engineer to join our Product and Engineering team. Oak’s websites provide free educational resources to teachers and pupils, and we want to ensure we are providing them with the high level of service they deserve.

You will have a passion for SRE principles, and help define organisational strategy. You will lead on bringing our monitoring up to a high, consistent standard across all our key applications while also working closely with our product squads to help them improve the stability and security of their applications, providing mentoring, and promoting a sense of ownership of operational concerns in the squads.

You will also contribute to CI/CD best practice, automation, and work alongside colleagues to help improve the usability, security and performance of our platform, infrastructure and tooling.

Candidates must have a good understanding of SRE principles and the value they bring to an organisation. While a good grounding in development practices, security fundamentals and infrastructure operation are key, specific technical skills are less important than a passion for automation, an ability to understand complex systems and a keenness to learn.

Site Reliability Engineer

Responsible to: Principal Engineer

Team: Product and Engineering

Term: Permanent

Location: Remote (with some occasional in-person UK meetings)

Pay Point: D, Market Supplement 4: Salary: £65,460

Hours: 36 hours per week (if full-time – flexible arrangements will be considered. Our core working days are Tuesday, Wednesday, Thursday, to allow effective collaboration time with colleagues).

Line management responsibility: none

Budget responsibility: none

Key external relationships: none

Responsibilities

● Lead the continuous improvement of the performance, reliability and security of our applications and infrastructure, promoting and nurturing a culture of quality across the product and engineering department.

● Take ownership of our monitoring and logging solutions, to ensure they are easy to use and provide development teams with the information they need to understand service quality, resolve problems quickly and get meaningful insights into application behaviour.

● Design and implement automated systems to speed up development processes, secure systems and enable the team to both maintain and improve quality standards.

● As a member of the Oak Team, contribute to the wider success of the organisation and support and role model our culture of inclusion, freedom, responsibility, and continuous improvement.

● Work in cross-functional and product-oriented squads with colleagues from across the organisation, as required.

● Deputise for Principal Engineer and take on other general responsibilities as required

Knowledge, skills, and experience

● Considerable professional experience leading the continuous improvement of web service stability in a Site Reliability Engineer role (or similar).

● Proven success managing a suite of monitoring tools.

● Competent in scripting or coding tasks, ideally TypeScript/JavaScript.

● Experience working with Cloud computing platforms and a familiarity with Infrastructure as Code tools.

● Comfortable liaising with a range of stakeholders and embracing a spirit of collaboration.

The successful candidate will have a desire to contribute in all areas to ensure Oak is successful. You will be comfortable working at pace, with a range of digital systems (including proprietary ones as required) and you will continuously look at ways that the team can keep getting better. You will be excellent at working as part of a remote team, building relationships and managing your time effectively

Application Process

You’ll answer some questions that are related to your day-to-day job. After the job closes, your answers will go through our sift process: all answers will be anonymised, randomised and then reviewed by a panel of reviewers.

If you are shortlisted, we’ll invite you to the next steps, all carried out over Zoom, which involve a one hour of questions with engineering colleagues, immediately followed by a one-hour simple coding test in JavaScript (a basic framework of code will already be in place), then a second one-hour interview stage with colleagues from the wider organisation. At the end of the application process, we will provide you with feedback.

We are aiming to start interviews in the second week of April 2024.

We are experiencing really good responses to our job adverts. This may lead us to closing the role early, so if you are considering applying, then please get your application in early to avoid missing out.

Removing bias from the hiring process

Applications closed Sun 7th Apr 2024

x

Removing bias from the hiring process

  • Your application will be anonymously reviewed by our hiring team to ensure fairness
  • You’ll need a CV/résumé, but it’ll only be considered if you score well on the anonymous review

Applications closed Sun 7th Apr 2024