Early Career Health Data Scientist - BHF DSC

Health Data Research UK

Employment Type Full time Permanent
Location Remote · UK This role is remote, however you need to reside in the UK
Salary £43,000 - £47,000 (GBP)
Seniority Junior
  • Closing: 5:30pm, 24th Apr 2026 BST

Perks and benefits

Flexible working hours
Work from home option
Wellness programs
Employee Assistance Programme
Enhanced maternity and paternity leave
Extra holiday
Professional development
Mentoring/coaching
Team social events

Candidate happiness

8.44 (8382)

Job Description

Purpose of the post

The Early Career Health Data Scientist will join our Health Data Science team and will contribute to the development of scalable, reusable resources that support researchers with the data curation phase of their research projects to produce high-quality, analysis-ready data. These resources may include:

 

  • Data dictionaries, dataset summaries and shared exploratory analyses and insights that inform researchers about datasets and how they can be used on research projects.

  • Coding tutorials, guidance notes, and worked examples to help researchers develop the technical skills needed to curate data within Secure Data Environments.

  • Re-usable code, functions, and data curation pipelines that researchers can adapt for their own projects, reducing duplication and accelerating the data curation phase of their project. An example data curation pipeline for research projects being undertaken within the NHS England SDE can be found in the Centre’s GitHub here.

  • Curated data methods. These are methods to produce cleaned and enhanced views of datasets, designed to integrate with our data curation pipelines to prevent repeated reimplementation of equivalent logic.

 

The post-holder will also provide direct, hands-on support to researchers either by providing guidance and signposting to existing data curation resources relevant to their project, or by providing targeted, bespoke development of data curation pipelines to generate analysis ready data. The post-holder will also be required to perform analyses of data for quality control purposes and to help better understand the utility of the data, and how it can be appropriately used for research purposes.

This post is an attractive career development opportunity, which would suit a health data scientist, data analyst, data engineer with previous experience of data wrangling and curation of health data for research projects, who wishes to expand and deepen their expertise in large-scale, linked health data, and collaborative research environments.

 

Main responsibilities

  • Providing data engineering and data curation support in secure data environments (SDEs) and trusted research environments (TREs) to produce robust, analysis-ready datasets.

  • Contributing to the development, testing, and maintenance of data curation pipelines and shared resources under the supervision of senior colleagues.

  • Developing and applying expertise in the assessment of data quality, completeness, and data utility of the various routinely collected health datasets across the four devolved nations, including contributing to early feasibility and exploratory assessments to inform study design.

  • Summarising and disseminating findings and lessons from data quality and data utility assessments to inform research design and appropriate use of routinely collected data.

  • Under the supervision of the senior colleagues, writing, organising and maintaining support documentation for linked data resources (e.g. data dictionaries, variable mapping tables, data access process documentation, and Git repositories).

  • Carry out technical validation checks on linked data sources (e.g. duplicates, linkage errors, temporal inconsistencies) and develop reusable functions to check these data rigorously for errors and inconsistencies.

  • Working with relevant researchers to identify and apply appropriate existing and novel phenotype definitions and algorithms from linked national health data.

  • Preparing clear numerical summaries and visualisations to communicate findings (e.g. data characteristics, quality, and decision making) to researchers when curating data.

  • Preparing and presenting results in oral and written reports, technical notes, and academic publications

  • Actively participating and attending the regular Centre and project meetings, reporting on progress and presenting analytical results.

  • Demonstrating a strong commitment to open source, transparent, and reproducible research, as the post will involve releasing tools, code, documentation under an open-source licence.

To view full JD, please click here.

Removing bias from the hiring process

Start your de-biased application

x

Removing bias from the hiring process

  • Your application will be anonymously reviewed by our hiring team to ensure fairness
  • You’ll need a CV/résumé, but it’ll only be considered if you score well on the anonymous review

Start your de-biased application