Skip Navigation
Search site
The national provider of information, data and IT systems for health and social care

The processing cycle and HES data quality

HES data comes from the routine exchanges of information between providers and commissioners of healthcare for NHS patients in England. Healthcare providers collect administrative and clinical information locally to support the care of the patient. The data is submitted to the Secondary Uses Service (SUS), which, as well as making it available to the commissioners, also copies the information to a database.

At pre-arranged dates during the year, SUS takes an extract from their database and sends it to HES. We then validate and clean the extract, before deriving new items and making the information available in the data warehouse. Data quality reports and checks are completed at various stages in the cleaning and processing cycle. 

pdf icon The HES processing cycle and HES data quality [345kb]

pdf icon Data quality checks performed on SUS and HES data [430kb]


Automatic data cleaning and derivation rules

Read how we clean the data to improve the value and quality of HES data. These rules are used to:

  • clean common and obvious data quality errors
  • derive additional data items to populate the HES data set

pdf icon Inpatient cleaning rules [243kb]

pdf icon Outpatient cleaning rules [99kb]

pdf icon A & E cleaning rules [86kb]


Duplicate Methodology

Information on how we identify and handle duplicate records within the HES dataset.

pdf icon HES Duplicate Identification and Removal Methodology [142kb]


HES patient ID

The HES Patient ID (HES  ID) provides a way of tracking patients through the HES database without identifying them. It is central to many HES outputs including spell construction, emergency readmissions and linkage to other data sets, such as mortality.

pdf icon Read about the HES ID and its methodolgy [387kb]


Examples of how we use automatic data cleaning and derivation rules:

To clean common and obvious data quality errors

Rule #0150 looks for evidence where a Birth Episode (CDS type 120) has been incorrectly submitted to SUS as a General Episode (CDS type 130). If evidence is found then the Episode Type of the record is altered to reflect this.

Without this clean, the number of birth records (Episode Type 3) in HES would tend to be lower than the actual number of births taking place.

To derive additional data items to populate the HES data set

Rule #1200 uses the postcode from each submitted CDS record to derive additional geographical data items relating to the episode of care. HES uses reference data from the ONS Postcode Directory to derive data items such as Parliamentary Constituency or Strategic Health Authority of the patient's residence. This allows record level data to be easily aggregated to enable effective spatial analysis to be performed.

Close iCM Form