Introduction
The NHS Oversight Framework describes a consistent and transparent approach to assessing NHS trusts and foundations trusts (referred to in this publication as “providers”) and integrated care boards (ICBs).
Each provider and ICB will be scored against a focused set of metrics that align with the priorities set out in the Medium Term Planning Framework and 10 Year Health Plan, and allocated to a segment based on their performance against these metrics, from segment 1 (high performer with narrow range of challenges) to segment 4 (widespread poor performance and likely to require co-ordinated intervention).
To support consistency and transparency in segmentation, we have developed an automated process that follows a set of rules and metrics to derive an organisation’s segment. The allocated segment will guide a range of decisions, including:
- the intensity of our oversight and scrutiny of each provider
- the support they need
- our need to intervene or consider using enforcement powers
- the granting of additional freedoms
This manual gives transparency to our decision-making by describing the segmentation process. All underlying data used to derive a segment is available to NHS staff through the Model Health System, as well as a publicly accessible dashboard. The delivery metrics technical annex gives the detailed specifications, data sources and scoring methodologies for all metrics that will be scored to inform segmentation. Please read this manual alongside that annex.
Overview of the segmentation process
The automated segmentation process follows 4 sequential steps:
- Each metric is scored on a scale of 1 to 4 (some with discrete scores and some continuous) with 1.00 being the highest score and 4.00 the lowest.
- All individual metric scores are consolidated and averaged to provide an average metric score.
- Average metric scores are then translated into a single overall segment of 1, 2, 3 or 4 using defined thresholds.
- An adjustment ensures that any organisation with an underlying financial deficit cannot be allocated to a segment higher than 3.
The following sections describe how each of these steps operates.
Step 1: individual metric scoring
Selecting metrics
We have selected the delivery metrics that underpin segmentation to give a headline view of delivery across a range of domains. They cover core NHS operating objectives, aligned to the Medium Term Planning Framework and 10 Year Health Plan.
The list of metrics is published in annex B of the NHS Oversight Framework and is reviewed annually. Each metric is intended to, wherever possible, meet 6 core criteria:
- aligns to a published objective or duty
- is comparable between organisations (of a similar type)
- is based on publicly available data
- can be interpreted without excessive caveats
- provides a clear indicator of ‘good’ performance
- results in a score that has consistency over time
Frequency and timeliness
Metrics cover different time periods and have varying lag times, for example, [staff] survey data is published annually while performance metrics are published monthly. We will run the segmentation process on a quarterly basis, as soon as possible after all official operating statistics for the previous quarter have been published. In some cases, an organisation’s internal information may be more up to date than the data we will be using, but to allow for meaningful comparison, we will use verified official statistics, wherever possible, even if it lags the current internal performance position.
All metrics will be refreshed quarterly to ensure that each quarter we publish segmentation relating to the verified official position for the previous quarter; that is, segmentation based on quarter 1 data will take place before the end of quarter 2. Where official data is updated monthly, the metric will only be updated when the final quarterly data is produced and not within a quarter. However, this will not preclude NHS England from intervening if data indicates declining performance within a quarter. When publishing data for each metric, we will clearly display the time period to which it relates.
Weighting
We have not applied weighting to individual metrics, with one exception. As the rates for different infections cannot be combined into a single measure of healthcare associated infection, the 3 measured rates of infection are individually scored, and the scores are then weighted at one-third value to ensure that their combined score has the same impact as other single metrics.
In other circumstances, a score may be derived from a combination of measures, for example, for finance we provide a single score based on a combination of planned and actual position. In these circumstances, an individual metric may be weighted more highly than another; any such weighting is specified in the detailed information for the metric available in the technical annex.
We will also apply a financial override to ensure that no organisation with a financial deficit or in receipt of deficit support is allocated to a segment higher than 3 overall (see step 4).
Missing data
Where an organisation submits data for a metric, it will ordinarily receive a score for that metric. Where an organisation does not submit data, it will receive a score of 0 and the metric will not contribute to the organisation’s overall score.
In some specific circumstances, organisations who submit data will be excluded from scoring metrics. For example, some mental health and community trusts have data on 4-hour A&E performance due to providing lower acuity urgent care services, but these providers are excluded from scoring as their service is not comparable with acute sites due to the mix of cases they see. Where certain trusts or trust types are excluded from scoring, this is clearly articulated within the methodology set out in the technical annex.
Sometimes data will stop or begin being reported, such as when commissioning pathways change or due to operational issues such as IT incidents. In these circumstances, automated scoring will continue to apply as set out in this section, that is, where data is not reported the organisation will not be scored for that metric. There will, however, be an internal exception reported that will be escalated for investigation and if changes in reporting lead to meaningful changes in a trust’s score this will be set out publicly in our quarterly summary report.
Disclosure
We have tried to ensure that metrics are derived from publicly accessible data sources, so people can recreate the results from the raw data if they wish. Our scoring code is also publicly available, as are CSV extracts and all segmentation datapoints.
In rare cases, we may base a metric on a data source that is not publicly available, for example, where the metric is in development. In these situations, the data will be released as a supplementary statistic on our website with associated methodology.
While we provide links wherever possible to proprietary methodologies from other organisations, such as the Care Quality Commission (CQC) or national clinical audits, we do not provide the ability to replicate these measures. Any queries about how these organisations have produced their results should be directed to them (for example, to the CQC to understand their methodology behind their survey results).
Standard scoring models
Each metric receives a score between 1 and 4 (with 1.00 being the highest performance rating), in line with 1 of 4 basic models:
- target and spread: a defined target or benchmark is set and every organisation achieving that level scores 1.00. Remaining organisations are evenly scored between 2.00 and 4.00 based on their relative performance
- banded scores: specific criteria are defined and converted to a performance band, for example a score of 1.00 would apply to an organisation achieving between 80% and 100%, a score of 2.00 would apply to an organisation achieving between 60% and 79.9%, etc
- floor and spread: a defined lower acceptable limit is set and every organisation failing to achieve that limit scores 4.00. Remaining organisations are evenly scored between 1.00 and 3.00 based on their relative performance
- even spread: there is no defined performance threshold; organisations are ranked evenly between 1.00 and 4.00 based on their relative performance
The methodology selected for each metric is, to some extent, dictated by the data landscape, that is, narrow distributions lend themselves more towards even spread while wide distributions lend more towards a target and spread model. Wherever possible, we have tried to use target and spread or banded scores as our preferred scoring model as they are most amenable to demonstrating improvement.
Worked examples of each scoring schema are shown in the examples below.
Example 1: a target and spread metric with a performance standard of 80%
How scores are applied:
- all organisations that meet or exceed the 80% performance standard score 1.00
- the highest‑performing organisation that does not meet the 80% standard scores 2.00
- remaining organisations below the standard are scored evenly between 2.01 and 3.99, based on their relative performance using a percentile rank. Where organisations have the same level of performance, they receive the same score, aligned to the highest‑performing organisation in that group
- the lowest‑performing organisation scores 4.00
Worked example
If a group of organisations cluster with calculated scores between 2.50 and 2.80, all organisations in that group are scored 2.50.
Example 2: a banded score model using example thresholds
How scores are applied:
- organisations that exceed 90% score 1.00
- organisations with a value between 75.01% and 90% score 2.00
- organisations with a value between 50.01% and 75% score 3.00
- organisations with a value less than 50% score 4.00
Example 3: a floor and spread metric based on a requirement to improve from a baseline level
How scores are applied:
- the organisation with the greatest improvement from baseline scores 1.00
- organisations with improvement between the highest and lowest values are scored evenly between 1.01 and 2.99, based on their relative performance using a percentile rank. Where organisations have the same level of improvement, they receive the same score, aligned to the highest‑scoring organisation in that group
- the organisation with the lowest improvement from baseline scores 3.00
- organisations that do not show improvement from baseline score 4.00
Worked example
If a group of organisations cluster with calculated scores between 2.50 and 2.80, all organisations in that group are scored 2.50.
Example 4: an even spread metric
How scores are applied:
- the highest‑performing organisation scores 1.00
- organisations with performance between the highest and lowest values are scored evenly between 1.01 and 3.99, based on their relative performance using a percentile rank. Where organisations have the same level of performance, they receive the same score, aligned to the highest‑scoring organisation in that group
- the lowest‑performing organisation scores 4.00
Worked example
If a group of organisations cluster with calculated scores between 2.50 and 2.80, all organisations in that group are scored 2.50.
Bespoke scoring
In some cases, a metric will have pre-defined thresholds, such as the CQC survey scores or neonatal mortality rate. Where such thresholds exist, a banded score model will be applied in which performance brackets are directly translated to a 1 to 4 score.
Publication of scoring methodology
The scoring methodology for each individual metric is fully explained in both the metadata and the technical annex.
Demonstrating improvement
Metric scores are based on a mixture of absolute values and relative performance. They do not consider levels of improvement unless these are specifically built into the metric, for example, percentage increase measures.
It is also not recommended to use changes in segment as a barometer of improvement or deterioration; there is unlikely to be a statistically significant difference between the bottom of one segment and the top of another. To support understanding of meaningful aggregate performance change, we apply a quarterly process which uses funnel plots to consider the number of metrics that apply to each organisation and the organisation’s average metric score. This analysis identifies organisations whose average metric score is determined to be meaningfully different from the previous quarter, with 95% and 99.8% confidence that the change is not coincidental. The organisations where such change has taken place are made publicly available as part of our quarterly release of segmentation data, along with an explanation of what has driven the change. This, however, does not mean that other changes are necessarily not significant.
Step 2: producing an average score
The average metric score for each organisation will be calculated by adding together all individual metric scores and dividing this by the number of metrics for which a score is recorded for the organisation. Metrics with a score of 0 are excluded. The result is then rounded to 2 decimal places to give an average score.
Step 3: translating an average score to a segment
Determining the unadjusted segment
In 2025/26, we used a quartile-based model to translate average metric scores into segments. This was always intended to be a baselining methodology, which would adapt in future years. We have used actual segmentation data for 2025/26 to determine where appropriate fixed thresholds for segmentation in 2026/27 should be drawn.
Across all organisations in 2025/26, the mean average metric score was 2.34. This value therefore acts as the threshold between segment 2 and segment 3 for 2026/27.
We have then used the standard deviation – how much values typically differ from the average – to set the thresholds for segment 1 and 4. An average metric score of 1.94 would be one standard deviation lower than the mean. Therefore, any organisation below this value would be considered to have a significantly better metric score than average. Conversely, an average score of 2.77 would be one standard deviation higher than the mean and considered a significantly weaker score than average. We have used these values to set the proposed threshold segments for 2026/27.
As these thresholds have been set using 2025/26 data, it is possible that actual 2026/27 performance may show substantial differences and require these thresholds to be adjusted. Should it be necessary to move thresholds in year we will ensure this is clearly communicated in advance.
Table 1: segment thresholds for 2026/27
|
Segment |
Average metric score |
|
1 |
below 1.94 |
|
2 |
1.94 to 2.33 |
|
3 |
2.34 to 2.77 |
|
4 |
above 2.77 |
Step 4: adjusting for financial deficit
Determining the presence of a financial deficit
It is critical that organisations maintain strong financial grip and control and operate within their allocations. Segments 1 and 2 will continue to be reserved for organisations that do so.
Any organisation with a segment of 1 of 2 based on its average metric score that is reporting a deficit or is in receipt of deficit support funding will automatically have its segment downrated to 3. This will be based on organisational not system-wide financial performance.
The full definition of how deficit is calculated can be found within the technical annex, but in simple terms, 3 sequential tests are applied:
- Is the organisation in receipt of deficit support funding?
- Does the organisation have a planned deficit for the current financial year?
- Is the organisation currently performing worse than its financial plan and is the value of the variance greater than any overall planned surplus?
Where the answer to any of these tests is “yes” the organisation is considered to be in deficit.
Applying an adjustment for financial deficit and finalising segmentation
If the unadjusted segment, calculated at step 3, is 1 or 2 (that is, the average metric score is below 2.34), we check whether the organisation is considered to be in financial deficit.
Where the organisational deficit flag value, calculated at the previous step, is set to “yes” and the provisional segment is 1or 2 the segment is automatically downrated to 3. Where the organisational deficit flag value is set to “yes” and the provisional segment is 3 or 4 or where the organisational deficit flag is set to “no”, there is no adjustment to the segment made.
For full transparency, the public NHS Oversight Framework dashboard contains both the provisional and final segment value for each organisation.
Additional performance information
Oversight domains
While the segmentation process derives the overall segment for each organisation based on an average metric score, this cannot by itself support improvement or change, as organisations have areas of individual strength and weakness that can be masked by aggregating performance information.
To support segment interpretation and determination of practical next steps, we also provide a series of focused domain scores. These scores are calculated by taking the average score of a subset of metrics, rather than the full range, and dividing the results into 4 equally sized comparative bands of 1, 2, 3 or 4. We provide domain scores covering the following areas:
- population health and inequalities
- access to services (providers only)
- allocating resources (ICBs only)
- effectiveness of care
- experience of care
- patient safety
- finance, productivity and innovation
- people
Domain scores cannot be aggregated to calculate an organisation’s overall score as they are based on varying numbers of metrics and an average calculation.
How to use a domain score
Domain scores are a device for easily identifying potential areas for further investigation or targeted diagnostics. Domain scores could, for example, show that while an organisation is broadly delivering against the NHS operational objectives (a high access-to-services score), it is doing so in a way that may not be financially sustainable (a low finance, productivity and innovation score). This may indicate the need for targeted review of finances to determine what is driving the potential for a deficit and how this can be addressed while maintaining operational delivery.
Contextual metrics
As set out in step 1, scoring metrics are generally required to meet 6 core criteria. However, some metrics do not meet these criteria but do provide helpful context and information. For example, some patient safety metrics are not appropriate to score, as results could simply reflect a more open reporting culture rather than more unsafe practice. These metrics are made available alongside scoring metrics but are defined as contextual and so are not scored and do not contribute to segmentation. These metrics should be routinely considered and discussed as part of wider system planning and accountability conversations.
The NHS Oversight Framework metrics list (annex B) lists both the scoring and contextual metrics.
League tables
Flowing directly from our segmentation process, we publish provider performance league tables for NHS acute, non-acute hospital and ambulance trusts. These tables rank each organisation relative to its peers for individual metrics and in overall performance to show variation between providers and potential areas for improvement.
League table data is publicly available, as is a dedicated methodology document that describes how leagues are calculated, including how we have calculated confidence intervals and what league tables can and cannot tell you.
From 2026/27 we will also introduce ICB league tables using the same approach as we have for providers.
Ensuring continuous improvement
NHS England is committed to ensuring that our methodology remains balanced and effective.
We will keep the methodology under constant review to identify areas where it can be improved. We will also review the list of underlying metrics annually to ensure it continues to align with NHS priorities.
If you have any queries regarding the methodology or ideas for how it could be improved, please contact us at nhs.oversightandassessment@nhs.net
Publication of data
To support transparency, NHS England publishes a dashboard which underpins segmentation.
Each provider’s individual metric data, domain scores, average metric score, provisional and final segmentation are fully available and comparable between organisations of the same broad type. ICBs were not segmented in 2025/26, but from quarter 1 of 2026/27 a dedicated dashboard will show their results.
Publication reference: PRN02437