Reach Us +441474556909

Variability in thyroid function test requests across general practices in south-west England

Bijay Vaidya1*, Obioha C Ukoumunne2, Joanna Shuttleworth2, Alan Bromley3, Aled Lewis4, Chris Hyde5, Anthea Patterson6 and Simon Fleming and Julie Tomlinson7

1Department of Endocrinology, Royal Devon & Exeter Hospital, Exeter, UK

2Peninsula Collaboration for Leadership in Applied Health Research and Care, University of Exeter Medical School, University of Exeter, Exeter, UK

3Department of Clinical Chemistry, Royal Cornwall Hospital, Truro, UK

4Department of Biochemistry, Royal Devon & Exeter Hospital, Exeter, UK

5Peninsula Technology Assessment Group, University of Exeter Medical School, University of Exeter, Exeter, UK

6Department of Clinical Chemistry, Royal Cornwall Hospital, Truro, UK

7Department of Public Health, Cornwall & Isles of Scilly Primary Care Trust & University of Plymouth, Truro, UK

Corresponding Author:
Dr Bijay Vaidya
Department of Endocrinology
Royal Devon & Exeter Hospital Barrack Road Exeter EX2 5DW UK
Email: [email protected]

Received: 29 January 2013; Accepted date: 13 March 2013

Visit for more related articles at Quality in Primary Care


BackgroundThe number of thyroid function tests (TFTs) performed in the UK and other countries has increased considerably in recent years. Inconsistent clinical practice associated with inappropriate requests for tests is thought to be an important cause for this increase. AimTo study the extent of variability in requests for TFTs from general practices. MethodsWe analysed routine data on all TFTs on patients aged 16 years and over carried out by two hospitals in south-west England (Royal Cornwall Hospital and Royal Devon & Exeter Hospital) during 2010 at the request of 107 general practices. ResultsA total of 195 309 TFT requests were made for 148 412 patients (63% female). The total requests included 192 108 tests for thyroid-stimulating hormone (TSH), 43 069 for free thyroxine (FT4) and 1972 for free tri-iodothyronine (FT3). The number of TSH tests per 1000 list size varied widely across the practices, ranging from 84 to 482. Most of the variation was due to heterogeneity across practices and only 24% of this was accounted for by prevalence of hypothyroidism and socio-economic deprivation. ConclusionsThere is wide variation in TFT requests from general practice and scope to reduce both unnecessary TFTs and the variability in the clinical practice. Further studies are required to understand the causes for the variability in testing thyroid function.


audit, clinical practice variation, general practice, laboratory tests, primary care, thyroid


Thyroid disorders are common in the community, with hypothyroidism alone affecting about 2% of women.[1] Because symptoms of thyroid disorders are often non-specific and highly prevalent in the general population, biochemical tests are necessary for their diagnosis and monitoring. Thyroid function tests (TFTs) include serum thyroid-stimulating hormone (TSH), free thyroxine (FT4) and free tri-iodothyronine (FT3). In primary care, the first point of investigation for thyroid disorders, TSH is the first line test for most patients followed by an FT4 test if TSH is outside the Reference range. FT3 testing is generally reserved for endocrinologists or done at the discretion of the laboratory.

The number of TFTs performed in the UK and other countries has increased considerably in recent years,2 and it has been suggested that inappropriate requests for tests is an important cause.[3,4] National guidelines to help clinicians use TFTs appropriately have been published.[2] Although it is difficult tomeasure the degree of ‘appropriateness’ of requests for TFTs at the population level, the presence of variation in clinical practice is an indicator.4 Our study aims to establish the extent of variability in TFT requesting across general practices referring to two hospitals in south-west England.


The analyses are based on routine data on all TFTs carried out by the biochemistry laboratories at the Royal Cornwall Hospitals NHS Trust (RCHT) and the Royal Devon & Exeter Hospital NHS Foundation Trust (RDE) and requested by general practices during 2010. The catchment areas of these hospital laboratories are the predominantly rural counties of Cornwall and Devon in the south-west of England, with a combined population of 1.7 million persons. Together, the two hospitals serve a catchment population of approximatley 800 000. The datasets included patient’s gender and age; type (TSH, FT4 or FT3) and result of TFTs; and the name of the requesting general practice. Both laboratories use chemiluminescent immuno-assay (Roche Modular Analytics E170 analyser) for analysing TFTs.

We also obtained data on the following practice characteristics from the Network of Public Health Observatories5 for the year 2010: list size, percentage of patients with hypothyroidism recorded on practice disease registers, Index of Multiple Deprivation (IMD) score as a measure of socio-economic deprivation, and percentage of patients aged over 65 years.

Eligible records were TFT requests made by NHS general practices for patients aged 16 and over. We excluded records relating to practices outside the usual catchment areas of the hospital laboratories. Data for two separate practices in the RDE dataset were excluded because their records were combined under a common name and there was insufficient information to disaggregate them. In addition, records were excluded for a practice for which characteristics were not available.We analysed records from 107 practices (RCHT, 57; RDE, 50).

Statistical analysis

We summarised the distribution of tests per 1000 list size at general practice level within each hospital site and overall. Both means (with standard deviations) and medians (with interquartile ranges) are reported as the former reflects the volume of tests and the latter is appropriate for quantifying the average of skewed distributions. Random effects Poisson regression models were fitted to TSH test request rate (outcome) in order to quantify the extent to which prevalence of hypothyroidism, deprivation score and percentage of patients aged over 65 years (predictors) account for variability in practice test request behaviour. This model explicitly recognises the variation across clusters beyond chance. Analyses were carried out using Stata Statistical Software (Release 12.1, 2011; Stata Corporation, College Station, TX, USA).


A total of 195 309 test requests for 148 412 patients [mean (SD) age 60 years (19); age range 16–105 years; 63.1% female] were made to the two hospital laboratories (Table 1). Of these, 46 897 (24%) were subsequent tests to patients who had already been tested earlier in the year. The total requests included 192 108 tests for TSH, 43 069 for FT4 and 1972 for FT3; 15.5% of the TSH results were outside the laboratory Reference range (0.35–4.5 mIU/L).


Table 1: Demographic characteristics of patients for whom thyroid function tests were requested

The number of TSH tests per 1000 list size varied widely across practices, ranging from 84 to 482 (Table 2 and Figure 1). Using formula from Hayes and Bennett,6 the standard deviation of the test rate across practices that would be expected given the overall test rate and the list sizes is [7]. The observed standard deviation of 83 was considerably greater than this, indicating that most of the variation across practices (over 99%) is due to between-practice heterogeneity as opposed to mere sampling variability. Marked variation was also seen in the rates for FT4 and FT3 (Table 2). The inclusion of the prevalence of hypothyroidism (P<0.001) and deprivation score (P = 0.02) as predictors in the random effects Poisson regression model together accounted for only 23.8% of the between-cluster variance component, indicating that they account for relatively little of the marked differences in TSH test rate. The percentage of patients aged over 65 years did not explain any extra variability. When the model was further adjusted for the proportion of TSH tests that were outside the laboratory range, it accounted for 53.8% (an extra 30%) of the between-cluster variance component. The higher the proportion of tests in the normal range, a crude proxy for unnecessary testing, the greater the TSH test rate (P < 0.001).


Table 2: Distribution of number of tests per 1000 list size across practices in 2010


Figure 1: Number of TSH tests requested per 1000 list size in 2010 (all practices; n = 107)


This study shows wide variation in the TFT test rate across general practices in south-west England. Three quarters of this variability was not explained by practice-level prevalence of hypothyroidism, socioeconomic deprivation and percentage of patients aged over 65 years. O’Kane and colleagues also found a wide variation in the rates of different biochemistry tests requests, including TFTs, from general practices in Northern Ireland.[7]

The recent increases in many tests ordered, unmatched by obvious change in level of disease, alongside wide variation in practice strongly suggest that some test ordering is inappropriate. Our result, that high levels of test ordering are related to high proportions of results in the normal range for a practice, provides support for this view. There is thus potential for both health gain and savings to be made in NHS expenditure. The latter will be important as the UK Department of Health has set a target of £20 billion efficiency savings to be made through the Quality, Innovation, Productivity and Prevention (QIPP) programme. [8] Our data show that if the TSH test rates in the practices were all reduced to 229 per 1000 list size (the current median) there would be a 7% reduction in tests ordered. In Cornwall and Exeter this would equate to approximately 13 000 TSH tests per annum, costing £91 000 if priced at £7 per test. Furthermore, inappropriate tests may also harm individual patients by increasing the risk of false-positive results leading to increased anxiety, a cascade of further investigations and unnecessary treatments.

A variety of educational and administrative strategies have been trialled to reduce unnecessary TFTs,[4,911] but the results are sometimes disappointing. This may be because the interventions were chosen as they were standard approaches thought to be worthy of evaluation, rather than interventions appropriate to the context and the particular behaviours needing to be tackled to achieve reduced testing. Current behaviourchange science indicates that the barriers to change must be clearly identified.[12]

A limitation of our study is the lack of data on why the tests were requested as many of the requests had incomplete or unclear indication for the tests. This precluded us from further exploring the potential causes for the variability in TFTs requests seen in the study. Althoughwe have gained some insights into the reasons for variability in test ordering in primary care,[9,13,14] this knowledge is incomplete. We believe that understanding the causes of variation between high- and low-ordering practices using mixed methods research is the next step. As well as informing our ultimate behaviour change strategy, this will also lead to better definition of the most appropriate level of test ordering: any assumption that this is the median rate needs to be explored.


This report presents independent research funded by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care (CLAHRC) for the South West Peninsula. The views expressed in this publication are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health in England.

Ethical Approval

As this was an audit of the routine clinical practice, a formal ethical approval was not necessary.


The research question originated from the Peninsula CLAHRC question generation workshop in Cornwall led by JT and SF. SF, CH, AP, BV, JS and OU were involved in the development and design of the study. AB, AL and JS acquired the data, and OU did the statistical analysis. BV, JT, OU and JS prepared the draft of the manuscript. All authors were involved in the critical revision of the draft and approved the final version of the manuscript. BV is guarantor for the study.

Peer Review

Not commissioned; externally peer reviewed.

Conflicts of Interest



Select your language of interest to view the total content in your interested language

Viewing options

Post your comment

Share This Article

Flyer image
journal indexing image

Post your comment

captcha   Reload  Can't read the image? click here to refresh