You are here: Home » Statistics explained

Statistics explained

Print

Statistics

If the very word 'statistics' fills you with horror, read on… these pages are written with you in mind! This section is included to help you appraise the statistics in the rest of the handbook. I do not offer it in any way as a comprehensive guide to statistics – just a basic guide from a nonstatistician to help you along the way!

The following terms are covered and explained in alphabetical order below:

  • Absolute risk (AR)
  • Composite endpoints
  • Confidence intervals (CI)
  • Hazard ratios (HR)
  • Interquartile range (see 'mean')
  • Likelihood ratios
  • Mean and median
  • Non-inferiority trials
  • Number needed to treat (or harm) (NNT/NNH)
  • Odds ratios (OR)
  • Positive & negative predictive value (PPV& NPV)
  • Pre-test probability
  • Rate ratio
  • Relative risk (RR)
  • Relative risk reduction (RRR)
  • Sensitivity and specificity
  • Systematic reviews & meta-analysis

Absolute risk reduction (ARR)

 

Absolute risk reduction is difference in rate of events between the two groups
ARR = risk of event in control group – risk of event in Rx group
If ARR = 0 there is no difference between the two groups (no treatment effect)

Example:Death rate in control group: 15% or 0.15
Death rate in treatment group: 10% or 0.10
ARR = Risk in control group – risk in treatment group = 0.15 – 0.10 = 0.05 or 5%

Comparing absolute and relative risk reductions
Note the differences between relative risk and absolute risk reduction in this study:

'You could take this extra tablet, dipyridamole, twice daily for the next year and you could reduce your risk of having a further event by up 20% compared to not taking it'. (Relative risk reduction 20%).

'You could take this extra tablet, dipyridamole, twice daily for the next year and at the end of the year you are 1% less likely to have had an event'. (Absolute risk reduction 1%)

Both come from exactly the same data but you can see why pharmaceutical companies prefer to use relative risk reductions!

Composite endpoints: beware!

 

Many research trials look at composite endpoints – for example in cardiovascular research a composite endpoint might include MI, need for revascularisation and cardiovascular death. However, a paper in the BMJ points out that for patients these endpoints are not equal – how can you compare dying with any other non-fatal endpoint? Dying is an altogether different magnitude event to needing a revascularisation (BMJ 2007;334:786–8). However, the paper pointed out that most cardiovascular research in recent years has been into composite endpoints. And by combining endpoints you can present skewed data to patients.

For example, one study they looked at reduced the relative risk of death by 8% and the relative risk of other more minor events by 33%. Most patients would be most interested in the reduction in their risk of death – yet the benefits to patients of this are much smaller than that of the other events. In addition, by looking at composite endpoints you are likely to overexaggerate the benefits the patients perceive they are getting from any given treatment.

Confidence intervals

 

Confidence intervals (CI) allow you to assess statistical significance. All confidence intervals in this book are 95% confidence intervals – that is you can expect 95% of the population to fall into the range given.

If, for example, smokers got cancer 5 × more often than non-smokers (N.B. I have made this figure up!) the relative risk for smoking would be 5. If the confidence intervals for this were 3.5–8 then, because they are greater than 1, you could assume this result is unlikely to have arisen by chance. However, if the 95% confidence intervals were between 0.5 and 7 it may be that, because they cross 1, smoking may not actually increase the risk of cancer.

Hazard ratios

 

Hazard ratios are a form of relative risk (see that section). A hazard ratio of greater than 1 means an event is more likely to happen in the treatment group than in the placebo group.

Likelihood ratios

 

Likelihood ratios are useful because they incorporate sensitivity and specificity. If you want to know how they are calculated, see below. You don't need to understand likelihood ratios – just remember that they are useful because they incorporate both sensitivity and specificity.

For a positive test, the likelihood ratio of a positive test is: sensitivity/(1 – specificity)

The likelihood of a negative test is: (1 – sensitivity)/specificity

Mean, median and interquartile range

 

This may sound basic but it is important, so I have included it here. As an example let's take resting pulse rates of 7 people (65, 68, 72, 75, 78, 83, 108bpm).

Mean: add all the results up and divide by the number of results you had (= 549/7 = 78.4)

Median: line up all the numbers in order, and the median is the middle number (in this case the 4th number = 75)

Interquartile range: the difference between the 25th quartile and 75th quartile of data (i.e. the middle 50% of data). In this case the 25th quartile is 68 and the 75th quartile is 83, so the interquartile range is 68–83). The interquartile range is important because although it can be similar to the median, it ignores outliers that may skew data (such as the person with the pulse of 108bpm who may well have AF).

Medians can also be quoted for interquartile ranges. Once again, this is useful to avoid skewing of data, although in small data sets such as the example I have given you here, the result is the same as the median for the full data set.

Non-inferiority trials

 

Most trials are superiority trials: is this new drug better than this other drug or this placebo? However, sometimes non-inferiority trials are run. This is often the case when it would be unethical to offer placebo, for example if someone has H. pylori,ethically you can't really enrol them in a trial of new drug versus placebo. However, you could offer them a non-inferiority trial, testing out this new drug versus standard H. pylori eradication therapy.

Non-inferiority trials will tell you whether your new drug is no worse than the control treatment BUT it can't tell you if it is any better (although you can run a non-inferiority trial that tests for non-inferiority but is also sufficiently powered to detect superiority!)

There are several inherent weaknesses in non-inferiority trials, in particular that the margin for proving effect can be set nice and wide, making almost anything look effective. Also noninferiority trials assume the standard control therapy is effective (it may not be!). In addition, intention-to-treat analysis (deemed good in superiority trials) may blur the effect for new and old treatments still further.

Numbers needed to treat (NNT) or harm (NNH)

 

NNTs tell us how many people have to be treated for 1 person to benefit. An ideal NNT is 1; everyone treated gets better, no one given the placebo group gets better. NNHs are numbers needed to harm. NNT/H should, but don't always, quote a time frame.

An NNT (or H) of 40 over 2 years means that 40 people have to be treated for one to get a benefit (or harm) over a 2-year period.

NNTs are easy to calculate: NNT = 1/ARR (absolute risk reduction)

If risk of event in treatment group: 4%, & risk of event in placebo group: 1%

ARR is 4 – 1=3%

NNT=1/ARR = 1/3 (× 100) = 33

Odds ratios

 

An odds ratio is a way of expressing probability or relative risk – an odds ratio of greater than 1 means an event is more likely to happen in the treatment group than in the placebo group.

Positive & negative predictive values

 

The positive predictive value of a symptom or test is the proportion of the people who test positive who actually have the disease.

The negative predictive value of a symptom or test is the proportion of people told they don't have the disease that really don't have it.

Using a 2 × 2 table:

Disease presentDisease absent
Test positiveTrue positives (TP)False positives (FP)
Test negativeFalse negatives (FN)True negatives (TN)

Positive predictive values (PPV) = TP/(TP + FP)

Negative predictive value (NPV) = TN/(TN +FN)

The higher the PPV of a symptom or test, the more likely the patient sitting in front of you really does have that disease.

The higher the NPV of a test, the more likely it is that the patient who has tested negative, really doesn't have the disease.

A PPV of 10% means 10% of people with that symptom will, after investigations, actually have cancer. That means 90% of people with that symptom will not.

In the Cancer chapter we discuss how low some PPVs are for classic 'red flags' for cancer and what this means in terms of our ability to detect cancers.

Pre-test probability

 

This is the probability of having the disease before a diagnostic test is done. For example a 56-year-old man who smokes comes to the surgery looking pale and clammy and complaining of a severe chest pain. The pre-test probability of him having an MI is quite high. ECGs and troponins will make this diagnosis more or less likely (change the pre-test probability).

The pre-test probability can be calculated from a two by two table (as above) like this:

TP + FN
Pre-test probability =

 

(TP + FP + FN + TN)

Or All those with the disease divided by all patients with the symptoms (both those with and without the disease).

Rate ratio

 

Rate ratio is simply the ratio of the rate of something in one population divided by the rate in another population. It is often used for comparing the incidence of a disease in a group of people exposed to a something compared to an unexposed population. For example, the rate of cancer in a population exposed to a carcinogen may be 10/hundred person years. The rate of cancer in an unexposed population might be 3/100 person years. The rate ratio would be 3.333, suggesting that those exposed to the carcinogen were 3x more likely to get cancer than the unexposed population.

Relative risk (RR)

 

How many times more likely is it that an event will occur in the treatment group compared to control group?

RR is risk in treatment group/risk in control group

RR of 1 = no difference
RR <1 means treatment reduces risk of outcome
RR >1 means treatment increases risk of outcome

Example:Death rate in control group: 15% or 0.15
Death rate in treatment group: 10% or 0.10
RR is Risk in treatment group/risk in control group = 0.10/0.15 = 0.67

Relative risk reduction (RRR)

 

Tells us reduction in the rate of the outcome in the treatment group relative to control group.

RRR = ARR/risk of outcome in control group OR RRR = 1 – RR

Example:Death rate in control group: 15% or 0.15
Death rate in treatment group: 10% or 0.10
RRR = ARR/risk of outcome in control group = 0.05/0.15 = 0.33 or 33%
OrRRR = 1 – RR = 1 – 0.67 = 33% or 0.33

Comparing absolute and relative risk reductions
Note the differences between relative risk and absolute risk reduction in this study:

'You could take this extra tablet, dipyridamole, twice daily for the next year and you could reduce your risk of having a further event by up 20% compared to not taking it' (Relative risk reduction 20%).

'You could take this extra tablet, dipyridamole, twice daily for the next year and at the end of the year you are 1% less likely to have a had an event'. (Absolute risk reduction 1%)

Both come from exactly the same data but you can see why pharmaceutical companies prefer to use relative risk reductions!

Sensitivity and specificity

 

Sensitivity is the proportion of people with a disease who are detected by the test.

Specificity is the people who don't have the disease and don't test positive (i.e. they test negative). Using the 2 × 2 chart from before:

Sensitivity = TP/(TP + FN)E.g. you work out the proportion of cancers detected as a proportion of all the cancers. High sensitivity – good test for cancer.
Specificity = TN/(TN + FP)E.g. you work out the proportion of the people who haven't got cancer and test negative for cancer as a proportion of all those without cancer. High specificity = few false positives.

What is the difference between a systematic review and a meta-analysis?

 

Both systematically look for all the relevant literature on a subject.

A systematic review will draw together all the literature and come to conclusions in the absence of numerical data to prove an effect.

A meta-analysis is a systematic review that uses quantitative methods to summarise the results – basically the results from a number of different studies are pooled to produce a large enough sample size to reduce the risk of any finding being down to chance alone.

Of course for both systematic reviews and meta-analysis the studies need to be similar and good quality… rubbish in equals rubbish out!

For more on statistics these two websites are really good: 
Bandolier: www.medicine.ox.ac.uk/bandolier 
Centre for Evidence-Based Medicine: www.cebm.net

 

Personal learning points/actions: