6 2.6 The mean corn size (in mm) for this sample of 10 patients is:1.02.03.04.04.4
7 2.7 The median corn size (in mm) for this sample of 10 patients is:1.02.03.04.04.4
8 2.8 The modal corn size (in mm) for this sample of 10 patients is:1.02.03.04.04.4
9 2.9 The range of corn sizes (in mm) for this sample of 10 patients is:1 to 10 2 to 52 to 103 to 113 to 12
10 2.10 The interquartile range (IQR) corn size (in mm) for this sample of 10 patients is:2 to 102 to 72 to 53 to 73 to 10
11 2.11 The variance in corn size (in mm2) for this sample of 10 patients is:0.82.53.54.46.5
12 2.12 The standard deviation corn size (in mm) for this sample of 10 patients is:0.82.53.54.46.5
3 Summary Measures for Binary Data
1 3.1 Summarising Binary and Categorical Data
2 3.2 Points When Reading the Literature
3 3.3 Exercises
This chapter illustrates methods of summarising binary and categorical data. It covers proportions, risk, rates, relative risk, and odds ratios. The importance of considering the absolute risk difference (ARD) as well as the relative risk is emphasised.
3.1 Summarising Binary and Categorical Data
Categorical data are simply data which can be put into categories. Binary data are the simplest type of categorical data. Each individual has a label which takes one of two types. A simple summary would be to count the different types of label. However, a raw count is rarely useful. For example, there were 45 656 new cases of breast cancer registered in England in 2016. On its own this sounds like a large number, but there were 303 135 new cases of all cancers registered in 2016. Thus breast cancer accounts for 15.2% (45 656/303 135) of all new cancer registrations in England. Proportions are a special example of a ratio . When time is also involved (as in counts per year) then it is known as a rate . The mid‐year population of England in 2016 was estimated as 55 268 067. Thus, the breast cancer registration rate was 0.008 (45 656/55 268 067).
Ratios, Proportions, Percentages, Risk and Rates
A ratio is simply one number divided by another. If we measure how far a car travels in a given time then the ratio of the distance travelled to the time taken to cover this distance is the speed .
Proportions are ratios of counts where the numerator (the top number) is a subset of the denominator (the bottom number). Thus in a study of 50 patients, 30 are depressed, so the proportion is 30/50 or 0.6. It is usually easier to express this as a percentage (%), so we multiply the proportion by 100, and state that 60% of the patients are depressed. Clearly proportions must lie between 0 and 1 and percentages between 0 and 100%.
A proportion is known as a risk if the numerator counts events which happen prospectively. Hence if 100 students start an introductory statistics course and 15 drop out before the final course examination, the risk of dropping out is 15/100 = 0.15 or 15%.
Rates always have a time period attached. In the UK, 597 206 deaths were recorded in 2016 out of a population of 65 648 100; a death rate of 597 206/65 648 or 0.009 deaths per person per year. This is known as the crude death rate (crude because it makes no allowance for important factors such as age). Crude death rates are often expressed as deaths per thousand per year, so the crude death rate is nine deaths per thousand per year, since it is much easier to imagine 1000 people, of whom 9 die, than it is 0.009 deaths per person!
Illustrative Example – RCT of Salicylic Acid Plasters for Treatment of Foot Corns
Farndon et al. (2013) reports a randomised controlled trial which investigated the effectiveness of salicylic acid plasters compared with usual scalpel debridement for treatment of foot corns. One categorical variable recorded was the location or anatomical site of the corn on the foot in six categories as displayed in Table 3.1. The first column shows category names, whilst the second shows the number of individuals in each category together with its percentage contribution to the total. We can see that corns are most likely on the metatarsal head.
Table 3.1 Anatomical site of foot corn of 201 patients with corns
( Source: data from Farndon et al. 2013).
Anatomical site of index corn on foot |
Frequency |
(%) |
Apex (end of toe) |
13 |
7 |
Proximal interphalangeal joint (middle part of toe – on the top) |
27 |
13 |
Interdigital (between the toes) |
16 |
8 |
Metatarsal head (ball of the foot – on the bottom) |
119 |
59 |
Plantar calcaneus (heel) |
6 |
35 |
Other part of foot |
20 |
10 |
Total |
201 |
100 |
We might be interested in whether the corn site is related to the gender of the patient. Table 3.2shows the distribution of corn site by gender; in this case it can be said that corn site type has been cross‐tabulated with gender. We can see that the distribution of sites for corns is similar for males and females. Table 3.2is an example of a contingency table with six rows (representing corn site) and two columns (gender). Note that we are interested in the distribution of the site where the corn is located on the foot within gender, and so the percentages add up to 100 down each column, rather than across the rows.
Table 3.2 Cross‐tabulation of anatomical site of corn by gender for 201 patients with corns on feet
( Source: data from Farndon et al. 2013).
|
Gender |
Anatomical site of index corn on foot |
Male |
|
Female |
|
|
n |
(%) |
n |
(%) |
Apex (end of toe) |
4 |
(5) |
9 |
(8) |
Proximal interphalangeal joint (middle part of toe – on the top) |
8 |
(10) |
19 |
(16) |
Interdigital (between the toes) |
5 |
(6) |
11 |
(9) |
Metatarsal head (ball of the foot – on the bottom) |
54 |
(64) |
65 |
(56) |
Plantar calcaneus (heel) |
3 |
(4) |
3 |
(3) |
Other part of foot |
10 |
(12) |
10 |
(9) |
Total |
84 |
(100) |
117 |
(100) |
As an example of the importance of considering relative proportions Furness et al. (2003) reported in Auckland, New Zealand over a one‐year period that 25.6% of road accidents were to white cars. As a consequence, a New Zealander may think twice about buying a white car! White cars were the most prevalent colour on the roads with a proportion of 25.9%. So about a quarter of cars on the road are white and this is the same as the proportion of road accidents that were in white cars; thus white cars are not more dangerous than other colours.
Labelling Binary Outcomes
For binary data it is common to call the outcomes ‘an event’ or ‘a non‐event’. So having a car accident in Auckland, New Zealand may be an ‘event’. We often score an ‘event’ as 1 and a ‘non‐event’ as 0. These may also be referred to as a ‘positive’ or ‘negative’ outcome, or ‘success’ and ‘failure’. It is important to realise that these terms are merely labels and the main outcome of interest might be a success in one context and a failure in another. Thus in a study of a potentially lethal disease the outcome might be death, whereas in a disease that can be cured it might be being alive.
Читать дальше