Because of the interval sampling, the age at diagnosis
is doubly truncated by the pair
, where the right‐truncation variable
is the time in years from birth (date of onset,
) to 31 December 2003, and
. The
triplets
,
, with the values observed for
were reported in Moreira and de Uña‐Álvarez (2010), while de Uña‐Álvarez (2020) included the cancer group in the statistical analysis. Ordinary descriptive statistics can be applied to the information gathered along this 5 year long window to compute, for instance, the average age at cancer diagnosis. However, if the goal is to describe the population of children eventually developing cancer, the double truncation issue should be acknowledged and properly corrected, so potential biases are avoided.
Interestingly, the observed values for
range between
and 14.5 (years); equivalently, the observed values for
range between 0.5 and 19.5. This means that the lower and upper endpoints of
and
satisfy
and
. Thus, in this case, the target variable
is observable on its whole support
, and there are no identification issues for
, the cdf of
. Information on
is summarized in Table 1.1.
Table 1.1Descriptive statistics for Childhood Cancer Data: sample size
and mean (and standard deviation, SD) for the age at diagnosis (years).
Group |
|
 |
Mean (SD) |
All |
|
406 |
6.47 (4.50) |
By gender |
Female |
178 |
6.43 (4.51) |
|
Male |
228 |
6.51 (4.51) |
By ICCC Group |
Leukemia |
107 |
6.30 (4.15) |
|
Lymphoma |
57 |
8.66 (4.39) |
|
N. System Tumour |
94 |
6.38 (4.29) |
|
Neuroblastoma |
38 |
3.16 (3.47) |
|
Other |
105 |
6.87 (4.70) |
|
Missing |
5 |
3.92 (5.18) |
This dataset is used in Chapters 2, 3and 5and is accessible in the DTDA
package in ChildCancer
.
1.4.2 AIDS Blood Transfusion Data
Kalbfleish and Lawless (1989) reported 494 cases of transfusion‐related AIDS, corresponding to individuals diagnosed prior to 1 July 1986 (
). The variable of ultimate interest
is the induction or incubation time, which is the time elapsed from HIV infection to AIDS. Importantly, HIV was unknown before 1982 (
); this implies that cases developing AIDS prior to this date were not reported. Let
denote the time from HIV infection to 1 July 1986 (in months), and introduce
; then, due to the interval sampling, only triplets
satisfying
were observed (Bilker and Wang, 1996). We restrict our analysis to the
cases with consistent data, for which the infection could be attributed to a single transfusion or a short series of transfusions. This dataset is fully reported in Kalbfleish and Lawless (1989), p. 361.
The observed values of
range from 0.5 to 89 (months), while
ranges from
to 45.5. This suggests that the lower limit of the support of
is about
, while the upper limit of the support of
is about 99.5. As discussed in Chapter 2, in such a case the distribution of the incubation time
is identifiable on the interval
(months). The AIDS Blood Transfusion Data also includes information on the age of the individual at infection; see Table 1.2.
Читать дальше