Relationships between the number of stem cells per tissue (s), the lifetime number of replications per stem cell in that tissue (d) and lifetime cancer risk, across 31 cancer types (data from electronic supplementary material, table S1, in ). (a) Relationship between stem cell replication (d) and cancer risk, after statistically correcting for the effect of stem cell number (s). This correction was done by regressing cancer risk on s, and then performing the partial regression of residual cancer risk on d. The r2 value is the square of the partial regression coefficient and quantifies the amount of variation in residual cancer risk explained by stem cell division rate. (b) Relationship between stem cell number (s) and cancer risk, after statistically correcting for the effect of stem cell replication (d). The partial r2 quantifies the variation in residual cancer risk explained by stem cell number. (c) Illustration of the combined positive effects of stem cell number (s) and stem cell replication (d) on predicted cancer risk. Predicted values were obtained from the multiple regression of cancer risk on d and s (see text). In 0.5 log-intervals, we assigned a colour gradient to the predicted values, ranging from light orange (low predicted risk) to dark red (high predicted risk). Thus, cancer risk increases with increasing values of both s and d. All analyses and figures use log-transformation of s, d and cancer risk. The black lines in (a,b) represent regression lines, and the shaded areas the 95% confidence intervals around the regression. Colour-coding of points based on fig. 2 in , denoting deterministic D-tumours (blue) and replicative R-tumours (green).
Residual lifetime risk of 31 cancer types, calculated as the difference between observed values and predictions of our multiple regression model (figure 1a,b). Most of the cancers that T&V classed as deterministic D-tumours (blue bars) also have high residual risks according to our alternative metric, compared to the cancers that T&V classed as replicative R-tumours (green and red bars). Many cancers with high residual risks are associated with known causative factors (oncoviruses, chemical carcinogens or inherited cancer susceptibility genes). The additional identification of cancers with very low residual lifetime risks (red bars) suggests that some tissue types may be differentially resistant to tumours.
Relationship between cancer risk per stem cell division (risk/lscd) and lscd in 31 cancer types. The negative correlation contradicts the hypothesis that the cancer risk per stem cell division is the same for all tissues (in which case the line would be flat, with gradient zero). Dashed lines show 95% confidence intervals for the linear regression (R2 = 0.58, p < 10−6). (Online version in colour.)
A hypothetical one-factor, nonlinear model of the relationship between cancer risk and lscd. If each stem cell division has the same probability of causing cancer, then there should be a linear relationship between cancer risk and lscd with a gradient of unity. However, we argue that the risk cannot rise indefinitely, but must be bounded by a maximum limit, either due to the primacy of other causes of mortality and/or due to differential cancer prevention in tissues with larger total numbers of stem cell divisions. (Online version in colour.)
(a) Relationship between cancer risk and lscd in 34 cancer types, described by a model that assumes that the gradient of the correlation is unity for small lscd and is zero for large lscd. Data from T&V are shown as crosses or hollow circles; additional data are shown as filled circles. The model asymptotes are included as dashed lines. (b) The same model fitted to the set of 27 cancer types not associated with a high-risk subpopulation (the excluded data points are shown as hollow circles in (a)). Dotted lines indicate the approximate 95% confidence interval of the regression curve. (Online version in colour.)
A two-factor, linear model of the relationship between cancer risk and lscd. In this case, the cancer types are divided into subsets according to tissue type. The subsetting partitions variation into within-subset variation (due to lscd) and between-subset variation (due to tissue type). The gradient of the correlation within subsets is expected to be close to unity. The dashed line indicates a hypothetical maximum risk threshold. For each tissue type, the cancer risk per stem cell division can be estimated by extrapolating the regression line to the point where lscd = 1 (i.e. log lscd = 0). (Online version in colour.)
(a) Relationship between cancer risk and lscd in nine topographically defined subsets of 24 cancer types (electronic supplementary material, table S1). The model assumes that the risk per stem cell division may differ between subsets but that the slope of the correlation is the same for each subset. Data from T&V are shown as crosses; additional data are shown as filled circles. (b) Relationship between cancer risk and lscd in five morphologically defined subsets of 17 cancer types (electronic supplementary material, table S2). (c) Cancer risk per stem cell division for 28 cancer types, calculated by dividing risk by lscd. This formula assumes that the correlation between risk and lscd has a gradient of unity for each tissue type, which is supported by the results of the regression model (equation (2.8)). Cancer types are coloured by topographic subset, according to the scheme shown in (a). Four types that belong to topographic subsets with only one member (and so excluded from the analysis shown in (a)) are shown in grey.