This evidence report on vitamin D and calcium in relation to health outcomes was prepared for consideration by the Committee on Dietary Reference Intakes for Vitamin D and Calcium at the request of AHRQ on behalf of the various sponsors. This report does not make nor was it intended to make recommendations for DRI values concerning vitamin D or calcium. Responsibility for setting DRI values lies with the Committee. Evidence from systematic reviews is one of several types of information available to the Committee for use in its deliberations to establish DRI values. This is the first time that an independent systematic review is being commissioned to support the DRI process. Thus, it is important for users of this report to fully appreciate the nuances of the methodologies employed, as well as the strengths and limitations of this approach. In particular, it should be noted that total vitamin D exposure was not evaluated in this report because there is no valid method to quantify the contribution of endogenous vitamin
D synthesis resulting from sun exposure and it is also the TEP’s consensus that vitamin D intake, as estimated by current food frequency questionnaires, is too inaccurate to be of value.
For this report, we identified 165 primary articles that met the eligibility criteria established by the TEP. In addition, we included 11 published systematic reviews that incorporated over 200 additional primary articles. Despite the relatively large number of studies included, with the following few exceptions, it is difficult to make any substantive and concise statements on the basis of the available evidence concerning the association of serum 25(OH)D concentration, supplemental vitamin D, dietary calcium intake, or the combination of both nutrients with the various health outcomes. It proved challenging because many of the studies contained substantial heterogeneity and their findings were inconsistent for the health outcomes examined.
In general, among RCTs of hypertensive adults, calcium supplementation (400 to 2000 mg/d) lowered systolic, but not diastolic, blood pressure by a small but statistically significant amount (2 to 4 mm Hg).
For body weight, despite a wide range of calcium intakes (from supplements or from dairy and nondairy sources) across the calcium trials, the RCTs were fairly consistent in finding no significant effect of increased calcium intake on body weight.
For growth, a meta-analysis of 17 RCTs did not find a significant effect on weight and height gain attributable to calcium supplement in children ranged from 3 to 18 years of age.
For bone health, one well-conducted systematic review of RCTs found that vitamin D3 (up to 800 IU/d) plus calcium (~500 mg/d) supplementation resulted in small increases in BMD of the spine, total body, femoral neck, and total hip in populations consisting predominantly of women in late menopause.
For breast cancer, subgroup analyses in four cohort studies consistently found that calcium intake in the range of 780 to 1750 mg/d in premenopausal women was associated with a decreased risk for breast cancer. However, no RCTs of calcium supplementation to prevent breast cancer in premenopausal women have been published. In contrast, cohort studies of postmenopausal women are consistent in showing no association of calcium intake with the risk of breast cancer.
For prostate cancer, three of four cohort studies found significant associations between higher calcium intake (>1500 or >2000 mg/day) and increased risk of prostate cancer, compared to men consuming lower amount of calcium (500-1000 mg/day).
For cardiovascular events, a cohort study and a nested case-control study found associations between lower serum 25(OH)D concentrations (less than either about 50 or 75 nmol/L) and increased risk of total cardiovascular events; however an RCT found no effect of supplementation and studies of specific cardiovascular events were too sparse to reach conclusions. Taken together, six cohort studies of calcium intake suggest that in populations at relatively increased risk of stroke and with relatively low dietary calcium intake (i.e., in East Asia), lower levels of calcium intake under about 700 mg/day are associated with higher risk of stroke. This association, however, was not replicated in Europe or the US, and one Finnish study found a possible association of increased risk of stroke in men with calcium intakes above 1000 mg.
Studies on the association between either serum 25(OH)D concentration or calcium intake and other forms of cancer (colorectum, pancreas, prostate, all-cause); incidence of hypertension or specific cardiovascular disease events; immunologic disorders; and pregnancy-related outcomes including preeclampsia were either few in number or reported inconsistent findings. Too few studies of combined vitamin D and calcium supplementation have been conducted to allow adequate conclusions about its possible effects on health. The WHI trial was commonly the only evidence available for a given outcome.
Strengths of This Report
The strengths of this report lie in the wide range of topics covered, critical appraisal, detailed documentation, transparent methods to assess the scientific literature, and an unbiased selection of studies. A team of evidence-based methodologists not previously directly involved in research related to vitamin D and calcium worked with nutrient experts to refine the key questions (initially defined by AHRQ with input from various sponsors), analytic framework, and review criteria for the systematic review. After defining the questions and eligibility criteria with input from content experts and the sponsoring agencies, the Tufts EPC reviewed the published evidence on the topic. The intent was to perform a thorough and unbiased systematic review of the literature base on available evidence as defined by prespecified criteria. Once the review process began, input from experts in the field was sought to clarify technical questions during the literature review process. These individuals did not participate in study selection or detailed data extraction from the included studies nor were any members serving on the IOM committee on vitamin D and calcium involved in the review of this document. A quality rating as detailed in Chapter 2 (Methods section) was assigned for each primary study and systematic review, and incorporated into the data summaries section of the report. On the basis of this work, a sound foundation has been created which will facilitate rapid and efficient future updates as needed.
Details concerning the process of question formulation, selection of health outcomes of interest, justification for study selection criteria, methods used for critical appraisals of studies and quality rating, and summary of results are described fully in the Methods chapter. This approach is critical to the establishment of a transparent and reproducible process. Furthermore, important variables that affect vitamin D status such as life stages, latitude of the study locale, background diet and skin pigmentation are documented in this review.
This evidence report was carried out under the AHRQ EPC program, which has a 12-year history of producing over 175 evidence reports and numerous technology assessments for various users including many federal agencies. EPCs are staffed by experienced methodologists who continuously refine approaches to conducting systematic reviews and develop new methods on the basis of accumulated experience encompassing a wide range of topics. In addition, the Tufts EPC has conducted a number of nutrition-related evidence reports19-22,241, as well as conducted the mock exercise on vitamin A panel.3 This report drew on these experiences, the expertise of the TEP, and the support of federal agencies.
DRI and the Literature on Vitamin D and Calcium
It should be emphasized that none of the studies reviewed were designed to address issues specifically relevant for establishing DRI values (i.e., to ascertain the optimal dose in a particular life stage to promote growth and tissue maintenance, and prevent chronic disease throughout the lifecycle). In general, the studies did not enroll subjects with ages that could be easily mapped to specific life stages as defined within the DRI framework (with the exception of postmenopausal women and pregnant or lactating women) and did not evaluate health outcomes on the basis of what doses will lower risk for a particular disease in prespecified life stages. Therefore, data will need to be extrapolated from these studies to craft a set of DRI values for vitamin D and calcium. This extrapolation may prove challenging.
Certain issues concerning the studies of vitamin D must be noted. As mentioned previously, it is difficult to evaluate nutritional adequacy because there are no methods currently available to quantify the contribution of endogenous vitamin D synthesis resulting from sun exposure on an individual or group level. In addition, it is generally accepted that estimating intake by dietary assessments is not a valid indicator of vitamin D status, because there are limitations in the completeness of nutrient databases for both food and dietary supplements vitamin D content and the rapidly changing landscape of vitamin D food fortification has not yet been captured in either instruments used to assess intake and the databases used to analyze the data. For example, vitamin D values are available for only about 600 out of 1400 foods in the USDA National Nutrient Database for Standard Reference (http://www.ars.usda.gov/nutrientdata) and notably missing are foods recently fortified with vitamin D.25 Given the recent trend towards increased nutrient fortification of the North American food supply, the lag in updating food composition tables, and the inability to distinguish between fortified and unfortified foods when using most dietary assessment tools, it is difficult to accurately estimate dietary intakes of vitamin D, especially for a given year. Shifts in methodological approaches to measure serum 25(OH)D concentrations, the heterogeneous nature of the data available with respect to study locations (i.e., latitude) and times during the year (i.e., season) hamper our ability to succinctly summarize dose-response relationships. We did not perform a dose-response meta-analysis of the relationship between serum 25(OH)D concentrations and health outcomes because limited and inconsistent data would result in a meta-analysis that is difficult to interpret and results that may be misleading. Furthermore, many of the large cohorts analyzed for associations of vitamin D with health outcomes enrolled mostly white participants aged approximately 40 to 70 years old and much of the data on intake dose-response and serum 25(OH)D concentration were derived from studies designed to measure bone health in postmenopausal women. These factors limit the applicability of the findings to other life stages and other racial groups.
Unlike serum 25(OH)D concentrations for vitamin D, there is no equivalent serum biomarker to indicate calcium status. Relying on dietary assessment to gauge calcium intake is limited by the confounding effect of vitamin D status on the efficiency of calcium absorption and uncertainties in the calcium content of many foods due to the recent trend in nutrient fortification of food, limited ability of current dietary assessment tools to distinguish among fortified and unfortified foods and the lag in updating nutrient databases with current nutrient information.
Limitations of our Methodological Approach
The number of potentially relevant (English language articles on humans and not reviews) vitamin D studies indexed in MEDLINE is very large (~15,000) and the number of calcium studies is even larger (~110,000). Without unlimited time and resources, the systematic review conducted in this report had to focus on selected key questions predefined by our federal sponsors with input from the IOM, and capitalize on existing systematic reviews. Using previous systematic reviews risks propagating deficiencies and errors242 introduced in those reviews (e.g., errors in data abstraction, flawed assumptions in quantitative synthesis). Although we have assessed the quality of these systematic reviews using AMSTAR26 checklist, we cannot reliably know the validity of the reported summary data without knowing the details of the primary studies. It should also be stressed that a well-performed systematic review does not necessarily imply that the body of evidence for a particular outcome of interest is of high quality. While some systematic reviews assessed the quality of the individual studies, the methods used varied. Any systematic review is limited by the quality of the primary studies included in the review. Unless the methods used to assess the quality of the primary studies is transparent and the details made available for examination, it would be difficult to reliably determine the validity of the conclusions. Also, relying on existing systematic reviews alone could have potentially precluded us from identifying all relevant studies because those systematic reviews might have addressed somewhat different questions and had a different scope from this review. For example, for growth outcome in children, we principally relied on the findings from a meta-analysis of RCTs of calcium originally designed to evaluate bone density outcomes. If there were RCTs of calcium intake specifically designed to measure growth outcomes such as weight and height gain, but not bone density, then those studies would not have been identified. In addition, as per the task order from AHRQ, we relied on the Ottawa report for bone health outcomes and we did not examine specific studies included in that report. As a consequence, if those studies had reported other (than bone health) outcomes that were of interest, those studies would not have been included in this review.
As there is no consensus on how to assess the quality of the nutrition observational studies, we created a quality checklist based on a newly published reporting standard for observational studies32 and nutrition reporting items that we believe should be considered in quality assessment. This checklist, however, has not been calibrated and the intra- and interrater variability have not been assessed. We should also remind the readers that impeccable study reporting does not equate study validity. However, transparent, comprehensive, and accurate reporting does help in evaluating a study’s validity.
Also, studies on vitamin D and calcium were not specifically targeted at life stages (except for children, pregnant, and postmenopausal women) specified for the determination of DRI. We, therefore, were unable to structure our report strictly according to prespecified life stages. When a study enrolled populations that spanned across multiple life stages, we provided our best estimates as to which life stage(s) the study’s findings would be of most relevance.
Comments on the Observational Studies
All the included observational studies were designed to generate hypotheses of potential associations of multiple factors with vitamin D or calcium. Therefore, a finding of a significant association in these studies, after exploratory analyses, should not be considered equivalent to the result of studies that were designed to confirm this relationship. Many of the nested case-control studies typically excluded a substantial portion of participants (some as high as 60 to 70 percent) in the original cohorts because blood samples, or completed dietary questionnaires were not available. How this selection bias would affect the reported association is unclear. In addition, several of the studies might have suffered from outcome misclassification; for example, when cancer cases were identified from registries without histopathology verification. The effect of outcome misclassification is unpredictable. Furthermore, many of the studies did not report a power calculation. Even though many of the studies included cohorts with relatively large numbers of subjects (tens of thousands), it is plausible that, in fact, the included studies may have been underpowered to detect the true effect sizes. If that were the case, the significant effect reported may, in fact, be spurious. Furthermore, many of the reported effect sizes were small to moderate (with OR ranged from 1.03 to 2.0). When the effect size is small, the possibility of residual confounding by unmeasured variables must be considered.
Sources of Heterogeneity and Potential Biases
As have been mentioned previously, most of the findings reported in this review were inconsistent for each of the outcomes of interest. Many studies showed substantial heterogeneity. Some studies adjusted the serum 25(OH)D concentration by season of serum collection, some did not. While the majority of the studies used some forms of RIA to measure the serum 25(OH)D concentration, a minority used competitive protein-binding assay. Some studies reported a substantial proportion of the frozen serums were accidentally thawed and limited the analyses that could be performed. It is unclear how this would alter the overall results. Many studies suffered from potentially inadequate outcome ascertainment (e.g., reliance on self-reported calcium intake and hypertension diagnosis). Time between measurement of serum 25(OH)D concentration and the diagnosis of interest varied. For prostate and colorectal cancer, it ranged from 1 to more than 16 years. Factors potentially relevant to the outcomes of interest like family history (in colorectal cancer) were not consistently reported and accounted for in the studies. Also, the blinding of case assessors to the risk factor of interest (e.g., serum 25(OH)D concentrations) as well as that of investigators who measured the risk factor per se to outcomes were rarely reported.
For studies on calcium supplementation, intake compliance, information on the bioavailability of the calcium source, the role of background sun exposure, and associated vitamin D effects were not consistently available across all studies. Thus, it is difficult to interpret those findings on an absolute level and among studies.
Finally, all systematic reviews, including this report, may suffer from potential publication and reporting biases since currently there is no reliable way to detect and correct these biases. However, there is an underlying suspicion of publication bias against studies having either null or negative outcomes and reporting bias toward “significant” outcomes in the literature.243,244 Thus, it is important to consider these biases when reviewing the overall findings of any systematic review.
Vitamin D Intake and Response in Serum 25(OH)D Concentration
The findings of this review on the association between vitamin D intake dose and change in serum 25(OH)D concentration was primarily derived from RCTs reviewed in a systematic review of bone health in postmenopausal women. This limits the applicability of the findings to other life stages. Though, we did not find any reason to consider these trials to be biased, they are nonetheless an arbitrary sample of all studies that have reported the association between vitamin D intake dose and change in serum 25(OH)D concentration. We did not perform a quantitative synthesis (e.g., meta-regression) to examine the relationship between vitamin D intake dose and serum 25(OH)D concentration due to the heterogeneity across studies. Studies had varied compliance rates in the vitamin D intake; limited or no adjustment for skin pigmentations, calcium intake, or background sun exposure; different vitamin D assay methodologies and measurement (both intra- and interassay) variability. All these factors increase the heterogeneity and limit the usefulness of an overall summary estimate for an intake dose response in serum 25(OH)D concentration. Nonetheless, overall, there appeared to be a trend for higher vitamin D supplementation dose resulting in higher net change in serum 25(OH)D concentration.
Considerations for Future DRI Committees
Formulating the appropriate key questions is the most important aspect of conducting a systematic review to ensure the final product will meet the intended purpose. Ideally, this should be an iterative process involving the sponsors, EPC, TEP and targeted end-users. The questions should be reviewed and potentially refined once the “state” of the literature has been systematically appraised, with the understanding that any modifications to the key questions after the review process has started will likely extend the literature review and synthesis processes. In addition, developing relevant study selection criteria for the systematic review is critical to finding pertinent data to answer the key questions; the TEP should be engaged early in this process. Crafting a framework of the entire review process depicting the explicit roles of the sponsors, TEP, and targeted end-users could also be helpful for future reviews.
While the process of conducting the actual systematic review of a nutrient or group of nutrients on an agreed upon set of key questions concerning specific health outcomes is carefully laid out and could be replicated without undue difficulty, the process of selecting which health outcomes would be important for inclusion in a systematic review could not be easily replicated. The health outcomes selected were decided after much deliberation by the TEP with input from the various partners. As the nature of the deliberation hinged much on the expertise reflected by the particular composition of the TEP, it is conceivable that a different TEP composed of members with different expertise may have recommended a different set of health outcomes for inclusion. To minimize this variability, an a priori designed set of instructions to weigh each outcome (taking into account such factors like population attributable risk, morbidity, and others) for possible inclusion would be valuable.