April 29 - 30, 2004 Charge/Questions to the Panel

FIFRA SCIENTIFIC ADVISORY PANEL

Revised Charge Questions

SESSION TITLE: A MODEL COMPARISON: DIETARY AND AGGREGATE EXPOSURE IN CALENDEX, CARES, AND LIFELINE

FIFRA SAP MEETING DATE: April 29 - 30, 2004 , Holiday Inn Hotel, Arlington , Virginia .

SUGGESTED QUESTIONS FOR THE SAP:

1. General Approach of Approximating Models :

A. While the three probabilistic risk assessment models described in this SAP presentation each project pesticide exposure for the ' US population', they differ in their basic design in a number of ways. EPA has identified and investigated four model design features associated with these models, as follows:

• 'Reference Population':

• DEEM-Calendex is based on the CSFII survey design

• CARES is based on the US Census PUMs

• Lifeline is based on the NCHS Natality database

• 'Binning food diaries' to generate longitudinal consumption profiles:

• DEEM-Calendex draws from the individuals' two day diaries

• CARES uses the Gower dissimilarity index

• Lifeline 'bins' food diaries based on age and season

• 'Model weights' to project simulated exposure days up to the modeled ( US ) population

• DEEM-Calendex uses the CSFII survey weights.

• CARES uses weights developed from its stratified sampling design, and

• Lifeline weights each person equally.

• 'Body weight'

• DEEM- Calendex assigns food consumption values to each individual on the basis of the grams food/kg body weight as reported by the CSFII respondents.

• CARES also assigns food consumption values to each individual on the basis of the grams food/kg body weight as reported by the CSFII respondents. However, since the CARES body weights are different from the CSFII Body weights, CARES adjusts the amount of food consumed to reflect the CARES body weight.

• Lifeline uses a reported consumption value for each individual on the basis of the grams food as reported in CSFII) and estimates the individual's body weight based on physiometric growth models developed for various demographic groups (based on gender, race and ethnicity) using NHANES data.

Question 1.1 The SAP is asked to please comment on whether the above cited model design features reflect those most likely to result in differences in dietary [food and water] exposure estimates based on identical data sets. If not, what other model design features are likely to cause different dietary exposure estimates?

B. In an attempt to further elucidate differences between predicted exposures among the three models (DEEM-Calendex, CARES and LifeLine) , OPP developed SAS approximation models. These SAS approximation models permit the isolation of factors related to the Reference Population, Binning Procedures, Sampling Weights, and individual Body Weights which cannot be isolated by running the individual models. Section IV of the background document, provided to the SAP, describes the development of these SAS approximation models and some analyses performed by the Agency using these SAS approximation models to compare and contrast model design features of DEEM-Calendex, CARES, and LifeLine. Based on these analyses the Agency concluded that the SAS approximation models track actual model results very closely for single Raw Agricultural Commodity (RAC) analyses, and reasonably well for the multi-RAC analyses.

Question 1.2 The SAP is asked to please comment on the approach taken by the Agency to develop and use SAS approximation models (see Section IV of the background document) to attribute differences in model predictions from differences in model designs. Please suggest possible improvements or refinements to these SAS approximation models and to alternative methods for comparing model predictions.

2. Reference Population & Model Weights :

The DEEM-Calendex program uses the CSFII survey respondents as its reference population; as such, the DEEM-Calendex model estimates use the CSFII-specific sample (or model) weights to estimate exposures. Each simulated day is weighted to project that exposure day to represent a group of similar individuals from the U.S. population. CARES and Lifeline use alternative data sources (i.e., U.S. Census PUMS, and NCHS Natality) to generate their respective Reference populations. The CARES model developed its Reference Population by taking a stratified random sample of 100,000 persons from the US Census PUMS. The stratified sampling design enabled CARES to over-represent sub-populations of interest (e.g., 20,003 Infants) in its reference population which are subsequently downweighted to permit projection to the U.S. population. The Lifeline model uses the Natality data to generate its Reference population. Lifeline provides the option of using CSFII survey weights to affect the probability of selecting diaries from each of the dietary bins. If this option is not selected, Lifeline will weight each modeled individual equally since these modeled lives are drawn randomly from the Natality statistics.

Question 2.1 The SAP is asked to please comment on the different approaches used by the three models in developing their Reference Populations and model weights.

• Binning Design & Frequencies of using CSFII diaries :

These models differ in the expected (or actual) frequencies that each CSFII diary is used in the probabilistic risk assessment. DEEM-Calendex uses only the individuals that provided two days of food diaries in its reference population, and sets aside approximately 1,000 one day food diaries in estimating dietary exposure. CARES employs a Gower dissimilarity index in its algorithm to generate longitudinal consumption profiles for its Reference Population. The result is use of some CSFII diaries much more often than other diaries in simulating exposure (as much as 4,000 times for certain diaries versus once for others). Approximately 1,000 CSFII diaries are not included in the CARES Food Match table. The Lifeline model uses a very general bin based on age and season, such that all food diaries within a particular bin have the same expected frequency of being used in its exposure assessment. In order to evaluate the effect of these differing frequencies and modeling weights, EPA approximated all three models using the Lifeline recipes (i.e., keeping recipes constant).

Question 3.1 The SAP is asked to please comment on the frequency that CSFII diaries are used by the various models. Are there any potential biases that may arise in the respective dietary exposure estimates for these models as a result of how they used CSFII records? Considering Lifeline's current dietary bin design (age, season), please comment with respect to the use of the CSFII survey weight option. Is either Lifeline option (CSFII-weighted or not) generally more appropriate than the other or are there circumstances in which one might be preferable to the other?

• Commodity Exposure Contribution Analyses:

An important aspect of any dietary risk assessment is the ability to identify significant contributors at the upper percentiles of exposure. The CARES and DEEM models both include an output report option known as the Critical Exposure Contribution (CEC) analysis. A comparable report option is expected to be developed for the LifeLine model in the near future. These CEC reports quantify the contribution of specific food commodities (RAC-FF) to the total exposure at the upper percentiles (e.g., top 0.2%) of the exposure distribution. An alternate or complementary approach (frequency-exceeded), also used by various model developers, tabulates the frequency that a particular commodity (RAC-FF) causes exposure to exceed some level of concern . As was the case with predictive exposure estimates, model design can affect the outcome of commodity exposure contribution analyses. Section IV.G of the background document describes the CEC and 'frequency-exceeded' approaches for identifying significant contributors at the upper end of the exposure distribution. Tables 13 and 14 show CEC reports and 'frequency of occurrence' data for DEEM-FCID and CARES analyses for 3 - 5 year olds and 20 - 49 year olds, respectively. Tables 15 and 16 show SAS approximations for the model CEC reports and 'number of occurrences > aPAD' for these same age groups. Although there is certainly a degree of similarity between model results and between the model results and the SAS approximation results, differences do occur.

Question 4.1 The SAP is asked to please comment on the relative merits of the two approaches described above (CEC and frequency-exceeded) for identifying significant contributors (RAC-FF) to exposure at the upper percentiles of exposure. Are there other methods or techniques which the Panel might recommend for accomplishing this important part of the dietary exposure assessment?