Components of Uncertainty

Note: EPA no longer updates this information, but it may be useful as a reference or resource.

Which Components of Uncertainty did the National-Scale Assessment Include?

The approach taken in the national-scale assessment uncertainty analysis was to rely ultimately on expert judgments of the uncertainty associated with each step of the calculations of ambient air concentration, exposure, and risk (These steps were reviewed in the Methods section on variability). These judgments are needed due to the qualitative nature of some of the aspects of uncertainty discussed below. The judgments, however, were informed by a series of quantitative uncertainty analyses performed wherever quantification was possible and justified. In addition, broad rules were defined for combining judgments. The resulting judgments of confidence, therefore, are not purely subjective, but rather are guided by objective, quantitative, statistical analyses; based in consistent application of rules for combining subjective judgments; and arrived at by experts with extensive experience making such judgments.

The uncertainties have been divided into three primary areas based on the three steps leading from emissions to risk. There is uncertainty in ambient air concentration, which is due to uncertainty in both the emissions estimates and the ASPEN model. There is uncertainty in exposure, which is due to uncertainty in the activity patterns, the locations of individuals within a census tract, and the microenvironment concentrations as reflected in the HAPEM4 model. And there is uncertainty in risk, which is due to uncertainty in the shape of the relationship between exposure and risk, the Unit Risk Estimate, and the Reference Concentration. These three sources of uncertainty are discussed below.

Ambient Air Concentration

Considering first the predictions of ambient air concentration, the specific sources of uncertainty considered in the uncertainty analysis were:

Uncertainties due to the emissions parameters. Emission rates and locations of sources were taken from the NTI database, which is a composite of estimates produced by State and local regulatory agencies, industry and the EPA. The quality and uncertainty for specific emissions rates and locations found in this database (e.g. industrial emissions from a specific census tract in North Carolina) has not been fully assessed. Some of the parameter values may be out of date, there may have been errors introduced in transcribing raw data to a computer file, and so on. This database is being updated continuously. In some cases, the locations were unknown and the source was placed into the centroid of a census tract. Overall, slightly less than 10% of the point source sites were assigned a default location.

There are also uncertainties inherent in emission models used to develop inventory estimates. For example, county-level nonroad equipment toxic air pollutant emissions are estimated by applying toxic fractions of total hydrocarbons (HC) to county level HC estimates for gaseous air toxics, and toxic fractions of particulate matter to county-level PM estimates for metals. The toxic fractions are derived from speciation data based on limited testing of a few equipment types. The county-level total organic gas (TOG) and PM estimates used come from EPA's draft NONROAD model. In NONROAD, there are uncertainties associated with emission factors, activity, and spatial allocation surrogates. National level emissions in NONROAD are allocated to the county level using surrogates, such as construction costs (to allocate emissions of construction equipment) and employees in manufacturing (to allocate industrial equipment). Use of more specific local data on equipment populations and usage will result in more accurate inventory estimates. EPA strongly recommends that states undertake data collection to provide local data as is routinely done for highway motor vehicle activity and population.

For both mobile and area sources, the emissions rates usually were allocated from the county level to a specific census tract through a surrogate such as population or land use. This introduces an additional uncertainty, since the data on the surrogates carry their own uncertainties.

Uncertainties due to stack parameters. The ASPEN model requires information on stack height, gas temperature, gas velocity, etc, to estimate dispersion of an air toxics compound in the atmosphere. Again, the NTI database supplied these values, and in some cases default values were used either because the necessary data were not available or they were judged unreliable (e.g. physically unrealistic values). If data were missing on stack parameters, they were supplied by making reasoned guesses from similar facilities. Of the 97,365 unique vertical stacks, 63,292 contained at least one stack parameter that was a default value.

Uncertainties due to particle size and reactivity parameters. The ASPEN model requires information on the physical properties of the pollutant; that is the fraction in gaseous, fine particulate or coarse particulate forms (which affects the extent to which they are removed from the air by settling to the earth) and on chemical reactivity (some air toxics compounds are subject to chemical reactions in the atmosphere). These parameters were not available in the NTI database, and so representative values were assigned. In addition, representative values of the deposition velocities for particles (the speed at which they settle to the ground) were used. Any one source, however, may actually have different values than the ones assumed.

Uncertainties due to chemical speciation parameters. The health effects of air toxics compounds depends on the chemical form of that compound when it is inhaled. The NTI database did not include information on speciation for many sources, but only on the total rate of emission of a compound in all of its forms. EPA staff, therefore, made assumptions about chemical speciation based on representative values at such sources. Any one source, however, may actually have different values than the ones assumed.

Uncertainties due to terrain parameters. The dispersion, or movement, of air toxics compounds in the atmosphere depends on the degree to which the terrain surrounding a source is flat or hilly. The ASPEN model, however, does not take into account variations in local terrain. This can lead to uncertainties in predictions of ambient air concentration, particular in areas with hills or mountains.

Uncertainties due to background concentration parameters The estimates of ambient air concentration use a background value that is added to the air in all census tracts to reflect sources other than the ones modeled in the national-scale assessment. These sources might, for example, be from long-range transport of compounds from other counties and states. For 13 of the air toxics compounds modeled, the same, nationally-averaged value was used for the background concentration. In reality, this value probably varies between census tracts, which introduces uncertainty in the estimate of ambient air concentration in any one census tract. For diesel PM, instead of using monitored air quality data to estimate background concentrations, EPA used a modeling-based approach. For more details, see background concentrations

Uncertainties due to meteorological parameters. The ASPEN model requires parameters on the direction and speed of airflow, and on the stability of the atmosphere (which affects how high gases rise once released). The national-scale assessment used meteorological data from the nearest available monitoring station, collected in 1996. This introduces one source of uncertainty, since the data usually were not for the precise location of a source. In addition, other sources of these data were available, and so the uncertainty due to selection of a database for a specific source was included in the uncertainty analysis.

Uncertainty due to the ASPEN dispersion model equations. The ASPEN model uses a Gaussian dispersion equation to calculate ambient air concentration, taken from Version 2 of the Industrial Source Complex Long-Term (ISCLT2) computer model. The uncertainty in the ISCLT2 model has been studied extensively, and this uncertainty was used in the uncertainty analysis for the national-scale assessment.

Uncertainty due to the ASPEN chemical transformation equations. For some of the air toxics compounds, the chemical reactions in which they participate in the atmosphere are complex and non-linear. The ASPEN model, however, can treat only simpler, linear, reactions. For predicting the secondary formation of formaldehyde, the results of this model were compared to the results of a more detailed model (OZIPR) to estimate the uncertainty introduced by the simplifications of secondary formation in the national-scale assessment.

Exposure

Considering next the predictions of exposure, the specific sources of uncertainty considered in the uncertainty analysis due to the relationship between ambient air concentration and exposure (in addition to those considered for ambient air concentration) are:

Uncertainty due to microenvironment factor parameters. The HAPEM4 exposure model calculates the concentration in specific microenvironments (such as in a home or in a car) based on the ambient air concentration predicted by ASPEN. Parameter values needed in these relationships are not well developed for many of the air toxics compounds. As a result, representative values were used in many cases based on measured values of similar compounds in similar situations. This introduces uncertainty into the analysis of exposures for compounds in which these representative values were used. In addition, the same values were applied to all census tracts. Any one census tract, however, may actually have different values than the representative ones assumed.

Uncertainty due to population cohort parameters. Each receptor population or cohort (10 cohorts in all) was assigned a representative activity pattern based on the EPA’s CHAD database. The same activity pattern was assigned to that cohort in all census tracts. Parameter uncertainty is introduced due both to limitations in the CHAD database and to the assignment of a national average to all census tracts.

Uncertainty due to the activity pattern sequence for an individual. Annual average exposure for a typical individual in a receptor population or cohort was calculated by selecting a single day from the CHAD database for that cohort. The same value was used for a given individual on all 365 days. This process was repeated 30 times, to produce 30 different estimates of annual exposure for individuals in a receptor population or cohort. This process does not reflect the fact that an individual vary in activity pattern throughout a year. In addition, there is uncertainty introduces by using a sample size of 30 (rather than a much larger number of samples). This introduces uncertainty as to whether the resulting typical activity pattern represents the average activity pattern for an actual group of individuals.

Risk

Considering finally the predictions of risk, the specific sources of uncertainty considered in the uncertainty analysis due to dose-response relationships (in addition to those considered for ambient air concentration and exposure) are:

Uncertainty in hazard identification. The cancer risk estimates are based on the assumption that a compound is a carcinogen or produces a noncancer effect. This judgment was made based on the results of a hazard identification stage in which the evidence that an air toxics compound produced either cancer and/or a noncancer effect is assessed. Since the evidence for either of these judgments is never perfect, there is always the possibility that a compound labeled a carcinogen, or deemed to produce noncancer effects, might in fact produce no such effect in humans. This introduces uncertainty into the calculation of risk, since there is a possibility that the risk is zero. This possibility decreases as the evidence for the original claim (i.e. that the compound produces the effect) increases.

Uncertainty in dose-response models for carcinogens. The cancer risk estimates are based on an assumption of linearity in the relationship between exposure and probability of cancer. In other words, the probability of cancer is assumed to be proportional to the exposure (equal to the exposure times a Unit Risk Estimate). This linear model is used routinely in regulatory risk assessment because it is believed to be conservative; i.e. if the model is incorrect, it is likely to lead to an overestimate of the risk rather than to an underestimate. Other scientifically valid, biologically-based models are available, and these produce estimates of cancer risk different from those obtained from the linear model. Uncertainty in risk estimates is, therefore, introduced by the inability to completely justify use of one model or the other (since there is at least some scientific support for each of many models). It is important to note here that this uncertainty is to some extent one-sided. In other words, it provides more confidence in the claim that the true risk is less than that predicted than in the claim that the risk is greater than that predicted.

Uncertainty in Unit Risk Estimate parameters. The linear cancer dose-response model uses a Unit Risk Estimate (URE) specific to each air toxics compound. In some cases, these UREs are based on best (maximum likelihood) estimates of the slope of the dose-response relationship based on reliable data, and in other cases these estimates are based on “upper bound” estimates (i.e. the slope is not the best estimate, but is a conservative value which is likely to lead to overestimates of risk) based on less reliable data. For some compounds, the data are from human exposures, while for others they are from animal exposures. These issues cause the URE values for a specific compound to be uncertain.

Uncertainty in Reference Concentration parameters. The noncancer effects model uses a Reference Concentration (RfC) to calculate a Hazard Quotient (HQ) for a specific compound. The HQ is the ratio of the actual exposure over the RfC. The RfC, in turn, is uncertain. As a result, the value of HQ also is uncertain. It is important to bear in mind that the uncertainty in the RfC is to some extent one-sided. In other words, it provides more confidence in the claim that the true noncancer risk is less than that predicted than in the claim that the risk is greater than that predicted.

All of the above sources of uncertainty, including uncertainty in ambient air concentration, exposure and dose-response, were considered in the uncertainty analysis for the national-scale assessment. This includes both model uncertainty and parameter uncertainty as described on the Components page. The methods by which they were characterized qualitatively and/or quantitatively are described in the Methods page.

More Details About the "Overall Confidence" Rankings
What are the components of uncertainty?
How was the uncertainty analysis conducted?

Return to Main Uncertainty Page

Components of Uncertainty

Ambient Air Concentration

Exposure

Risk

Local Navigation