Comparison of 1999 Model-Predicted Concentrations to Monitored Data

Note: EPA no longer updates this information, but it may be useful as a reference or resource.

As part of the 1999 National-Scale Air Toxics Assessment, EPA compared ASPEN-modeled concentrations with available, but geographically limited, ambient air quality monitoring data for 1999. For each monitor-pollutant combination, EPA compared the annual average concentration estimated by the ASPEN model at the exact geographical coordinates of the monitor location with the annual average monitored value to get a point-to-point comparison between the model and monitor concentrations. EPA used the same approach as that used for comparing the ASPEN model to monitor data for the 1996 national-scale assessment except that EPA used updated emissions, meteorological, and monitor input data for the 1999 assessment; there were no major changes to model formulation.

Table 1 (PDF 1 p., 16 KB) shows the number of monitoring sites used in the 1999 comparison and the median ratio of model to monitor annual average concentrations by pollutant, on a point by point basis. The number of sites is the number of monitors with valid data. A large number of monitors means that more data is available which in turn makes it easier to assess the degree of agreement between model and monitor data. Lead and benzene have the highest number of monitors. In general, the number of available sites is about the same as used in the 1996 evaluation and is also geographically limited. The median of ratios is based on the model/monitor ratios for a given pollutant. A median close to 1 suggests that the model overestimates the monitors about as often as it underestimates the monitors. The agreement for benzene is the best: median ratio of 0.95. The percent of sites estimated "within a factor of 2", is the percent of sites for which the model estimate is somewhere between half and double the monitor average. The "percent of sites estimated within 30%" is the percent of sites for which the model/monitor ratio is between 0.7 and 1.3. The "percent of sites underestimated" is the percent of sites for which the model/monitor ratio is below 1.

The degree of agreement between model to monitor data can be attributed the following 5 uncertainties (which are the same identified in the 1996 model-to-monitor comparison):

emission characterization uncertainties (e.g., specification of sources location, emission rates and release characterization);
meteorological characterization uncertainties (e.g., representativeness);
model formulation and methodology uncertainties (e.g., characterization of dispersion, plume rise, deposition,);
monitoring uncertainties; and
uncertainties in background concentrations.

ASPEN's limited ability to address the complex chemical transformation mechanisms needed to estimate ambient concentrations for highly reactive pollutants results in additional uncertainty for acetaldehyde and formaldehyde concentrations.

Figure 1 (PDF 1 p., 73 KB) is a box plot showing the distribution of the model-to-monitor ratios shown in Table 1. For example if there are 115 monitors for benzene, there are 115 model/monitor ratios to compute. EPA then computed the median of these 115 ratios as well as the percentiles to create the plot. The bottom of the box is the 25th percentile, the top of the box is the 75th percentile, and the horizontal line in the middle of the box is the median. If the model is consistently agreeing well with the monitored data for the pollutant, the box plots will be short, and centered at 1. Pollutants are organized alphabetically in two separate groups according to whether they are gaseous, or are metals. This side-by-side display of pollutants facilitates comparison to see which pollutants are being overestimated and underestimated, and which are estimated consistently. As in the 1996 comparison, the box plots do not show extreme percentiles (e.g., 10th and 90th ) of the ratios because the extreme percentiles were far from the center of the distribution.

In this comparison, several assumptions about the monitoring data were made. Pollutants with less than 30 monitors, and limited geographical coverage (located in only one state) were excluded from the comparisons because our ability to assess model to monitor agreement is limited to that State or geographical area and does not extend nationwide. If monitor data were found suspect (i.e. little variation in daily average concentrations over long periods), EPA also excluded the pollutant.

These results show that only for benzene is there relatively good agreement between modeled and monitor values on a point-to-point basis. The remainder of the pollutants show various degrees of agreement. These results are about the same as found in the 1996 national-scale assessment comparison, i.e., for most pollutants, the ASPEN model tended to underestimate the monitored values at the exact location of the monitors, especially for metals. There are four possible reasons for ASPEN to underestimate pollutant concentrations (which also applied to the 1996 assessment):

If the National Emissions Inventory (NEI) is missing specific emissions sources (for many of the sources in the NEI some of the emissions parameters are defaulted or missing).
If the emission rates are underestimated in many locations. EPA believes the ASPEN model itself is contributing in only a minor way to the underestimation. This is mainly due to output from the predecessor of the ASPEN model comparing favorably to monitoring data in cases where the emissions and meteorology were accurately characterized and the monitors took more frequent readings.
If there are problems in monitor siting. Sites are normally situated to find peak pollutant concentrations, which implies that errors in the characterization of sources would tend to make the model underestimate the monitor values.
Uncertainty in the accuracy of the monitor averages, which, in turn, have their own sources of uncertainty. The results suggest that the model estimates are uncertain on a local scale (i.e., at the census tract level). EPA believes that the model estimates are more reliably interpreted as being a value likely to be found within 30 km of the census tract location.

Top of page

Comparison of 1999 Model-Predicted Concentrations to Monitored Data

Local Navigation