Jump to main content.

MODTOMON Executive Summary

Information provided for informational purposes onlyNote: EPA no longer updates this information, but it may be useful as a reference or resource.

National-Scale Air Toxics Assessment
Comparison of ASPEN Modeling System Results to Monitored Concentration

The complete paper is available at https://www.epa.gov/airtoxics/nata/draft6.html .

Executive Summary

Summary and Conclusions

As part of the National Air Toxics Assessment (NATA) national-scale assessment, the ASPEN model using the 1996 National Toxics Inventory (NTI) emission estimates and estimates for background concentrations is used to predict annual average ambient concentrations throughout the U.S.  Estimates are generated for the 33 priority urban hazardous air pollutants (HAPs) plus diesel particulate matter at approximately 60,000 census tract locations nationwide.  This appendix describes a comparison of these modeled air quality estimates with currently available, but geographically limited ambient air monitoring data.  A representative subset of seven HAPs (benzene, perchloroethylene, formaldehyde, acetaldehyde, cadmium, chromium and lead) was selected for this evaluation.  These pollutants were selected because they represent impacts by different combinations of mobile, area and point sources; include reactive and non-reactive compounds; and include those with both primary emissions and secondary formation in the atmosphere.  They also include those HAPs with the largest number of monitoring sites.

In general, the modeled estimates for most of the pollutants examined are typically lower than the measured ambient annual average concentrations when evaluated at the exact location of the monitors.  However, when the maximum modeled estimate for distances up to 10-20 km from the monitoring location are compared to the measured concentrations, the modeled estimates are closer to monitored concentrations.  This result can be attributed, in part, to spatial uncertainty of the underlying emission and meteorological data, and the tendency of current air toxics monitoring networks to typically characterize the higher if not highest air pollution impact areas in the ambient air.  It also shows that the model estimates are more uncertain at the census tract level but are more reliable for larger geographic scales, like county or State.  Nevertheless, there are many locations for several of the studied HAPs (including the aldehydes and the metals), for which the model estimates are still significantly lower than the measured concentrations even at distances up to 50 km.  For these instances, the difference between modeled and monitored concentrations may be attributed to underestimated or missing emissions data and uncertainty in chemical transformation (for aldehydes).  The limitations of modeled concentrations resulting from isolated point sources using a geographically sparse ASPEN receptor network in rural areas may be another contributing factor.  A more detailed discussion of the uncertainties of the ASPEN modeling system and the monitoring data can be found elsewhere in this document.

The remainder of this report is organized into six sections: I. Introduction; II. Raw Materials; III. Model-to-Monitor Comparison Analysis Methods; IV. Uncertainties; V. General Results; and VI. Conclusions.  A roadmap to the content and findings of the report is presented below.

The basic raw materials are the monitoring and the modeling system data.  For the monitoring data and their computed annual average concentrations, Section II.B includes a discussion of temporal completeness, treatment of concentrations below the method detection limit and monitoring data excluded from these analyses.  Next, Section II.C describes the underlying emissions data which are used in the modeling system.  This includes the National Toxics Inventory (NTI) information at the point and county level, as well as the processed, "model-ready" emissions provided by Emissions Modeling System for Hazardous Air Pollutants (EMS-HAP).  Section II.D briefly describes the types of model estimates used for these monitor to model comparisons.  The data analysis techniques are described in Section III.  This includes graphical and statistical methods which reflect a review by and suggestions from EPA's Science Advisory Board .  Next, Section IV describes the temporal and spatial uncertainties of the modeling and emissions data.  This starts with a History Of Model-to-Monitor Comparisons With ASPEN , which shows that ASPEN typically agrees with monitoring data within 30% half the time and within a factor of 2 most of the time.  The latter ranges will be used as a guide for judging good agreement between the model and monitoring estimates.  For many of these studies, however, it is important to note that the emission inputs were prepared specifically for modeling purposes and therefore many have been of a higher quality than the overall NTI.  This section includes the specific spatial and temporal uncertainties regarding monitoring , emissions , and ASPEN and other dispersion model estimates as they relate the 1996 model-to-monitor comparison.

Finally, the results of the various monitor to model data analyses are presented in Section V.  Table 8 summarizes the comparisons on a point to point basis.  The best agreement is observed for benzene.  The results are with a factor of two for 89% of the cases and within 30%, 59 percent of the time. The median ratio of model to monitor comparisons is 0.92.  The lack of agreement for the other HAPs on a point to point basis can be seen from Table 8 which shows the median ratios varying between 0.65 for formaldehyde to 0.17 for lead.  The percentage of points with agreement within a factor of 2 or within 30% are also correspondingly lower.  These results can also be observed using the ratio box plot graphs

While all HAPs except benzene show relatively poor agreement on a point-to-point basis (with the model estimates systematically lower than the monitor averages), they compare more favorably when the maximum estimated modeled concentration is examined within 30 km of the monitoring site.  These data are presented using the MAXTOMON statistic described in section III.  The improved comparisons can be attributed to two reasons as discussed in Section V.A :

  • Many emissions sources are not precisely located.  (EMS-HAP defaults locations when they are not provided or when total emissions exist for the county.)
  • Many of the monitors were likely sited to find peak concentrations.  For the point source situations with elevated emission releases, the monitors frequently represent hot spot locations where the ambient concentration falls off quickly around the peak concentration area.

The model-to-monitor comparisons are discussed in detail for benzene , other gases , and the metals .


The relationship between model estimates and monitored values for benzene can be nicely described by the scatter plot which shows the point to point comparison of modeled and monitored annual average concentrations.  As also shown in the ratio box plot and table 8 , most of the points in the scatter plot fall between the 2:1 and 1:2 lines which shows good agreement.

This is consistent with greater confidence in the monitoring and emissions data for this ubiquitous pollutant.

Perchloroethylene, Formaldehyde, and Acetaldehyde

The model-to-monitor relationship on a point-to-point basis is similar for the three other studied gases (perchloroethylene, formaldehyde, and acetaldehyde).  In the ratio box plots in section V.A., however, we can see that the model's estimates tend to be lower than the monitor averages.  The typical values, however, agree well, with the median ratios all within a factor of 2.  Nevertheless, a large percent of the modeled estimates are less than the monitored concentration for these gases on a point-to-point basis as seen in Table 8 and the ratio box plots .  This can be attributed, in part, to spatial uncertainty in the underlying emissions for these pollutants.

To examine the spatial uncertainty in the modeling system for the gases, the monitored concentration is compared to the maximum modeled estimate in its vicinity. The results for the gases are presented in Table 9.  This table shows nearby modeled concentrations which are greater than the measured average concentration for many of the monitors.

This is especially true for perchloroethylene.  In the close vicinity of most monitors, higher modeled concentrations are observed.  This result for perchloroethylene suggests that uncertainties in the magnitude and location of the nearby area sources may at least be partly responsible for the underestimation on a point-to-point basis.  In other words, the model is not necessarily systematically underestimating ambient concentrations: it may just be finding the peak concentration in the wrong place.  More details on the MAXTOMON comparison are presented elsewhere.

For the two aldehydes, many monitors also have nearby modeled concentrations which are greater than their measured values. However, a large fraction of the aldehyde monitors cannot be associated with larger modeled values, even within 50 km.  This suggests systematic underestimation by the modeling system for the aldehydes, at least for some areas.  This may be attributed, in part, to the nature and treatment of these HAPs.  The two aldehydes are mobile source dominated, but a large fraction of their ambient concentrations are secondarily formed.  The chemical reactions resulting in their formation are simulated in ASPEN.  This adds an additional source of uncertainty to the modeling system and distinguishes them from the other HAPs in this comparison.


For the metals, the monitored concentrations are typically much higher than the modeled concentrations when compared at the same location.  The difference is most dramatic for source oriented monitors.  A detailed discussion is provided for the model-to-monitor comparisons for lead .  Based on the median ratio, the source-oriented lead monitors are typically underestimated by a factor of 7.5, and the others are underestimated by a factor of 4.9.  Only 17% of the source-oriented monitors and 18% of the other monitors are estimated within a factor of 2 at the exact location of the model estimate.

A combination of several factors may be responsible for these discrepancies:

  • Missing emissions from the inventory (e.g., missing point sources, lack of treatment for possible re-entrainment effects).
  • Spatial uncertainty in emission locations due to defaulted locations for point sources using procedures described in EMS-HAP.
  • Spatial uncertainty of nearby impacts from elevated point sources (i.e. narrow plume impact) together with a small number of receptors.
  • High coarse particle deposition velocities.

The effects of missing emissions and EMS-HAP defaulted emissions locations were explored.  This analysis focused on 30 of the 42 monitors which were underestimated by a factor of 10 or greater.  This analysis demonstrated that for the included monitoring locations, several nearby lead sources are missing or have uncertain locations.  It also quantifies the effects of mislocated sources.  The effects for spatial uncertainty of lead emissions was also examined using the MAXTOMON analysis . Because this analysis did not reveal higher predicted concentrations within a relatively large region surrounding the monitor, it also suggests that emission sources may be missing or underestimated.  However, this analysis may be less definitive for point source pollutants in rural areas.  Such monitors tend to represent small (<0.5 km) areas.  The surrounding region also have few census tracts and the small number of ASPEN receptors limit the opportunity to find other peak concentrations.  Regarding deposition velocities, we estimated that ASPEN has a bias to underpredict average ambient lead concentrations by 20-30%, because of high coarse particle deposition velocities.

It appears that the current modeling system is underestimating lead for a large percentage of the monitors used in this evaluation.  It must be noted that the monitors do not represent a random sampling of all census tracts.  To attain better modeled results in the vicinity of isolated point sources, emissions and source locations should be more accurately characterized.  In addition, a denser receptor network may be required.

The results for cadmium and chromium also show many locations with low modeled concentrations.  However, the amount of disagreement between the modeled estimates and monitored concentrations appear to be dependent on the different source regions represented.  This suggests possible differences in the State inventories, but generalizations are difficult because of the limited number of monitoring locations included.

Efforts to Support Future Model Comparisons

The evaluation of the NATA air quality modeling system is an iterative process.  The current evaluation has demonstrated the need for better information which in turn will permit an improved evaluation in the future.

First, an improved, expanded and more representative air toxics monitoring network will be available in the near future to better support model evaluation.  To assist with the development of this network and to provide better model evaluation information in the short-term, FY-01 pilot monitoring studies have been initiated which will have multiple monitors in four urban areas and also provide information in six smaller communities across the country.  Limited information on background concentrations will also be provided.  These new data will provide a wider range of concentrations which are lacking in the current monitoring data set.

Second, improvements will be made to future emission inventories.  In addition, it would be desirable to conduct a study on a small sample of sources, to see if the emissions are accurately located and that their rates are accurately estimated.  The pilot cities mentioned above would be a logical starting point.

Third, if these model estimates are used on a local scale, it is crucial to:

  • Get better data on the source locations and releases particularly, for pollutants dominated by point sources;
  • Improve spatial allocation methods and reducing the need to utilize them for pollutants dominated by area sources; and
  • Improve the estimates for background concentrations and utilize regional estimates to the extent possible.

Local Navigation

Jump to main content.