An official website of the United States government.

Environmental Modeling 101: Training Module

Overview

Environmental Modeling 101

The U.S. Environmental Protection Agency (EPA) uses a variety of models to inform decisions that support its mission of protecting human health and safeguarding the natural environment — air, water, and land — upon which life depends.

This module has four main objectives:

1. Provide a basic introduction to environmental modeling
2. Define the and categories of environmental models
3. Explain how and why models are used in environmental sciences
4. Introduce the model "life-cycle"

What is a model?

According to the EPA (2009a) a model is defined as:

"A simplification of reality that is constructed to gain insights into select attributes of a physical, biological, economic, or social system. A formal representation of the behavior of system processes, often in mathematical or statistical terms. The basis can also be physical or conceptual."

Models are representations of the environment that can be used to inform regulation or management decisions.

Definition

The term modelmodelA simplification of reality that is constructed to gain insights into select attributes of a physical, biological, economic, or social system. A formal representation of the behavior of system processes, often in mathematical or statistical terms. The basis can also be physical or conceptual. can be an ambiguous word used to describe an 'abstraction (or parameterization) of reality.' Models can take on many forms, the most common and relevant forms are computational and conceptual models.

In a broader sense, there can be many types of models (EPA, 2009a):

* While the last two types of models are not conventional models, the statistical models used to extrapolate from these abstractions to the 'real' system are. They are included here to distinguish among the types of models.

Definition: Computational Models

Computational models express the relationships among components of a system using mathematical representations (Van Waveren et al., 2000).

The Tier 1 Rice Model – A computational model

Definition: Conceptual Models

A hypothesis regarding the important factors that govern the behavior of an object or process of interest. This can be an interpretation or working description of the characteristics and dynamics of a physical system.

Conceptual model of the AQUATOX model

(Click on image for a larger version)
Diagram courtesy of AQUATOX website.
Registry of EPA Applications, Models and Databases (READ).

Definition: Analogous Models

* While the last two types of models are not conventional models, the statistical models used to extrapolate from these abstractions to the 'real' system are. They are included here to distinguish among the types of models.

Why Are Models Used?

Models have a long history of helping to explain scientific phenomena and predict outcomes and behavior in settings where empirical observations are limited or not available (EPA, 2009a).

Models are based on simplifying assumptions of environmental processes and cannot completely replicate the inherent complexity of the entire environmental system. Despite these limitations, models are essential for a variety of purposes; described in two broad categories:

• To diagnose (i.e., assess what happened) and examine causes and precursor conditions (i.e., why it happened) of events that have taken place
• To forecast outcomes and future events (i.e., what will happen).

The NRC (2007) describes a model as:

"A simplification of reality that is constructed to gain insights into select attributes of a particular physical, biological, economic, or social system."

Models can be used to inform a variety of activities including:

• Research
• Toxicity screening
• Policy analysis
• National regulatory decision making
• Implementation applications

Model Structure

In any modeling exercise, the system of interest should be defined. This definition is not only used to identify the boundaries of the model, but also serves to define how the model can be applied and to which systems/situations.

System:
A collection of objects or variables and the relations among them.

Model developers should answer the following questions:

1. What processes is the model attempting to reproduce and include?
2. At what time scale(s) are the included processes occurring?
3. At what spatial scale(s) are the included processes occurring?

Therefore, model structure can be described two ways:

1. Included Processes (chemical, physical, or biological)
2. Scope / Scale (time or space)

Examples of decreasing scale for generic air quality models.

Model Structure: A Modeling Caveat

A Modeling Caveat

Models are typically (and should be) developed for a well defined system and a set of conditions under which the use of the model is scientifically defensible - the application niche. The identification of application niche is a key step during model development and helps guide future application of the model.

Types Of Computational Models

The remainder of this module will focus on computational models. The types of computational models are determined by the available data, the intended use, and the interpretation of model generated results. However, the types of models are not mutually exclusive (see Summary Table).

Empirical vs. Mechanistic models

Empirical models – include very little information on the underlying mechanisms and rely upon the observed relationships among experimental data. These can be thought of as 'best-fit' models whose parametersparametersTerms in the model that are fixed during a model run or simulation but can be changed in different runs as a method for conducting sensitivity analysis or to achieve calibration goals. may or may not have real-world interpretation.

Mechanistic models explicitly include the mechanisms or processes between the state variablesstate variablesThe dependent variables calculated within the model, which are also often the performance indicators of the models that change over the simulation.; unlike empirical models. The parameters in mechanistic models should be supported by data and have real-world interpretations (EPA, 2009b).

A Modeling Caveat

When data quality is otherwise equivalent, extrapolation from mechanistic models (e.g. biologically based dose-response models) often carries higher confidence than extrapolation using empirical models (EPA, 2009b).

Types Of Computational Models:

Deterministic vs. Probabilistic models

Deterministic models – provide a solution for the state variable(s) rather than a set of probabilistic outcomes. This type of model does not explicitly simulate the effects of data uncertaintyuncertaintyThe unknown effects of parameters, variables, or relationships that cannot or have not been verified or estimated by measurement or experimentation. or variabilityvariabilityObservable diversity in biological sensitivity or response, and in exposure parameters (such as breathing rates, food consumption, etc.) These differences can be better understood, but generally not reduced by further research.. Changes in model outputs are solely due to changes in model components, the boundary conditions, or initial conditions (EPA, 2009a). Therefore, repeated simulations under constant conditions will result in consistent results. Probabilistic models – utilize the entire range of input data to develop a probability distribution of model output (i.e. exposure or risk) rather than a single point value.

Probabilistic models are sometimes referred to as statistical or stochastic models. Probabilistic models can be used to evaluate the impact of variability and uncertainty in the various input parameters, such as environmental exposure levels, fate and transport processes, etc.

Types Of Computational Models:

Dynamic vs. Static models

Dynamic models – make predictions about the way a system changes with time or space. Solutions are obtained by taking incremental steps through the model domain. For most situations, where a differential equation is being approximated, the simulation model will use a finite time step (or spatial step) to estimate changes in state variables over time (or space).

Static models make predictions about the way a system changes as the value of an independent variable changes.

Type Equation
Deterministic
Probabilistic
Dynamic
Static

Types Of Computational Models:

Other Relevant Modeling Terms

The model framework is defined as the system of governing equations, parameterization and data structures that represent the formal mathematical specification of a conceptual model (EPA, 2009a).

Mode (of a model): The manner in which a model operates. Models can be designed to represent phenomena in different modes. Prognostic (or predictive) models are designed to forecast outcomes and future events, while diagnostic models work "backwards" to assess causes and precursor conditions (EPA, 2009a).

Summary Table Of Model Type

Probabilistic
Models
Deterministic
Models
Empirical
Models
Mechanistic
Models
Also Known As: Statistical
or
Stochastic Models
--- 'Best Fit' Models ---
Input Data: Measured Values
or
Estimated Distributions
Measured Values Measured Values
or
Estimated Distributions
Measured Values
or
Estimated Distributions
Model Output: Probability Distribution Single Point Value Probability Distributions
or
Single Point Value
Probability Distributions
or
Single Point Value
Description: Utilize the entire range of input data to develop a probability distribution of model output Provide a solution for the state variables rather than a set of probabilistic outcomes Rely upon the observed relationships among experimental data Explicitly include the mechanisms or processes between the state variables

The Role of Modeling

"Fundamentally, the reason for modeling is a lack of full access, either in time or space, to the phenomena of interest. In areas where public policy and public safety are at stake, the burden is on the modeler to demonstrate the degree of correspondence between the model and the material world it seeks to represent and to delineate the limits of that correspondence."

The use of models has increased significantly. Although, models do not generate "truth", they can provide analyses and information used to inform the EPA's decision making process. Policy decisions should be informed by the best information and data. However, researchers are confronted with many constraints when obtaining data [e.g. time, access, and resources (funding, equipment, staff)].

Where there is a shortage of data and information, models can be used to provide useful insight. In general, models can help users study the behavior of ecological systems, design field studies, interpret data, and generalize results (EPA, 2009a). Models are used to make long- and short-term forecasts to extrapolate from the past and answer "what-if" questions. Models can also be used to provide concise summaries of data, in both diagnostic and regulatory contexts (NRC, 2007).

The relationship between data and models is changing. The increasing availability of data may promote new model development or application of existing models to new data. However, this requires that data are used appropriately with models. The limitations from uncertainties and assumptions associated with any model must be considered - as with observational data - before model generated results are applied in any context.

Environmental Models Used by EPA

Environmental models are categorized into groups representing a continuum of processes which translate the interactions between human activities and natural processes into human health and environmental impacts. The CREM Guidance Document (EPA, 2009a) identifies the classes of environmental models used by the EPA:

• Human Activity Models - Simulate human activities and the behaviors that result in emission of pollutants.
• Natural Systems Process - Simulate dynamics of ecosystems that give rise to fluxes of nutrients and/or emissions.
• Emissions Models - Estimate the rate or amount of pollutant emissions to water bodies and atmosphere.
• Fate and Transport Models - Calculate the movement of pollutants in the environment. Further classified into Subsurface Water Quality Models, Surface Water Quality Models, and Air Quality Models.
• Exposure Models - Estimate the dose of pollutant which humans or animals are exposed.
• Human Health Effects Models - Provide a statistical relationship between a dose of a chemical and an adverse human health effect.
• Ecological Effects Models - Provide a statistical relationship between a evel of pollutant exposure and a particular ecological indicator.
• Economic Impact Models - Used in rule making, priority setting, enforcement; model output as a monetary value.
• Noneconomic Impact Models - Evaluate the effects of contaminants on a variety of noneconomic parameters (e.g. crop yields).

Classes of Environmental Models: These classes represent a research continuum from human activities and natural system processes to environmental and economic impacts. Modified from NRC (2007). (Click on image for a larger version)

Registry of EPA Applications, Models and Databases (READ) houses the models used, developed, or funded by the EPA. It serves as the central repository of the Agency's models, across all disciplines.

The Model Life-cycle

The model life-cycle is ongoing, and there are many instances when earlier stages are revisited to refine the model. The life-cycle follows a general iterative progression shown in the figure to the right and described below (from EPA, 2009a):

Life-cycle of a model: the process of developing and applying models; modified from EPA (2009a). (Click on image for a larger version)

Further information regarding the model life-cycle can be found in the Model Life-cycle module

An Alternative Life-cycle

Not every project requires the full development of a new model; often there are existing models which can be applied to a specific situation. In these instances, there is an alternative model life-cycle; which involves model evaluation, application, and as needed, post-auditing.

Post-auditing:
Assesses a model's ability to provide valuable predictions of future conditions for management decisions.

For instance, not every project requires the full development of a new model; often there are existing models which can be applied to a specific situation.

In the modified life-cycle, a model is selected that meets the requirements of the specified problem. Once selected, a model may require calibrationcalibrationComparison of a measurement standard, instrument, or item with a standard or instrument of higher accuracy to detect and quantify inaccuracies and to report or eliminate those inaccuracies by adjustments. or site-specific parameter values. Likewise, other qualitative evaluations of the model may further corroborate its application. (Example of Site Specific Calibration)

After the model has been applied, post-auditing can determine whether the predicted model outcome(s) were observed. The model post-audit process involves monitoring the modeled system, after implementing a remedial or management action, to determine whether the actual system response concurs with that predicted by the model. Post-audits can also be used to evaluate how well stake-holder and decision-making roles were integrated during the development stages (Manno et al., 2008; EPA, 2009a).

An Alternate Version of the Model Life-cycle: When model development is not required a modified version of the life-cycle is appropriate. If an existing model will work for the specified problem, model development (and design) is circumvented; leaving three steps to the life-cycle (shown above with dashed lines). The stages of the life-cycle defined by EPA (2009a) appear in the solid boxes. Recall that model evaluation occurs during the Development and Application Stages. (Click on image for a larger version).

The Importance Of Data Quality

The quality of the data is fundamental to environmental modeling; and pertinent not only during model application, but throughout the modeling life-cycle. The quality of a model is also governed by model structure, scientific understanding, evaluation, etc. Quality assurance is therefore necessary throughout the stages of the modeling life-cycle.

A Foundation of Data Quality: Data provide the foundation for our understandings which motivate the development and application of environmental models. Data are used during parameter estimation events, calibration processes, and ultimately model application. Model developers and users should consider:

"what goes in is equal to what comes out"

that is to say, data which is poor in quality will not yield model results with higher quality.

The Importance Of Data Quality:

Indicators of Data Quality include the quantitative and qualitative measures of principal quality attributes (EPA, 2009a).

Indicators of Data Quality

• Precision - the quality of being reproducible in amount or performance
• Bias - systematic deviation between a measured (i.e., observed) or computed value and its "true" value.
• Representativeness - the measure of the degree to which data accurately and precisely represent a characteristic of a population, parameter variations at a sampling point, a process condition, or an environmental condition
• Comparability - a measure of the confidence with which one data set or method can be compared to another
• Completeness - a measure of the amount of valid data obtained from a measurement system
• Sensitivity - The degree to which the model outputs are affected by changes in a selected input parameters.

The Importance Of Data Quality: Quality Assurance

Quality assurance (QA), quality control, and peer reviewpeer reviewIn EMAP, peer review means written, critical response provided by scientists and other technically qualified participants in the process. EMAP documents are subject to formal peer review procedures at laboratory and program levels. In EMAP, Level 1 peer reviews are performed by EPA's Science Advisory Board, level 2 by the NAS National Research Council, level 3 by specialist panel peer reviews, and level 4 by internal EPA respondents. for definition also play important roles in the Agency's modeling efforts. The data are subject to data quality objectives and other QA measures. Similarly, Quality Assurance Project Plans help guide model development, evaluation, and application. Together, quality assurance requirements are the means to overall transparencytransparencyOpen, comprehensive and understandable presentation of information..

Data and Model Quality Assurance

Additional information (including guidance documents) can be found at the Agency's website for the Quality System for Environmental Data and Technology.

Legal Aspects When EPA Uses Models

A number of laws serve as EPA's foundation for protecting the environment and public health. The Administrative Procedure Act (5 U.S.C. § 553) requires EPA to provide the public notice and an opportunity to comment on its rule makings.

If a rule is supported by a model, this legal obligation means the Agency must provide the public notice of the Agency's use of the model and an opportunity to comment on the assumptions and algorithmalgorithmA precise rule (or set of rules) for solving some problem. that is built into the model, along with the other scientific components of the regulation or rule-making.

Further, it must be clear how a particular model may be used, and the Agency must provide sufficient information about the model for public comment. The legal challenges to the Agency's actions in enforcing those laws could be classified into two categories identified in the adjacent panel (adapted from McGarity and Wagner, 2003).

Process Challenges

Procedural challenges are usually directed at the overall transparency of the modeling exercise and the adequacy of any notice and opportunity for public comment that the agency might be required to provide.

Substantive Challenges
These challenges are mounted against areas of technical disagreements with assumptions of the model or the context in which the model was applied.

In the Legal Aspects of Environmental Modeling module, we explore how the Agency's regulatory actions (related to modeling) have been challenged and point to best modeling practices related to those challenges.

Summary

• According to the EPA (2009a) a model is defined as:
"A simplification of reality that is constructed to gain insights into select attributes of a physical, biological, economic, or social system. A formal representation of the behavior of system processes, often in mathematical or statistical terms. The basis can also be physical or conceptual."
• The types of the environmental models used by the EPA include fate and transport models, emissions and activities models, exposure models, and impact models.
• The mode life-cycle includes problem identification, development, evaluation, and application. Iterative peer reviews are an important component throughout a model's life-cycle.
• Models can provide meaningful data to inform the decision making process when the appropriate actions and precautions have taken place during the life-cycle of the model.
• Models can not improve the data that goes into them. Model results should not be considered truths.

Transparency: In the past, models have been considered a 'black box' of the research or regulatory process (Pascual, 2004). Through better understandings of the model life-cycle and best modeling practices, models can be built from Plexiglass!

End of Module

The Environmental Modeling 101 Module