The Reach Indexing Tool for the National Hydrography Dataset: Functionality and Impacts on State Water Programs

Water: Georeferencing

You are here: Water

Our Waters

Rivers & Streams

Applications & Databases

WATERS

Georeferencing

The Reach Indexing Tool for the National Hydrography Dataset: Functionality and Impacts on State Water Programs

Abstract

The U.S. Environmental Protection Agency (EPA) is developing a Reach Indexing Tool (RIT) for the National Hydrography Dataset (NHD). NHD is a spatial database containing national surface water features. The NHD-RIT is an ArcView based tool to help states georeference surface water entities to NHD using the dynamic segmentation data model. The tool is primarily used to index Clean Water Act Section 303(d) and 305(b) waterbodies, and state Water Quality Standards. This paper discusses the evolution, purpose, and functionality of the NHD-RIT and its impacts on states' Clean Water Act programs.

Introduction/History

Until 1998, most States in the U.S. did not have spatial datasets allowing them to analyze, display and map their waterbodies for programs under the Clean Water Act (e.g. impaired waterbodies under Section 303(d) and 305(b) and water quality standards). The lack and/or incompleteness of such dataset was a problem since it prevented an accurate determination of the environmental health of the Nation's surface waters. The visualization of impaired waters would also have been beneficial to states and the federal government to make decisions regarding resource allocation to improve water quality.

In addition to the lack of datasets, datasets used by the States often vary in quality, scale, and origin. Some States use hydrographic datasets developed by themselves, while others use EPA's standardized, national Reach File 3 (RF3) dataset or the U.S. Geological Survey's (USGS's) digital line graph (DLG) dataset. RF3 is a spatial digital dataset that was first compiled in 1992, and is at a scale of 1:100,000 (for more information see: https://www.epa.gov/waters/doc/rfindex.html). Yet, States that adopted RF3 or DLG sometimes made changes to the datasets depending on their needs by changing the scale or making changes to the reach structure. This great diversity in spatial datasets reduces the confidence in results from analyses using those datasets.

In order to promote the use of a single, national dataset, EPA contracted the Research Triangle Institute (RTI) in 1997 to design and develop a tool that would allow States to georeference their water-related information to RF3 (referred to as reach indexing). The RF3 Reach Indexing Tool (RF3-RIT) is designed to allow users to assign attributes to entire or partial RF3 reaches by using ESRI's dynamic segmentation model. Indexing only portions of a reach was a major improvement, since it allows tracking water-related information in its exact location along reaches of without modifying the underlying spatial dataset.

In December 1999, the first (preliminary) version of the National Hydrography Dataset (NHD) was made available to the public. The NHD was developed through a joint effort of USGS and EPA during the last few years and is the next generation Reach File. It was created by integrating USGS's DLG spatial information and EPA's RF3a attribute information, and is intended to provide a standardized, national data model for the surface water network of the U.S. and any associated water features. The dataset contains information about surface water features such as lakes, ponds, streams, springs, and coastlines. Within NHD, homogenous hydrologic features are combined into reaches which are assigned unique identifiers (reach codes). Reaches and their reach codes provide a framework that allows any water-related information to be linked to the surface water drainage network. In addition, NHD provides the following advantage over RF3 (Figures 1 and 2):

Two-dimensional features (polygons) are used to represent two-dimensional waterbody features such as lakes, ponds, and wide rivers
Artificial paths through lakes, ponds and wide rivers simplify modeling and size estimations
NHD will have a protocol in place for updating the database to include features that were not in the original 1:100,000 version

Figure 1: Lake representation in RF3; the lake actually consist of 3 reaches (red, green, purpul).

Figure 2: Lake representation in NHD; the lake consists of a polygon and a single reach (artificial path)

Due to the new feature in NHD, EPA decided to develop a new version of the Reach Indexing Tool (NHD-RIT) that works with NHD. The design of the tool was started in the summer of 1999, and development started shortly after the first release of NHD. The rest of this paper will layout some of the major functions of the tool, and benefits that may occur to state water programs through the use of the NHD-RIT.

Reach Indexing Tool Functionalities

The following section will discuss the functionality of the Reach Indexing Tool by describing the requirements and the design approach used to satisfy them. The requirements for the new NHD version of the tool were collected through a period of two months. Inputs were obtained and collected from users of the RF3 Reach Indexing Tool and a group at EPA that is involved in the development of applications and tools for NHD.

General Design

The NHD-RIT is designed as an ArcView extension. This allows the user to include the tool with any existing ArcView project as well as any other extension. As an extension it has also the advantage of being easily upgraded without having to rebuild existing ArcView projects. This can be achieved by simply replacing the extension with a new version of the NHD-RIT, whenever one becomes available.

In order to avoid any conflicts with other extensions or project specific scripts, all the scripts in the NHD-RIT are prefixed with 'rit.'. In addition, the tool does not make use of global variables, since they can be easily overwritten by other applications. Furthermore, the use of global variables can lead to confusion, since developers/programmers can easily lose track of when, where, and by who they were created or modified last. To avoid these problems, the NHD-RIT keeps track of application wide information by storing it in a Script Editor (Sed) class as a script object. This also allows one to keep track of the information across indexing sessions, meaning that after the user exits the ArcView project, the next time the project is opened again the indexing can be picked up where it was left off. Another advantage is that a project (including the necessary external data files) can be sent to another user, and this user can pick up the work where it was left off. This is especially useful in an environment where multiple people are working on the same project.

The information in the script object is stored in a hierarchical structure by keyword. For example:

<Keyword 1>
	<Keyword 1.1>
		Information line 1
		Information line 2
	<Keyword 1.2>
		Information line 1
<Keyword 2>
	<Keyword 2.1>
...

In order to retrieve data for a given keyword, the text string for the script can be searched for the keyword. Any information under the keyword or any sub-keywords are returned. New information or keywords can easily be added, and old information can be removed when the need arises. The manipulation and retrieval of information in the script object occurs through three scripts:

rit.readinfo	reads the desired information from the specified keyword
rit.writeinfo	writes new information to the desired keyword, if the keyword does not exist yet it will be created, otherwise the information will be overwritten
rit.deleteinfo	deletes the specified keyword or information

Assigning Attributes to Entire or Sections of Reaches

The requirement to allow the user to assign attributes to entire reaches or portions of a reach without changing the underlying spatial data is essential to the tool. It allows the user to specify exactly the location to which water-related information applies without changing the underlying spatial data. By avoiding any structural changes to the spatial dataset, information from different people as well as across state water programs can be processed and compared in a consistent manner.

The following is an example of how this functionality becomes crucial: a user wants to display the location where the effluent from a wastewater treatment plant enters a stream. At the same time he/she wants to display the section of the stream where the water quality is impacted by the effluent wastewater. This can be easily accomplished by creating a point event for the end of the pipe, and a linear event for the impacted stream section as described in the following paragraph.

To achieve this functionality, the NHD-RIT applies ESRI's dynamic segmentation data model available in ArcView. This data model is based on the idea that attributes applying only to a section of a feature can be displayed by simply specifying the start (From measure) and end (To measure) points of the section as measures along the route feature (Figure 3). The start and end points and the unique identifier of the feature are then stored with the attribute data in a database table (Note: the record in the table is called an event). The same methodology is used when a point event is created, yet instead of specifying a start and end point a single point position (Point measure) is stored.

The NHD-RIT uses this data model by applying it to the transport and coastline reach feature in NHD. In the transport reach feature, each reach has a length of 100, and therefore the From and To measures are from 0 to 100 (Note: with the exception of branched artificial paths in lakes where the measures are from 0 - 200). The tool allows users to select one or more features (reaches) from the transport reach feature, determines the appropriate measure, and stores the ID of the reach (reach code), the From and To measures, and some additional information in a dBase table (event table).

Figure 3: The From (F_meas) and To (T_meas) measure specify the exact location along a reach (Rch_code) to which a set of attributes apply. If the user specifies more than two sets of attributes to a reach, each set can be displayed offset from the original reach by a distance specified in the Eoffset field.

Indexing of 2D features

A feature in NHD that was not available in RF3 is the polygon topology and three associated region themes representing lakes, wide rivers, swamps, and other hydrological features. Of the three region types, the waterbody reach feature has an unique identifier (reach code), and can exist for headwater, terminal, in-line, and isolated waterbodies. The waterbody reach provides a link to waterbodies for external information.

The NHD-RIT also allows the user to index regions from the waterbody reach theme. Since the dynamic segmentation model can not be applied to polygon features, the tool creates a shapefile that contains the waterbody features selected by the user. The selection is made by drawing a polygon around the waterbodies to be indexed.

By using a shapefile, the user can also index only partial waterbodies. For this purpose, the user needs to draw a polygon around the portion of the waterbody that should be indexed. The tool then clips the desired section of the waterbody with the polygon, and saves the resulting shape in the waterbody shapefile (Figure 5).

Figure 4: The polygon (thin black line) created by the user is used to clip the 07080107_WBrch theme (blue), and create the waterbody shape file 07080107w.shp (orange).

Other Requierements/Functionalities of the NHD-RIT

Use the standard NHD event table format

The NHD Team at EPA developed a standardized structure for event tables that are used with NHD. In addition to the default event table fields (F_meas, T_meas, Rch_code, and Eoffset), the tables includes the following fields:

EVENT_ID: An unique identifier used for event maintenance

ENTITY_ID: Identifier used to link the event to an external data source

ATTR_PRG: Attribute describing the program/purpose/classification of the event table

ATTR_VAL: Value related to the Attr_Prg field

META_ID: Identifier used to link the event to metadata information

STATE: State abbreviation in which the reach is located

RCH_DATE: Date when the associate reach was created

DUU_ID: NHD DUU identifier

Create, modify and delete events from the event table and waterbody shapefiles through a user friendly interface

The NHD-RIT is designed to hide the complexity of the dynamic segmentation data model and its data structure from the user. This is accomplished by providing the user with tools for adding, modifying and deleting linear and point events and waterbody shapefiles.

Figure 5: Menues, buttons, and tools available to the user. The buttons with the blue sympols are the addtionall buttons and tools provided by the NHD-RIT

To add linear events, selection tools (see below) are available to the user. After selecting the desired reaches from the transport reach theme and providing an ID for the new events the tool populates the fields in the event table with the appropriate data values.

To create a point event the user has two choices: (1) a point located on the reach, and (2) a point offset from the reach. In either case, he/she only has to click on the location where the point should be created.

Tools for editing spatial and attribute information in the event tables and waterbody shapefiles are provided as well. The user can change the extent of indexed entities by moving the end points, or an entity can be split into two if the need arises. Entity ID, Attribute program (Attr_prg) and value (Attr_val) fields can be changed as well.

The NHD-RIT also allows the user to delete linear and point events by selecting them, and clicking on the Delete button.

Provide a set of selection tools to facilitate the selection of continuos sets of reaches

The NHD-RIT includes several selection tools to help the user select multiple stream reaches that make up a surface water entity connected by flow relations. Currently, only the 'point-to-point' and 'upstream' selection functionality which allows the user to select all reaches between two points along the transport route, and the 'upstream' selection functionality which selects all reaches upstream of a selected reach are implemented. At a later stage, additional selection tools will be added such as:

select only reaches along the main stem upstream of the selected reach

select all reaches between two reaches including tributaries

select all reaches along the mainstream downstream of the selected reach

Create and maintain FGDC compliant meta data for event and waterbody tables

The tool maintains meta data for every event. In order to minimize the amount of meta data, events share meta data entries. If events were created with the same source information and by the same user, they share the same meta data and therefore the information is stored only once.
Meta data are maintained in a dBase table, which contains a separate field for each piece of information that is required. Most of the information is automatically maintained by the tool. The user is only responsible to provide information about himself/herself and the sources that were used to index the entities (Figure 6).

Figure 6: Main entry screen for the meta data created by the NHD-RIT.

Provide the user with list of IDs to facilitate indexing

To facilitate the task of assigning IDs to events, the tool provides the user with a list of IDs. The list is specified in one of the following ways:

As a predefined list in dBase format that is loaded into the project

Created while the user indexes, in this case IDs are added to the list as the user enters them

Extracted from an existing database through ODBC.

The list is displayed (Figure 7.) when ever the use adds new liner/point entity or a new waterbody feature. If the user wishes to change entity information for one or more events the list will be displayed as well.

Figure 7: This dialog is display, whenever the user adds a new entity, or updates information related to entity ID, attribute program or values.

Conflate information from user specified coverages to event tables

To provide an easy way for new users to convert their existing data to event tables, an automatic conflation tool is integrated into the NHD-RIT. The conflation functionality creates events by creating buffers around features to be indexed. These buffers are than used to select the closest NHD reach, on which the event is created. Once the conflation process is completed, the user is encourage to visually verify the results. For this purpose the tool contains a QA/QC utility that steps through the conflated entities and lets the user compare them to the original features.

Work on NHD coverages and shapefiles

To provide users with maximum flexibility, the tool was implemented to work with NHD in coverage or measure shapefile format.

Create a transaction file for updating a central database

Throughout the indexing process, the tool records any creation, modification or deletion of an event in a transaction table. This table is used for event maintenance in a central event database such as EPA's Reach Addressing Database (RAD).

Impact on State Water Program

The benefits of using the reach indexing tool can be evaluated on two levels: on a technical level and on a management level.

Technical benefits

On the technical level the following benefits can be identified:

The NHD-RIT automates the process of georeferencing surface water information. The users only need a working knowledge of ArcView and the surface waters they are indexing. This benefits not only states that often have a shortage of GIS specialists, but enables also smaller agencies and groups to take advantage of NHD. This is accomplished by reducing the learning curve to index water-related information, since the tool does not require the user to learn the complex data structure of NHD.

Event tables provide a dynamic environment. Changes in water-related information can lead to frequent updates in waterbody delineation. Events can be easily added, modified, deleted or copied with the NHD-RIT. Also, the offset feature in ArcView allows the user to display information that applies to the same reach to be displayed at the same time in a clear manner.

Information for different water programs (e.g. 303(d), 305(b), etc.) can be displayed using the same spatial dataset, which reduces the storage space requirements. Also, events table can be easily exchanged between users, since these tables are often only a fraction of the size of a spatial dataset.

One state that uses the RF3 version of the RF3-RIT extensively is Tennessee. Before they used the NHD-RIT, the GIS staff had to visually identify reaches and lookup their IDs in the attribute table of the RF3 coverage. The IDs were then stored with the water quality information, and were used to link the information back to the coverage. This process involved various steps and was not automated, which made it fairly time consuming and prone to errors. In addition, it was not possible for the state to assign attributes to only a section of a reach.

Through the use of the RF3-RIT (and now the NHD-RIT), Tennessee is able to simply select the reaches that need to be indexed and assign them an ID with a few mouse clicks. Also, they are now able to assign attributes to the exact location on the stream network, since the tool is not limited to indexing entire reaches only.

Another benefit according to the state is that mapping of the information can be done much more quickly. Previously, the creation of a map involved various join operations among different datasets. This is no longer required, since the event tables provide an easy-to-use structure that facilitates the linking of spatial and attribute information.

The state recently indexed their 305(b) waters for the entire state. For this purpose, they had two staff working together, one with EPA's 305(b) Assessment Database and one with the RIT. While one person entered the data into the 305(b) database, the second person delineated the waterbody and was able to provide additional spatial information for the database (e.g. length of waterbody). Through this method, a waterbody entity could be loaded into the database and indexed at the same time.

Management benefits

State and federal resource managers will benefit from spatial information in a consistent format that they have never had before.

Maps can be created easily and quickly from event tables, and can provide management with excellent visual tools for decision making. For example, problem areas and their potential source can be located easily.

Event tables can also be used as a link between the different databases since they contain the reach code, which can be used to spatially relate information and make decisions base on this. For example, if point discharge information were mapped with the NHD-RIT (point events) and any streams or lakes with water quality problems were also indexed, a map displaying both could help management determine which facilities may lead to water quality problems and which may not.

Conclusion

The Reach Indexing Tool allows the user to assign and display water related information easily and consistently. It is designed to work with the recently released National Hydrography Dataset, and provides a full set of functionalities that provide a user friendly way to georeference any water-related information. The output of the indexing work (event tables), can be used for modeling and display purposes to improve decision making. EPA and several states are actively using the RIT.

In Tennessee for example, the tool was a used to to create and submint a 303(d) list to EPA for the year 2000. The tool was used to index their 303(d) list for the previous cycle (1998), which made the data easily accessible for the cycle in the year 2000. EPA is using the NHD-RIT to georeference several types of waterbodies to the NHD for national tracking and decision making.

References

USGS and EPA. Feb. 2000. "The National Hydrography Dataset: Concepts and Content", http://nhd.usgs.gov/chapter1/index.html
"Research Triangle Institute. 2000. "Reach Indexing Tool for the National Hydrography Dataset (NHD-RIT): Requirements Document",
"Research Triangle Institute. 2000. "Reach Indexing Tool for the National Hydrography Dataset (NHD-RIT): Design Document",

EVENT_ID:	An unique identifier used for event maintenance
ENTITY_ID:	Identifier used to link the event to an external data source
ATTR_PRG:	Attribute describing the program/purpose/classification of the event table
ATTR_VAL:	Value related to the Attr_Prg field
META_ID:	Identifier used to link the event to metadata information
STATE:	State abbreviation in which the reach is located
RCH_DATE:	Date when the associate reach was created
DUU_ID:	NHD DUU identifier



Figure 5: Menues, buttons, and tools available to the user. The buttons with the blue sympols are the addtionall buttons and tools provided by the NHD-RIT