Procedures_Used:
The data was recieved as compressed giras tar files representing
either a 1:250,000-scale (1:250K) quadrangle or a 1:100,000-scale (1:100K)
quadrangle. Each file was named after its respective quadrangle. A
coverage of 1:250k quadrangles was used to divide the country up into
four sections and get a list of names for each section. Using GIRASARC2,
an aml designed to create an ARC/INFO data set (coverage) from a GIRAS file
and a corresponding neat line coverage, it was quickly discovered that many
of the quad names were to long for the program (i.e. sault_saint_marie)
and a generic naming system for files and coverages was incorporated. In
1 of 10 cases, the name of the quadrangle did not correspond with the name
of the file. These problems were traced down and corrected (after all four
sections were converted there were many files left over...these wound up be
all the 1:100k quads which did not have similar names to the 1:250k
files).
After the files for a given section were all converted into ARC/INFO
format, a loop aml was run which copied a coverage and its neatline
cover into temporary storage (there was not enough room in info to
deal with a large number of files in one directory), attached to
that directory, built line topology, and went into the editor, ARCEDIT.
In ARCEDIT, the outer edge (original neatline) was selected and deleted an
the mathematically-calculated neatline coverage from the GIRASNEAT AML
program was copied in using the ARCEDIT GET command. The original
neatline was replaced with a calculated neatline because in all cases, the
outline of the coverage quad never quite conformed to a "true" neatline
causing overlaps and gapes between adjacent maps. The new neatline was
connected to the internal arcs where they intersected. Lines which did
not quite join the new neatline were extended to the edge with a maximum
tolerance of 500 meters. All extensions were made within this tolerance.
All arcs which extended beyond the new neatline were clipped off within
a 500 meter tolerance as arguments to the CLEAN command into a separate
directory. Both the neatline and huc coverages were deleted from the
temporary space, and the program looped to the next coverage.
Another program was then run which added an item to the .aat called OUTER,
went into INFO, and populated the attribute for all arcs composing the
new neatline. This was done by reselecting for the identity of the polygon
to the left or right of each arc whose value was "1", the identity of the
outer "universe" polygon (reselect lpoly# = 1 or rpoly# = 1 in the .aat and
calculated outer to = 1). All coverages were checked for additional
dangles and then a MAPJOIN was run using NET as the feature option.
Finally, most map edge lines were removed from the MAPJOINed coverage
using the DISSOLVE to create a seamless basin coverage with polygons
(basins) and arcs (boundaries) with attributes.
Quality control methods were applied to the resulting coverage by detecting
and fixing node and label errors and remaining neat line arc problems
(i.e. long neat lines still in the coverage). Many more problems arose
in the western part of the country than in the east. Bordering HUC code
disagreements between quads caused a number of cases in which neatlines
did not dissolve. These were provisionally corrected for the most part,
however there were several cases that required external review and editing
to fix, and are now incorporated in the final data set. After all 1:250K
sections were completed, the same procedure was run for the handful of
1:100k quads. These were mapjoined with the 1:250k quads to provide more
detailed coverage where it was available.
Revisions:
Revision #1. See above for all the details
Process_Date 10/92
Revision #2. Seattle and Bakersfield quadrangles were missing
from the composite supplied by Pete Steeves. These were manually
pasted in using Arcedit with small tolerances. Labelerrors were
remedied and most dangles were removed using the Eliminate command.
Process_Date 1/93
Revision #3. The following changes were made to a
1:250,000-scale version derived from National Mapping Divisions
Geographic Information Retrieval and Analysis System (GIRAS) data.
The discrepancies in the hydrologic unit codes (HUCs) in California
were changed because the California State Hydrologic Unit Map (HUM)
was revised in 1978 but the 1:250,000-scale digital dataset was not.
This has been reviewed by Bill Battaglin, Doug Nebert, and Paul
Kapinos and is noted under Reviews (#6 below).
The areas in which the HUC labels were incorrect in California were
180701, 180702, 180703, 180600, 180300, and 180400. Boundaries were
added in 180702 and 180600 from the 1:2 million source. Along the
Oregon/California border, a boundary was added in 180102. In Wyoming,
a boundary was added in 100902 from the 1:2 million source. Labels
were corrected in these HUCs to reflect state updates, and where
necessary, to add new labels to the newly-drawn boundaries. Map edges
were manually removed in Arkansas, California, and along the
Oregon/California border.
After the changes were made and saved in Arcedit, the build and clean
commands were executed, followed by labelerrors. Three polygons had
duplicate labels and were corrected. The labels were centered in the
polygons by the centroidlabels command. Verification of the coverage
was done by the describe command.
Process_Date 12/93.
Revision #4. The NAMES file was added to the data set and its
attributes were defined in the ATT file of the documentaton. This
table is a lookup table to correlate the 8-digit numbers with verbose
names officially assigned to the basins.
Process_Date 3/94.
Revision #5. The following corrections were made to the 1:250,000-scale
coverage of Hydrologic Unit Codes (HUC250):
Valid HUC code, 7140103, added to HUC250.NAM. Bourbeuse, Missouri.
HUC250.NAM was sorted on HUC.
HUC frequency >1, tiny polygons were deleted that were erroneous:
17010212 deleted small poly to NW of main poly
10130305 deleted small poly to S of main poly
10230005 deleted small poly to S of main poly
14020001 deleted small poly to N of main poly
15050201 deleted small poly to W of main poly
04080203 deleted small poly to N of main poly
03120001 deleted small poly to S of main poly
Invalid HUC codes, not in names file, were corrected:
18020023 HUC should be 18020111 (in N-central California)
18070010 HUC should be 18070303 (in so. California)
15010017 HUC should be 15010007, delete arc separating it
(in nw Arizona)
1870201 HUC should be 18070201 (in so. California, missing an 0)
1870204 HUC should be 18070204 (in so. California, missing an 0)
18060012 HUC should be 18060011 (in so. California,
improper polygon closure)
18060011 HUC label added after polygon closure of 18060011
HUC frequency >1, larger polys were checked and corrected:
18020126 western poly is 18020108 in HUC2M (CA)
18050005 southern poly is 18050006 in HUC2M (CA)
18060006 split into 2 polys, no apparent reason, delete arc
splitting polys (CA)
04110001 and 04100001 together are 04100001 in HUC2M (MI)
(MAPEDGE was deleted)
02080108 northwestern poly is 02080208 in HUC2M (VA)
The invalid HUC codes, and 7140103 were found by relating to the
HUC250.NAM file, and identifying polygons with no match in the names
file. The rest were found by looking at the 96 polygons which had
HUC codes with frequencies >1 in the PAT. Most of these seemed to
be correct, and were along the US-Canada boundary, or were islands
along the coasts.
These errors were found in the HUC250 coverage published as OFR 94-0326.
Process_Date 12/94 & 1/95
Reviews_Applied_to_Data:
Peer review, 10/18/93, Bill Battaglin, USGS-WRD, Lakewood, Co, memo to
Doug Nebert: -
"I have completed a review of the 1:250,000 scale hydrologic units coverage
(HUC) and found the digital data and metadata to be of high quality. I
have a few suggested improvements to the digital data and to the
documentation. Below is a summary of the methods I used to check feature
accuracy in the digital data base and the problems I found.
Digital Features:
The line work for the HUC coverage was checked against the line work from:
(1) the 1:2,000,000 HUC coverage by plotting both data sets out on one large
graphic (about 1:3,000,000). No major discrepancies were found except in
coastal areas where the 1:2,000,000 scale coverage had more detail than the
1:250,000 scale coverage.
(2) line work from 1:24,000 scale digitized drainage basins in Colorado,
Illinois, and New Jersey. The match was generally good with departures
generally less than 2500 meters. The biggest departures were in Colorado
and were as large as 4000 meters.
(3) line work from the 1:2,000,000 scale rivers coverage for the USA by
plotting both data sets out on one large graphic (about 1:3,000,000). In
general the nesting of streams in HUCs was good and HUC boundaries inter-
sected steams at stream intersections. In some places (SE New Mexico,
SE California and NW Utah), the streams coverage does not match the HUC
coverage that well, but this could easily be because of the unusual nature
of streams in these areas or because of inaccuracies in the streams coverage.
(4) line work from 1:100,000 scale streams from Colorado, Illinois, and
Kansas. The nesting of streams in HUCs was very good. Stream arcs for
the most part did not cross HUC arcs except at stream intersections. The
error (distance from intersection to HUC line) between HUC lines and stream
intersection was less than 500 meters at all intersections checked
(about 25).
Problems with Line work:
(1) There was a very large number of very short arcs in the coverage (3211
Lt 1000 meters long and 1729 Lt. 100 meters long). Most of these arcs were
internal (did not border on outside polygon) and coded as 250k edges(3)
(almost 3000) but some were 250k (2) lines and one was a 2m dlg (4). Arcs
with lengths of less than 100 meters (maybe even less than 1000 meters) are
difficult to deal with when editing subsets of the coverage, and they also
add to the overall size of the database. I know many of these lines were
created in the process of edgematching the quads, but I think the informa-
tion content of these very short arcs is less valuable than the hassle and
overhead involved in keeping them in the coverage.
(2) The edit distance for the coverage was set to a very small value.
This may have been required for earlier processing, however, it makes
the finished coverage difficult to work with. I had to reset the edit
distance to a larger value when I wanted to select arcs in ARCEDIT
interactively. This, of course, will be one of the things users will
want to do with the new HUC coverage.
Polygon labels/attributes:
(1) Label point accuracy was checked by making a point cover of polygon
labels from the 1:2,000,000 HUC coverage and then doing an identify of
those points in the 1:250,000 scale HUC polygon. This procedure looked
for both new or missing polygons, and was also used to check attribute
values. I also dissolved both coverages by accounting unit and compared
the number and location of remaining polygons.
Problems with labels/attributes:
(1) I discovered a total of 649 places where the HUC codes from the label
point of the 1:2,000,000 coverage did not match the HUC code for the
1:250,000 HUC polygon that it fell within. As you had indicated in the
documentation, there were a lot of differences in California. The 2m HUC had
lots of label points resulting from islands, bays, and estuaries that are
not included in the 1:250,000 scale HUC coverages. In other places the
polygons seemed to be the same but the HUC codes were different. For example
HUC 18020111 in the 1:2,000,000 coverage is coded as HUC 18020023 in the
1:250,000 coverage. There were also many differences in the Great Lakes.
It seems odd that the 1:2,000,000 coverage should have more detail with
regard to coastal features than the 1:250,000 scale coverage has. There
were also internal polygon label differences in Minnesota (7100001 in 250k,
70200001 in 2m), Colorado (10090204 in 250k, 10180007 in 2m), Illinois
(mistake in the 2m HUC I think), and Louisiana (11140203 in 250k, 11140202
in 2m). Texas and Florida also have a few that look like they should be
checked.
(2) The dissolved 1:2,000,000 coverage contained 350 accounting unit
polygons while the dissolved 1:250,000 HUC coverage only contained 177.
There were large differences in the way the Accounting unit polygons
looked in the Great Lakes Region, and in parts of California, Wyoming,
and Florida. Again, many of the differences result from the use of a
cruder coastline in the 1:250,000 scale HUC coverage.
Coverage Documentation:
The coverage documentation was reviewed both editorially and for overall
completeness. The documentation was editorially sound and did not need any
corrections.
Problems with the Documentation:
(1) The redefined items in the pat file were not defined in the data
dictionary portion of the documentation file.
(2) The complete reference to the source material for the data is not in the
documentation file."
Response to Peer review by Bill Battaglin, 1/5/93, Doug Nebert,USGS-WRD
Reston
Data were reviewed for attribute accuracy against a 1:2million base through
random audit of polygon features. Line attributes were verified by symbol-
ization on the screen. Regions were shaded in to verify correct polygon
values for HUC at the Hydrologic Region level. Documentation was updated.
The short arcs along the quadrangle boundaries were kept in the data set
due to the importance of maintaining as much original information as
possible. Basin codes were updated and additional erroneous neatlines
removed.
Peer review, 11/10/93, Doug Nebert, USGS-WRD, Reston, memo to Paul Kapinos:
"As you are aware, we have several digital versions of the hydrologic unit
maps for the United States and I am in the process of verifying and publishi
a 1:250,000-scale version derived from National Mapping Division Geographic
Information Retrieval and Analysis System (GIRAS) data as part of their land
use mapping program of the 1970s and early 1980s.
In comparing the 1:250,000-scale data reviewers noticed differences in both
basin definition and hydrologic unit codes in Southern California and in the
San Joaquin valley. The 1974 state map, at 1:500,000-scale agrees with the
1:250,000-scale GIRAS data in boundaries and numbers, whereas the 1:2.5 mill
"wall map" of the U.S. agrees with the 1:2,000,000 digital data set. Both p
maps are authoritative sources of information, but apparently something chan
between the two maps.
On a related note, it is worthwhile to mention that the 1:2.5 million-scale
wall map for the western U.S. is being revised to include new Alaska hydrolo
unit codes before reprinting. It would be wise to be sure that the boundari
depicted there are also the authoritative ones.
I would appreciate your review and adjudication of the California hydrologic
unit definitions in order for us to publish this digital data set. Please
provide a written response (e-mail and paper copy) and marked-up maps as to
which basins and boundaries are current."
Peer review, 11/29/93, Paul Kapinos, USGS-WRD, memo to Doug Nebert: "The discrepancies in the hydrologic unit codes (and some boundaries)
in the State of California are due to the fact that the California
State Hydrologic Unit Map (HUM) was revised in 1978 but the 1:250,000-scale
digital data set was not. The events that most likely occurred can be
summarized as follows:
o The 1:500,000-scale HUMs were published by OWDC over a period of about
four years between 1974 and 1978.
o The National Mapping Division (NMD) overlaid the hydrologic unit
boundaries on their 1:250,000-scale land-use and land-cover map
series after each State HUM was completed, and later digitized these
boundaries and their respective codes.
o In 1978, the State of California asked OWDC to revise the hydrologic
unit boundaries and codes in the central valley.
o The 1:500,000-scale California HUM was revised and reprinted but NMD
was either not informed of the revisions or chose not to revise or
redigitize their 1:250,000-scale overlays.
o Once all the HUMs were printed (including the 1978 revisions of
California and South Dakota), the 1980 1:2.5 million-scale United
States wall map was published using the up-to-date (1978) boundaries
and codes.
Based on the above summary, I would recommend using the boundaries
and codes from the 1:2.5 million-scale map and the 1:2,000,000 digital
data set. Please be aware that other hydrologic unit boundaries and/or
codes may have been revised when individual State HUMs were reprinted
by OWDC. I doubt if there has been any attempt to update any of the
digital data sets with these changes."
Response to Peer Review by Paul Kapinos, Doug Nebert 2/14/94:
-------------------------------------------------------------
The areas in question in California were updated to reflect the more
current information as contained in the 1:2 million data set. Polygon
hydrologic unit codes were updated in the Central Valley and in coastal
Southern California. Where necessary, 1:2 million-scale linework was
substituted to define the correct basin boundaries where no corresponding
information was available at a different scale.
Related_Spatial_and_Tabular_Data_Sets:
Any data set which has hydrologic unit codes as part of their data may
be able to use this data.
Other_References_Cited:
U.S. Geological Survey, 1990. Land Use and Land Cover Digital Data from
1:250,000- and 1:100,000-Scale Maps. Data Users Guide 4, 33 pp, Reston
Virginia.
Notes: