GSP 536 Final Technical Report

City of Flagstaff Urban Geodatabase Project

A Technical Report

John Bonifas

Advisor: Dr. Ray Huang

Committee members:

(TBD)

December 16, 2016

Northern Arizona University

Flagstaff, AZ

928-214-8214

(total words: 2,852)

Table of Contents

Abstract  
Introduction  
Goals And Objectives  
Methodology  
Conclusion and Final Discussion  
List of Figures  
References  
Great project report. If more time is allowed, I believe you can organize it better. Thank you for your comments and suggestions.

Abstract

This report chronicles the development process of the enterprise geodatabase project for the GIS department of the city of Flagstaff, Arizona. It begins with an introduction and background section, then lists the project's goals and objectives, leading into a methodology section that explains the mechanics of how the system was designed and developed, and concludes with a discussion concerning any major mishaps in the project that could have been avoided, and comments related to future projects of this type.

Introduction

In the mid-1980s, a landmark study was conducted for the province of Ontario, Canada. One of the first enterprise geodatabases, the project inventoried the needs of local government agencies throughout the province. Since then, enterprise-class computing power has increased dramatically, as has the number of features and power of GIS software. For example, in the late 90s Montgomery County, Maryland used ESRI's ArcGIS 8 SDE enterprise software and an Oracle server to centralize data that was formerly scattered among many unix and windows servers (Chen, 2004).

Prior to the development of Flagstaff's centralized geodatabase, many departments in the city's government were maintaining their own spatial data sets. Data sharing was carried out mainly by email and sometimes by delivering disks through the internal mail system. Later, some departments chose to share data on public folders on the intranet. This introduced inconsistency in the data and hampered data quality control efforts (Huang, 2016).

Goals and Objectives

The purpose of our project was to use modern enterprise GIS principles and guiding decisions to centralize Flagstaff's spatial data, host attribute data related to the spatial data, and provide a robust multiuser GIS environment that would maintain data quality control.

Our objectives were to:

Overview

A Geographic Information System, or GIS, is a computing system that stores, processes, analyzes, and reports spatial data in addition to non-spatial, or attribute, data. A Geodatabase is a relational database that contains spatial data. An enterprise geodatabase is a geodatabase that is hosted by an enterprise-class relational database management system. An enterprise-class relational database management system, such as the Oracle database server that we chose, is designed to support up to thousands of simultaneous users, and multi-terabytes of data (each terabyte being 1000 MB).

Prior to the implementation of the enterprise geodatabase, the city's various departments that were using spatial data stored that data mostly in ESRI shapfiles, a computer file format that supports the storage of spatial data. A few departments were also storing and exchanging their spatial data in ESRI coverage files, a more robust form of spatial data storage. Some users that were storing these types of files on common LAN file directories did not require the use of email or physical disks to exchange data as long as groups of users needing access to common data files were given the appropriate access permissions.

The server could easily store all these disparate datasets that the city's departments had, in one place. Moreover, the server could support many users interacting with the server at once, and effectively manage many versions of each spatial feature dataset (a collection of similar feature classes, each class representing a real world feature such as roads or rivers) at once without an appreciable degradation of performance. The ArcSDE GIS server that we chose not only supports feature dataset versioning but also spatial data archiving as a backup process. Unlike the city's old shapefiles, the ArcSDE geodatabase could support spatial constraints and relationships, or topology (explained in a later section), preventing spatial features from crossing over and overlaping each other. Like the city's old coverage files, the geodatabase could enforce connectivity rules between spatial features, such as those in a transportation or utility network. However, features in the geodatabase can have behaviors, which are especially useful when editing and moving features in a transportation or utility network. As an added bonus, the Oracle server could host the ArcSDE business objects and middleware (the communication software sitting between the users client and the backend geodatabase). Thus, the system we implemented met our objective of centralizing the data processing, storage, analysis, and output functions of the city's spatial data using departments.

Figure 1. GIS Data Server System Architecture (Chen, 2004)
 
Figure 2. Feature Datasets in the Database (Chen, 2004)

Methodology

After we completed a survey of the various city departments using spatial data, we interviewed representatives of the users to get an understanding of the city's organizational structure, and identifed organizational functions and their business requirements. We then identified several business functions, the data items of those functions, and then identified common GIS functions to relate them to. From this we designed a conceptual model of the data we were going to store.

Figure 3. Conceptual Data Groupings (Huang, 2016)

Next, we developed the logical data model using Unified Modeling Language development software. We designed the feature classes as tables in the geodatabase that would store the data of each real world feature as a row in the table, and identified the attributes of each feature, assigning each attribute as a column in each feature class table.

Figure 4. Logical Model Example: Administration (Huang, 2016)

Keeping our business requirements in mind, and the fact that we couldn't assume that our users would be perfect or technically aware, we then identified and assigned global range and coded-value domains for certain attributes. Finally, we identified certain relationships between feature classes, and included those relationships between feature class tables into the model. Relationships between feature classes allowed us to enforce referential data integrity within the database, and minimized data duplication and corruption.


Figure 5. Relationship Examples (Huang, 2016)

A domain is a kind of database constraint on the data, a range of values that an attribute is allowed to take on. Range domains are usually numeric and not discrete; coded-value domains are usually text and discrete. For example, a coded-value domain for the speed limits of a street type could be: "10", "15", "20", etc., while a range domain for the size of a pipe might be anywhere from 16" to 30".

Figure 6. Domain Example (Huang, 2016)

In an ESRI ArcSDE enterprise geodatabase, all spatial data must be related to a specific coordinate system of the earth's surface in order to guarantee spatial location accuracy. Feature datasets - groups of related feature classes - can only be assigned to one coordinate system at a time. While considering how we were going to group the feature classes into feature datasets before importing the data into the geodatabase, we discovered that spatial data sourced by the city was based on a projected coordinate system, while data needed from the U.S. Census was based on the Geographic Coordinate System of the North American Datam of 1983. Thus, we could not directly implement the logical groupings that we had identified, so we identified the feature datasets as categories instead. All the feature datasets that we identified - Administration, Land, Utilities, Transportation, and Environment - all had been based on a projected coordinate system of the State of Arizona.

We identified certain feature classes that could be further refined into database subclasses, or subtypes. For example, we identified 4 different subtypes for real world road features - streets, highways, freeways, and freeway onramps - then assigned different constraints on each subtype. Constraints on feature classes and their subtypes included attribute domains, relationship constraints, connectivity rules between subtypes, and topological rules. Topological rules are constraint rules on spatial data, and within a geodatabase they are collectively known as a topology.

Once the geodatabase schema - the total set of object, relationship, and constraint definitions within the geodatabase - had been completed, we imported the data into the database, and reconciled the results, correcting any errors we found. We calculated an estimate of the spatial data storage requirements for the geodatabase, and decided upon a data precision figure of 2126 storage units for every 1-foot map unit. The resolution of the input data was known, and we based the calculations on a total area of the city of Flagstaff of about 1 million X 1 million square feet.

Once the database was prepared and the data loaded, we began training the users on how to interact with the geodatabase using ArcGIS Desktop, and assisted the users to develop data input and management processes. We chose the ArcMap and ArcCatalog tools as the user clients because of their compatibility with the ArcSDE server software.

Figure 7. ArcGIS/ArcSDE Software Model (Trust, 2011)

We created an account for each user, an account for applications to use, and setup database roles. Unlike Microsoft Access software hosting personal geodatabases, the Oracle server host of the enterprise geodatabase supports a robust security model, and we setup user roles, developer roles, and administrator roles. For those requiring Microsoft Access or those that prefer it over such clients as Microsoft Excel, we developed a Microsoft Access application in the Visual Basic .NET computer language.

The Development Review Board was our pilot customer for the development of versioning policy and the versioning/archiving process for the geodatabase. Support for multiple versions of a feature dataset allows multiple personnel from different departments to work on the update of a feature dataset(s) and its classes, and any related non-spatial tables, simultaneously. Versions are managed at the database level by ArcSDE and the Oracle host.

For each submittal to the Review Board, plat developers are to develop at least two alternative preliminary plat designs, designated A and B. When a submittal package is received by the board from the developers, the receival clerk creates a version in the database for each alternative plat of the submittal by cloning it from the default version in ArcCatalog. The paperwork is then forwarded on to the appropriate departments.

GIS personnel are assigned in each department to input the plat designs into the database. Each person ensures that the feature datasets to be updated are versioned without the option to move edits from the default version to the database (base version). This allows data archiving (backups) to be enabled. The GIS specialist then enables archiving, switches to the appropriate preliminary version of a plat, and inputs the data using ArcMap. Usually, the lead GIS specialist of the department is assigned for preliminary data input as accuracy is vital; data input errors at this stage are very difficult to correct.

Under the hood, when a feature dataset is registered as versioned, delta tables in the database are created to hold the data. Over time, changes in all versions descended from the default version are merged into the default version. Every 24 hours, the server launches a compression job, maintained by the geodatabase administrator. This job merges the default version data into the database, making the changes permanent. The access permission of the default version of the database is set to public by convention; the access permission of each department's version is set to either protected or private, depending on departmental policy.

After both preliminary plat designs are input into the database, ArcGIS Map Documents (.MXD files) are forwarded to the Board for review, and the Board will approve either plat A or plat B. The developers will be notified of the Board's decision, and construction on the plat will then begin based on the approved plat design.

Figure 8. ArcSDE Versioning Concept (Huang/ESRI 2016)

Once construction on the plat is completed, the developers will notify the Board. When the receival clerk is notified that construction on the plat is complete, the clerk clones a final version of the submittal from the approved plat version, and notifies the appropriate department that worked on that preliminary plat design. The GIS specialist in that department ensures that the appropriate feature datasets are versioned without the option to move edits from the default version to the database, ensures that archiving is enabled, switches to the final plat version for the submittal, and archives the version as a backup. The specialist then makes any required changes to the version, and reconciles it. The specialist completes the process by correcting any errors generated by reconciliation, posts the final plat version to the database (base version), and notifies the receival clerk that the submittal process is complete. The receival clerk then signs off on the paperwork as complete and stores the submittal.

Figure 9. ArcSDE Archiving Table Snapshot (Huang, 2016)

Transportation and utility networks

Because network spatial data is in most cases directional, city departments managing GIS data relating to streets, highways, railroads, and water lines stored their transportation and utility networks in ESRI coverage files. Topological rules had been added to prevent certain transportation and utility lines from crossing each other. We assisted them in porting their network datasets into the new enterprise geodatabase.

While porting utility network data for certain water mains in a newly completed development in a southwest portion of Flagstaff, a major pipeline burst was reported. An emergency impact network analysis was conducted, and the results showed that the now shutdown water main, uninitialized, supplied water to about 60% of the development. Crews were sent to the area immediately, and water was restored to the affected residents in a few hours. Fortunately, few other indeterminate flow areas were uncovered, and were corrected promptly. Uninitialized areas of a utility network mean that that subnetwork is not connected to the rest of the network. Indeterminate flow areas are areas of the network where not enough information about its sources and sinks is available to determine flow. Determinate flows have their sources and sinks present and properly defined.

Porting street, highway, freeway onramp, and freeway data from the coverage files into the new database was straightforward. The old coverage files did not support annotations in the form of street names, so these were added later after a reconciliation was completed. One way streets were defined by setting certain attributes of the street sections, and turn features were added.

Conclusion and Final Discussion

The successful implementation of the city's enterprise geodatabase, and the centralization and conversion of the file-based data, dramatically improved the city's GIS operations. The improvements it brought to the city include:

  1. Centralization of the city's GIS data
  2. Elimination of data redundancy
  3. Implementation of data access control mechanisms
  4. Multi-user and concurrent data manipulation support
  5. Reliable data backup mechanism

A well-designed and easy-to-use robust database is the foundation for successful applications of enterprise GIS technology (Chen, 2004). Collaboration among the various departments at Flagstaff city hall using spatial data has increased through the data sharing and data integration capabilities of the new geodatabase system. The versioning and archiving capabilities of the new system have added a mechanism for auditing and accountability. In the future, client software running on mobile devices will enable personnel in the field to input data into the system directly, wherever they are. These mobile devices will communicate with the geodatabase via satellite, eliminating the need for WiFi or cellular support.

References

  1. Chen, Lian. "An Enterprise Geodatabase: Montgomery County." Maryland: ESRI 2004 Users Conference, Paper, 2004.
  2. Huang, Ruihong. "Enterprise Geodatabases". Northern Arizona University, course GSP 536, project 2, 2016. Web. Accessed 9/18/2016.
  3. Trust, Michael. "EEOS 381 Spring 2011, Lecture 6, Slide 6". Web. Accessed 12-16-2016.
  4. ESRI. "An Overview of Versioning." Web. http://desktop.arcgis.com/en/arcmap/10.3/manage-data/geodatabases/an-overview-of-versioning.htm. Accessed 12-16-2016.