Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

ESRI Arc Geodatabase

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name ESRI Arc Geodatabase
Description

The ESRI Arc Geodatabase (GeoDB) implements a data model that extends the georelational data model that is the basis for the ESRI ArcInfo_Coverage data format. For example, based on technological developments not available when the Coverage format was developed, the Geodatabase adds support for object-oriented functionality and takes advantage of the capabilities of off-the-shelf relational data base management systems. The Geodatabase data model serves as the common data storage and management framework for all ArcGIS software from v8.0 onward. The data model supports, as standard, a rich collection of objects (rows in a database table) and features (objects with geometry). It also supports advanced feature types such as geometric and logical networks, true curves, complex polylines, and user-defined features. Vector features can have two, three, or four dimensions (x, y, z, and m). Users can define topological and association relationships and rules that define how feature classses interact.

Included in the Geodatabase model is a storage mechanism for spatial and attribute data that contains specific storage structures for features, collections of features, attributes, relationships between attributes, and relationships between features. The Geodatabase has two major concepts: first, a physical store of geographic information inside a relational database management system (DBMS); secondly, a data model that supports objects with attributes and behavior, and transactional views of the database including versioning. Behavior describes how an object or feature can be edited and displayed. Behavior includes, but is not limited to relationships, validation rules, subtypes, and default values. With associated behaviors, data entry is regulated more efficiently, and data contamination issues can be avoided.

The DBMS for a Geodatabase is implemented using the user's choice of a commercial off-the-shelf database management system that stores all spatial data (vector, raster, address, measures, CAD, etc.) in multiple formats including:

  • Simple features such as shapefiles
  • Custom features with business logic and editing rules
  • Attribute data
  • Metadata
  • Images
  • Raster/Grid data
  • CAD data

The geodatabase schema includes the definitions, integrity rules, and behavior for these and for extended capabilities. These include properties for coordinate systems, coordinate resolution, feature classes, topologies, networks, raster catalogs, relationships, domains, and so forth. This schema information is persisted in a collection of geodatabase meta tables in the DBMS. These tables define the integrity and behavior of the geographic information.

The Geodatabase data model is expressed in three different geodatabase types: file geodatabases (GeoDB_File), personal geodatabases (GeoDB_Personal), and spatial database engine (GeoDB_SDE) geodatabases. The three expressions allow progressively more capability for basic and advanced spatial analysis.

  • File geodatabases are stored as folders in a file system. Each dataset is held as a file that can scale up to 1 terabyte in size. The File geodatabase is the recommended native data format for ArcGIS stored in a file system folder.
  • Personal geodatabases have all datasets stored within a Microsoft Access data file that is limited in size to 2 gigabytes, and tied to the Windows operating system.
  • ArcSDE geodatabases are stored in a relational database using Oracle, Microsoft SQL Server, IBM DB2, or IBM Informix, and PostgreSQL. These multiuser geodatabases require the use of ArcSDE software and can be unlimited in size and numbers of users. ArcSDE is the recommended native data format for ArcGIS stored and managed in a relational database.

For a more complete comparison of the key characteristics of the three different geodatabase types, see Types of geodatabases. More information about the individual database types is available at GeoDB_File and GeoDB_SDE. Geodatabases can be exported from ArcGIS as GeoDB_XML workspaces.

The primary mechanism used in a Geodatabase to organize and use geographic information in ArcGIS is the dataset. Three primary dataset types are used:

  • Feature classes
  • Raster datasets
  • Attribute tables

Creating a collection of these dataset types is the first step in designing and building a geodatabase. Users typically start by building a number of these fundamental dataset types. They then add to or extend their geodatabase with more advanced capabilities (such as by adding topologies, networks, or subtypes) to model GIS behavior, maintain data integrity and work with a set of spatial relationships.

Relationship to other formats
    Affinity to ArcInfo_Coverage, ESRI ArcInfo Coverage. GeoDB replaces ArcInfo_Coverage for coverage data; ArcInfo_Coverage is no longer editable in ArcGIS software for releases subsequent to 8.3. Coverages must be stored in an ESRI Geodatabase format to be editable.
    Affinity to GeoDB_XML, ESRI Geodatabase (XML). Exchange format used by ArcGIS to import and export all items and data in a geodatabase such as domains, rules, feature datasets, and topologies.
    Has subtype GeoDB_File, GeoDB, ESRI Geodatabase (File-based). The file-based geodatabase is one option for data storage for a single-user ESRI Geodatabase. It is implemented as a collection of binary files in a file system.
    Has subtype GeoDB_SDE, GeoDB, ESRI Geodatabase ArcSDE. The spatial database engine is the multi-user and/or enterprise option for data storage for an ESRI Geodatabase.
    Has subtype GeoDB_Personal, GeoDB, ESRI Geodatabase (Personal). An option for data storage for a single-user ESRI Geodatabase that is implemented as a single Microsoft Access file. ESRI recommends file geodatabases over Microsoft Access Personal Geodatabases, because they offer more functionality and better performance. The Personal Geodatabase format is not described at this time on this website.

Local use Explanation of format description terms

LC experience or existing holdings  
LC preference  

Sustainability factors Explanation of format description terms

Disclosure A proprietary data framework used for ESRI GIS software applications. The partial documentation that is available is cited below.
    Documentation Partial documentation in ESRI application help information: An overview of the geodatabase and The architecture of a geodatabase. The different storage options for ESRI geodatabases are described in Types of geodatabases.
Adoption The Geodatabase data model was introduced by ESRI in the late 1990s with the release of version ArcGIS 8.0. The release of the ArcGIS suite constituted a major change in ESRI's software offerings, aligning all their client and server products under one software architecture known as ArcGIS, developed using Microsoft Windows COM standards. While the ESRI shapefile is still quite prevalent in the industry, at least for sharing and transferring datasets among different systems, the geodatabase is becoming the mechanism of choice for data sharing and data interoperability among organizations, and departments within a single organization. While older ESRI (non-ArcGIS) products are still available, most of the GIS software market share that ESRI holds (approximately 36 percent worldwide as of 2002) is taken by ArcGIS products. See ArcGIS from Wikipedia and COTS GIS: The Value of a Commercial Geographic Information System for more information.
    Licensing and patents ESRI Licensing agreements detail the terms of use and compliance for ESRI GIS software per http://www.esri.com/legal/software-license.
Transparency Transparency depends on the storage option used.
Self-documentation The Geodatabase format supports the application of metadata and requires specifications of data types for attribute data. Semantic descriptions of a dataset and its attributes(variables) are optional.
External dependencies Software dependencies: GeoDB_File and GeoDB_Personal (not described on this Web site) geodatabases can be used within ArcGIS software, i.e., ArcView, ArcEditor, and ArcInfo. GeoDB_Personal geodatabases have been used in ArcGIS since Version 8.0 using the Microsoft Access data file structure (.mdb file). GeoDB_SDE (spatial database engine) geodatabases work with a variety of DBMS storage models including IBM DB2, IBM Informix, Oracle, PostgreSQL, and Microsoft SQL Server.
Technical protection considerations Whether technological protection can be applied will depend on the storage option used.

Quality and functionality factors Explanation of format description terms

GIS images and datasets
Normal functionality

The geodatabase data model allows users to take advantage of both basic and advanced spatial analysis when GIS data is stored within the geodatabase. Complex business logic can be applied to GIS data to create more detailed and accurate spatial data models that represent real-world GIS application workflows. Examples include land parcel management; natural resources management; river and stream system modeling; utility network system modeling, such as gas, water, and sewage pipelines; and three-dimensional surface modeling of the landscape.

By storing feature classes within a feature dataset, geospatial relationships can be modeled between the feature classes, enabling more advanced GIS analysis. The more common types of geospatial relationship data structures in the geodatabase are:

  • Topology -- Defines and enforces data integrity rules for features. For example, there should be no gaps between polygons. It supports topological relationship queries and navigation, such as feature adjacency or connectivity and sophisticated feature editing tools, and allows feature construction from unstructured geometry (for example, constructing polygon features from line features).
  • Geometric Networks -- Consists of a set of connected edges and junctions (line and point features) that, along with connectivity rules, are used to represent and model the behavior of a common network infrastructure in the real world. Water distribution, electrical lines, gas pipelines, telephone services, and water flow in a stream are all examples of resource flows that can be modeled and analyzed using a geometric network.
  • Network Dataset -- Consists of a set of connected edges and junctions, as well as turn features, along with connectivity rules, that represent and model the behavior of transportation network systems. Highways, roads, and streets in a city; rail lines; and bus routes are examples of undirected network flows that can be modeled with a network dataset.
  • Terrain -- A data structure that is generated from a mass collection of elevation measurement points, typically from remote-sensing data sources. It is a triangulated irregular network (TIN)-based data structure with multiple levels of resolution and is used to represent surface morphology. A terrain is used for 3D surface modeling applications.
  • Cadastral Fabric -- A continuous surface of connected parcel features that represents the record of survey for an area of land. This data structure enables GIS data to be integrated with survey data to maintain a consistent and accurate survey record..

Additional business logic in the geodatabase, in the form of subtypes and attribute domains, can also be applied to GIS data. Subtypes enable categorization of data in a table or feature class. Collectively, these examples of business logic in the geodatabase help streamline data entry and ensure the integrity of a user's GIS data. The geodatabase data model is designed to enable users to leverage and optimize their GIS data to its full potential and maintain a consistent, accurate repository of GIS data. See The Geodatabase: Modeling and Managing Spatial Data for more information.


File type signifiers and format identifiers Explanation of format description terms


Notes Explanation of format description terms

General See The geodatabase is object-relational for more information about the object-relational model behind GeoDB.
History Prior to the development of the ArcGIS data model and software suite, ESRI developed the Arc/INFO (now usually written as ArcInfo) workstation and various GUI based products for a suite known as ArcView GIS. In 1999, ESRI released ArcGIS 8.0 to provide a single integrated software architecture that included the geodatabase, an object-relational model. All subsequent ArcGIS products to date have used that model. More information about the history of ArcGIS products can be found in the Wikipedia article ArcGIS.

Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 02/22/2017