Shapefile is ESRI’s vector data format for storing data and attribute information for a geographical object. The geographical data of a shapefile can be represented in points, lines, or polygons, such as rivers, lakes, or water wells. The three major shapefile formats/extensions that define the geometry and attributes of geographically referenced objects are:
- shp: This shapefile format is a main file that stores the geometric data, such as points, lines, or polygons.
- .shx: This shapefile format is an index file that allows GIS software to quickly access the geometric data stored in the .shp file.
- .dbf: This shapefile format is a file that stores attribute data of the shapefile in a tabular format.
Some other important shapefile formats are—.sbn/.sbx, .fbn/.fbx, .ixs, .mxs, .prj, .cpg, etc.
When working with the above shapefile formats along with their associated field names, the following restrictions must be followed:
- All extensions/file formats associated with the shapefile must be stored in the same project workspace/directory.
- Each shapefile format must have the same prefix, for example, shp, object.shx, and object.dbf.
- While copying a shapefile across different GIS software, all file formats that compose a shapefile must be copied to ensure compatibility and proper functioning of the entire project.
- The size of both .shp and .dbf file formats cannot exceed 2 Gigabytes.
- The maximum permissible length for a field name in .dbf file is 10 characters.
- The maximum number of allowable shapefile fields in .dbf file is 255.
- For .dbf file, floating point numbers may contain rounding errors since they are stored as text.
- The field names of the .dbf file provide inadequate support for Unicode.
- The supported field types for the .dbf file are—floating point (13-character storage), integer (4 or 9-character storage), date (8-character storage), and text (maximum 254-character storage).
- For all shapefile formats, only alphanumeric characters (letters and numbers) and underscores (_) are allowed for field names. In addition, underscores (_) are only allowed as separators between words, but not at the beginning or end of the field name.
- Spaces and special characters are not allowed in field names.
- For the field names, avoid using reserved keywords that have special meanings in GIS software or its database.
