April 2014

Download this layer

Overview

This dataset consists of 2-dimensional roof outlines ("roofprints") for all buildings larger than 150 square feet, as interpreted by a contractor (Rolta) for the whole area of the Commonwealth using color, 30 cm. DigitalGlobe ortho images obtained in 2011 and 2012, supplemented with LiDAR (Light Detection And Ranging) data collected from 2002 to 2011 for the eastern half of the state.

The roofprints as delivered were enhanced by MassGIS using Normalized Digital Surface Models (NDSMs) derived from the same LiDAR data. Other layers were used, including the Level 3 Parcels, to aid in review, especially where LiDAR data were not available. See the section "Roofprint Shifting" below for details on MassGIS' work to edit the roofprints. For information on other updates see the "Maintenance" section.

In ArcSDE the layer is named STRUCTURES_POLY.

Production

Rolta created the polygons based on the 2011 and 2012 DigitalGlobe Ortho images (see Year of Photography Index pdf format of    DigitalGlobe Ortho Imagery Index Map 2011-2012  file size 1MB), the latest available orthos at the time, using LiDAR as a supplement to determine the shape of structures that were difficult to distinguish in the orthos. The data were saved in ESRI shapefile format and delivered to MassGIS for QA review, then processed to create the final deliverable.

Criteria Used for Creating a Roofprint

The following is a summary of the guidelines used in creating roofprints (as described in the Request for Response for this project):

A roofprint is a map polygon, with real world coordinates, representing the perimeter outline as it appears in aerial imagery of every structure or portion of a structure which has a roof. Roofprints shall be mapped for all structures equal to or larger than 150 square feet including the following:

  • Residential, commercial, and industrial structures (including roof over porches and decks)
  • Trailer homes and offices
  • Mobile homes
  • Garages, sheds, and other isolated structures

Additionally:

Features that do not have a roof covering usable areas, such as an open deck, the top surface of an electrical transmission or cell tower base, platforms for utility equipment, or other structures which do not have a usable “interior” or covered volume, shall not be interpreted for mapping. Also, vehicles, including truck trailers that are parked with or without a tractor attached, boats, airplanes, etc. should not be mapped. However, trailers with any kind of residential or business use such as temporary classrooms, construction site field offices and the like must be captured.

Greenhouses were generally considered not structures, unless attached to a roofed structure. Roofed dugouts of sufficient size were also included as structures. Tanks and covered reservoirs and pools with temporary covers were not considered structures.

Polygon creation had the following guidelines:

  • Outlines will usually be made up of orthogonal segments (all segments parallel or at right angles) unless the building is octagonal, round, triangular, etc.
  • Outlines are to be traced at the elevation of the eaves or lowest part of the roof adjacent to the exterior vertical walls. If there are multiple roof levels for a single structure then internal boundaries created by joining the separate roofprints must be dissolved. Any part of the structure which is covered by a roof is included, so two buildings connected by a covered walkway are to be represented as one polygon.
  • Any roof offset, jog or projection for which all sides are more than 3 feet in length should be captured.

In the creation of the outlines, building “lean” was not compensated for. (MassGIS addressed this issue for part of the state. See the section "Roofprint Shifting" below.) No attributes were included in the creation of the polygons.

Criteria for Acceptance

The interpretation error rate was less than 0.5%, conformance to this standard was determined as follows:

For each of the six delivery areas, MassGIS selected tiles randomly (using a ‘randomizing’ spreadsheet created within MassGIS) from the 2008/2009 ortho imagery. Tiles were selected until the total number of structures in the selected area exceeded 15,000. The roof outlines in the selected tiles were then reviewed against the DigitalGlobe imagery. Additional layers were used to supplement the review, including the LiDAR datasets, and the Level 3 Parcel dataset, especially where LiDAR data were not available.

The error rate was defined using two statistics from the review of the sample tiles for each delivery:
Eo = The number of errors of omission – structures that were missed
Ec = The number of errors of commission – structures that are not in fact structures (as defined above).

The combined error rate for interpretation was calculated to be Eo + Ec.

Roofprint Shifting

Elevated objects such as roof outlines in aerial imagery may appear displaced with respect to the base of the structure. In order to minimize or eliminate the effects of such displacement (often referred to as "building lean"), MassGIS undertook several automated processing steps to shift roofprint polygons as delivered by Rolta. Building lean effect may cause some buildings to cross over into adjacent parcels or overlap other features such as streets and water bodies. The shifting process was performed only in areas where MassGIS' LiDAR Terrain Data were available (Eastern Mass. inside of I-495; for details see the section "Adjustment Method" below). As a result, many of the shifted polygons better approximate building footprints.

Background on Building Lean

Ortho image data layers are really mosaics made up of portions of many overlapping aerial photo frames. The yellow lines represent the seam lines of these photos:

Ortho photo seams

The principal point of an aerial photo is the intersection of the optical axis of the camera lens and the photo image. The nadir is the point directly beneath the camera at the time of exposure. On a vertical aerial photograph (looking downward) the nadir and the principal point will be at the same location.

If a building is close to the principal point, the roof and base will appear to coincide (the base and sides of the building will not be visible; note the 26-story JFK Federal Building in Boston, at left in the image below):

Ortho image with no building lean

If a building is far from the principal point, toward the edge of the photo, the top of the building will appear to be farther away from the principal point than the bottom of the building. The building will appear to "lean" away from the principal point:

Ortho image with different building lean directions

The red line in the above image is the "seam" between two different photos. The buildings on either side of this line are from different photos, so the buildings seem to lean away from their respective principal points.

The magnitude of lean can be determined by:

  H / (D+d) = h / d

Where H is the camera height, h is the building height, D is the distance of the building from the principal point.

Or, since d is usually much smaller than D, D+d ~ D, so

  d ~ ( D / H ) x h

The average height of each building has been obtained from a LiDAR Normalized Difference Surface Model (NDSM). This raster is the difference between the LiDAR last-return elevations, and the LiDAR model of the ground.

It was assumed that the location of each principal point is at the mean center of each seam polygon, and that the aircraft altitude is 5,000 meters.

Roofprint adjustment method diagram

The lean also has a direction, so polygons representing "roofprints" have been moved a distance d in a direction opposite to the apparent displacement in the photo.

Result of shifted roofprint

In the image above, the red polygon is the original roofprint; the green polygon is the rectified (shifted) roofprint.

Adjustment Method

MassGIS used five Input Datasets:

1. DigitalGlobe 2011-2012 Orthoimagery (six blocks)

- Boston High Value Area
- Worcester High Value Area
- Cape Cod Refresh Area
- Standard Block 4171
- Standard Block 4172
- Standard Block 4272

2. NDSM Raster Images for LiDAR Project Areas (see Areas of LiDAR used in Shifting Index Map pdf format of    Roofprint Shift Area  ):

- FEMA 2010-2011:
       Nashua
       Concord River
       Charles River
       Blackstone
       Quincy
- LiDAR for the Northeast
- 2004 SE Massachusetts Pilot
- Buzzards Bay (parts of Bristol and Plymouth Counties)

[Manageable processing areas were determined based on the intersections of these regions.]

For each LiDAR project area, all LiDAR returns were filtered to create two ArcGIS Terrains:

  • Any return classified as "Ground" was used in a "Bare-Earth" Terrain
  • The last returns classified as "Ground" or as "Unclassified" went into a "Last Return" Terrain

These Terrains were then linearly interpolated to two 1.0 meter rasters. Finally, the Bare-Earth raster was subtracted from the Last Return raster, resulting in a Normalized Difference Surface Model (NDSM).

3. Orthoimage polygon tiles (irregularly-shaped "seam polygons") corresponding to each DigitalGlobe area

NDSMs were cut into smaller subimages (tiles) using the seam polygons for the corresponding area.

4. Seam center points were determined for each seam polygon.

5. Un-recified roofprint polygons (unshifted polygons as delivered by Rolta)

Processing

A model was developed in Trimble eCognition Developer 8.7.2 and run on eCognition Server that determined the distance and direction from each roofprint centroid to the tile's seam center, as well as the mean height the building. Output was a point shapefile.

An ArcGIS Toolbox script that prepared the output points and roofprints for rectification was run, followed by an ArcGIS Python script that created a dataset of shifted roofprint polygons.

The two sets of shifted roofprints in overlapping processing areas were examined, and where there were differences, the roofprints with the more accurate shift were kept.

Roofprints straddling seam lines usually contain two (or more) points with different values for angle and distance. These roofprints were generally not moved, but were coded TOUCH_SEAM = 1 so they could be tracked after processing.

Sources of possible error in the shifting process include:

  • The orthoimage used to determine the roofprint
  • The roofprint polygon as drawn
  • The estimated position of the principal point
  • The estimated camera altitude
  • The LiDAR NDSM raster
  • The estimate of the building height derived from the LiDAR NDSM raster

Situations which may cause the roofprint to shift more or less than it should:

  • The building represented by a roofprint was not built at the time of the LiDAR acquisition.
  • Trees may overhang a building, so that the elevation obtained may be higher than the building height.
  • Greenhouses may be represented in the roofprints layer, but not in the LiDAR.
  • A single roofprint representing a complex roof with different elevations may be shifted based on a single elevation value.

In a small number of cases, the shifting process caused some polygons to overlap others. These were found using ArcGIS topology and the polygons were moved manually so that no overlaps were present. Once the shifting process was complete, the shifted polygons replaced those in a copy of the original Rolta deliverable. The version of the Structures dataset distributed by MassGIS, therefore, is a hybrid of as-delivered polygons and those shifted by MassGIS. Finally, MassGIS took the hybrid layer and performed an Identity operation with the Survey-based Communities layer to populate the TOWN_ID fields.

Attributes

Polygons in this layer contain the following fields, all added by MassGIS:
Field Name Description
STRUCT_ID Unique polygon identifier, based on X,Y centroid coordinate of the feature in NAD83 Mass. State Plane meters
SOURCE Polygon source. All polygons from the original compilation are coded "ROLTA". Polygons digitized by MassGIS are coded "MAGIS". Other codes (in data delivered by the town of Dedham) include "PLANIMETRY", "SCANNED_PLAN",  and "HEADSUP_DIG".
SOURCETYPE Type of feature. Current codes are "ROOFPRINT", "ROOFPRINT_SHIFTED", and "FOOTPRINT".
SOURCEDATE Date (year) of source data used to create the structure polygon. Coded "20110000" or "20120000" (see Year of Photography Index pdf format of    DigitalGlobe Ortho Imagery Index Map 2011-2012  file size 1MB) for original compilation. The eight-digit format is to allow for more accurately recording the date as local datasets and newer imagery are used to update the statewide data layer.
SOURCEDATA Indicates what imagery was used as a source for digitizing structure polygons. Current values are "DG Worldview2 8-band satellite" and "USGS 2013 30 CM AERIAL IMAGERY"
MOVED Indicates with "Y" or "N" whether or not a ROLTA-compiled polygon was shifted to account for building lean in source imagery.
AREA_SQ_FT Area of the structure polygon in square feet, calculated with the "Calculate Geometry" tool in ArcGIS software.
TOWN_ID Identifier (1-351) for the city/town in which the structure is located. An ID of "0" (zero) indicates the structure is located out-of-state.
TOWN_ID2 Second identifier (1-351) for the city/town in which the structure is located, if the structure falls within two municipalities. In SDE format this field will be Null if structure falls within one town, 0 if partially out of state. Nulls are converted to zero in shapefile format.
TOWN_ID3 Third identifier (1-351) for the city/town in which the structure is located, if the structure falls within three municipalities. In SDE format this field will be Null if structure falls within one or two towns, 0 if partially out of state and within two Mass. towns. Nulls are converted to zero in shapefile format.
LOCAL_ID Identifier used by local entity. Currently not used as no local data are included in this layer.
GLOBALID Identifier used by MassGIS for in-house ArcSDE versioned replica editing. Not included in shapefile downloads.
ARCHIVED Yes/No code used by MassGIS for editing. Polygons coded 'Y' are deleted and added to the in-house STRUCTURES_POLY_ARCHIVED feature class. Not included in shapefile downloads.
ARCHIVEDATE Date of source imagery MassGIS used when deleting polygons and moving them to the in-house STRUCTURES_POLY_ARCHIVED feature class. Not included in shapefile downloads.

Maintenance

MassGIS maintains this layer. Data from municipal or other sources may replace features in this dataset as they become available.

In April 2013 footprint data from the town of Dedham replaced the majority of polygons in that community.

In April 2014 MassGIS published the first of several planned updates using 2013 USGS imagery as a base reference. By comparing the high definition Digital Globe imagery from 2011 that was used in the original compilation of the data to the 2013 USGS Ortho Images, new structures were identified and added to the data layer, and structures that have since been demolished were deleted. In the case of structures that have changed significantly, the original polygon was deleted (and saved in-house to an “archive” dataset) and a new one was added. Updated on April 2, 2014: Arlington, Auburn, Bedford, Belmont, Boylston, Brookline, Burlington, Clinton, Dedham, Douglas, Dover, Dudley, Leicester, Lexington, Medfield, Millbury, Needham, Newton, Norwood, Oxford, Paxton, Shrewsbury, Southbridge, Sterling, Sutton, Walpole, Waltham, Watertown, Webster, Wellesley, West Boylston, Weston, Westwood and Worcester. Updated on April 7, 2014: Abington, Avon, Canton, Cohasset, Foxborough, Hanover, Hanson, Hingham, Holbrook, Hull, Milton, Norwell, Pembroke, Quincy, Randolph, Rockland, Scituate, Sharon, Stoughton and Whitman.


Last Updated 4/7/2014