Source of Imagery
Our strategy is to reuse images from existing benchmark datasets as much as possible and manually annotate new land cover labels. We selected xBD, Inria, Open Cities AI, SpaceNet, Landcover.ai, AIRS, GeoNRW, and HTCD datasets. For countries and regions not covered by the existing datasets, aerial images publicly available in such countries or regions were collected to mitigate the regional gap, which is an issue in most of the existing benchmark datasets. The open data were downloaded from OpenAerialMap and geospatial agencies in Peru and Japan. The attribution of source data is summarized here.
Classes and Annotations
We provide annotations with eight classes: bareland, rangeland, developed space, road, tree, water, agriculture land, and building. Their color and proportion of pixels are summarized below. All the labeling was done manually, and it took 2.5 hours per image on average.
Comparison with Related Datasets
OpenEarthMap presents a major advance over existing data with respect to geographic diversity and annotation quality (e.g., spatial details) as summarized below.
|Image level||GSD (m)||Dataset||Task||Classes||Countries||Regions||Area (km2)||Segments|
|Sub-meter level||0.3--0.5||SpaceNet 1/2||B||2||5||5||5,555||685,235|
|0.02--0.2||Open Cities AI||B||2||8||11||419||792,484|
License of Labels in OpenEarthMap
Label data of OpenEarthMap are provided under the same license as the original RGB images, which varies with each source dataset. For more details, please see the attribution of source data here. Label data for regions where the original RGB images are in the public domain or where the license is not explicitly stated are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.