Skip to content

Instantly share code, notes, and snippets.

@donmccurdy
Last active September 14, 2021 22:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save donmccurdy/2779b63b09be4856b6daf1e8b9e36f1f to your computer and use it in GitHub Desktop.
Save donmccurdy/2779b63b09be4856b6daf1e8b9e36f1f to your computer and use it in GitHub Desktop.
EXT_feature_metadata: Optional and Nullable Values

(1) Optional Property Column

Already defined in schema.

(2) Optional Property Value

noData: NoData values represent missing data (also known as sentinel values). If omitted, property values are considered to be present for all features — with the exception of variable-length ARRAY elements, which may be left empty without a NoData value.

"classes": {
    "building": {
        "properties": {
            "height": {
                "type": "FLOAT32"
            },
            "owners": {
                "type": "ARRAY",
                "componentType": "STRING",
                "noData": ["", "NULL"]
            },
            "buildingType": {
                "type": "ENUM",
                "enumType": "buildingType",
                "noData": [0]
            }
        }
    }
}

Details and questions:

  1. for ENUM, should noData be {always, never, sometimes} part of the enum set?
    • Opinion: I'd consider it best practice to include an "UNSPECIFIED" value for enums used in APIs and public data, per https://google.aip.dev/126. So I would advise that it should be at least valid for the noData value to appear in the enum; whether this should be required I'm not sure.
  2. for STRING would you explicitly list "" as a noData value?
    • Opinion: Empty string should be considered a value like any other, unless specified in noData.
  3. for FLOAT32 and FLOAT64, do we allow NaN? IEEE-754 supports it. Is it implicitly a noData value, if so?
    • Opinion: I'm not sure whether NaN should be allowed. If it is allowed, I think it should be implicitly considered noData, because it can't be serialized as JSON.
  4. for BOOL, noData is disallowed?
    • Opinion: Probably better to disallow it than to complicate boolean storage for this case... recommend use of enum (UNSPECIFIED | TRUE | FALSE) when noData is required.
  5. can a variable-length array have NoData values?
    • Opinion: When order is important, I suppose [1, 2, 3, _, 4] might be meaningful... what about a single-element [ _ ] array, whose only element is a noData value? This seems "bad" but I'm not sure whether prohibiting it is (a) providing useful consistency to the format, or (b) overzealous.
  6. can a fixed-length array have NoData values? or should it be variable-length instead? VECN/MATN types?
    • Opinion: ???

(3) Optional Feature ID

schema.class.count defines number of features, where feature IDs start at 0. Feature ID storage may include values outside [0, count-1], which are to be interpreted as "not a feature".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment