Dataset Validation for Column Data Type Values

When uploading data into Socrata via API, the data should be in specific format best suited for the particular column type being used. These expected formats are not needed when uploading through the Socrata Dataset Management Experience which can handle validation errors through Transforms and existing logic.

The table below describes the valid formats for each data type, following these standards will ensure data is uploaded successfully.

Type Valid Formats Acceptable Value Examples Invalid Value Examples

Text strings up to 4kb in size. This would be about 4,000 characters for strings containing only ASCII characters. (This estimate is per cell value and the limit will be lowered based on non-ASCII characters.)

Null values are permitted and are interpreted as the absence of a value (not an empty string or "").

  • "Hello, world"
  • "2" 
  • 2 | This is a number value of 2 and not a string containing the text value of '2'
  • false | This is a boolean value and not the text string of 'false'

The values true and false with or without quotation encapsulation.

Null values are permitted and are interpreted as the absence of a value (not false).

  • true 
  • "false"
  • "N/A" or "null" | These are not a boolean value of true/false
  • 0 or 1 | This is a number value

Numeric values or strings that can be correctly parsed as numbers.

We permit numbers as strings in order to correctly handle values larger than MAX_SAFE_INT.

Null values are permitted and are interpreted as the absence of a value (not zero).

  •  0
  • 3.14
  • "3.14"
  • "N/A" or "null" or "" | These are strings with a text value and cannot be parsed as a number
floating_timestamp (calendar_date) 

ISO-8601 strings that do not include a timezone component.

For more information on ISO-8601 see:

  •  "2001-01-01T00:00:00.000"
  • "2001-01-01"
  • "2001-01-01T00:00:00.000Z" | 'Z' is a timezone component
  • "2001-01-01T00:00:00.000-0700" | '-0700' is a timezone component
  • 978307200000 | seconds since the epoch is not a valid ISO-8601 string


Well-known Text (WKT) objects of the respective types using decimal degrees in the WGS84 projection as the x and y parameters.

Note that the order of parameters is x followed by y (which is equivalent to longitude followed by latitude).

For more information on Socrata's support for geographic primitives please see:

For more information on the Well-known Text format see:

  • "POINT (-122.335167 47.608013)"
  • "LINESTRING (30 10, 10 30, 40 40)"
  • "MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)),
    ((15 5, 40 10, 10 20, 5 10, 15 5)))"
  • "POINT (47.608013, -122.335167)" | This is not a WKT-formatted object where the order is  (y x) or (long lat)"
  • [(47.608013, -122.335167), (32.698437, -114.650398)]" | This is not a WKT-formatted object
  • "LINE(-122.335167 47.608013)" | The correct prefix is LINESTRING, not LINE

Currently, while we work on upgrading the technical storage for datasets on the platform we have a background process for specifying geospatial points in the legacy 'location' format. This format is as follows:

"coordinates": {
"latitude": <latitude>,
"longitude": <longitude>
"human_address": {
"street_address": "<street address>",
"city": "<city>",
"state": "<state>",
"zip": "<zip code>"

Note: To ensure that your data ingress procedures will automatically benefit from future platform improvements, and to reduce any friction that future changes to your data ingress procedures will be necessary as we will eventually retire these background processes, you should consider proactively updating location columns to point data types at your earliest convenience.

  • "(47.608013, -122.335167)"
  • "255 S King Street, Seattle WA"
  • "POINT (-122.335167 47.608013)" | This is a WKT-formatted object, which is not compatible with the location column type
Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request



Article is closed for comments.