We Are Here To Help

Follow

Socrata Internationalization and Localization

Socrata is making investments to support data standards in other countries, below is an outline of the changes.

Non-US Date Formats

Most countries outside the US use the date format that follows DD/MM/YYYY, or some variation of that.

  • Getting data in: Dependent on a setting for your Socrata domain, if your country/locale defaults to the date format of DD/MM/YYYY, then your domain will expect that format on import.
  • Displaying that data: Data publishers can change the date column display format to match their desired format, such as DD/MM/YYYY. This needs to be done on a column-by-column and dataset-by-dataset basis.
  • Exporting that data: Your data will export with the format it is displayed in. So if your date date is displayed as DD/MM/YYYY it will be exported that way.
  • API: the API formats dates according to ISO8601 standards in order to be machine readable.

European Number Formats

In most European countries (UK and Ireland excluded), number formats use a “.” for the thousands separator and a “,” for the decimal separator. For example, the number 1,234.56 in the US would be written as 1.234,56 in Spain.

  • Getting data in: Dependent on a setting for your Socrata domain, if your country/locale defaults to the number format used in most European Countries, then your domain will expect that format on import.
  • Displaying that data: Data publishers can change number, percent, and money columns display format to match their desired format through the “localization” option. This needs to be done on a column-by-column and dataset-by-dataset basis.
  • Exporting that data: NEW! Number, Percent, and Money columns with localized number formats configured (such as a “.” for the thousands separator and “,” for the decimal separator) are preserved when exported with “CSV for Excel” or “CSV for Excel (Europe).” Other export types do not preserve formatting because they need to follow machine-readable web standards, which specify that commas should not be used with numbers.

Semicolon Separated Files

In the European countries that use the number formatting listed above, semicolon separated CSV files are more commonly used because of the use of commas in numbers. Excel in Europe is defaulted to reading CSVs with a semicolon separator, rather than a comma.

  • Getting data in: Through the UI, you can import .csv files with semicolon separators without any additional work needed. Through DataSync, using the “Advanced Configurations” option when mapping fields, you can specify the separator as a semicolon.
  • Exporting that data: NEW! There is a new “CSV for Excel (Europe)” export type, which will export data with semi-colon separators, as well as date formats and number formats preserved if applicable.

Encoding (handling accents and special characters)

Many languages use accents and other non-English characters. With those characters, the encoding of the file is critical in ensuring that Socrata and other applications read the text correctly.

  • Getting data in: NEW! Through the UI, we’ve updated the library we use to automatically detect your file’s encoding, so that characters are automatically imported correctly. Through DataSync, using the “Advanced Configurations” option when mapping fields, you can specify the file encoder if it differs from UTF-8.
  • Displaying that data: if it’s imported correctly, characters will be displayed correctly.
  • Exporting that data: NEW! Characters can be exported correctly and read through Excel. if users download the “CSV for Excel” or “CSV for Excel (Europe)” file. These export options contain the “BOM” character needed for Excel to properly read and display the text data.

Currencies

Non-US customers use other currency symbols besides the $. These currency symbols appear in data that needs to be imported into Socrata, how it’s displayed in the platform, and the data that is exported.

  • Getting data in: Currencies other than the $ will be accepted when importing that data through the UI. The currency symbol must be consistent within the column, i.e. if you have mixed currencies the data will be invalid if the column is a “money” data type. Currency symbols of any kind ($ included) are not supported through DataSync.
  • Displaying that data: Data publishers can change the money column display format to match their desired currency symbol. This needs to be done on a column-by-column and dataset-by-dataset basis.
  • Exporting that data: The currency defined in the dataset and column is exported with the data.
  • API: Currency symbols of any kind ($ included) are not included in the API

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk