Data Transformation Examples

A collection of examples and how you can leverage the data transform functionality to manage incoming data. 

Table of Contents:

Usage notes:

Italicized text represents a column in the dataset and will need to be replaced with the column field name of the dataset being used.

Copying and pasting snippets directly from the web page will likely result in compiler failures as there may be hidden styling characters copied over. For best results copy a snippet into a plain text editor first and then transfer that over to a column's Data Transforms editor.

Date and Time Transformations

case(
date_extract_m(date_column) == 7, '1',
date_extract_m(date_column) == 8, '1',
date_extract_m(date_column) == 9, '1',
date_extract_m(date_column) == 10, '2',
date_extract_m(date_column) == 11, '2',
date_extract_m(date_column) == 12, '2',
date_extract_m(date_column) == 1, '3',
date_extract_m(date_column) == 2, '3',
date_extract_m(date_column) == 3, '3',
date_extract_m(date_column) == 4, '4',
date_extract_m(date_column) == 5, '4',
date_extract_m(date_column) == 6, '4'
)

case(
date_extract_m(date_column) == 7, to_text(date_extract_y(date_column) + 1),
date_extract_m(date_column) == 8, to_text(date_extract_y(date_column) + 1),
date_extract_m(date_column) == 9, to_text(date_extract_y(date_column) + 1),
date_extract_m(date_column) == 10, to_text(date_extract_y(date_column) + 1),
date_extract_m(date_column) == 11, to_text(date_extract_y(date_column) + 1),
date_extract_m(date_column) == 12, to_text(date_extract_y(date_column) + 1),
date_extract_m(date_column) == 1, to_text(date_extract_y(date_column)),
date_extract_m(date_column) == 2, to_text(date_extract_y(date_column)),
date_extract_m(date_column) == 3, to_text(date_extract_y(date_column)),
date_extract_m(date_column) == 4, to_text(date_extract_y(date_column)),
date_extract_m(date_column) == 5, to_text(date_extract_y(date_column)),
date_extract_m(date_column) == 6, to_text(date_extract_y(date_column))
)

to_floating_timestamp(
to_text(year_column) || '-01-01T00:00:00', '%Y-%m-%dT%H:%M:%S'
)

datetime_diff(
to_fixed_timestamp(start_date_column),
to_fixed_timestamp(end_date_column),
'timezone',
'unit of time'
)

note:
timezone example - 'US/Central'
available units of time: seconds, minutes, hours, days, business days, weeks, months

example source data formats:
20190101000000
20190101

case(
length(date_column) == 14, to_floating_timestamp(date_column,'%Y%m%d%H%M%S'),
length(date_column) == 8, to_floating_timestamp(date_column, '%Y%m%d')
)

note:
Valid time values on the platform fall within 00:00:00 - 23:59:59

to_floating_timestamp(
date_column ||
case(
time_column = '24:00:00', '23:59:59',
TRUE, time_column
),
'%Y%m%d%H:%M:%S'
)

Georeference Transformations

forgive(
case(
street_address_column is not null, geocode(street_address_column, city_column, state_column, zip_column),
city_column is not null, geocode(null, city_column, null, null),
zip_column is not null, geocode(null, null, null, zip_column)
)
)

note:
forgive() transform will nullify addresses that could not be geocoded in the case() statement

example value: "02327600 06959000"

to_point(
reproject_to_wgs84(set_projection(
to_point(('POINT (' ||x_y_column) || ')'),
'+init=epsg:2276'
))
)

note:
The x_y_column value must be in the format of "X Y"

If centroids are guaranteed to fall within county boundaries:
centroid(
county_boundary( to_text(state_column), to_text(county_column) )
)
If centroids may fall outside county boundaries:
polylabel(
county_boundary( to_text(state_column), to_text(county_column) )
)

Data Manipulation Transforms

case(
encoded_column = 'ATT', 'ATTEMPTED',
encoded_column = 'COM', 'COMPLETED',
TRUE, encoded_column
)

case(
column = 'null', null,
column = 'NA', null,
column = 'n/a', null,
column = 'none', null,
true, column
)

Was this article helpful?
1 out of 2 found this helpful
Have more questions? Submit a request

Comments

0 comments

Article is closed for comments.