Tips and Tricks: Data Preparation Tips

Before you start uploading there are some things you can do to make your data openly discoverable and accessible for the long term. Something we call data curation. Here are some keys to successful data curation -

Use raw data

There’s an adage in database architecture, “Don’t store what you can calculate”. It’s best to store only raw data and leave all of the totaling, subtotaling and other calculations up to the platform after it’s uploaded. This makes your data more flexible; able to be consumed and visualized in more ways. It also ensures that, as you update your data, the calculations will always reflect the current values. It also makes the database size smaller, which speeds up processing your data; searching for instance.

Eliminate trailing spaces

Trailing space affect searches and operations which require an exact match to be made, as each space is treated as a character. To eliminate these spaces within Excel you can use the TRIM() function.

Use the right datatype

Whenever possible, use number and date datatypes, not plain text. Using numbers and dates allows the platform to perform calculations and build calendars, which it can’t do with plain text.

When choosing your datatype, think through how you want to use your data and how you would like it to appear.  Once your data is uploaded you can change the datatype, but doing so will change the schema, because changing the datatype actually deletes and recreates that column.  This can lead to errors if you change the schema after you have created views or charts from the dataset.

Use human-readable, meaningful field names

It really helps users who are not familiar with your data to know immediately what information is contained in that field.

Keep your location information in separate fields

Leave your address, city, state and zip code in separate columns. It allows the platform to build a Location field that can be geocoded. This is critical if you want to create maps with your data (and you do want to create maps, right?). It also keeps your options open for filtering your data by one of those fields.

Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request



Article is closed for comments.