Tips and Tricks: Data Preparation Tips

Before uploading data there are some things you can do to make your data openly discoverable and accessible for the long term. 

Here are some keys to successful data curation: -

1. Use raw data. There’s an adage in database architecture, “Don’t store what you can calculate”. It’s best to store only raw data and leave all of the totaling, subtotaling and other calculations up to the platform after it’s uploaded.

This makes your data more flexible; able to be consumed and visualized in more ways. It also ensures that, as you update your data, the calculations will always reflect the current values. It also makes the database size smaller, which speeds up processing your data; searching for instance.


2. Eliminate trailing spaces. Trailing spaces affect searches and operations which require an exact match to be made, as each space is treated as a character. To eliminate these spaces within Excel you can use the TRIM() function.


3. Use the right data type. Whenever possible, use number and date datatypes, not plain text. Using numbers and dates allows the platform to perform calculations and build calendars, which it can’t do with plain text.
When choosing your datatype, think through how you want to use your data and how you would like it to appear.  Once your data is uploaded you can change the datatype, but doing so will change the schema because changing the datatype actually deletes and recreates that column.  This can lead to errors if you change the schema after you have created views or charts from the dataset.


4. Use human-readable, meaningful field names. It really helps users who are not familiar with your data to know immediately what information is contained in that field.


5. Keep your location information in separate fields. Leave your address, city, state, and zip code in separate columns. It allows the platform to build a Location field (point column) that can be geocoded. This is critical if you want to create maps with your data (and you do want to create maps, right?). It also keeps your options open for filtering your data by one of those fields.

Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request



Article is closed for comments.