posted this on September 21, 2011 12:16
What is it?
The Publishing Workflow changes some of the fundamental ways in which datasets work on Socrata. Instead of being a dataset being an always-live representation of your data - where consumers see changes or updates you make in real time - datasets will have two states: Published and Working Copy
A Published dataset is a static, locked copy of your data intended for sharing either with the public or within your organization. It represents a particular version of your dataset, and thus is locked against editing. In order to make edits to your datasets, you must first create a Working Copy.
A Working Copy is an editable version of your dataset that is distinct from the Published copy. Think of it as an internal draft copy of your dataset that you can collaborate with your coworkers on. You can make whatever changes you want to the Working Copy without those changes being reflected in the Published copy. When you’re done with making your changes, you can “publish” the Working Copy to be the new Published Copy, and your changes will become live. Even better, the old Published Copy becomes a Snapshot (if your platform configuration includes that feature), which you can reference as a historical version of the dataset.
What are its benefits?
There are many benefits to the Publishing Workflow, but here are a few of our favorites:
Since there is an explicit publishing step, Published copies can be heavily cached, allowing us to improve the performance of datasets both large and small
Since you can make changes to a Working Copy without affecting the Published copy, you have a chance to review updates for correctness, clarity, and completeness before they go live. You can even share them within your organization to get approval before the changes are published. No longer will you have to fear that the changes or updates you’re making will break your dataset!
Since old versions of your dataset are archived as Snapshots (as available in your platform configuration), and even accessible via the Consumer API, you can keep historical archives of your dataset around to show how it has changed over time.
How does it affect how I work with my data?
The Publishing Workflow does have a few effects on your dataset management process:
When you first create a dataset, it will be a Working Copy . Before you can make your dataset public, you'll need to publish your dataset to make a Published Copy for the public to access.
In order to make modifications to the structure or contents of your dataset, you'll need to create a Working Copy to make your modifications against.
Frequently Asked Questions
Does my Published Copy have to be public? - No, not at all. Published Copies have permissions just like any other dataset or view, so you can keep them private even after they're "published".
What if I want to collaborate with another user on a dataset? - When the Working Copy is created, it receives the same sharing grants that you have configured on the dataset itself. So if you want to collaborate with another user, grant them a Contributor or Owner share, and they'll be able to make modifications to Working Copies created on that dataset. For more details, see Permissions and the Publishing Workflow.