Data Publishers can use Data & Insights Gateway to create and update datasets using the Gateway agent and plugins. Both the Agent and Plugin can be created in the Data & Insights Dataset Management Experience (SDMX).
- Connect to an External Datasource
- Provision Agent
- Create a Plugin
- Creating a Dataset
- Schedule Dataset Updates
Connect to an External Data Source
The first thing you will do will be to either create a new dataset or edit an existing one - taking you to the main page in SDMX.
You will then need to select Add Data in the Data Actions section.
From there, you will want to choose the option Connect to an External Data Source.
You will now be on the Data & Insights Gateway page for your dataset. This page will list any plugins already created to choose from. If your plugin is already created skip to Connect Datasource. Otherwise, your next step will be to Provision a new agent.
The first step you will take is to provision a new agent. To do so, click the Provision Agent button at the top right of the page.
This will launch a new window with additional instructions. The first thing you will do is to name your agent and download it. The agent will download as a .zip file containing an executable .jar program and a README* file. Once you agent is downloaded, move on to the Next page.
*Tip: The README file is a great resource with instructions for MacOS, Windows and Linux.
Once you have downloaded the agent, the next step will be to set it up. The window will allow you to view instructions for Windows, Mac, or Linux machines. In general, the steps to setting up a new agent will be as follows:
- Place the downloaded folder on the server or computer you are connecting to Data & Insights (if it’s not there already).
- Extract/Unzip the downloaded file.
- Open the new subfolder containing the zipped contents of the download.
- Run the agent as a service
Once the agent is running, go back to the Agent Stepper and click refresh at the bottom of the page, this will prompt Data and Insights to check to see if the agent is running. You will not be able to move to the next page until the agent is confirmed to be running. Once it is confirmed that the Agent is running, click Next in the window.
You will now have the option to either set up a plugin now or choose to do it a later time, the following section will walk through creating a plugin.
Create a Plugin
A plugin is used to connect directly to a specific source system. Once a plugin is configured, users can create datasets from any data the plugin is configured to access for that source. You can set-up a plugin on the final page of the Agent set up tool or at any point with the Add Plugin button.
When adding a plugin, the first thing you will do is search or browse the list to determine the plugin associated with your source system.
In this example, I have chosen the CSV option. Selecting Set-Up will open a new window with steps to configuring your plugin.
The first page is a plugin overview, it will contain a description and any required fields needed to run the plugin:
On the second page, you will name your plugin, the name should help you understand the data source you are connecting to.
The final page will give you the Set-up instructions for the plugin.
- During the agent set up process, if you didn’t note the path of your downloaded agent file, gather that path now. It will be called <Gateway_The name you called your agent>.
- Open the command prompt (Windows) or Terminal (Mac) and navigate to that folder where you have placed the agent. This can be done using the cd (change directory) command.
- Copy and paste this command into the command prompt/terminal.
java -jar socrata-ingress-agent.jar --configure-plugin csv:Andrew_CSV_1
- Run the command by hitting enter or return. This will download the plugin from Data & Insights, verify it, and run it on your server. If you get an error, try changing directories.
- When the plugin is run it will display a window on your server asking for plugin-specific configuration and credentials (if needed). This information will not live on Data & Insights.
- Fill out the configuration information and click “OK”.
Once completed you can return to Data & Insights, the Done option in the setup window should now be available.
Creating a Dataset
Now that your plugin is available, you can select it and connect it to the data source you are using for the dataset.
In this example, I will select the CSV plugin, there is an arrow dropdown that will show a list of all available files that can be published to your dataset. To choose the source data, select Use this data source.
Selecting the data source will then load the data into Data & Insights, and you can proceed to set up the dataset just as you would any other!
Schedule Dataset Updates
Once the dataset has been published, access the Primer page and open the dropdown menu to find the Schedule Updates option.
This will open a modal where you can configure the frequency of update in days, the time of day the update will occur as well as view or change the datasource settings. In the below example, the dataset would update at 6 AM Pacific time every day:
Once the time and frequency are set, click Save Schedule. The schedule can be modified at any time by choosing the Schedule Updates option.
Congratulations! You now have a fully automated Gateway powered dataset!
NOTE: If the query powering Gateway automation is updated, in order to keep the automation working, it is necessary to manually create a draft of the dataset and reload the dataset in the Data & Insights Dataset Management Experience then publish it.
Interested in being notified when your schedule completes? Check out our article on Gateway and Schedule Monitoring to setup your notifications!