Shape and Join Data with the SoQL Query Editor on the Enterprise Data Platform


When to Shape or Join with the SoQL Query Editor

SoQL Query Editor

Shape Data

Join Data

Save and Publish the Result

Error Messages, Explained

Query the Results of a Query

Querying DateTime Datatype Columns

SoQL Functions


When to Shape or Join with the SoQL Query Editor

Published datasets on Data & Insights may be shaped by filtering, sorting, grouping, and performing other functions against the source dataset. Published datasets may also be joined together with other published datasets on Data & Insights. Both actions, shaping and joining, populate a new derived view with the desired data. The new view continuously refers back to the source or sources it has been produced from.

Available on Enterprise Data Platform, the SoQL Query Editor is an advanced surface for building and running queries on your published datasets in order to produce these views. This Query Editor enables a view to be articulated with Socrata Query Language, including all functions documented at Socrata Query Language is similar to Structured Query Language (SQL). The syntax will likely feel familiar to you if you are familiar with SQL.

The SoQL Query Editor also supports joins. The JOIN keyword may be used to join published Data & Insights datasets together as part of a SoQL query.


To open the Query Editor:

  1. On a published dataset, “View Data”.
  2. Select “Create View” on the action bar, in blue. This creates a draft view of the source dataset. You can only filter and perform other queries on a view of the source dataset.
  3. Open the Filter menu, in blue, and select SoQL Query Editor from the Filter pane.


SoQL Query Editor

The Editor is a surface for building a query. The columns in your dataset are listed after a SELECT keyword in the editor when you first open it up.   


To the right, Keywords, Columns, and Functions are listed. When hovering over one of these elements, you can get “Info” about the element or “Add” that keyword, column name, or function to your query at the location of your cursor.

Keywords are the foundational verbs of your query.

  • SELECT – selects columns from your source
  • JOIN – allows for data from another published Data & Insights dataset to be queried and included
  • WHERE – filters the records to be returned based on conditions you provide
  • GROUP BY – determines any groupings to use based on columns you provide
  • HAVING – filters records returned from an aggregation
  • ORDER BY – determines the order of the results

Underneath the editing space, a preview of your query results will be returned after you “Run” your query.

Aliasing Column Field Names

Within the SELECT statement, you may alias any columns with new field names. Be sure to use only valid field names:

  • all lowercase
  • no spaces
  • cannot be the same as another field in the query
  • cannot start with a number

Shape Data

Shaping data refers to actions such as aggregating, grouping, or calculating values from the column and records of your source dataset. Your inputs may be manipulated by the functions documented in the side panel of the query editor.

This example aggregates the total number of units within affordability bands. When applying a function to a column, place the function’s name just before the column name. Be sure to surround the column name with parenthesis to indicate that it is the object to be acted upon by that function.



You may rename the result of a function being applied to a column with the word ‘as’ followed by the new column name.

sum(totalunits) as sum_total_units



When applying functions that aggregate your data, you must account for columns that are not yet aggregated or grouped. Aggregating one column without providing a rationale for aggregating or grouping other columns in your query is not a valid query.

In an aggregation, the columns in the SELECT statement must either be used with an appropriate function, added to a GROUP BY condition, or removed from the SELECT statement if no longer relevant. In this example, Project ID and Building ID are levels of detail that should be removed when trying to determine at a higher level how many units in total there are within each affordability band.

Add the “GROUP BY” keyword and any relevant columns to set unique groups based on the values in those columns.


Once you have written a valid query, the results of the query may be viewed by selecting “Run”.


Join Data

The JOIN keyword in the Query Editor is used with identifiers of published datasets on Data & Insights to perform a join. You can check this article for more information on the different types of joins supported. Before building your query with the JOIN keyword, locate the identifier for the datasets you wish to join, and the column(s) you wish to join on:

  1. On a published dataset, locate the string of characters at the end of the URL. The string will have the format xxxx-xxxx and is referred to as the dataset’s 4x4 (four by four).
  2. While you’re there, locate the field name of the column you wish to join on, and any other field names needed. Field names are readily found on the dataset’s Primer page, in the “Columns in this Dataset” section. Expand each column’s row to discover the field name. Your column names may well be the field_names if they are short and contain no spaces.


To start writing the join, navigate to the primary dataset you wish to use in the join.

  1. Navigate to the SoQL Query Editor, as described above, at top.
  2. Add the JOIN keyword to the end of the prepopulated SELECT statement.
  3. Add an @ symbol, then the 4x4 of the second dataset you want to join to.
  4. Alias the second dataset with “as” followed by a short, descriptive alias for that table. It’s common practice to alias the first table as “a” and the second table as “b”, etc.

In this example, we have referenced the 4x4 of the second dataset and provided the alias “afford”, as it refers to an affordability dataset.

Now, it’s time to include the columns we want from our second table in the SELECT statement.

  1. Add column field names from the second table with an @ symbol and the alias before each field name, separating each with a dot.

For example:





Rename any of the newly referenced columns as needed.

To execute the join, define the columns that you want to match on.

  1. In your JOIN expression, and the operator “on”.
  2. Reference the column from the second table by name. For example, @afford.projectid
  3. Set that column equal to the column from the primary table that you will attempt the join with. For example, projectid.
  4. Select Run to preview the results.

Your join expression will look something like this:

JOIN @jqsc-vxz4 as a on @a.projectid = projectid

[join keyword] [@4x4] [as] [alias for 2nd table] [column, 2nd table] [column, primary table]


Save and Publish the Result 

Your query results are previewed in the Query Editor. Select Update Query in the lower right hand corner of the editor to reflect the query results in the view itself.

At this point you may decide to save the draft of this view. Saving the draft allows you to return to your view and your query at a later time to continue working. Saving will not publish your view.

To change the default title of your new view, select “Edit Metadata” next to the title and enter a more descriptive title. Save your metadata changes, then select return to dataset.

 You may also decide to publish this draft view to a specified audience, like private, internal, or public. At the same time, you can add and manage collaborators that may be working on the draft with you. Be sure to change the title of your view before publishing, for high-quality and clear assets.

Error Messages, Explained

“Invalid SoQL Query”

  • The query as written is not valid SoQL, meaning the query is attempting to perform some action that is not possible or not supported
  • Make sure all columns are accounted for in an aggregation
  • Make sure any functions used are accurately written. Make sure functions are receiving all of their required arguments.
  • Make sure your operators are correct.
    • For example, “is” is not the same as “=”. The operator “is” belongs with functions like “null” or “not …", whereas “=” is for comparisons with numbers or strings, etc.
  • Make sure you refer to the correct data type when writing out values.
    • For example, if a column is a number data type, you can refer to a value or compare to a value written plain, like 4587. However, if the column is a text data type, you must compare or refer to text strings surrounded by single or double quotes, like “4587”.

“Error Parsing SoQL Query... ”

  • There may be a missing comma after one of your column names in your SELECT statement. Note, though, that no comma is placed after the last column in your SELECT statement. Commas are used to separate the field names of your query. If nothing follows your last field name for a specific Keyword, then you don’t need a comma.
  • Check your keywords for proper order.
  • Check functions for necessary arguments, closed parenthesis, etc. 
  • Check for misspellings.
  • Be sure your aliases are valid. They must be lowercase with no spaces, and cannot begin with a number or include special characters. 


“Unexpected token … ”

  • There is likely a syntax error. Check for missing commas where they should be separating field names.
  • This may also refer to keywords being out of order

“Expression expected”

  • Related to above. If a comma is hanging around where it isn’t necessary, or if all parentheses are not closed, the query will wonder what you intend to write next and return this error.

“Identifier expected”

  • A column with an alias is not using a valid alias.
  • Aliases must be valid field names. They may not start with a number nor be just a number.


“Collocation is rejected because the total size in bytes exceeds the limit ...” 

  • Each domain has a limit on a necessary process called collocation. Reach out to Data & Insights support to mediate this error message. Please provide links to the datasets you are attempting to join and the corresponding query. 


“SoQL expression referred to column that does not exist: column” 

  • Check for misspellings of your column names or for an alias that you may be using inconsistently. 
  • If this is a join, check to make sure any columns from your joined table are referred to properly. 

Query the Results of a Query 

Sometimes you may want to query the results of a previous query, like when aggregating twice on the same column or when trying to count unique values. To accomplish the second query, create a new view with the results of your first query as the source.

  1. Save and publish your first view.
  2. Select Create View from the action bar.
  3. Open the Filter pane and open the SoQL Query Editor.
  4. Write and run your query, then save and publish the results.

Views are chained to one another. When you query a view, you may only query what is in that view, not everything from the parent dataset.

You may use any view as the source for a new view.


Querying DateTime Datatype Columns

Dates and times may be represented in many different formats on the Data & Insights platform. However, for querying, all dates are stored in ISO 8601 format. When including DateTime values in your query, you will need to write them out in ISO 8601 format. For example:


If you need to check the format or timezone of the DateTime values in a column, navigate to the dataset’s API endpoint to check on the full DateTime values themselves. A dataset’s API endpoint is {}/resource/{4x4}.json

SoQL Functions

Socrata Query Language Functions Listing

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request



Article is closed for comments.