This documentation is related to an older version of Firebolt. For the most current documentation, see Firebolt documentation.
Working with arrays
This section covers querying and manipulating arrays in Firebolt.
- Declaring ARRAY data types in table definitions
- Simple array functions
- Manipulating arrays with Lambda functions
- UNNEST
Declaring ARRAY data types in table definitions
Array types are declared using ARRAY(<type>)
where <type>
can be any data type that Firebolt supports. This includes the ARRAY
data type, so arrays can be arbitrarily nested.
Array literals are also supported. For example, the SELECT
statement shown below is valid.
SELECT [1,2,3,4]
Basis for examples
All examples in this topic are based on the table below, named visits
. The column id
is of type INTEGER
. All other columns are of type ARRAY(TEXT)
.
Simple array functions
There are several fundamental functions that you can use to work with arrays, including LENGTH, ARRAY_CONCAT, and FLATTEN. See the respective reference for a full description. Brief examples are shown below.
LENGTH example
LENGTH
returns the number of elements in an array.
SELECT
id,
LENGTH(agent_props_keys) as key_array_length
FROM visits;
Returns:
+-----------------------+
| id | key_array_length |
+-----------------------+
| 1 | 3 |
| 2 | 2 |
| 3 | 3 |
+-----------------------+
ARRAY_CONCAT example
ARRAY_CONCAT
combines multiple arrays into a single array.
SELECT
id,
ARRAY_CONCAT(agent_props_keys, agent_props_vals) as concat_keys_and_vals
FROM visits;
Returns:
+----+------------------------------------------------------------------------------+
| id | concat_keys_and_vals |
+----+------------------------------------------------------------------------------+
| 1 | ["agent","platform","resolution","Mozilla/5.0","Windows NT 6.1","1024X4069"] |
| 2 | ["agent","platform","Safari","iOS 14"] |
| 3 | ["agent","platform","platform","Safari","iOS 14","Windows 11"] |
+----+------------------------------------------------------------------------------+
FLATTEN example
FLATTEN
converts one or more nested arrays into a single array.
SELECT FLATTEN([ [[1,2,3],[4,5]], [[2]] ]) as flattened_array;
Returns:
+-----------------+
| flattened_array |
+-----------------+
| [1,2,3,4,5,2] |
+-----------------+
Manipulating arrays with Lambda functions
Firebolt Lambda functions are a powerful tool that you can use on arrays to extract results. Lambda functions iteratively perform an operation on each element of one or more arrays. Arrays and the operation to perform are specified as arguments to the Lambda function.
Lambda function general syntax
The general syntax pattern of a Lambda function is shown below. For detailed syntax and examples see the reference topics for Lambda functions.
<LAMBDA_FUNC>(<arr1_var>[, <arr2_var>][, ...<arrN_var>]) -> <operation>, <array1>[, <array2>][, ...<arrayN>])
Parameter | Description |
---|---|
<LAMBDA_FUNC> | Any array function that accepts a Lambda expression as an argument. For a list, see Lambda functions. |
<arr1_var>[, <arr2_var>][, ...<arrN_var>] | A list of one or more variables that you specify. The list is specified in the same order and must be the same length as the list of array expressions (<array1>[, <array2>][, ...<arrayN>] ). At runtime, each variable contains an element of the corresponding array. The specified <operation> is performed for each variable. |
The operation that is performed for each element of the array. This is typically a function or Boolean expression. | |
A comma-separated list of expressions, each of which evaluates to an ARRAY data type. |
Lambda function example–single array
Consider the following TRANSFORM array function that uses a single array variable and reference in the Lambda expression. This example applies the UPPER
function to each element t
in the ARRAY
-typed column tags
. This converts each element in each tags
array to upper-case.
SELECT
id,
TRANSFORM(t -> UPPER(t), tags) AS up_tags
FROM visits;
Returns:
+----+--------------------------+
| id | up_tags |
+----+--------------------------+
| 1 | ["SUMMER-SALE","SPORTS"] |
| 2 | ["GADGETS","AUDIO"] |
| 3 | ["SUMMER-SALE","AUDIO"] |
+----+--------------------------+
Lambda function example–multiple arrays
ARRAY_FIRST is an example of a function that takes multiple arrays as arguments in a map of key-value pairs. One array represents the keys and the other represents the values.
ARRAY_FIRST
uses a Boolean expression that you specify to find the key in the key array. If the Boolean expression resolves to true, the function returns the first value in the value array that corresponds to the key’s element position. If there are duplicate keys, only the first corresponding value is returned.
The example below returns the first value in the agent_props_vals
array where the corresponding position in the agent_props_keys
array contains the key platform
.
SELECT
id,
ARRAY_FIRST(v, k -> k = 'platform', agent_props_vals, agent_props_keys) AS platform
FROM visits;
Returns:
+----+----------------+
| id | platform |
+----+----------------+
| 1 | Windows NT 6.1 |
| 2 | iOS 14 |
| 3 | iOS 14 |
+----+----------------+
ARRAY_SORT sorts one array by another. One array represents the values and the other represents the sort order.
The example below sorts the first array by the positions defined in the second array
SELECT
ARRAY_SORT(x,y -> y, [ 'A','B','C'],[3,2,1]) AS res;
Returns:
+-----------------+
| res |
+-----------------+
| ["C", "B", "A"] |
+-----------------+
UNNEST
You might want to transform a nested array structure to a standard tabular format so that you can expose views to BI tools that can’t handle Firebolt array syntax, or you might find the tabular format more natural to query using standard SQL idioms. UNNEST
serves these purposes.
UNNEST is part of the FROM clause and resembles a JOIN. Given an ARRAY
-typed column, UNNEST
unfolds the elements of the array and duplicates all other columns found in the SELECT
clause for each array element.
A single UNNEST
acts similarly to JOIN
. You can use a single UNNEST
command to unnest several arrays if the arrays are the same length.
Multiple UNNEST
statements in a single FROM
clause result in a Cartesian product. Each element in the first array has a record in the result set corresponding to each element in the second array.
Example–single UNNEST with single ARRAY-typed column
The following example unnests the tags
column from the visits
table.
SELECT
id,
tags
FROM visits
UNNEST(tags);
Returns:
+----+---------------+
| id | tags |
+----+---------------+
| 1 | "summer-sale" |
| 1 | "sports" |
| 2 | "gadgets" |
| 2 | "audio" |
+----+---------------+
Example–single UNNEST using multiple ARRAY-typed columns
The following query specifies both the agent_props_keys
and agent_props_vals
columns to unnest.
SELECT
id,
a_keys,
a_vals
FROM
visits
UNNEST(agent_props_keys as a_keys,
agent_props_vals as a_vals)
Returns:
+----+------------+------------------+
| id | a_keys | a_vals |
+----+------------+------------------+
| 1 | agent | “Mozilla/5.0” |
| 1 | platform | “Windows NT 6.1” |
| 1 | resolution | “1024x4069” |
| 2 | agent | “Safari” |
| 2 | platform | “iOS 14” |
+----+------------+------------------+
Example–multiple UNNEST clauses resulting in a Cartesian product
The following query, while valid, creates a Cartesian product.
SELECT
id,
a_keys,
a_vals
FROM
visits
UNNEST(agent_props_keys as a_keys)
UNNEST(agent_props_vals as a_vals)
Returns:
+-----+------------+------------------+
| INTEGER | a_keys | a_values |
+-----+------------+------------------+
| 1 | agent | "Mozilla/5.0" |
| 1 | agent | "Windows NT 6.1" |
| 1 | agent | "1024x4069" |
| 1 | platform | "Mozilla/5.0" |
| 1 | platform | "Windows NT 6.1" |
| 1 | platform | "1024x4069" |
| 1 | resolution | "Mozilla/5.0" |
| 1 | resolution | "Windows NT 6.1" |
| 1 | resolution | "1024x4069" |
| 2 | agent | "Safari" |
| 2 | agent | "iOS 14" |
| 2 | platform | "Safari" |
| 2 | platform | "iOS 14" |
+-----+------------+------------------+
Example–error on UNNEST of multiple arrays with different lengths
The following query is invalid and will result in an error as the tags
and agent_props_keys
arrays have different lengths for row 1.
SELECT
id,
tags,
a_keys
FROM
visits
UNNEST(tags, agent_props_keys as a_keys)