Expression Reference
Expression List Guide
Introduction
This section explains the available expressions you can use when creating calculated fields. These expressions enable powerful data transformations and aggregations directly within your dataset configuration.
Aggregate Functions
Aggregate functions summarize multiple values into a single result, providing concise insights for analysis.
Average (avg)
Definition: Returns the arithmetic mean of values in a column.
Syntax: avg(column)
Return Type: double
Example: For prices [100, 200, 300], avg(price) = 200.0
Count (count)
Definition: Counts the number of non-null rows in a column.
Syntax: count(column)
Return Type: bigint
Example: With 5 valid sales records, count(sales_id) = 5
Maximum (max)
Definition: Returns the largest value in a column.
Syntax: max(column)
Return Type: Same as input column
Example: For prices [100, 200, 150], max(price) = 200
Minimum (min)
Definition: Returns the smallest value in a column.
Syntax: min(column)
Return Type: Same as input column
Example: For prices [100, 200, 150], min(price) = 100
Sum (sum)
Definition: Adds all values in a column.
Syntax: sum(column)
Return Type: Same as input column
Example: For prices [100, 200, 150], sum(price) = 450
Mathematical Functions
These functions perform numeric transformations to support analytical calculations.
Absolute Value (abs)
Definition: Returns the absolute (non-negative) value.
Syntax: abs(column)
Return Type: Same as input
Example: abs(-154.85) = 154.85
Cube Root (cbrt)
Definition: Returns the cube root of a number.
Syntax: cbrt(column)
Return Type: double
Example: cbrt(27) = 3.0
Ceiling (ceil)
Definition: Rounds a number up to the nearest integer.
Syntax: ceil(column)
Return Type: Same as input
Example: ceil(3.2) = 4
Exponential (exp)
Definition: Returns e raised to the power of the input.
Syntax: exp(column)
Return Type: double
Example: exp(1) ≈ 2.71828
Floor (floor)
Definition: Rounds a number down to the nearest integer.
Syntax: floor(column)
Return Type: Same as input
Example: floor(3.8) = 3
Natural Logarithm (ln)
Definition: Returns the base-e logarithm.
Syntax: ln(column)
Return Type: double
Example: ln(2.71828) ≈ 1.0
Common Logarithm (log10)
Definition: Returns the base-10 logarithm.
Syntax: log10(column)
Return Type: double
Example: log10(1000) = 3.0
Round (round)
Definition: Rounds to the nearest integer (standard rounding).
Syntax: round(column)
Return Type: Same as input
Example: round(3.6) = 4
Sign Function (sign)
Definition: Returns -1 for negative, 0 for zero, 1 for positive.
Syntax: sign(column)
Return Type: Same as input
Example: sign(-5) = -1; sign(0) = 0
String Functions
Used to manipulate and analyze text data.
Length (length)
Definition: Returns the number of characters in a string.
Syntax: length(column)
Return Type: bigint
Example: length('Hello') = 5
Reverse (reverse)
Definition: Reverses the character order of a string.
Syntax: reverse(column)
Return Type: varbinary
Example: reverse('hello') = 'olleh'
Data Types Overview
| Type | Description | Examples |
|---|---|---|
BOOLEAN | Logical true/false values | true, false |
BIGINT | 64-bit signed integer | 9223372036854775807, -10000000000 |
DOUBLE | Double-precision floating-point number | 3.14159, -0.00001 |
INT | 32-bit signed integer | -5, 0, 123 |
FLOAT | Single-precision floating-point number | -87.5, 0.0, 3.14159 |
DATE | Calendar date (no time) | 2024-08-26 |
TIMESTAMP | Date and time with second precision | 2024-08-26 14:23:05 |
STRING | Immutable character sequence | "Hello World", "Blue" |
VARCHAR | Variable-length character string | "User123", "ProductABC" |
