Scalar Types

The schema should describe any irreducible scalar types. Scalar types can be used as the types of columns, or in general as the types of object fields.

Scalar types define several types of operations, which extend the capabilities of the query and mutation APIs: comparison operators and aggregate functions.

Type Representations

A scalar type definition must include a type representation. The representation indicates to potential callers what values can be expected in responses, and what values are considered acceptable in requests.

Supported Representations

typeDescriptionJSON representation
booleanBooleanBoolean
stringStringString
int8An 8-bit signed integer with a minimum value of -2^7 and a maximum value of 2^7 - 1Number
int16A 16-bit signed integer with a minimum value of -2^15 and a maximum value of 2^15 - 1Number
int32A 32-bit signed integer with a minimum value of -2^31 and a maximum value of 2^31 - 1Number
int64A 64-bit signed integer with a minimum value of -2^63 and a maximum value of 2^63 - 1String
float32An IEEE-754 single-precision floating-point numberNumber
float64An IEEE-754 double-precision floating-point numberNumber
bigintegerArbitrary-precision integer stringString
bigdecimalArbitrary-precision decimal stringString
uuidUUID string (8-4-4-4-12 format)String
dateISO 8601 dateString
timestampISO 8601 timestampString
timestamptzISO 8601 timestamp-with-timezoneString
geographyGeoJSON, per RFC 7946JSON
geometryGeoJSON Geometry object, per RFC 7946JSON
bytesBase64-encoded bytesString
jsonArbitrary JSONJSON

Enum Representations

A scalar type with a representation of type enum accepts one of a set of string values, specified by the one_of argument.

For example, this representation indicates that the only three valid values are the strings "foo", "bar" and "baz":

{
  "type": "enum",
  "one_of": ["foo", "bar", "baz"]
}

Comparison Operators

Comparison operators extend the query AST with the ability to express new binary comparison expressions in the predicate.

For example, a data connector might augment a String scalar type with a LIKE operator which tests for a fuzzy match based on a regular expression.

A comparison operator is either a standard operator, or a custom operator.

To define a comparison operator, add a ComparisonOperatorDefinition to the comparison_operators field of the schema response.

For example:

{
  "scalar_types": {
    "String": {
      "aggregate_functions": {},
      "comparison_operators": {
        "like": {
          "type": "custom",
          "argument_type": {
            "type": "named",
            "name": "String"
          }
        }
      }
    }
  },
  ...
}

Standard Comparison Operators

Equal

An operator defined using type equal tests if a column value is equal to a scalar value, another column value, or a variable.

Note: syntactic equality

Specifically, a predicate expression which uses an operator of type equal should implement syntactic equality:

  • An expression which tests for equality of a column with a scalar value or variable should return that scalar value exactly (equal as JSON values) for all rows in each corresponding row set, whenever the same column is selected.
  • An expression which tests for equality of a column with another column should return the same values in both columns (equal as JSON values) for all rows in each corresponding row set, whenever both of those those columns are selected.

This type of equality is quite strict, and it might not be possible to implement such an operator for all scalar types. For example, a case-insensitive string type's natural case-insensitive equality operator would not meet the criteria above. In such cases, the scalar type should not provide an equal operator.

In

An operator defined using type in tests if a column value is a member of an array of values. The array is specified either as a scalar, a variable, or as the value of another column.

It should accept an array type as its argument, whose element type is the scalar type for which it is defined. It should be equivalent to a disjunction of individual equality tests on the elements of the provided array, where the equality test is an equivalence relation in the same sense as above.

less_than, greater_than, less_than_or_equal, greater_than_or_equal

An operator defined using type less_than tests if a column value is less than a specified value. Similarly for the other comparisons here.

If a connector defines more than one of these standard operators, then they should be compatible:

  • When using less_than, a row should be included in the generated row set if and only if it would not be returned in the corresponding greater_than_or_equal comparison, and vice versa. More succinctly, it is expected that x < y holds exactly when x >= y does not hold.
  • It is expected that x < y holds exactly when y > x holds.
  • It is expected that x <= y holds exactly when y >= x holds.

The less_than_or_equal and greater_than_or_equal operators are expected to be reflexive. That is, they should return a superset of those rows returned by the corresponding equal (syntactic equality) operator.

Each of these four operators is expected to be transitive. That is, for example x < y and y < z together imply x < z, and similarly for the other operators.

contains, icontains, starts_with, istarts_with, ends_with, iends_with

These operators must only apply to scalar types whose type representation is string.

An operator defined using type contains tests if a string-valued column on the left contains a string value on the right. icontains is the case-insensitive variant.

An operator defined using type starts_with tests if a string-valued column on the left starts with a string value on the right. istarts_with is the case-insensitive variant.

An operator defined using type ends_with tests if a string-valued column on the left ends with a string value on the right. iends_with is the case-insensitive variant.

Custom Comparison Operators

Data connectors can also define custom comparison operators using type custom. A custom operator is defined by its argument type, and its semantics is undefined.

Aggregate Functions

Aggregate functions extend the query AST with the ability to express new aggregates within the aggregates portion of a query. They also allow sorting the query results via the order_by query field.

Note: data connectors are required to implement the count and count-distinct aggregations for columns of all scalar types, and those operator is distinguished in the query AST. There is no need to define these aggregates as aggregate functions.

For example, a data connector might augment a Float scalar type with a SUM function which aggregates a sum of a collection of floating-point numbers.

Just like for comparison operators, an aggregate function is either a standard function, or a custom function.

To define an aggregate function, add a AggregateFunctionDefinition to the aggregate_functions field of the schema response.

For example:

{
  "scalar_types": {
    "Float": {
      "aggregate_functions": {
        "sum": {
          "type": "sum",
          "result_type": "Float"
        },
        "stddev": {
          "type": "custom",
          "result_type": {
            "type": "named",
            "name": "Float"
          }
        }
      },
      "comparison_operators": {}
    }
  },
  ...
}

Standard Aggregate Functions

sum

An aggregate function defined using type sum should return the numerical sum of its provided values.

The result type should be provided explicitly, in the result_type field, and should be a scalar type with a type representation of either Int64 or Float64, depending on whether the scalar type defining this function has an integer representation or floating point representation.

A sum function should ignore the order of its input values, and should be invariant of partitioning, that is: sum(x, sum(y, z)) = sum(x, y, z) for any partitioning x, y, z of the input values. It should return 0 for an empty set of input values.

average

An aggregate function defined using type average should return the average of its provided values.

The result type should be provided explicitly, in the result_type field, and should be a scalar type with a type representation of Float64.

An average function should ignore the order of its input values. It should return null for an empty set of input values.

min, max

An aggregate function defined using type min or max should return the minimal/maximal value from its provided values, according to some ordering.

Its implicit result type, i.e. the type of the aggregated values, is the same as the scalar type on which the function is defined, but with nulls allowed if not allowed already.

A min/max function should return null for an empty set of input values.

If the set of input values is a singleton, then the function should return the single value.

A min/max function should ignore the order of its input values, and should be invariant of partitioning, that is: min(x, min(y, z)) = min(x, y, z) for any partitioning x, y, z of the input values.

Custom Aggregate Functions

A custom aggregate function has type custom and is defined by its result type - that is, the type of the aggregated data. The result type can be any type, not just a scalar type.

Extraction Functions

Extraction functions extend the query AST with the ability to extract components from a value with a scalar type. Extraction functions can be used to group by components of a scalar type.

For example, a Date scalar type might expose extraction functions which extract the individual year, month and day components as integers.

Just like for comparison operators and aggregate functions, an extraction function is either a standard function, or a custom function.

To define an extraction function, add a ExtractionFunctionDefinition to the extraction_functions field of the schema response.

For example:

{
  "scalar_types": {
    "Date": {
      "extraction_functions": {
        "year": {
          "type": "year",
          "result_type": "Int"
        },
      }
      "aggregate_functions": {},
      "comparison_operators": {}
    }
  },
  ...
}

Standard Extraction Functions

The following standard extraction functions are supported:

  • Day
  • DayOfWeek
  • DayOfYear
  • Hour
  • Microsecond
  • Minute
  • Month
  • Nanosecond
  • Quarter
  • Second
  • Week
  • Year

For each of these, the return type should be a scalar type whose representation is one of int8, int16, int32, or int64.

Custom Extraction Functions

A custom extraction function has type custom and is defined by its result type - that is, the type of the extracted data. The result type can be any type, not just a scalar type.

See also