The ML.QUANTILE_BUCKETIZE function
This document describes the ML.QUANTILE_BUCKETIZE
function, which lets you
break a continuous numerical feature into buckets based on quantiles.
When used in the
TRANSFORM
clause,
the same quantiles are automatically used in prediction.
Syntax
ML.QUANTILE_BUCKETIZE(numerical_expression, num_buckets) OVER()
Arguments
ML.QUANTILE_BUCKETIZE
takes the following arguments:
numerical_expression
: the numerical expression to bucketize.num_buckets
: anINT64
value that specifies the number of buckets to splitnumerical_expression
into.
Output
ML.QUANTILE_BUCKETIZE
returns a STRING
value that contains the name of the
bucket. The returned bucket names are in the format of bin_<bucket_index>
,
with bucket_index
starting at 1
.
Example
The following example breaks a numerical expression of five elements into three buckets:
SELECT f, ML.QUANTILE_BUCKETIZE(f, 3) OVER() AS bucket FROM UNNEST([1,2,3,4,5]) AS f;
The output looks similar to the following:
+---+--------+ | f | bucket | +---+--------+ | 3 | bin_2 | | 5 | bin_3 | | 2 | bin_2 | | 1 | bin_1 | | 4 | bin_3 | +---+--------+
What's next
- For information about feature preprocessing, see Feature preprocessing overview.
- For information about the supported SQL statements and functions for each model type, see End-to-end user journey for each model.