public final class Bucketizer extends Model<Bucketizer>
Bucketizer maps a column of continuous features to a column of feature buckets.| Constructor and Description |
|---|
Bucketizer() |
Bucketizer(java.lang.String uid) |
| Modifier and Type | Method and Description |
|---|---|
static double |
binarySearchForBuckets(double[] splits,
double feature)
Binary searching in several buckets to place each data point.
|
static boolean |
checkSplits(double[] splits)
We require splits to be of length >= 3 and to be in strictly increasing order.
|
Bucketizer |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
double[] |
getSplits() |
Bucketizer |
setInputCol(java.lang.String value) |
Bucketizer |
setOutputCol(java.lang.String value) |
Bucketizer |
setSplits(double[] value) |
DoubleArrayParam |
splits()
Parameter for mapping continuous features into buckets.
|
DataFrame |
transform(DataFrame dataset)
Transforms the input dataset.
|
StructType |
transformSchema(StructType schema)
:: DeveloperApi ::
|
java.lang.String |
uid()
An immutable unique ID for the object and its derivatives.
|
transform, transform, transformtransformSchemaclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitclear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParamstoStringinitializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarningpublic Bucketizer(java.lang.String uid)
public Bucketizer()
public static boolean checkSplits(double[] splits)
public static double binarySearchForBuckets(double[] splits,
double feature)
splits - (undocumented)feature - (undocumented)SparkException - if a feature is < splits.head or > splits.lastpublic java.lang.String uid()
Identifiablepublic DoubleArrayParam splits()
public double[] getSplits()
public Bucketizer setSplits(double[] value)
public Bucketizer setInputCol(java.lang.String value)
public Bucketizer setOutputCol(java.lang.String value)
public DataFrame transform(DataFrame dataset)
Transformertransform in class Transformerdataset - (undocumented)public StructType transformSchema(StructType schema)
PipelineStageDerives the output schema from the input schema.
transformSchema in class PipelineStageschema - (undocumented)public Bucketizer copy(ParamMap extra)
Paramscopy in interface Paramscopy in class Model<Bucketizer>extra - (undocumented)defaultCopy()