Column

Column

A column that will be computed based on the data in a DataFrame.

A new column is constructed based on the input columns present in a dataframe:

  df.col("columnName")          // On a specific DataFrame.
  var F = sqlFunctions;
  F.col("columnName")           // A generic column no yet associcated with a DataFrame.
  F.col("columnName.field")     // Extracting a struct field
  F.col("`a.column.with.dots`") // Escape `.` in column names.
  F.expr("a + 1")               // A column that is constructed from a parsed SQL Expression.
  F.lit("abc")                  // A column that produces a literal (constant) value.

Constructor

new Column()

Note: Do not use directly (see above).

Since:
  • 1.3.0
Source:

Methods

alias(alias)

Gives the column an alias. Same as as.

Parameters:
Name Type Description
alias
Since:
  • 1.4.0
Source:
Example
// Renames colA to colB in select output.
  df.select(F.col("colA").alias("colB"))

and(other)

Boolean AND.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.select( people.col("inSchool").and(people.col("isEmployed")) );

as(alias)

Gives the column an alias. Same as alias.

Parameters:
Name Type Description
alias
Since:
  • 1.4.0
Source:

asc()

Returns an ordering used in sorting.

Since:
  • 1.3.0
Source:
Example
df.sort(df.col("age").asc());

between(lowerBound, upperBound)

True if the current column is between the lower bound and upper bound, inclusive.

Parameters:
Name Type Description
lowerBound
upperBound
Since:
  • 1.4.0
Source:

bitwiseAND(other)

Compute bitwise AND of this expression with another expression.

Parameters:
Name Type Description
other
Since:
  • 1.4.0
Source:
Example
df.select(F.col("colA").bitwiseAND(F.col("colB")));

bitwiseOR(other)

Compute bitwise OR of this expression with another expression.

Parameters:
Name Type Description
other
Since:
  • 1.4.0
Source:
Example
df.select(F.col("colA").bitwiseOR(F.col("colB")));

bitwiseXOR(other)

Compute bitwise XOR of this expression with another expression.

Parameters:
Name Type Description
other
Since:
  • 1.4.0
Source:
Example
df.select(F.col("colA").bitwiseXOR(F.col("colB")));

cast(to)

Casts the column to a different data type, using the canonical string representation of the type. The supported types are: string, boolean, byte, short, int, long, float, double, decimal, date, timestamp.

Parameters:
Name Type Description
to
Since:
  • 1.3.0
Source:
Example
// Casts colA to integer.
  df.select(df.col("colA").cast("int"))

contains(other)

Contains the other element.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:

desc()

Returns an ordering used in sorting.

Since:
  • 1.3.0
Source:
Example
df.sort(df.col("age").desc());

divide(other)

Division this expression by another expression.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.select( people.col("height").divide(people.col("weight")) );

endsWith(other)

String ends with.

Parameters:
Name Type Description
other

(Column or String).

Since:
  • 1.3.0
Source:

eqNullSafe(other)

Equality test that is safe for null values.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:

equalTo(other)

Equality test.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
df.filter(F.col("colA").equalTo(F.col("colB")) );

explain(extended)

Prints the expression to the console for debugging purpose.

Parameters:
Name Type Description
extended
Since:
  • 1.3.0
Source:

geq(other)

Greater than or equal to an expression.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.select( people.col("age").geq(21) )

getField(fieldName)

An expression that gets a field by name in a StructType.

Parameters:
Name Type Description
fieldName
Since:
  • 1.3.0
Source:

getItem(key)

An expression that gets an item at position ordinal out of an array, or gets a value by key key in a MapType.

Parameters:
Name Type Description
key
Since:
  • 1.3.0
Source:

gt(other)

Greater than.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.select(people.col("age").gt(21));

isin(list)

A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.

Parameters:
Name Type Description
list
Since:
  • 1.5.0
Source:

isNaN()

True if the current expression is NaN.

Since:
  • 1.5.0
Source:

isNotNull()

True if the current expression is NOT null.

Since:
  • 1.3.0
Source:

isNull()

True if the current expression is null.

Since:
  • 1.3.0
Source:

leq(other)

Less than or equal to.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.select( people.col("age").leq(21) );

like(literal)

SQL like expression.

Parameters:
Name Type Description
literal
Since:
  • 1.3.0
Source:

lt(other)

Less than.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.select( people.col("age").lt(21) );

minus(other)

Subtraction. Subtract the other expression from this expression.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.select( people.col("height").minus(people.col("weight")) );

mod(other)

Modulo (a.k.a. remainder) expression.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:

multiply(other)

Multiplication of this expression and another expression.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.select( people.col("height").multiply(people.col("weight")) );

or(other)

Boolean OR.

Parameters:
Name Type Description
other
Since:
  • 1.3.0
Source:
Example
people.filter( people.col("inSchool").or(people.col("isEmployed")) );

otherwise(value)

Evaluates a list of conditions and returns one of multiple possible result expressions. If otherwise is not defined at the end, null is returned for unmatched conditions.

Parameters:
Name Type Description
value
Since:
  • 1.4.0
Source:
Example
// Example: encoding gender string column into integer.

  people.select(when(F.col("gender").equalTo("male"), 0)
    .when(F.col("gender").equalTo("female"), 1)
    .otherwise(2))

plus(other)

Sum of this expression and another expression.

Parameters:
Name Type Description
other

people.select( people.col("height").plus(people.col("weight")) );

Since:
  • 1.3.0
Source:

rlike(literal)

SQL RLIKE expression (LIKE with Regex).

Parameters:
Name Type Description
literal
Since:
  • 1.3.0
Source:

startsWith(other)

String starts with.

Parameters:
Name Type Description
other

(Column or String).

Since:
  • 1.3.0
Source:

substr(startPos, len)

An expression that returns a substring.

Parameters:
Name Type Description
startPos

starting position (Column or Number).

len

length of the substring (Column or Number).

Since:
  • 1.3.0
Source: