DataFrameReader

DataFrameReader

Interface used to load a DataFrame from external storage systems (e.g. file systems, key-value stores, etc). Use SQLContext#read to access this.

Constructor

new DataFrameReader()

Note: Do not use directly (see above).

Since:
  • 1.4.0
Source:

Methods

json(path, cb)

Loads a JSON file (one object per line) and returns the result as a DataFrame.

This function goes through the input once to determine the input schema.

You can set the following JSON-specific options using DataFrameReader#option to deal with non-standard JSON files:

  • primitivesAsString (default false): infers all primitive values as a string type
  • allowComments (default false): ignores Java/C++ style comment in JSON records
  • allowUnquotedFieldNames (default false): allows unquoted JSON field names
  • allowSingleQuotes (default true): allows single quotes in addition to double quotes
  • allowNumericLeadingZeros (default false): allows leading zeros in numbers (e.g. 00012)
  • Parameters:
    Name Type Description
    path
    cb

    Node-style callback function (error-first).

    Since:
    • 1.4.0
    Source:

    jsonSync(path)

    The synchronous version of DataFrameReader#json

    Parameters:
    Name Type Description
    path
    Since:
    • 1.4.0
    Source:

    option(key, value)

    Adds an input option for the underlying data source.

    Parameters:
    Name Type Description
    key
    value
    Since:
    • 1.4.0
    Source:

    text(cb, path)

    Loads a text file and returns a DataFrame with a single string column named "text". Each line in the text file is a new row in the resulting DataFrame.

    Parameters:
    Name Type Description
    cb

    Node-style callback function (error-first).

    path
    Since:
    • 1.6.0
    Source:

    textSync(path)

    The synchronous version of DataFrameReader#text

    Parameters:
    Name Type Description
    path
    Since:
    • 1.6.0
    Source: