Spark Context

Provides functions to create and maintain the spark context.

class xframes.spark_context.SparkInitContext[source]

Spark Context initialization.

This may be used to initialize the spark context with the supplied values. If this mechanism is not used, then the spark context will be initialized using the config file the first time a context is needed.

static set(context)[source]

Sets the spark context parameters, and then creates a context. If the spark context has already been created, then this will have no effect.

Parameters:

context : dict

Dictionary of property/value pairs. These are passed to spark as config parameters. If a config file is present, these parameters will override the parameters there.

Notes

The following values are the most commonly used. They will be given default values if none are supplied in a configuration file. Other values can be found in the spark configuration documentation.

spark.master : str, optional

The url of the spark cluster to use. To use the local spark, give ‘local’. To use a spark cluster with its master on a specific IP address, give the IP address or the hostname as in the following examples:

spark.master=spark://my_spark_host:7077

spark.master=mesos://my_mesos_host:5050

app.name : str, optional
The app name is used on the job monitoring server, and for logging.
spark.cores.max : str, optional
The maximum number of cores to use for execution.
spark.executor.memory : str, optional
The amount of main memory to allocate to executors. For example, ‘2g’.