Installing Spark on Ubuntu

STEPS

1. Install Virtualbox

2. Install Ubuntu on virtualbox

Setting up Spark

Spark is pretty simple to set up and get running on machine.

Assuming you already have Java and Python:

1 Visit the Spark downloads page

Select the latest Spark release (1.2.0 at the time of this writing), a prebuilt package for Hadoop 2.4, and download directly.

Unizip the spark folder and rename it as spark.

PySpark find py4j.java_gateway?

export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH

export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH

Edit your BASH profile to add Spark to your PATH and to set the SPARK_HOME environment variable. These helpers will assist you on the command line. On Ubuntu, simply edit the ~/.bash_profile or ~/.profile files and add the following:

5. After you source your profile (or simply restart your terminal), you should now be able to run a pyspark interpreter locally. Execute the pyspark command, and you should see a result as follows:

Analytics Classroom

Friday, 8 July 2016

Installing Spark on Ubuntu

Installing Spark on Ubuntu

PySpark find py4j.java_gateway?

export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH

export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH

No comments:

Post a Comment

Total Pageviews

Pages

Friday, 8 July 2016

Installing Spark on Ubuntu

Installing Spark on Ubuntu

PySpark find py4j.java_gateway?

export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH

No comments:

Post a Comment

export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH

export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH