1 articles tagged as pyspark

I remember it took me sometime to get this configured when I first started trying Jupyter and Spark out. Hopefully this is helpful for others. This works for Hadoop 2.6.0-CDH5.9.1, Spark 1.6.0 using python2.7 and Python3. For other versions, you need to adjust the path accordingly. Basically, you just need to tell spark 4 things: The location of your (Ana)conda installation The location of your Jupyter installation and its configuration The location of your Python installation Resources your Spark executors need Type the following from your bash terminal (If you are using Cloudera, this would be your Edge node. If …

Read more →