Cloudera CDH/CDP 및 Hadoop EcoSystem, Semantic IoT등의 개발/운영 기술을 정리합니다. gooper@gooper.com로 문의 주세요.
0. Spark, Scala, Pip, Python, Hadoop, Jupyter등은 이미 설치되어 있다고 가정한다.
1. 환경변수 설정
sudo vi /etc/profile
export $SPARK_HOME=$HOME/spark
2. 수정된 환경변수값 반영
sudo source /etc/profile
3. toree설치
sudo pip install toree
sudo jupyter toree install --spark_home=$SPARK_HOME --interpreters=Scala,PySpark,SparkR,SQL
4. browser에서 확인
https://gsda4:8888/
5. jupyter kernal목록 확인(jupyter kernelspec list)
/usr/local/lib/python2.7/dist-packages/jupyter_client/session.py:48: VisibleDeprecationWarning: zmq.eventloop.minitornado is deprecated in pyzmq 14.0 and will be removed.
Install tornado itself to use zmq with the tornado IOLoop.
from zmq.eventloop.ioloop import IOLoop
[ListKernelSpecs] WARNING | Native kernel (python2) is not available
[ListKernelSpecs] WARNING | Native kernel (python2) is not available
Available kernels:
apache_toree_pyspark /usr/local/share/jupyter/kernels/apache_toree_pyspark
apache_toree_scala /usr/local/share/jupyter/kernels/apache_toree_scala
apache_toree_sparkr /usr/local/share/jupyter/kernels/apache_toree_sparkr
apache_toree_sql /usr/local/share/jupyter/kernels/apache_toree_sql
python3 /usr/local/share/jupyter/kernels/python3