If you want to use Oracle Big Data Cloud and Python using the built-in Zeppelin notebooks, follow these instructions. This is based on a single node instance.
Versions:
Anaconda 3.6 : 3.6.3
Oracle BDC: 17.3.3-20
Steps I took.
- Get Anaconda
- login to BDC using ssh
- mkdir anaconda
- cd anaconda
- wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh
- Or latest version
- Install Anaconda (I installed 3.6, 2.7 also works)
- sudo bash Anaconda3-5.0.1-Linux-x86_64.sh
- Accept Licence
- install to /opt/anaconda
- Check if everything is OK
- logout of ssh
- connect to ssh again
- /opt/anaconda/bin/conda list
- If you get a list of packages, you are good to go.
- Modify Zeppelin interpreter in Oracle BDC to use Anaconda
- Login to BDC console
- Click on Settings tab.
- Change/add the following properties
- Change zeppelin.pyspark.python – Set to /opt/anaconda/bin/python
- Add PYSPARK_DRIVER_PYTHON – Set to /opt/anaconda/bin/python
- Add PYSPARK_PYTHON – Set to /opt/anaconda/bin/python
- Click SAVE button
- Test the zeppelin interpreter
- Go to the Notebook tab
- Create a New Note
- In first note type
- %pyspark
import sys
print(sys.version)
- %pyspark
- If you see your version, you are all set.