On Cascades, there are two different “flavors” of python available:
- locally-built versions of python
- the anaconda python distribution from continuum analytics
Each of these options are available as modules, and allow users to work with either python 2.7 or python 3.5. The anaconda python distribution includes many popular packages, and is set to be the default python if a compiler is not loaded. The anaconda distribution can also be loaded explicitly using the module system
module load Anaconda
If a compiler is loaded, locally-compiled versions of python are available and loaded by default. That is
module load gcc/5.2.0 python
will load a version of python that was compiled using gcc/5.2.0. A more limited set of python modules are provided for the locally-built versions of python. Several modules are packaged within the installation, most importantly setuptools and pip. Locally-optimized builds for numpy and spicy are also available as modules, which depend on atlas (for gcc compilers) or mkl (for intel compilers). These will be added to $PYTHONPATH
by typing
module load gcc/5.2.0 python atlas numpy scipy
Beyond these basic python modules, ARC is not able to centrally install python modules due to the vast number of modules available. Instead, ARC supports straightforward mechanisms that allow users to customize their own python environments using easy_install or pip.
Managing python dependencies for locally-built python installations
For locally-built python installations, users have access to two primary tools to manage local installation of python modules. These are easy_install and pip. The environment variable $PYTHONUSERBASE
provides a mechanism to customize the location where python modules are installed. The value of $PYTHONUSERBASE
can be set differently on different systems. In this way locally-installed python dependencies can be managed on a system-by-system basis. For example, on Cascades, we could set
export PYTHONUSERBASE=/home/<yourpid>/cascades/python
With the install location set, the --user
option can be passed to pip or easy_install so that python modules are installed to $PYTHONUSERBASE
. For example, one could install matplotlib by typing
module load gcc/5.2.0 python
easy_install --user matplotlib
Similarly, one could use pip to install scikit-learn, relying on the locally-optimized installations of numpy and spicy (available from the associated modules)
module load gcc/5.2.0 python atlas numpy scipy
pip install --user scikit-learn
In each case, the desired modules will be installed to the path specified by $PYTHONUSERBASE
.