This is one of a series of posts where I document software configurations for personal reference. This post documents the configurations for setting up data science environment with Miniconda. The instructions below are based on Ubuntu 18.04.
Download and install Miniconda
The installer can be downloaded from the official site: https://conda.io/miniconda.html. Enter ‘yes’ when the installer asks if you want to prepend the miniconda3 install location to PATH in
Run the following commands to add some channels provided by TUNA to speed up package downloads. Note the difference between the
--prepend option and the
--append option: the former adds a channel to the top of the list (with higher priority) while the latter adds a channel to the bottom of the list (with lower priority).
conda config --set show_channel_urls yes conda config --prepend channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/ conda config --prepend channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/ # high priority conda config --append channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/ # low priority
Instal Nvidia graphic card driver
The proprietary Nvidia graphic card driver is required by the GPU version of some packages (e.g., tensorflow), and it can be installed from Ubuntu’s Software & Updates app.
Install Python packages
The following command installs some frequently used packages.
conda install numpy scipy pandas matplotlib hdf5 pillow scikit-learn jupyterlab conda install -c conda-forge feather-format
Additional packages (if not available from conda) can be installed with:
pip3 install dbt-core dbt-postgres dbt-mysql dbt-hive pip3 install sqlfluff sqlfluff-templater-dbt pip3 install sqlglot
Install R packages
conda install -c r r-essentials
-c r option tells conda to look for packages in the R channel; the
r-essentials package contains many frequently used packages including
IRkernel. To make the R kernel visible to Jupyter (see below for instructions on how to configure Jupyter), run the following command in an R session:
IRkernel::installspec() # set `user = FALSE` to install the spec system-wide
See my other post for instructions on configuring R.
Configure Jupyterlab on a server
jupyter_notebook_config.py (usually located at
~/.jupyter/) does not exist, generate a new one with:
jupyter notebook --generate-config # optional
To enable https, generate some certificate files with the following command; then optionally set a password:
openssl req -x509 -nodes -days 9999 -newkey rsa:2048 -keyout mykey.key -out mycert.pem # optional jupyter notebook password # optional
jupyter_notebook_config.py and set the following parameters:
# Set options for certfile, ip, password, and toggle off browser auto-opening c.NotebookApp.certfile = '/absolute/path/to/your/certificate/mycert.pem' # optional c.NotebookApp.keyfile = '/absolute/path/to/your/certificate/mykey.key' # optional c.NotebookApp.ip = '0.0.0.0' c.NotebookApp.open_browser = False # optional
After the above setup, start the server with:
jupyter lab # Specify the `--allow-root` option if the server need to be run as root.
Move custom configuration files (in the
user-settings folder) to
By default, the server is listening on port 8888. If https is enabled, then the address to access the server would be
https://<server_address>:8888; otherwise it is