Introduction
To run the python scripts for importing, plotting and modelling OSSL data as described in other posts of this blog you need to have a python environment. This post describes how to setup install and setup Anaconda for defining virtual python environments. The instructions are shorthand and written using MacOS. I have also published more detailed instructions for install Anaconda for MacOS and a summary for how to install Eclipse and Anaconda in Ubuntu20.
Anaconda is a cross-platform solution for developing virtual python environments. There are also several other solutions available online. You can simply do a google seach on your operating system (MacOS, Linux, Windows) and python virtual environment).
Conda, Anaconda and miniconda
Conda is the core python package manager for both Anaconda and miniconda. Conda is run from a Terminal window. The difference between Anaconda and miniconda is that Anaconda comes with a more extensive default library and a Graphical User Interface (Navigator). If you are new to Python or setting up environments for Python for the first time, I recommend that you install Anaconda.
Anaconda
Anaconda is a free platform for scientific Python. To download Anaconda, visit the Anaconda homepage, and look for the download link. In the download page, find the distribution (e.g. Windows, Linux, MacOS). For the OSSL python project you need python 3.8, and thus you should chose the latest Anaconda version for python 3. If there are no versions to chose from, the default in September 2023 is 3, so just go ahead and press the button.
Once downloaded just start the installer and follow the instructions. You might need to accept some license terms or override some security warnings - dependent on your OS. The default destination of the installation is your local user directory. If you want to change the installation destination you can do that as part of the installation process.
From hereon this post will only rely on terminal commands.
Open a terminal window and check that conda was installed:
% conda -V
% conda -V
conda 4.13.0
Before setting up the virtual environment for the OSSL package, update both conda and [Anaconda].
% conda update conda
Dependent on your version and OS, there might be updates or not. The following excerpt shows the kind of responses you can get. Once the update is checked and prepared, you get the question whether to proceed or not. Enter y (default) in the terminal to proceed, n to stop.
% conda update conda
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /Applications/anaconda3
added / updated specs:
- conda
The following packages will be downloaded:
...
...
The following NEW packages will be INSTALLED:
...
...
The following packages will be REMOVED:
...
...
The following packages will be UPDATED:
...
...
The following packages will be SUPERSEDED by a higher-priority channel:
...
...
Proceed ([y]/n)?
% conda update anaconda
The terminal window reports what can be updated, and again you can chose to proceed or stop the update.
% conda update anaconda
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /Applications/anaconda3
added / updated specs:
- anaconda
The following packages will be downloaded:
...
...
The following packages will be UPDATED:
...
...
Proceed ([y]/n)? y
Your terminal indicates which conda environment that is active in the parenthesis “()” that is shown on the text line of the command line, for example like this:
(base) your.user@machine ~ %
base is the default conda python environment. But we want to create a custom virtual environment where we can plot and model the OSSL spectra and wet laboratory data.
To create the conda virtual python environment for the OSSL project you can either install the required packages from scratch or download a predefined environment (.yml) file. In either case the packages that are required, but not a part of the core conda python 3 environments, include:
- matplotlib,
- numpy,
- pandas, and
- scikit-learn
Create virtual environment
To setup a new virtual environment from using conda you can either
- define default packages in a file called .condarc,
- directly specify packages as part of the creation,
- use an exported configuration (yml) file, or
- set up a default environment and add individual packages afterwards
The last alternative, to install individual packages as a post-process, is not recommended. As the packages are not installed simultaneously, conda can not assure the coherence between the installed packages.
Of the four additional packages required, numpy, pandas and scikit-learn are available in the Anaconda package list. matplotlib is instead available via the conda-forge repository.
Default packages (.condarc)
To set a default environment you need to have a .condarc configuration file in you personal folder. For in-depth details on the syntac of .condarc see the conda page on Managing environments.
Create or open the .condarc file in your home directory. How to do that depends on your OS. In MacOS you can use the terminal text editor pico:
% pico .condarc
Then edit the content of .condarc thus:
create_default_packages:
- matplotlib
- numpy
- pandas
- scikit-learn
channels:
- conda-forge
- defaults
The GitHub repo OSSL-pydev contains the .condarc file above.
Save the edited .condarc file. Back at the terminal window, create the virtual environment with the following command
conda create --name ossl_py38 python=3.8
The terminal reports the progress and you might need to respond to conflicts. Sticking to default answers to any questions usually solves any conflicts:
% conda create --name ossl_py38 python=3.8
...
...
Solving environment: done
## Package Plan ##
environment location: /Applications/anaconda3/envs/ossl_py38
added / updated specs:
- matplotlib
- numpy
- pandas
- python=3.8
- scikit-learn
The following packages will be downloaded:
...
...
The following NEW packages will be INSTALLED:
...
...
Proceed ([y]/n)? y
Downloading and Extracting Packages
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
# $ conda activate ossl_py38
#
# To deactivate an active environment, use
# $ conda deactivate
Direct specification
An alternative to setting default packages using .condarc, is to directly specify the additional packages to bring along while creating the virtual environment:
conda create -n ossl_py38a python=3.8 matplotlib numpy pandas scikit-learn
The reporting in the terminal will be similar as when installing using default packages defined in .condarc.
From an exported configuration (yml) file
The GitHub repo OSSL-pydev contains an exported conda virtual environment (yml) file. If you cloned or downloaded the repo you should have the file spectra-ossl_from-history_py38.yml on your local machine. To set up a new virtual environment using an exported environment make sure the yml file is in the active path of the Terminal and then type:
conda env create -f spectra-ossl_from-history_py38.yml
The reporting in the terminal will be similar as when installing using default packages defined in .condarc.
Post-installation of individual packages
If you need to do a post-installation, this last section describes how to do it. For adding a package to a particular virtual environment you first have to activate it:
conda activate ossl_py38
Then your installation commands will add the package to that particular environment.
matplotlib
matplotlib is not available in the default Anaconda repositories, but need to be installed from conda-forge. The terminal command for installing matplotlib using conda is:
conda install -c conda-forge matplotlib
numpy
The numpy is available from the default Anaconda repository and can thus be installed with the command:
conda install -c anaconda numpy
pandas
Also pandas is available from the default Anaconda repository:
conda install -c anaconda pandas
scikit-learn
Also scikit-learn is available from the default Anaconda repository:
conda install -c anaconda scikit-learn