This post is the first in a series on organising and analysing data from the Open Soil Spectral Library (OSSL). You can download OSSL data without any additional tools than a web-browser. To follow the subsequent posts you need to clone or download a set of Python scripts and described in other posts of this blog.
Introduction
This post builds on the two previous posts in this blog: Open Soil Spectral Library (OSSL) and OSSL explorer API. In this post you will download selected OSSL data that will then be used in the following posts in this blog on OSSL Machine Learning modelling. The data used as an example is Near-InfraRed (NIR) OSSL data over Sweden. You can change the study area by selecting any other geographical region, select spectral data related to certain soil textures (attributes) or source dataset, as outlined in the previous post.
Select OSSL data
The previous post illustrates how to access OSSL data using the OSSL API Explorer. As example for this post, the selection was set to Sweden as illustrated in the figure below.
For most country searches you can select sub-regions, in the example of Sweden (figure above) the data is divided into counties.
You can also select the dataset(s) to extract from the selected region by clicking the LUCAS is a European wide soil survey that is repeated every three years. The LUCAS dataset covers the visisble to near infra-red (VIS-NIR) spectral region and is used as the example throghout this blog. The ICRAF-ISRIC dataset focuses on the Mid-Infrared (MIR) spectral regions and will not be used, also because it only contains 5 sample sites in Sweden. The LUCAS-WOODWELL.SSL only contains a single sample site over Sweden, including or omitting it will not really influence the modelling.
button in the Data selection window. The Swedish OSSL data stem from three different datasets, ICRAF-ISRIC, LUCAS-WOODWELL.SSL and LUCAS.SSL.Select NeoSpectra data
If you want to analyse and evaluate the predictive power of the NeoSpectra advanced handheld NIR field spectrometer mentioned in the previous post, select the NEOSPECTRA.SSL as Dataset from the global (or USA) geographical region.
Download OSSL data
Download your selected OSSL data by clicking the zip and then download. Expand the zip file and you should get five csv files:
button. The data will be compressed to a- mir.data.csv [MIR spectral scans]
- neon.data.csv [NeoSpectra spectral scans]
- soillab.data.csv [soil laboratory (wet chemistry) data]
- soilsite.data.csv [sample site data]
- visnir.data.csv [VIS-NIR spectral scans (excluding NeoSpectra)]
Arranging OSSL data
In the next post you will use a python script to arrange (import) the data from the csv files into a combined json file. To keep your data in order I suggest that you create a dedicated folder (directory) under your user where you keep the OSSL data that you download. In my system I have a folder called OSSL and under that folder I create subfolders with regional and/or thematic names indicating the OSSL data that I downloaded. I save the downloaded file, that is always called data.zip, in that folder, then I unzip the file and the 5 csv files will end up in a subfolder called data. Do not rename the downloaded and unzipped files if you want to use the Pythons scripts in the following posts. My OSSL folder now looks like this:
.
|____Sweden
| |____LUCAS
| | |____data.zip
| | |____data
| | | |____visnir.data.csv
| | | |____neon.data.csv
| | | |____soillab.data.csv
| | | |____mir.data.csv
| | | |____soilsite.data.csv
|____Uganda
| |____data.zip
| |____data
| | |____visnir.data.csv
| | |____soillab.data.csv
| | |____mir.data.csv
| | |____soilsite.data.csv
|____USA
| |____NeoSpectra
| | |____data.zip
| | |____data
| | | |____visnir.data.csv
| | | |____neon.data.csv
| | | |____soillab.data.csv
| | | |____mir.data.csv
| | | |____soilsite.data.csv