WoSIS snapshot - December 2023
ABSTRACT:
The World Soil Information Service (WoSIS) provides quality-assessed and standardized soil profile data to support digital soil mapping and environmental applications at broad scale levels. Since the release of the ‘WoSIS snapshot 2019’ many new soil data were shared with us, registered in the ISRIC data repository, and subsequently standardized in accordance with the licenses specified by the data providers. The source data were contributed by a wide range of data providers, therefore special attention was paid to the standardization of soil property definitions, soil analytical procedures and soil property values (and units of measurement).
We presently consider the following soil chemical properties (organic carbon, total carbon, total carbonate equivalent, total Nitrogen, Phosphorus (extractable-P, total-P, and P-retention), soil pH, cation exchange capacity, and electrical conductivity) and physical properties (soil texture (sand, silt, and clay), bulk density, coarse fragments, and water retention), grouped according to analytical procedures (aggregates) that are operationally comparable.
For each profile we provide the original soil classification (FAO, WRB, USDA, and version) and horizon designations as far as these have been specified in the source databases.
Three measures for 'fitness-for-intended-use' are provided: positional uncertainty (for site locations), time of sampling/description, and a first approximation for the uncertainty associated with the operationally defined analytical methods. These measures should be considered during digital soil mapping and subsequent earth system modelling that use the present set of soil data.
DATA SET DESCRIPTION:
The 'WoSIS 2023 snapshot' comprises data for 228k profiles from 217k geo-referenced sites that originate from 174 countries. The profiles represent over 900k soil layers (or horizons) and over 6 million records. The actual number of measurements for each property varies (greatly) between profiles and with depth, this generally depending on the objectives of the initial soil sampling programmes.
The data are provided in TSV (tab separated values) format and as GeoPackage. The zip-file (446 Mb) contains the following files:
- Readme_WoSIS_202312_v2.pdf: Provides a short description of the dataset, file structure, column names, units and category values (this file is also available directly under 'online resources'). The pdf includes links to tutorials for downloading the TSV files into R respectively Excel. See also 'HOW TO READ TSV FILES INTO R AND PYTHON' in the next section.
- wosis_202312_observations.tsv: This file lists the four to six letter codes for each observation, whether the observation is for a site/profile or layer (horizon), the unit of measurement and the number of profiles respectively layers represented in the snapshot. It also provides an estimate for the inferred accuracy for the laboratory measurements.
- wosis_202312_sites.tsv: This file characterizes the site location where profiles were sampled.
- wosis_2023112_profiles: Presents the unique profile ID (i.e. primary key), site_id, source of the data, country ISO code and name, positional uncertainty, latitude and longitude (WGS 1984), maximum depth of soil described and sampled, as well as information on the soil classification system and edition. Depending on the soil classification system used, the number of fields will vary .
- wosis_202312_layers: This file characterises the layers (or horizons) per profile, and lists their upper and lower depths (cm).
- wosis_202312_xxxx.tsv : This type of file presents results for each observation (e.g. “xxxx” = “BDFIOD” ), as defined under “code” in file wosis_202312_observation.tsv. (e.g. wosis_202311_bdfiod.tsv).
- wosis_202312.gpkg: Contains the above datafiles in GeoPackage format (which stores the files within an SQLite database).
HOW TO READ TSV FILES INTO R AND PYTHON:
A) To read the data in R, please uncompress the ZIP file and specify the uncompressed folder.
setwd("/YourFolder/WoSIS_2023_December/") ## For example: setwd('D:/WoSIS_2023_December/')
Then use read_tsv to read the TSV files, specifying the data types for each column (c = character, i = integer, n = number, d = double, l = logical, f = factor, D = date, T = date time, t = time).
observations = readr::read_tsv('wosis_202312_observations.tsv', col_types='cccciid')
observations ## show columns and first 10 rows
sites = readr::read_tsv('wosis_202312_sites.tsv', col_types='iddcccc')
sites
profiles = readr::read_tsv('wosis_202312_profiles.tsv', col_types='icciccddcccccciccccicccci')
profiles
layers = readr::read_tsv('wosis_202312_layers.tsv', col_types='iiciciiilcc')
layers
## Do this for each observation 'XXXX', e.g. file 'Wosis_202312_orgc.tsv':
orgc = readr::read_tsv('wosis_202312_orgc.tsv', col_types='iicciilccdccddccccc')
orgc
B) To read the files into python first decompress the files to your selected folder. Then in python:
# import the required library
import pandas as pd
# Read the observations data
observations = pd.read_csv("wosis_202312_observations.tsv", sep="\t")
# print the data frame header and some rows
observations.head()
# Read the sites data
sites = pd.read_csv("wosis_202312_sites.tsv", sep="\t")
# Read the profiles data
profiles = pd.read_csv("wosis_202312_profiles.tsv", sep="\t")
# Read the layers data
layers = pd.read_csv("wosis_202312_layers.tsv", sep="\t")
# Read the soil property data, e.g. 'cfvo' (do this for each observation)
cfvo = pd.read_csv("wosis_202312_cfvo.tsv", sep="\t")
CITATION:
Calisto, L., de Sousa, L.M., Batjes, N.H., 2023. Standardised soil profile data for the world (WoSIS snapshot – December 2023), https://doi.org/10.17027/isric-wdcsoils-20231130
Supplement to:
Batjes N.H., Calisto, L. and de Sousa L.M., 2023. Providing quality-assessed and standardised soil data to support global mapping and modelling (WoSIS snapshot 2023). Earth System Science Data (Discussions; https://doi.org/10.5194/essd-2024-14 ).
Simple
- Date (Publication)
- 2023-12-20
- Identifier
- e50f84e1-aa5b-49cb-bd6b-cd581232a2ec
- Identifier
- doi: / 10.17027/isric-wdcsoils.20231130
- Presentation form
- Digital map
- Status
- Required
- Theme
-
- bulk density
- cation exchange capacity
- soil classification
- coarse fragments
- clay
- effective cation exchange capacity
- electrical conductivity
- organic carbon
- pH
- sand
- silt
- calcium carbonate
- texture
- soil profiles
- water retention
- total nitrogen
- Stratum
-
- Soil science
- Region
-
- Global
- Access constraints
- License
- Use constraints
- License
- Other constraints
- Licenced per profile, as specified by data provider and indicated in the data (CC-BY or CC-BY-NC)
- Spatial representation type
- Vector
- Denominator
- 100000
- Metadata language
- English
- Character set
- UTF8
- Topic category
-
- Geoscientific information
- Environment
- Begin date
- 1918-01-01
- End date
- 2022-12-01
- Reference system identifier
- EPSG / 4326
- Distribution format
-
-
TSV and Geopackage
(
)
-
TSV and Geopackage
(
)
- OnLine resource
-
Download zipped dataset
(
WWW:DOWNLOAD-1.0-ftp--download
)
Zip file with the WoSIS December 2023 snapshot
- OnLine resource
-
Scientific paper
(
WWW:LINK-1.0-http--link
)
Goes to landing page for ESSD snapshot paper
- OnLine resource
-
Project webpage (FAQ)
(
WWW:LINK-1.0-http--related
)
Provides answers to frequently asked questions about WoSIS
- OnLine resource
-
ReadMe file for 'wosis_snapshot_2023'
(
WWW:LINK-1.0-http--link
)
This pdf report describes the 'wosis snapshot 2023' dataset and includes links to guidelines on how to import the TSV files into R resp. Excel.
- Hierarchy level
- Dataset
- File identifier
- e50f84e1-aa5b-49cb-bd6b-cd581232a2ec XML
- Metadata language
- English
- Character set
- UTF8
- Hierarchy level
- Dataset
- Hierarchy level name
- dataset
- Date stamp
- 2024-09-02T08:15:00
- Metadata standard name
- ISO 19115:2003/19139
- Metadata standard version
- 2003/Cor.1:2006