Replication Data for: Clusters and Gradients for Classifying Land-Use Intensity in Ecological Research

DOI

Classifying land-use intensity is central to ecological research and soil health monitoring, yet widely used a priori categories (e.g., conventional', extensive', `semi-natural') risk oversimplifying the diversity of management practices. Here, we tested whether a priori categories, post priori clusters derived from management survey data, or a continuous gradient better captured variation in grassland management. We surveyed 18 Dutch grasslands, recording fertilization, mowing, grazing, and related practices, and compared classification schemes using hierarchical clustering, sparse principal component analysis, and remote-sensing-derived Sentinel-2 Red-Edge Position (S2REP) as an independent proxy of management intensity. A priori and post priori clusters showed moderate agreement (ARI = 0.72), but the weak cluster separation and the lack of an optimal number of clusters supported a gradient structure. Indeed, the gradient approach explained most variation in S2REP (AIC = 94.9, BIC = 102.6), outperforming both categorical schemes. Our results show that land-use intensity is better represented as a continuum rather than discrete categories, providing a more nuanced basis for linking management to ecological outcomes. We advocate gradient-based classification, supported by transparent metrics, as a default approach in ecological and soil health research.

R studio, Version 2024.09.1+394 (2024.09.1+394)

Files and variables

Folder: Clusters and gradient  

management_variables.csv; containing data on management practices from questionnaires and field observations.  Column names  Name: [character] Sample ID’s from sampling locations (three replicates per field) Type: [character] A priori land use intensity type from grasslands Age_grassland: [numeric] age (in years) of grasslands, time since conversion to grassland Age_management: [numeric] age (in years) of management, time since current management has been implemented Grasscutting_year: [numeric] number of mowing events per year  Grazing_months: [numeric] number of months per year that grassland is been used for grazing GVE_ha: [numeric] unit of livestock per hectare of grassland (LSU/ha) Extra_feeding: [binary] Yes = 1, No = 0 to question whether cows have received extra feeding (roughage) Manure: [factor w/ 3 levels] Pesticide_use: [binary] Yes = 1, No = 0 to question whether farmers have used pesticides in last 10 years of management Chem_fert_use: [binary] Yes = 1, No = 0 to question whether farmers have used chemical fertiliser in last 10 years of management Amount_manure_solid_m3: [numeric] amount of solid manure that fields receive per year in cubic meters Moisture_content_perc: [numeric] percentage moisture content from soil sample, dried at 105 °C Plant_richness: [numeric] number of individual species recorded with Braun-Blanquet within 2 x 2 m plot in grassland

Plots_data_info.csv; containing data on fields and corresponding plots Colum names Field: [character] Field name (three plots per field) Sample: [character] plot name (three replicates per field, .1, .2, .3 etc) Type: [character] A priori land use intensity type from grasslands

Folder: S2REP

Field_plots_points2022.csv; data containing x and y coordinates of field plots from field campaign 2022, extracted with dGPS (Topcon) ObjName: [character] Name of field plot Notes: [character] additional notes including field observation from plot locations Campaign: [number] year of sampling campaign Type_Manag: [character] A priori land use intensity type from grasslands x: [number] X coordinate from plot location. Locations were taken from center of the plot (2 x 2m) y: [number] Y coordinate from plot location. Locations were taken from center of the plot (2 x 2m)

S2Rep_mean_selected_dates_2022_1720April_RC.tif; TIFF file containing the S2REP values, extracted from Google Earth Engine. TIFF file includes S2REP values averaged from 17 and 20 April 2022. Scale = 10, maxPixels: 1e9

Clusters and gradients.R; R script for analysis of data management variables, land use intensity classification and remote sensing validation with S2REP. 

Code/software

To run the script, R studio (Version 2024.09.1+394 (2024.09.1+394)) is needed. The following packages are also needed; "dplyr","tibble","forcats","ggplot2","ggrepel","scales", "cluster","fpc","mclust","corrplot","uwot","mixOmics","raster","sf","exactextractr","lme4","lmerTest","DHARMa","MuMIn","emmeans"

Identifier
DOI https://doi.org/10.17026/LS/J4Y6FO
Metadata Access https://lifesciences.datastations.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.17026/LS/J4Y6FO
Provenance
Creator R. Boone ORCID logo; D. Bucur
Publisher DANS Data Station Life Sciences
Contributor Boone, Rosa
Publication Year 2026
Rights DANS Licence; info:eu-repo/semantics/restrictedAccess; https://doi.org/10.17026/fp39-0x58
OpenAccess false
Contact Boone, Rosa (Radboud University Nijmegen)
Representation
Resource Type Dataset
Format type/x-r-syntax; text/tab-separated-values; text/csv; text/rtf; image/tiff
Size 12006; 3702; 2987; 3431; 8426; 1788; 11318570
Version 3.0
Discipline Agricultural Sciences; Agriculture, Forestry, Horticulture, Aquaculture; Agriculture, Forestry, Horticulture, Aquaculture and Veterinary Medicine; Earth and Environmental Science; Environmental Research; Geosciences; Life Sciences; Natural Sciences