Data for binary classification experiments

Dataset

DOI

Research context

This zip archive contains all the data and scripts which are neccessary to reproduce the results of the following paper, co-authored by Markus Kattenbeck, Ioannis Giannopoulos, Negar Alinaghi, Antonia Golab, and Daniel R. Montello:

Predicting spatial familiarity by exploiting head and eye movements during pedestrian navigation in the real world

This paper will be published in Springer Nature Scientific Reports.

File overview

The structure of the archive is the following:

Folder "01_data" contains all the data files needed and a readme file describing the structure of each of these data files. These data files are:

lsp.csv [contains demographic data about participants]

matched_gaze_imu.csv [contains the segmented behavioral data, i.e. both gaze features and imu features]

matched_gaze_imu_feature_description.pdf [contains a description of the features contained in matched_gaze_imu.csv]

walking_dates.csv [contains an overview on which date participants walked the familiar and unfamiliar routes]

users_polygons.csv [contains one or more polygons per participant in which they are familiar]

polygons_markers.csv [contains locations of POIs per polygon for which participants reported to be familiar with]

user_routes.csv [containes the route participants provided between a randomly selected pair of POIs they have provided for a given polygon]

Folder "02_scripts" contains the data analysis scripts; they are organized in two subfolders:

01_ml_scripts: these are the scripts for the XGBoost classification; they are organized as two python files in which further instructions for use are given.

80_20_code.py is the python file which runs the ML experiments using an 80/20 train/test split

L5O4T_code.py is the python file which runs the ML experiments leaving the full data of five different participants per condition as unseen data for the test.

requirements.txt states the used Python package versions

02_r_scripts:

cleaned_script.Rmd This is an R notebook which can be easily opened in R-Studio and provides the analysis scripts for the descriptive statistics presented in the paper.

package_versions.txt states the used R package versions

Licenses

The code is licensed under MIT, the data is licensed under CC-BY.

Identifier
DOI	https://doi.org/10.48436/zjkky-pgs18
Related Identifier	IsVersionOf https://doi.org/10.48436/6kpr2-2ah89
Metadata Access	https://researchdata.tuwien.ac.at/oai2d?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:researchdata.tuwien.ac.at:zjkky-pgs18

Provenance
Creator	Kattenbeck, Markus; Golab, Antonia ; Alinaghi, Negar ; Giannopoulos, Ioannis
Publisher	Geoinformation, TU Wien
Publication Year	2025
Rights	Creative Commons Attribution 4.0 International; MIT License; https://creativecommons.org/licenses/by/4.0/legalcode; https://opensource.org/licenses/MIT
OpenAccess	true
Contact	tudata(at)tuwien.ac.at

Representation
Language	English
Resource Type	Dataset
Version	1.0.0
Discipline	Other