Supplementary data for the vassi Python package

DOI

This data repository provides supplementary data for "vassi - verifiable, automated scoring of social interactions in animal groups" Documentation and example usage of the package are available online at https://vassi.readthedocs.io/en/latest/. The source code is under version control at https://github.com/pnuehrenberg/vassi/ and also archived here (vassi_source.zip). 1. social cichlids dataset We tested our package on this novel dataset consisting of nine video recordings of groups of cichlid fish (15 Neolamprologus multifasciatus per group). The dataset also contains individual trajectories for each fish (stored in a HDF5 file, can be loaded in Python as numpy arrays; posture data and corresponding time stamps) and behavioral annotations (CSV files, one behavioral event per row). Reencoded video files (compressed using FFMPEG) are available in datasets/social_cichlids/videos. All scripts and notebooks from which results were presented in the paper used the same configuration for feature extraction (examples/social_cichlids/features-cichlids.yaml). We provide intermediate results (examples/social_cichlids/results.h5 and examples/social_cichlids/k_fold_predictions.h5) for the examples/social_cichlids/results_and_figures-cichlids.ipynb notebook (available in GitHub repository or vassi_source.zip. This notebook reproduces the figures as presented in our paper. We also provide the results obtained from hyperparameter optimization using the optuna framework in the same directory (examples/social_cichlids/optimization/). The results from k-fold prediction on the entire dataset (for visualization of networks as presented in the paper) are available in examples/social_cichlids/k_fold_predictions_predictions.csv, which can be loaded as a dataset when complemented with the trajectories file (see vassi_source.zip/examples/social_cichlids/results_and_figures-cichlids.ipynb for details). Our paper also presents a comparison between model predictions (behavior counts) and association time as an alternate behavioral proxy for interactions. The raw data files and the corresponding r script are available at examples/social_cichlids/predictions_vs_association. 2. CALMS21 dataset In addition, we tested our package on an existing benchmark dataset, the CALMS21 mouse resident-intruder dataset. For convenience, we provide Python scripts to download or convert the original dataset (vassi_source.zip/src/vassi/case_studies/calms21/download.py and vassi_source.zip/src/vassi/case_studies/calms21/convert.py; or available after vassi was installed, see online documentation for more details). The original CALMS21 dataset can be downloaded here: https://data.caltech.edu/records/s0vdx-0k302 [Dataset] Jennifer J. Sun, Tomomi Karigo, David J. Anderson, Pietro Perona, Yisong Yue, & Ann Kennedy. (2021). Caltech Mouse Social Interactions (CalMS21) Dataset (1.0) [Data set]. CaltechDATA. https://doi.org/10.22002/D1.1991 [Paper] Sun JJ, Karigo T, Chakraborty D, Mohanty SP, Wild B, Sun Q, Chen C, Anderson DJ, Perona P, Yue Y, Kennedy A. The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions. Adv Neural Inf Process Syst. 2021 Dec;2021(DB1):1-15. PMID: 38706835; PMCID: PMC11067713. All scripts and notebooks from which results were presented in the paper used the same configuration for feature extraction (examples/CALMS21/features-mice.yaml). As for the other example dataset, we provide intermediate results (examples/CALMS21/results.h5) for the vassi_source.zip/examples/CALMS21/results_and_figures-mice.ipynb notebook to reproduce the figures presented in our paper. We also provide the results obtained from hyperparameter optimization using the optuna framework in the same directory (examples/CALMS21/optimization/). Files datasets/ └── social_cichlids/ ├── videos/ │ ├── GH010423.MP4 │ ├── GH010861.MP4 │ ├── GH013974.MP4 │ ├── GH019910.MP4 │ ├── GH030423.MP4 │ ├── GH030451.MP4 │ ├── GH030861.MP4 │ ├── GH039910.MP4 │ └── GH039931.MP4 ├── cichlids_annotations.csv └── cichlids_trajectories.h5 examples/ └── CALMS21/ ├── optimization/ │ ├── optimization-results.yaml │ ├── optimization-summary.yaml │ └── optimization-trials.csv ├── features-mice.yaml └── results.h5 └── social_cichlids/ ├── optimization/ │ ├── optimization-results.yaml │ ├── optimization-summary.yaml │ └── optimization-trials.csv ├── predictions_vs_association/ │ ├── aggregated_counts-1bl.csv │ ├── aggregated_counts-3bl.csv │ ├── aggregated_counts-5bl.csv │ └── predictions_vs_association.Rmd ├── features-mice.yaml ├── results.h5 └── k_fold_predictions_predictions.csv └── vassi_source.zip

Identifier
DOI https://doi.org/10.17617/3.3R0QYI
Metadata Access https://edmond.mpg.de/api/datasets/export?exporter=dataverse_json&persistentId=doi:10.17617/3.3R0QYI
Provenance
Creator Nührenberg, Paul
Publisher Edmond
Publication Year 2025
OpenAccess true
Contact pnuehrenberg(at)ab.mpg.de
Representation
Language English
Resource Type Dataset
Version 1
Discipline Other