MEANtools: multi-omics integration towards metabolite anticipation and biosynthetic pathway prediction

DOI

During evolution, plants have developed the ability to produce a vast array of specialized metabolites, which play crucial roles in helping plants adapt to different environmental niches. However, their biosynthetic pathways remain largely elusive. In the past decades, increasing numbers of plant biosynthetic pathways have been elucidated based on approaches utilizing genomics, transcriptomics, and metabolomics. These efforts, however, are limited by the fact that they typically adopt a target-based approach, requiring prior knowledge. Here, we present MEANtools, a systematic and unsupervised computational integrative omics workflow to predict candidate metabolic pathways de novo by leveraging knowledge of general reaction rules and metabolic structures stored in public databases. In our approach, possible connections between metabolites and transcripts that show correlated abundance across samples are identified using reaction rules linked to the transcript-encoded enzyme families. MEANtools thus assesses whether these reactions can connect transcript-correlated mass features within a candidate metabolic pathway. We validate MEANtools using a paired transcriptomic-metabolomic dataset recently generated to reconstruct the falcarindiol biosynthetic pathway in tomato. MEANtools correctly anticipated five out of seven steps of the characterized pathway and also identified other candidate pathways involved in specialized metabolism, which demonstrates its potential for hypothesis generation. Altogether, MEANtools represents a significant advancement to integrate multi-omics data for the elucidation of biochemical pathways in plants and beyond.

meantools, 1.0.0

Contains data for the meantools repository https://github.com/kumarsaurabh20/meantools

Data:

  1. Lotus database in SQLite format
  2. RetroRules formatted data (loose, medium and strict dataset)
  3. EC-pfam IDs map
Identifier
DOI https://doi.org/10.34894/2MVBGK
Related Identifier IsCitedBy https://doi.org/10.1039/D2NP00032F
Related Identifier IsCitedBy https://doi.org/10.1016/j.pbi.2024.102657
Metadata Access https://dataverse.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.34894/2MVBGK
Provenance
Creator Singh, Kumar Saurabh ORCID logo
Publisher DataverseNL
Contributor Singh, Kumar Saurabh; UB Dataverse support
Publication Year 2025
Funding Reference NWO OCENW.GROOT.2019.063
Rights CC-BY-4.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/licenses/by/4.0
OpenAccess true
Contact Singh, Kumar Saurabh (maastrichtuniversity.nl); UB Dataverse support (maastrichtuniversity.nl)
Representation
Resource Type SQLite database; Dataset
Format text/csv; application/octet-stream
Size 239905790; 809150; 260972544
Version 1.0
Discipline Basic Biological and Medical Research; Biology; Life Sciences; Medicine; Omics
Spatial Coverage Wageningen