Raw data from external antibody databases and scripts to homogenize and standardize them used to build AntiBody Sequence Database (for reproducibility)

DOI

Reproducibility data for the AntiBody Sequence Database (ABSD) article. This dataset contains the raw data (antibody sequences) extracted on June 20, 2024, from various databases, as well as the several scripts, to ensure the reproducibility of our results.

External databases used: ABDB, AbPDB, CoV-AbDab, Genbank, IMGT, PDB, SACS, SAbDab, TheraSAbDab, UniProt, KABAT

Scripts usage: each external database has a corresponding script to format all antibody sequences extracted from it. A last script enable merging all extracted antibody sequences while removing redundancy, standardizing and cleaning data.

Identifier
DOI https://doi.org/10.57745/DDLHWU
Metadata Access https://entrepot.recherche.data.gouv.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.57745/DDLHWU
Provenance
Creator MAILLET, Nicolas ORCID logo; MALESYS, Simon ORCID logo
Publisher Recherche Data Gouv
Contributor Collection administrator; MAILLET, Nicolas; MALESYS, Simon; Institut Pasteur; Entrepôt-Catalogue Recherche Data Gouv
Publication Year 2024
Rights info:eu-repo/semantics/openAccess
OpenAccess true
Contact Collection administrator (Institut Pasteur); MAILLET, Nicolas (Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015 Paris, France); MALESYS, Simon (Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, F-75015 Paris, France)
Representation
Resource Type Dataset
Format application/x-gzip; text/markdown
Size 65497009; 80726198; 620431; 6833391387; 12475; 163643
Version 3.0
Discipline Computer Science; Life Sciences; Biology; Medicine