GraphML files for sequence networks of PETases and PURases

DOI

The GraphML files contain the sequence networks and annotated metadata for protein sequences.

The GraphML attributes for the edges comprise the edge weights (pairwise sequence identity, "weight"). The GraphML attributes for the nodes comprise the identifiers from the ExED ("sequence_id", "protein_id", "hfam_id", and "sfam_id" for sequence, protein, homologous family and superfamily identifiers, respectively), the NCBI taxonomy ID ("tax_id"), the annotated (organism) source name ("tax_name"), the taxonomic lineage of the source organism ("lineage", with taxa separated by "<--"), and the length of the amino acid sequence ("sequence_length"). In addition, suggested color names are given for both fill color and border color of each node ("color" and "color_border").

Identifier
DOI https://doi.org/10.18419/darus-2054
Metadata Access https://darus.uni-stuttgart.de/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18419/darus-2054
Provenance
Creator Buchholz, Patrick C. F. ORCID logo
Publisher DaRUS
Contributor Pleiss, Jürgen
Publication Year 2021
Rights CC BY 4.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/licenses/by/4.0
OpenAccess true
Contact Pleiss, Jürgen (Universität Stuttgart)
Representation
Resource Type Dataset
Format text/xml-graphml
Size 6035615; 570547; 16586
Version 1.0
Discipline Life Sciences; Medicine