pmoA gene reference database (fasta-formatted sequences and taxonomy)


This data set is a part of result affiliated to our manuscript about pmoA gene (encoding the alpha subunit of the enzyme of particular methane monooxygenase). The taxonomy database consists of 7809 unaligned pmoA nucleotide sequences in fasta format and a corresponding taxonomy file, according the format specified by the software platforms of Mothur and QIIME. The taxonomy file is a two column text file where the first column is the accession number of the sequence and the second column is a string of taxonomic information separated by semicolons. We created a comprehensive taxonomy database for the pmoA nucleotide sequences which could be probed by the primer set combination of A189f and A682r. Sequences in this database were firstly retrieved from the NCBI database and progressively screened by Biopython or R scripts. The corresponding taxonomy was generally referred to the NCBI taxonomy if the explicit taxonomic ranks from phylum to species are available. For those with ambiguous taxonomies given by the NCBI database, taxonomic classification was improved as possible by referring to the Dumont’s database (Frontier in Microbiology, 2014, 5: 34. doi: 10.3389/fmicb.2014.00034).

Related Identifier
Related Identifier
Metadata Access
Creator Yang, Sizhong; Wen, Xi; Liebner, Susanne
Publisher GFZ Data Services
Contributor Yang, Sizhong; Wen, Xi; Liebner, Susanne
Publication Year 2016
Rights CC BY 4.0;
OpenAccess true
Contact Yang, Sizhong (GFZ German Research Center for Geosciences)
Language English
Resource Type Dataset
Format application/octet-stream; application/pdf
Size 7264805 Bytes; 5 Files
Discipline Geosciences
Temporal Coverage /2