µgreen database
Studying of the ecology of photosynthetic eukaryotic microalgae and prokaryotic cyanobacteria communities requires molecular tools to complement the historical technique of morphological observations.
These tools being developed rely on specific genetic markers and require the development of specialised databases to achieve taxonomic assignment.
Here, we set up a reference database, called µgreen-db, for the plastidial 23S rRNA gene. The sequences were retrieved from either generalist (NCBI, SILVA) or Comparative RNA Web (CRW) databases, in addition to a more original approach involving recursive BLAST searches to obtain the best sequence recovery.
At present, µgreen-db includes 2,326 plastidial 23S rRNA sequences spanning four Kingdoms (Eubacteria, Chromista, Protozoa and Plantae) encompassing 442 unique genera and 736 species of eukaryotic algae, cyanobacteria and non-vascular land plants based on the NCBI and AlgaeBase taxonomy. When the PR2/SILVA taxonomy is used instead, µgreen-db still contains 2,217 sequences (399 unique genera and 696 unique species). Using the µgreen-db, we were able to assign, at the phyla level, 98.5% of the sequences of the V domain of the 23S rRNA plastid gene obtained by metabarcoding after amplification from soil extracted DNA, thus highlighting the good coverage of database.
Download an availability
The first release of µgreen-db contains:
- 2,326 23S rDNA sequences for AlgaeBase and NCBI taxonomy
- 2,217 23S rDNA sequences for PR2/SILVA taxonomy
How to cite ?
Christophe Djemiel, Damien Plassard, Sébastien Terrat, Olivier Crouzet, Joana Sauze, Samuel Mondy, Virginie Nowak, Lisa Wingate, Jérôme Ogée, Pierre-Alain Maron. µgreen-db: a reference database of the plastidial 23S rRNA gene of photosynthetic eukaryotic algae and cyanobacteria.
Manuscript submitted for publication to Scientific Reports (Sep. 2019)
Acknowledgments
This project has received funding from the Agence National de la Recherche (ANR, ORCA project award no. ANR-13-BS06- 0005-01).
This project has also received funding from the European Research Council (ERC) under the European Union’s Seventh Framework Programme (FP7/2007-2013) (grant agreement No. 338264).