Bioinformatic Tools

Platform specializing in diatom taxonomy and species statistics. Includes genus taxonomic names and classifications, environment, distribution, specimen type, images, and links to resources and references.

DiatomBase Snip.PNG

A comprehensive database of Phaeodactylum tricornutum metabolism including metabolic pathways, reactions, compounds, proteins and genes.

Diatom Cyc Snip.PNG

Largest databse consisting of 90,000 Expressed Sequence Tags (ESTs) from Phaeodactylum tricornutum grown in various conditions. Also includes Thalassiospira pseudonana.

Diatom EST Database Snip.PNG

An aggregated knowledge base about life on Earth. Includes attributes such as geographic distribution, cell characteristics, role and relationships with other species etc.

EOL Snip.PNG

The EnsemblProtists page for Thalassiosira pseudonana includes links to the genome sequence, gene annotation, comparative genomics resources and other data.

Ensembl Protists Snip.PNG

The KEGG page for Thalassiosira pseudonana includes links for the genome sequence, metabolic pathway maps, modules, brite hierarchy and more.

Kegg Snip.PNG

The UniProt page for Thalassiosira pseudonana includes links for the reference proteome, protein sequences, taxonomy, one protein sequence per gene FASTA files, and genome assembly.

UniProt Snip.PNG

Codon usage table for Thalassiosira pseudonana.

Other helpful resources and publications:

Genome Annotation of a Model Diatom Phaeodactylum tricornutum Using an Integrated Proteogenomic Pipeline. (2018) https://doi.org/10.1016/j.molp.2018.08.005

  • Developed an integrated proteogenomic pipeline and applied it toward improved annotation of the P. tricornutum genome using mass spectrometry based proteomics data, pipeline developed is available to be used for any sequenced eukaryote

A Phaeodactylum tricornutum literature database for interactive annotation of content . (2016) https://doi.org/10.1016/j.algal.2016.06.020

  • Bibliome of P. tricornutum literature in a single database, in HTML format

Flux balance analysis of primary metabolism in the diatom Phaeodactylum tricornutum. (2016) https://doi.org/10.1111/tpj.13081

  • A metabolic network constructed for P. tricornutum incorporating the genome, biochemical literature, and online bioinformatic databases, could explore in silico different knockout conditions

Genome-Scale Model Reveals Metabolic Basis of Biomass Partitioning in a Model Diatom. (2016) https://doi.org/10.1371/journal.pone.0155038

Pan-transcriptomic analysis identifies coordinated and orthologous functional modules in the diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum. (2016) https://doi.org/10.1016/j.margen.2015.10.011

  • Integrated analysis including expression patterns, gene functions, cis-regulatory DNA sequence motifs, shows coordination of transcriptional responses in diatoms over changing environmental conditions, resource available at http://networks.systemsbiologymet/diatom-portal

Transcription factors in microalgae: genome-wide prediction and comparative analysis. (2016) https://doi.org/10.1186/s12864-016-2610-9

  • Pipeline combining BLAST, HMMER and InterProScan software to identify and classify transcription factors in algae including P. tricornutum

Plastid proteome prediction for diatoms and other algae with secondary plastids of the red lineage. (2015) https://doi.org/10.1111/tpj.12734

  • Created ASAFind, a customized prediction tool to identify nuclear-encoded plastid proteins in algae with secondary plastids of the red lineage based on the output of SignalP and identification of the conserved ASAFP’ motifs and transit peptides, tested and subcellular localization was verified, works for T. pseudonana and P. tricornutum

Tracking the sterol biosynthesis pathway of the diatom Phaeodactylum tricornutum. (2014) https://doi.org/10.1111/nph.12917

  • Using DiatomCyc, reconstructed mevalonate and sterol biosynthetic pathways for P. tricornutum in silico and experimentally verified the pathways using enzyme inhibitors, gene silencing and heterologous gene expression approaches

A Linear Programming Approach for Modeling and Simulation of Growth and Lipid Accumulation of Phaeodactylum tricornutum. (2013) https://doi.org/10.3390/en6105333

  • Mathematical modelling for simulating the growth and storage molecule accumulation in P. tricornutum

Identification and bioinformatics analysis of pseudogenes from whole genome sequence of Phaeodactylum tricornutum. (2013) https://doi.org/10.1007/s11434-012-5174-3

  • Pipeline for identification of pseudogenes in P. tricornutum

PlantRNA, a database for tRNAs of photosynthetic eukaryotes. (2012) https://doi.org/10.1093/nar/gks935

  • Database for tRNA gene sequences in nuclear, plastidial and mitochondrial genomes, includes all biological information relevant to the function of all tRNAs, 11 organisms included, including P. tricornutum http://plantrna.ibmp.cnrs.fr/

Genetic engineering of fatty acid chain length in Phaeodactylum tricornutum. (2011) https://doi.org/10.1016/j.ymben.2010.10.003

Computational prediction of microRNAs and their targets from three unicellular algae species with complete genome sequences. (2011) https://doi.org/10.1139/W11-102

  • Less of a tool but a pipeline for computational prediction (homology search based on genomic sequences) to find microRNAs in P. tricornutum and T. pseudonana

MIReNA: finding microRNAs with high accuracy and no learning at genome scale and from deep sequencing data. (2010) https://doi.org/10.1093/bioinformatics/btq329

  • Created MIReNA, a genome-wide search algorithm for finding miRNA sequences, used for P. tricornutum

Comparative genomics of the pennate diatom Phaeodactylum tricornutum. (2005) https://doi.org/10.1104/pp.104.052829

  • EST analysis between P. tricornutum and other algae based on the GenBank nonredundant protein database, the COG profile database, the Pfam protein domains database, and the newly created diatom EST database

Diatomics: Toward diatom functional genomics. (2005) https://doi.org/10.1166/jnn.2005.003

  • First large-scale analysis of 12000 ESTs, put into a queryable database known as the Phaeodactylum tricornutum database (PtDB)