I have mostly worked within the field of Computational Evolutionary Genetics.
I use Python to adjust input and output for existing tools and develop standalone programs wherever needed. I implement and combine own scripts with existing programs in Bash to form streamlined pipelines which can be run on clusters via Sun Gridengine. Pipelines and their output are accompanied with statistics, manuals, reports and workflows generated by R, Inkscape and LaTeX.
- Insect Genomics
- Transposable Element Dynamics
- Evolutionary Genetics
- Artificial Intelligence
- Master thesis: Comparative Genomics within Termites and Cockroaches.
My Master thesis focussed on comparative genomics within termites and cockroaches. I developed a pipeline for transposable element prediction and repeat landscape creation, combining multiple approaches (Tandem Repeats finder, LTRharvest, RECON, Repeatscout, transposonPSI, RepeatMasker, RepeatLandscape). The results were analysed for the importance of repeats in the evolution of sociality.
In order to generate viable non-social outgroup genomes for comparing to social termite genomes, I worked on assembling and polishing cockroach genomes. I worked on Diploptera Punctata (Allpaths-LG) using quality checks (K-mer analysis, fastqc) and trimming (Skewer, Trimmomatic, Nextclip) before assembling (Allpaths-LG). I also worked on verifying and improving the gene annotation of Blattella germanica, which was highly inflated. I found contamination to be present in annotation using Domain analysis. Bacterial sequences were removed via similarity searches (Blastn against nr-database) and I developed an approach using sequence repetitiveness to separate eukaryotic and prokaryotic sequences. Currently I'm writing a machine learning based script that automatically determines sequence origin using repetitiveness. For further reducing the gene count I developed scripts that revealed signs of cleavage in annotation (placement on contigs, lengths compared to orthologs) and used RNA-seq to join contigs, scaffolds and genes (BWA, AGOUTI). The quality of the assembly was checked after joining scaffolds (BUSCO, DOGMA, QUAST). A novel gene annotation was created (MAKER) guided by a transcriptome that was assembled de novo (Trinity).
- Bachelor thesis: Characteristics of de Novo Genes
Genes are considered de novo, if they have evolved from previously intergenic regions, rendering them devoid of homology to other sequences. I developed an analysis pipeline mainly focussing on order and disorder prediction (IUPred, Seg-HCA, RNAfold, WolfPsort, Pfamscan, SignalP, Garnier, Cusp, Repeatmasker). The outcome was analysed upon the potential role of disordered de novo genes as network hubs, making them quickly essential.
- Promos project: Ancestral Sequence Reconstruction within the Nitroreductase Superfamily
Within the nitroreductase superfamily there is a switch from cofactor utilisation to cannibalisation. In order to explore the switch, I generated an ancestral sequence that could be screened for function. The dataset was chosen using sequence similarity networks (Cytoscape) aiming to find the best representatives for all different functional sites (HMMER). The sequences were aligned (Mafft, T-COFFEE) and the alignment trimmed. I chose substitution models (Prottest), generated a tree using a bayesian approach (MrBayes) and reconstructed ancestors (PAML). The ancestral sequences were manually curated according to their amino acid properties, revealing three "lid"-like structures responsible for the switch to cannibalisation of co-factors.
- Hemimetabolous genomes reveal molecular basis of termite eusociality
Nature Ecology and Evolution
- Comparative genomic approaches to investigate molecular traits specific to social insects
Current Opinion in Insect Science