publications
2023
- Unraveling cell differentiation mechanisms through topological exploration of single-cell developmental trajectoriesEmanuel Flores-Bautista, and Matt ThomsonbioRxiv, 2023
Understanding the circuits that control cell differentiation is a fundamental problem in developmental biology. Single-cell RNA sequencing has emerged as a powerful tool for investigating this problem. However, the reconstruction of developmental trajectories is based on the assumption that cell states traverse a tree-like structure, which may bias our understanding of critical developmental mechanisms. To address this limitation, we developed a framework, TopGen, that enables identifying topological signatures of functional biological circuits as persistent homology groups in transcriptome space. First, we show that TopGen can identify genetic drivers of topological structures in simulated datasets. We then applied our approach to more than ten single-cell developmental atlases and found that topological transcriptome spaces are predominantly path-connected and only sometimes simply connected. Finally, we applied TopGen to analyze gene expression patterns in topological loops representing stem-like, transdifferentiation, and convergent cell circuits, found in C. elegans, H. vulgaris, and N. vectensis, respectively. Our results show that some essential differentiation mechanisms use non-trivial topological motifs, and that these motifs can be conserved in a cell-type–specific manner. Thus, our approach to studying the topological properties of developmental transcriptome atlases opens new possibilities for understanding cell development and differentiation.Competing Interest StatementThe authors have declared no competing interest.
2020
- Deciphering the regulatory genome of Escherichia coli, one hundred promoters at a timeWilliam T Ireland, Suzannah M Beeler, Emanuel Flores-Bautista, and 7 more authorseLife, Sep 2020
Advances in DNA sequencing have revolutionized our ability to read genomes. However, even in the most well-studied of organisms, the bacterium \textitEscherichia coli, for ≈65% of promoters we remain ignorant of their regulation. Until we crack this regulatory Rosetta Stone, efforts to read and write genomes will remain haphazard. We introduce a new method, Reg-Seq, that links massively parallel reporter assays with mass spectrometry to produce a base pair resolution dissection of more than a \textitE. coli promoters in 12 growth conditions. We demonstrate that the method recapitulates known regulatory information. Then, we examine regulatory architectures for more than 80 promoters which previously had no known regulatory information. In many cases, we also identify which transcription factors mediate their regulation. This method clears a path for highly multiplexed investigations of the regulatory genome of model organisms, with the potential of moving to an array of microbes of ecological and medical relevance.
2019
- Quantitative characterization of random partitioning in the evolution of plasmid-encoded traitsAndrew D. Halleran, Emanuel Flores-Bautista, and Richard M. MurraybioRxiv, Sep 2019
Plasmids are found across bacteria, archaea, and eukaryotes and play an important role in evolution. Plasmids exist at different copy numbers, the number of copies of the plasmid per cell, ranging from a single plasmid per cell to hundreds of plasmids per cell. This feature of a copy number greater than one can lead to a population of plasmids within a single cell that are not identical clones of one another, but rather have individual mutations that make a given plasmid unique. During cell division, this population of plasmids is partitioned into the two daughter cells, resulting in a random distribution of different plasmid variants in each daughter. In this study, we use stochastic simulations to investigate how random plasmid partitioning compares to a perfect partitioning model. Our simulation results demonstrate that random plasmid partitioning accelerates mutant allele fixation when the allele is beneficial and the selection is in an additive or recessive regime where increasing the copy number of the beneficial allele results in additional benefit for the host. This effect does not depend on the size of the benefit conferred or the mutation rate, but is magnified by increasing plasmid copy number.
2018
- Functional Prediction of Hypothetical Transcription Factors of Escherichia coli K-12 Based on Expression DataEmanuel Flores-Bautista, Carenne Ludeña Cronick, Anny Rodriguez Fersaca, and 2 more authorsComputational and Structural Biotechnology Journal, Sep 2018
The repertoire of 304 DNA-binding transcription factors (TFs) in Escherichia coli K-12 has been described recently, with 196 TFs experimentally characterized and 108 proteins predicted by sequence comparisons. Based on 303 expression profile patterns retrieved from the Colombos database 12 clusters were identified, including hypothetical and experimentally characterized TFs, using a spectral clustering algorithm based on a 3NN graph built using 14 principal components that represent 65% of the variance of the expression data. In a posterior step, clusters were characterized in terms of their associated overrepresented functions, based on KEGG, Supfam annotations and Pfam assignments among other functional categories using an enrichment test, reinforcing the notion that the identified clusters are functionally similar among them. Based on these data, the we identified 12 clusters in which hypothetical and known TFs share similar regulatory and physiological functions, such as module associations of toxin-antitoxin (TA) systems with DNA repair mechanisms, amino acid biosynthesis, and carbon metabolism/transport, among others. This analysis has increased our knowledge about gene regulation in E. coli K-12 and can be further expanded to other organisms.