15:10
|
Bc.
Martin
Engst
|
B1
|
prof. Mgr. Daniel Svozil, Ph.D.
|
Gathering the knowledge of terpene biosynthesis
|
detail
Gathering the knowledge of terpene biosynthesis
About 60% of all known natural products are terpenoids, precursors to terpenoids are called terpenes. While there are many terpenes, they are all made from a set of very simple substrates, all comprised of connected isoprene units. The enzymes that catalyze these reactions, terpene synthases, must employ very clever biochemistry, which makes them interesting study subjects for understanding the intricacies of enzymatic catalysis. We have built a comprehensive dataset of known mechanisms of terpene biosynthesis which we are using to assess the feasibility of utilizing machine learning approaches like large language models to predict features and properties of terpene synthases.
|
15:25
|
Bc.
Jozef
Fülöp
|
M2
|
prof. Mgr. Daniel Svozil, Ph.D.
|
A comprehensive analysis of RNA binders
|
detail
A comprehensive analysis of RNA binders
The field of RNA-targeted therapeutics is growing quickly and holds great promise for treating many diseases. In my diploma thesis, I analyzed large amounts of data from sources like the Enamine Hit Locator and other RNA libraries to find molecules that can modulate RNA behavior. I focused on the chemical properties and distributions of RNA-binding molecules, using quantitative estimates of drug-likeness (QED), dimensionality reduction, and scaffold analysis to create RNA-binders' comprehensive profile. I also developed a machine learning model that uses ECFP6 fingerprints, which helps us find promising RNA-binding molecules much faster. This study provides valuable insights for developing RNA-based therapies and advances the field of medicinal chemistry.
|
15:40
|
Bc.
Adam
Hanzlík
|
M1
|
Ing. Petr Čech, Ph.D.
|
Applications for Optical Chemical Structure Recognition
|
detail
Applications for Optical Chemical Structure Recognition
Chemical literature often presents information in raster images of chemical structures. These are interpretable by humans but are not suitable for computer tasks such as storage and querying. Optical Chemical Structure Recognition (OCSR) tools bridge this gap by converting images into machine-readable formats like SMILES and MOL files, facilitating the extraction of chemical knowledge, most commonly to be stored in large optimized databases.
First OCSR tools relied on chemical drawing rules, for example open-source tools OSRA, MolVec, and Imago. Advancements in deep learning have led to the development of machine-learning based tools like Decimer, MolScribe, and MolGrapher. This work benchmarks these tools against a dataset, evaluating their performance against varying degrees of graphical damage.
The benchmarking indicates that different tools perform worse when working with different types of damage. As such a composite approach employing multiple OCSR tools in tandem with compound validation checks is appropriate when seeking to maximize succesful recognition rates. This strategy has been successfully employed in a real data extraction project.
|
15:55
|
Oleksandra
Shumilina
|
B4
|
doc. Ing. Filip Lankaš, Ph.D.
|
Sequence dependent structural dynamics of DNA containing radiation damage
|
detail
Sequence dependent structural dynamics of DNA containing radiation damage
The photo-induced formation of cyclobutane pyrimidine dimers is a highly mutagenic and cancerogenic DNA lesion. Plants repairing that damage with photolyases, placental mammals with nucleotide excision pathway. However, the exact mechanism how the repair enzymes recognize a damaged site in DNA is not fully understood. Molecular dynamics simulations are very powerful tool for study macromolecular dynamics. It could provide an information about structural changes in DNA sequence and its stability. By analyzing simulation results some patterns in damaged DNA could be detected which is leads to better understanding of recognition mechanisms and even possibly some cure design.
|
16:10
|
Pavlína
Slavníková
|
M1
|
doc. Ing. Filip Lankaš, Ph.D.
|
Modelling sequence dependant structure and deformability of DNA
|
detail
Modelling sequence dependant structure and deformability of DNA
The sequence of the DNA double helix plays a significant role in determining its three-dimensional structure and mechanical flexibility. This understanding is essential for the shape-specific recognition of DNA‘s structural motifs by many proteins and small ligands, as it affects their binding affinity and provides us with a way to estimate the specificity of transcription factors, nucleosome core histone proteins, and other regulatory molecules involved in gene expression.
The most efficient way to explore and comprehend this matter in greater detail is probably through the use of computer simulations. Atomic-resolution molecular dynamics (MD) simulations are employed to measure the deviations of a DNA molecule from the ideal B-DNA form and model its mechanical properties. To this end, DNA is described in a coarse-grained manner as an ensemble of interacting rigid bodies representing individual bases. The sequence-dependent structure and deformability is then deduced from statistical properties of internal coordinates describing the relative displacement and rotation of the bases. I plan to extend this standard model by adding the phosphate group to the description as another rigid body, so that DNA sequence-specific structure and deformability may be studied at a finer level.
|
16:25
|
Bc.
Antonín
Zajíček
|
M1
|
Ing. Ivan Čmelo, Ph.D.
|
Stratified data selection using Kohonen maps
|
detail
Stratified data selection using Kohonen maps
The splitting of data between a training and a testing set is one of the core tasks that contributes
to overall quality of the validated machine learning model. There are numerous approaches to
conducting this split, all with their own strengths and weaknesses. This work focuses on
development and testing of a new approach based on Kohonen self-organizing maps for
predicting biological activities of organic compounds from their structures (i.e., QSAR). This
approach was compared against a baseline method represented by a random data split between
the training and testing set.
|