Symposium of Polish Bioinformatics Society 2022 — Invited speakers

Pawel Teisseyre (14.09.2022)

Title: Binary classification under partial observability of class variable

Binary classification is a central problem in machine learning, which appears frequently in many bioinformatics tasks. Dozens of algorithms have been designed in recent years, but most of them assume full observability of the target variable, i.e. that each observation is assigned label (positive or negative). In my presentation, I will focus on the problem of positive-unlabeled (PU) learning, which has attracted a significant attention in machine learning community. In PU scenario, only some portion of positive observations are assigned labels, whereas for the remaining observations the label is not assigned (they can be either positive or negative). For example, in medical databases, some patients have been diagnosed with disease (positive observations), whereas the remaining patients have not been diagnosed (unlabeled observations). The absence of a diagnosis does not mean that the patient does not have a disease in question. It turns out that building classification models based on PU data raises many difficulties. I will introduce the basic concepts related to PU learning and discuss the main challenges. Finally, the novel method of joint estimation of the posterior probability and propensity score function will be presented.

Paweł Krupa (15.09.2022)

Title: Prediction of structure and dynamics of biomacromolecules

Molecular dynamics (MD) in all-atom and coarse-grained force fields is the most popular computational method used to study structure and dynamics of various biomacromolecules: proteins, peptides, nucleic acids, sugars, lipids, and their interactions with other compounds. Although results obtained by MD have to be validated by comparison with available data, especially experimental, MD provides valuable information when other methods cannot be effectively used. One example is computational study of intrinsically disordered peptides and proteins (IDPs), such as monomeric amyloid beta, for which experimental methods cannot be effectively used mostly due to transient nature of the peptide, while computational simulations are invaluable source of information about its structure and dynamics. MD simulations can provide information about the role of disulfide bonds on protein stability, effects of substitutions of amino-acid residues (mutations), binding of possible drug candidates and predicting potential nanotoxicity. Presentation will focus on showing possible applications and limitations of all-atom and coarse-grained force fields in studies of various biomacromolecules and their interactions with other compounds based on some of the recent works [1-8].
This work was partially supported by the National Science Center (Poland) Sonata 2019/35/D/ST4/03156.

References
1 He Y, Mozolewska MA, Krupa P, et al (2013) Lessons from application of the UNRES force field to predictions of structures of CASP10 targets. Proc Natl Acad Sci U S A 110 (37), 14936-14941.
2 Krupa P, Mozolewska MA, Wiśniewska M, et al (2016) Performance of protein-structure predictions with the physics-based UNRES force field in CASP11. Bioinformatics 32:3270–3278.
3 Ahmed L, Rasulev B, Kar S, et al (2017) Inhibitors or toxins? Large library target-specific screening of fullerene-based nanoparticles for drug design purpose. Nanoscale 9:10263–10276.
4 P Krupa, AK Sieradzan, MA Mozolewska, et al (2017) Dynamics of disulfide-bond disruption and formation in the thermal unfolding of ribonuclease A. Journal of Chemical Theory and Computation 13 (11), 5721-5730.
5 Krupa P, Quoc Huy PD, Li MS (2019) Properties of monomeric Aβ42 probed by different sampling methods and force fields: Role of energy components. J Chem Phys 151:055101.
6 Nguyen HL, Krupa P, Hai NM, et al (2019) Structure and Physicochemical Properties of the Aβ42 Tetramer: Multiscale Molecular Dynamics Simulations. J Phys Chem B 123:7253–7269.
7 Krupa P, Karczyńska AS, Mozolewska MA, et al (2021) UNRES-Dock - protein-protein and peptide-protein docking by coarse-grained replica-exchange MD simulations. Bioinformatics 37 (11), 1613-1615.
8 D Marasco, C Vicidomini, P Krupa, et al (2021) Plant isoquinoline alkaloids as potential neurodrugs: A comparative study of the effects of benzo [c] phenanthridine and berberine-based compounds on β-amyloid aggregation. Chemico-biological interactions 334, 109300.

 

Janusz Bujnicki (16.09.2022)

Title: Computational modeling of RNA 3D structures and interactions with the use of experimental data

Ribonucleic acid (RNA) molecules are the master regulators of cells. They are involved in many molecular processes: They transmit genetic information, sense cellular signals and transmit responses, and even catalyze chemical reactions. The function of RNA, and in particular its ability to interact with other molecules, is encoded in sequence. Understanding how these molecules perform their biological tasks requires detailed knowledge of RNA structure and dynamics, as well as thermodynamics, which largely determines how RNA folds and interacts in the cellular environment.
Experimental determination of these properties is difficult, and several computational methods have been developed to model the folding of RNA 3D structures and their interactions with other molecules, especially proteins. However, computational methods are reaching their limits, especially when the biological implications require calculation of dynamics beyond a few hundred nanoseconds. For the researcher faced with such challenges, a better approach is to resort to coarse-grained modeling to reduce the number of data points and computational effort to a manageable size, while sacrificing as little critical information as possible.
I will present strategies for computational modeling of RNA 3D structures and their interactions with other molecules using a suite of methods developed in my laboratory that use coarse-grained representations of molecules, rely on the Monte Carlo method for sampling conformational space, and employ statistical potentials to approximate energy and identify conformations corresponding to biologically relevant structures. In particular, I will discuss the use of computational approaches to determine RNA structure based on low-resolution experimental data, including chemical probing and electron microscopy.

References
1 Ponce-Salvatierra, A. et al. Biosci. Rep. 39, BSR20180430 (2019)
2 Boniecki, M. J. et al. Nucleic Acids Res. 44, e63 (2016)