Biological code of knots - identification of knotted patterns in biomolecules via AI approach

Research projects are (co)financed by the Slovenian Research and Innovation Agency

Member of the University of Ljubljana: Faculty of Mechanical Engineering
Project code: N1-0278
Science: Natural sciences and mathematics
SICRIS: Biological code of knots – identification of knotted patterns in biomolecules via AI approach (cobiss.net)

The main goals of the full project (of all three partners) are the following:

Analyze AlphaFold data to identify and study knotted protein structures.
Develop AI methods to efficiently detect knots and other topological features.
Identify sequence and structural patterns responsible for protein knotting.
Experimentally validate selected predicted knotted protein structures.
Develop new mathematical tools for describing and classifying complex topologies.
Model protein folding and knotting dynamics using neural networks.
Build the AlphaKnot database to store and share topological protein data.

The main goal of the Slovenian part of the project is to develop the mathematical background to discover and understand patterns in amino acids responsible for forming non-trivial knot types in proteins. We will create mathematical methods to study biological structures and develop computer algorithms to manipulate and identify these structures. We will:

5.1. describe and provide the topological characterizations of the bio-structures we wish to analyze,

5.2. develop an efficient notation of entangled structures in a machine-readable format,

5.3. develop methods for manipulating the structures (e.g. in terms of a Reidemeister theorem) and implement the manipulation/simplification methods in computer code (in Python language),

5.4. compute new invariants that can efficiently identify the knot-types of such structures (primarily these invariants will be based on the underlying spatial graph),

5.5. analyse topological properties of the invariants (for example, can they detect primeness, chirality, reversibility), the strengths, and time complexities of the invariants, in the case the invariants will not detect these properties, we will use more advanced topological methods (such as the theory of quandles, covering spaces and cannonical triangulations of the complement space in terms of the Epstein-Penner-Weeks theorem),

5.6. create knot tables of all theoretically possible entangled structures appearing in biomolecules. In particular, we will expand the existing Litherland-Moriuchi tables, where they classified prime (i.e. non-divisible) theta-curves up to 7 crossings in the minimal crossing projection and create knot tables for bonded knots up to 1, 2, and 3 bonds by which we will be also able to detect very important cystine knot-types.

In addition, we will prepare sequential and structural protein data for topological and machine learning analysis in cooperation with the Polish and Czech team. We will also cooperate with the in silico experiments of discovering knotting events during knot synthesization and the discovery of the smallest possible biological knot.

The project is complete and has been fully realized according to its objectives.

Biological code of knots – identification of knotted patterns in biomolecules via AI approach

Details

Research projects are (co)financed by the Slovenian Research and Innovation Agency

Cookies