A ‘digital twin’ is a virtual representation of a real-world object (‘physical twin’) that allows the operator to virtually explore this object without real-world constraints, such as object size or gravity. Yet, it can also be more than a static virtual copy and can contain algorithms that allow simulation of its behavior. Digital twins have been implemented in various fields, e.g. urban planning, construction and health care, however, they are not yet commonly used in molecular biology. The SCALE consortium is generating digital twins of subcellular segments that will fulfill multiple purposes: Innovative Digital Biology, Data Management & Integration, Team Management, and Dissemination & Teaching.
1. Innovative Digital Biology
Atomistic models of macromolecular assemblies as obtained in structural biology mirror cellular components at very high accuracy. However, they are not yet sufficient to simulate the behavior of a cellular component or even subcellular segment. For this, they have to be enriched with additional components that are not usually contained because they are too dynamic, ideally striving for virtual completeness. Such components could be e.g. intrinsically disordered proteins, flexibly interacting proteins and RNAs, lipids and membranes. Such further information is generated complementary techniques capturing additional aspects, e.g. cryo-electron tomography, super-resolution microscopy or spatial proteomics and sequencing that provide spatial context of interactors and proximate molecules. Conformational measurements of flexible and dynamic components by FRET or NMR provide temporal context. Mass spectrometric can provide stoichiometries of lipids and proteins or information about post-translational modifications.
Already the generation of such an enriched atomistic model is a useful exercise and may motivate further experiments: It tests our knowledge of the stoichiometry and spatial packing of a given structure and makes knowledge gaps immediately obvious: Can all components fit in to the given space or are there unfilled gaps? How far can flexibly connected components reach?
Next, molecular simulations of such enriched atomistic models can be carried out and would now meet the definition of a digital twin: to virtually simulate the behavior of a subcellular segment. Such simulations can be further augmented by machine learning or be restrained with additional parameters such as membrane tension or the affinity of intrinsically disordered proteins to each other. Digital twins can make predictions that would not have been obvious from a structural model, e.g. under which conditions a membrane ruptures, or which space intrinsically disordered proteins explore. These predictions may be experimentally tested. Experiments may be designed to expand the virtual capacity and accuracy of the digital twin.
Over time, the digital twins of subcellular segments willgrow ever larger and individual digital twins may eventually merge. The digital twins will be annotated with additional components and restrains, and thus simulate cellular behavior in a more and more accurate and comprehensive manner.