V. Panizza, P. Hauke, C. Micheletti and P. Faccioli
Protein Design by Integrating Machine Learning and Quantum-encoded Optimization
PRX Life 2 043012 (2024)
Link to online article
Abstract
The protein design problem involves finding polypeptide sequences folding into a given three-dimensional
structure. Its rigorous algorithmic solution is computationally demanding, involving a nested search in sequence
and structure spaces. Structure searches can now be bypassed thanks to recent machine-learning breakthroughs,
which have enabled accurate and rapid structure predictions. Similarly, sequence searches might be entirely
transformed by the advent of quantum annealing machines and by the required new encodings of the search
problem, which could be performative even on classical machines. In this work, we introduce a general protein
design scheme where algorithmic and technological advancements in machine learning and quantum-inspired
algorithms can be integrated, and an optimal physics-based scoring function is iteratively learned. In this first
proof-of-concept application, we apply the iterative method to a lattice protein model amenable to exhaustive
benchmarks, finding that it can rapidly learn a physics-based scoring function and achieve promising design
performances. Strikingly, our quantum-inspired reformulation outperforms conventional sequence optimization
even when adopted on classical machines. The scheme is general and can be extended, e.g., to encompass
off-lattice models, and it can integrate progress on various computational platforms, thus representing a new
paradigm approach for protein design.