Minel CengizT., Elif DuymazE.
Proteins, one of the four groups of macromolecules necessary for the continuation of life, are initially linear polypeptide chains called primary structure and formed by the combination of amino acids in a certain order. Proteins perform a wide variety of functions within organisms, including copying DNA, responding to stimuli, providing structure and organisms to cells, and transporting molecules from one location to another; it also has the ability to naturally create spatial, that is, 3D (3 Dimension) structures1,2. Proteins classified into two groups, structural and functional, exhibit secondary, tertiary or quaternary structure configurations depending on sequence length and amino acid content; however, they are key players that play a role in the realization of all physiological and metabolic processes in living organisms3,4. The structure of a protein is of a complex nature. Understanding this complex formation, which consists of the unique polypeptide sequence and multiple domains as well as the participation of side chains, is important in revealing the relationship between structure and function5-8. Physics-based (X-ray crystallography, nuclear magnetic resonance spectroscopy, etc.) or computational approaches (machine learning, deep learning, artificial intelligence) developed for protein structure prediction have made great strides in the field of structural biology. At the present time, high accuracy rates can be obtained in estimating the 3D structures of proteins from the amino acid sequence. Structure-based models and interdisciplinary applications of computational science are shown as the most important factor in revealing more and more functional properties of proteins 9-11.
There are many natural proteins that have evolved over billions of years of evolutionary optimization. Despite the highly sophisticated developments in the structural predictions of proteins, due to both the burdens they carry in terms of time and cost, and the margin of error arising from the structural complexity of proteins, a great effort is made for alternative solutions in fields of science that mimic nature and natural processes such as synthetic biology.
Protein design is a computational engineering application with a theoretical approach to 3D structure prediction, aiming to create structures at the same level of development with the present information in the desired time without waiting for temporal and evolutionary changes12-14. This approach, which started with the redesign of natural proteins, opened the doors to the world of de novo protein design, which is based on the reconstruction of a protein with a predetermined function on a synthetic scaffold, in order to perceive the relationship between structure and function in a wider perspective 15-17. In this review, aims to evaluate the recent developments in the principles and application areas of de novo protein design.
1. De novo Protein Design Principles
In the design of a protein whose amino acid sequence and structure are unknown, there is the possibility of 20n linear peptide sequence containing n number of residues of 20 amino acids found in nature. For this reason, research has advanced in the direction of designing proteins whose dynamics of folding, biochemistry and biophysics are known18. De novo design of proteins generally consists of two steps. First, a target structural model, called the scaffold or spine, needs to be established. Second, the determination of the amino acid sequence and side chains covering all residue positions in this structure is performed that is, an outline must be established in the first steps19,20. The physical basis of the design aims to fold proteins in their lowest free energy states. Hydrophobic residues in the protein nucleus modulate protein folding in a solvent-free condition. The side chains in the nucleus need to be tightly packed to minimize the volumetric size occupied by the protein and to obtain the highest Van der Waals forces (Figure 1)21-23.
Figure 1. Energy dynamics in protein folding21.
In the light of the basic principles of de novo protein design, its historical development took place in 3 stages. Manual design based on a physical model, physicochemical-based computational design, and finally fragment and bioinformatics-based computational design24.
1.1. Manual Protein Design
A ribonuclease with 34 amino acid residues and ribonuclease activity, designed and synthesized by Gutte and colleagues in 1979 based on X-ray crystallography data, is known as the origin of de novo proteins. The starting point of the study was based on amino acid sequence prediction from the 3D structure of Escherichia coli (E. coli) lac repressor published in 1973 and 197525. After this success, although studies on optimization of the resolution in manual protein design were carried out in the 1980s, no results beyond the draft for further stages could be obtained24.
1.2. Physicochemical Based Computational Protein Design
In protein design, which is realized by characterizing its physical and chemical properties, it is aimed to determine the conformation of protein backbones with mathematical equations as a general principle.The design of three- and four-stranded bundles has passed into the history of synthetic biology as precursors to computational protein design (CPD) based on physicochemistry26-28. In a series of studies initiated by Lombardi and his colleagues, the first de novo designs29-31 of metallo proteins of helical coils were indicative of the important role of computational methods in addressing fundamental deficiencies in the folding problems of proteins.
1.3. Trailer and Bioinformatics Based Computational Protein Design
Each of the conformations in packaging the side chains in the protein is called a rotamer. Determining the most appropriate rotamers involves the identification of side chain and chemical group positions associated with the protein main chain by leveraging structural information in the Protein Data Bank (PDB). Then, with the dead-end elimination algorithm, belief propagation, integer linear programming, and simulated annealing, various variations of Markov Chain Monte Carlo (MCMC) methods and optimization of rheumater and amino acid residues are carried out32.
For the optimization of energy functions according to the side chains of the protein backbone to be designed, physics-based methods such as RosettaDesign, EvoEF2, PROTEUS or statistics-based methods such as TERM and ABACUS can be used33. RosettaDesign has also been distinguished by its side chain conformations and the possibility of screening suitable amino acid sequences34-36. The most important reason for this is that the first fragment-based computational method successfully designed the TOP7 artificial 93-residue protein37. Current developments have made it possible to create and use PDB-derived spine fragment libraries in determining backbone-dependent rotators of de novo proteins38-40.
Figure 2. Example of computational protein design generated using the Rotamer library40.
Developed to create more flexible backbones in fragment-based CPD, RosettaRemodel41 has provided the researchers with a wide range of engineering capabilities, allowing them to still successfully implement this method today42-44. However, when protein design is considered a folding puzzle, a significant limitation of fragment-based approaches is that it can restrict the ability to explore a wide variety of protein folds due to the relatively unrepresentative content of part libraries.Another disadvantage is that their use does not fully improve the understanding of the structure relationship from the array45,46. Protein design is important beyond understanding protein dynamics and for the development of industrial and new therapeutics. So far with CPD, vaccines, new protein mechanisms, antibodies, membrane proteins, new folds, engineering products such as ligand binding proteins have been successfully demonstrated47,48. Recent advances in synthetic biology in de novo protein design have increased interest in protein structure prediction algorithms, which help to understand the 3D structure and folding architectures of proteins, and thus bioinformatic tools49.
AlphaFold50 and RoseTTAFold51 have recently produced effective solutions regarding the nature of protein folding, which have been attracting attention with accurate structure predictions. AlphaFold, developed by DeepMind, is designed as a deep learning algorithm that uses multiple sequence alignments (MSAs) and distograms as parameters. In the 13th Critical Assessment of Structure Prediction (CASP13) competition, free modeling (FM) generated a lot of excitement by predicting 25 of the 43 proteins with an accuracy of 6.6Â (angstrom); however, both the time cost and the resolution problems encountered led to breakthroughs in the development of the algorithm52-54. In CASP14, AlphaFold2, which identifies protein folding patterns, was developed using an attention mechanism and a transformer-based neural network architecture, and predicted the median backbone of proteins with higher atomic accuracy compared to experimentally determined structures. Then another team designed a three-channel neural network architecture based on deep learning algorithms of Rosetta and AlphaFold2, which enables sequential transformation and integration of data in the array, the distance between pairs of amino acids, and coordinate planes, and named RoseTTAFold51,58. The common feature of these two methods, which use folding patterns for structural prediction in the protein structure-function relationship, is that they use an MSA-based approach. The problem with substituting missing data in amino acid sequences has resulted in the development of ESMFold as a result of using a language learning approach with evolutionary scale Modeling (ESM) for 3D structure prediction of proteins. Making structure predictions using just one amino acid sequence instead of MSA has provided concrete evidence of the contribution of artificial intelligence (AI) in the protein world 59,60. An example of the effects of 3D structure prediction algorithms of proteins on de novo protein design is the development of the AlphaDesign design framework. This method, which can design de novo monomer proteins with ab initio structure prediction based on fragment assembly, has carried out data verification with molecular dynamic simulations and Rosetta approaches49. The ADesign method was then developed, which provides large-scale comparison and validation for use in the design of amino acid sequences from AlphaFold databases and AlphaDesign-based protein structures. In this method, the confidence-sensitive protein decoder (CPD) and the simplified graphical converter encoder (SGT) are introduced as new features in the sequence determination of protein angles. When the performance of the training set with geometric vector sensors (GVPs) compared to GraphTrans and StructGNN (graphical neural network-based models) was evaluated, the result was ADesign ≈ StructGNN > GraphTrans > GVP (Figure 3)61.
Figure 3. Schematic representation of the ADesign workflow61.
2. De novo Protein Design and Therapies
The de novo protein design has generally led to the production of designer-proteins that do not share homology with natural amino acid sequences18. So far, the de novo design of proteins with catalysis, binding, and membrane transition functions has been successfully performed62-64. Thanks to the algorithms used in the design and the pulses in the field of biophysics, it has been possible to create protein structures that have been developed with unique or theoretically superior biological and chemical properties or have gained new functions20,65-67. With the contributions of computational sciences, de novo protein design has been used in many applications, especially in the biomedical field68. The de novo design process of a drug is shown in Figure 469. The de novo design of molecules aims to create the best model of a molecule’s chemical profile. This method, also called productive chemistry, works to achieve standardization by creating optimized pharmacokinetic properties, especially in drug discoveries70. Today, a different point has been reached in medicine thanks to therapeutics developed with biomacromolecules such as peptides, proteins, or amino acids, including the transport of small molecules71-72. The biological and physicochemical properties of de novo designed proteins are shaped according to the method applied by the researcher; however, chemical reactions that are not yet understood in nature aim to adapt to organisms and obtain more stable proteins. This approach covers not only biomolecules and artificial components, but also biomolecules engineering and sheds light on the future73. Proteins are structurally highly dynamic molecules. This situation is seen as a significant challenge in the absence of templates that will allow the exact nature of proteins to be determined. As a solution to this issue, a study by Sesterhenn and colleagues aimed to determine the topological motifs of proteins in order to develop vaccines with de novo protein design.With the TopoBuilder application they developed, an effective protein design is aimed by combining adaptive protein topologies with differential functional motifs74. In vaccine studies, Correira et al. used the Rosetta method to produce topology-based de novo immunogens to create stable functional motifs that trigger the production of neutralising antibodies against the respiratory syncytial virus (RSV)75. Against SARS-CoV-2 infection, Cao and colleagues designed small, stable proteins that prevent the pathogen from binding spike protein to the host cell. Synthesized de novo proteins bind with high affinity to the pathogen-interacting region in mammalian Vero E6 cells, preventing SARS-CoV-2 from entering the cell76.
Chevalier and colleagues have created a large pool of de novo protein designs that act as inhibitors for influenza a H1 hemagglutinin and botulinum neurotoxin B. Researchers have shown that a large number of design proteins can be tested thanks to advanced experimental techniques and advances in bioinformatics tools. There is considerable research on the design of de novo proteins in cancer treatment77. The supply/demand imbalance, especially in immunotherapy agents, has led to a drive towards meeting these therapeutic agents with their synthetic equivalents. For example, interleukin-2 (IL-2), which has limited clinical use in cancer treatment due to its toxic side effects, has been designed with a de vono protein called neoleukin-2/15 (Neo-2/15), which is free of the side effect that mimics IL-15, which is found in the same signaling pathway. In the mouse model developed with colorectal carcinoma and melanoma, Neo-2/15 was found to activate immune cells, slow tumor growth, and do not show side effects of IL-2 in a dose-independent manner78. There are also experimental studies on the design of various de novo proteins, such as map kinase inhibitors 79, histone methyltransferase80, SHP281 and ROCK82 inhibitors in cancer treatment. In addition, studies are being carried out to improve the biocompatibility of antimicrobial peptides (MPPs) used against antibiotic resistance83. With the logic gates called co-LOCKR, it has been shown that the tissue-specific effects of CAR-T treatment can be increased 84,85. Proteins are specific structures in terms of their curvature and conformation structure. Therefore, they are faced with difficulties in resolving their own characteristics. At this point, the de novo technique, developed for both existing and newly discovered proteins, aims to produce protein designs by mimicking naturally occurring proteins. This system, which will be used in many areas, is developing further in every period and sheds light on the future.
References:
- Stollar, E. J., & Smith, D. P. (2020). Uncovering protein structure. Essays in Biochemistry, 64(4), 649–680. https://doi.org/10.1042/EBC20190042
- Finkelstein, A. V., Badretdin, A. J., Galzitskaya, O. V., Ivankov, D. N., Bogatyreva, N. S., & Garbuzynskiy, S. O. (2017). There and back again: Two views on the protein folding puzzle. Physics of Life Reviews, 21, 56–71. https://doi.org/10.1016/J.PLREV.2017.01.025
- Numata, K., & Jp, K. N. (2020). How to define and study structural proteins as biopolymer materials. Polymer Journal 2020 52:9, 52(9), 1043–1056. https://doi.org/10.1038/s41428-020-0362-5
- Bongirwar, V., & Mokhade, A. S. (2022). Different methods, techniques and their limitations in protein structure prediction: A review. Progress in Biophysics and Molecular Biology, 173, 72–82. https://doi.org/10.1016/J.PBIOMOLBIO.2022.05.002
- Rajasekaran, N., & Kaiser, C. M. (2022). Co-Translational Folding of Multi-Domain Proteins. Frontiers in Molecular Biosciences, 9, 349. https://doi.org/10.3389/FMOLB.2022.869027
- Bhattacharyya, M., Ghosh, S., & Vishveshwara, S. (2016). Protein Structure and Function: Looking through the Network of Side-Chain Interactions. Current Protein & Peptide Science, 17(1), 4–25. https://doi.org/10.2174/1389203716666150923105727
- Sykes, J., Holland, B. R., & Charleston, M. A. (2023). A review of visualisations of protein fold networks and their relationship with sequence and function. Biological Reviews, 98(1), 243–262. https://doi.org/10.1111/BRV.12905
- Prabantu, V. M., Gadiyaram, V., Vishveshwara, S., & Srinivasan, N. (2022). Understanding structural variability in proteins using protein structural networks. Current Research in Structural Biology, 4, 134. https://doi.org/10.1016/J.CRSTBI.2022.04.002
- Banerjee, A., Saha, S., Tvedt, N. C., Yang, L. W., & Bahar, I. (2023). Mutually beneficial confluence of structure-based modeling of protein dynamics and machine learning methods. Current Opinion in Structural Biology, 78, 102517. https://doi.org/10.1016/J.SBI.2022.102517
- Peng, Z., Wang, W., Han, R., Zhang, F., & Yang, J. (2022). Protein structure prediction in the deep learning era. Current Opinion in Structural Biology, 77, 102495. https://doi.org/10.1016/J.SBI.2022.102495
- Jisna, V. A., & Jayaraj, P. B. (2021). Protein Structure Prediction: Conventional and Deep Learning Perspectives. The Protein Journal, 40(4), 522–544. https://doi.org/10.1007/S10930-021-10003-Y
- Zhou, J., Panaitiu, A. E., & Grigoryan, G. (2020). A general-purpose protein design framework based on mining sequence-structure relationships in known protein structures. Proceedings of the National Academy of Sciences of the United States of America, 117(2), 1059–1068. https://doi.org/10.1073/PNAS.1908723117
- Ding, W., Nakai, K., & Gong, H. (2022). Protein design via deep learning. Briefings in Bioinformatics, 23(3), 1–16. https://doi.org/10.1093/BIB/BBAC102
- Takahashi, T., Chikenji, G., & Tokita, K. (2021). Lattice protein design using Bayesian learning. Physical Review E, 104(1), 014404. https://doi.org/10.1103/PHYSREVE.104.014404
- Höcker, B., & Zielonka, S. (2022). Protein engineering & design: hitting new heights. Biological Chemistry, 403(5–6), 453. https://doi.org/10.1515/HSZ-2022-0139
- Woolfson, D. N. (2021). A Brief History of De Novo Protein Design: Minimal, Rational, and Computational. Journal of Molecular Biology, 433(20), 167160. https://doi.org/10.1016/J.JMB.2021.167160
- Ferrando, J., & Solomon, L. A. (2021). Recent Progress Using De Novo Design to Study Protein Structure, Design and Binding Interactions. Life 2021, Vol. 11, Page 225, 11(3), 225. https://doi.org/10.3390/LIFE11030225
- Huang, P. S., Boyken, S. E., & Baker, D. (2016). The coming of age of de novo protein design. Nature 2016 537:7620, 537(7620), 320–327. https://doi.org/10.1038/nature19946
- Harteveld, Z., Bonet, J., Rosset, S., Yang, C., Sesterhenn, F., & Correia, B. E. (2022). A generic framework for hierarchical de novo protein design. Proceedings of the National Academy of Sciences of the United States of America, 119(43), e2206111119. https://doi.org/10.1073/PNAS.2206111119
- Marcos, E., & Silva, D. A. (2018). Essentials of de novo protein design: Methods and applications. Wiley Interdisciplinary Reviews: Computational Molecular Science, 8(6), e1374. https://doi.org/10.1002/WCMS.1374
- 21. Kuhlman, B., & Bradley, P. (2019). Advances in protein structure prediction and design. Nature Reviews Molecular Cell Biology 2019 20:11, 20(11), 681–697. https://doi.org/10.1038/s41580-019-0163-x
- Baker, D. (2019). What has de novo protein design taught us about protein folding and biophysics? Protein Science : A Publication of the Protein Society, 28(4), 678. https://doi.org/10.1002/PRO.3588
- Koepnick, B., Flatten, J., Husain, T., Ford, A., Silva, D. A., Bick, M. J., Bauer, A., Liu, G., Ishida, Y., Boykov, A., Estep, R. D., Kleinfelter, S., Nørgård-Solano, T., Wei, L., Players, F., Montelione, G. T., DiMaio, F., Popović, Z., Khatib, F., … Baker, D. (2019). De novo protein design by citizen scientists. Nature, 570(7761), 390. https://doi.org/10.1038/S41586-019-1274-4
- Korendovych, I. V., & DeGrado, W. F. (2020). De novo protein design, a retrospective. Quarterly Reviews of Biophysics, 53, e3. https://doi.org/10.1017/S0033583519000131
- Gutte, B., Däumigen, M., & Wittschieber, E. (1979). Design, synthesis and characterisation of a 34-residue polypeptide that interacts with nucleic acids. Nature 1979 281:5733, 281(5733), 650–655. https://doi.org/10.1038/281650a0
- Pan, X., & Kortemme, T. (2021). Recent advances in de novo protein design: Principles, methods, and applications. The Journal of Biological Chemistry, 296. https://doi.org/10.1016/J.JBC.2021.100558
- Walsh, S. T. R., Cheng, H., Bryson, J. W., Roder, H., & Degrado, W. F. (1999). Solution structure and dynamics of a de novo designed three-helix bundle protein. Proceedings of the National Academy of Sciences of the United States of America, 96(10), 5486. https://doi.org/10.1073/PNAS.96.10.5486
- Hill, R. B., Raleigh, D. P., Lombardi, A., & Degrado, W. F. (2000). De Novo Design of Helical Bundles as Models for Understanding Protein Folding and Function. Accounts of Chemical Research, 33(11), 745. https://doi.org/10.1021/AR970004H
- Lombardi, A., Summa, C. M., Geremia, S., Randaccio, L., Pavone, V., & DeGrado, W. F. (2000). INAUGURAL ARTICLE by a Recently Elected Academy Member:Retrostructural analysis of metalloproteins: Application to the design of a minimal model for diiron proteins. Proceedings of the National Academy of Sciences of the United States of America, 97(12), 6298. https://doi.org/10.1073/PNAS.97.12.6298
- Kaplan, J., & DeGrado, W. F. (2004). De novo design of catalytic proteins. Proceedings of the National Academy of Sciences of the United States of America, 101(32), 11566. https://doi.org/10.1073/PNAS.0404387101
- Summa, C. M., Rosenblatt, M. M., Hong, J. K., Lear, J. D., & DeGrado, W. F. (2002). Computational de novo design, and characterization of an A2B2 diiron protein. Journal of Molecular Biology, 321(5), 923–938. https://doi.org/10.1016/S0022-2836(02)00589-2
- Anand-Achim, N., Eguchi, R. R., Derry, A., Altman, R. B., & Huang, P.-S. (2020). Protein sequence design with a learned potential. BioRxiv, 2020.01.06.895466. https://doi.org/10.1101/2020.01.06.895466
- Liu, Yufeng, Zhang, L., Wang, W., Zhu, M., Wang, C., Li, F., Zhang, J., Li, H., Chen, Q., & Liu, H. (2022). Rotamer-free protein sequence design based on deep learning and self-consistency. Nature Computational Science 2022 2:7, 2(7), 451–462. https://doi.org/10.1038/s43588-022-00273-6
- Liu, Yi, & Kuhlman, B. (2006). RosettaDesign server for protein design. Nucleic Acids Research, 34(Web Server issue), W235. https://doi.org/10.1093/NAR/GKL163
- Dantas, G., Kuhlman, B., Callender, D., Wong, M., & Baker, D. (2003). A Large Scale Test of Computational Protein Design: Folding and Stability of Nine Completely Redesigned Globular Proteins. Journal of Molecular Biology, 332(2), 449–460. https://doi.org/10.1016/S0022-2836(03)00888-X
- Bermeo, S., Favor, A., Chang, Y. T., Norris, A., Boyken, S. E., Hsia, Y., Haddox, H. K., Xu, C., Brunette, T. J., Wysocki, V. H., Bhabha, G., Ekiert, D. C., & Baker, D. (2022). De novo design of obligate ABC-type heterotrimeric proteins. Nature Structural & Molecular Biology 2022 29:12, 29(12), 1266–1276. https://doi.org/10.1038/s41594-022-00879-4
- Kuhlman, B., Dantas, G., Ireton, G. C., Varani, G., Stoddard, B. L., & Baker, D. (2003). Design of a Novel Globular Protein Fold with Atomic-Level Accuracy. Science, 302(5649), 1364–1368. https://doi.org/10.1126/SCIENCE.1089427
- Shapovalov, M. V., & Dunbrack, R. L. (2011). A Smoothed Backbone-Dependent Rotamer Library for Proteins Derived from Adaptive Kernel Density Estimates and Regressions. Structure, 19(6), 844–858. https://doi.org/10.1016/J.STR.2011.03.019
- Mortensen, J. C., Damjanovic, J., Miao, J., Hui, T., & Lin, Y. S. (2022). A backbone-dependent rotamer library with high (ϕ, ψ) coverage using metadynamics simulations. Protein Science, 31(12), e4491. https://doi.org/10.1002/PRO.4491
- Morin, A., Meiler, J., & Mizoue, L. S. (2011). Computational design of protein-ligand interfaces: Potential in therapeutic development. Trends in Biotechnology, 29(4), 159–166. https://doi.org/10.1016/j.tibtech.2011.01.002
- Huang, P. S., Ban, Y. E. A., Richter, F., Andre, I., Vernon, R., Schief, W. R., & Baker, D. (2011). RosettaRemodel: A Generalized Framework for Flexible Backbone Protein Design. PLoS ONE, 6(8), 24109. https://doi.org/10.1371/JOURNAL.PONE.0024109
- Leman, J. K., Weitzner, B. D., Lewis, S. M., Adolf-Bryfogle, J., Alam, N., Alford, R. F., Aprahamian, M., Baker, D., Barlow, K. A., Barth, P., Basanta, B., Bender, B. J., Blacklock, K., Bonet, J., Boyken, S. E., Bradley, P., Bystroff, C., Conway, P., Cooper, S., … Bonneau, R. (2020). Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nature Methods, 17(7), 665. https://doi.org/10.1038/S41592-020-0848-2
- Koga, N., Koga, R., Liu, G., Castellanos, J., Montelione, G. T., & Baker, D. (2021). Role of backbone strain in de novo design of complex α/β protein structures. Nature Communications 2021 12:1, 12(1), 1–12. https://doi.org/10.1038/s41467-021-24050-7
- Eguchi, R. R., & Huang, P. S. (2020). Multi-scale structural analysis of proteins by deep semantic segmentation. Bioinformatics, 36(6), 1740. https://doi.org/10.1093/BIOINFORMATICS/BTZ650
- Konagurthu, A. S., Subramanian, R., Allison, L., Abramson, D., Stuckey, P. J., Garcia de la Banda, M., & Lesk, A. M. (2021). Universal Architectural Concepts Underlying Protein Folding Patterns. Frontiers in Molecular Biosciences, 7. https://doi.org/10.3389/FMOLB.2020.612920
- Carbery, A., Skyner, R., Von Delft, F., & Deane, C. M. (2022). Fragment Libraries Designed to Be Functionally Diverse Recover Protein Binding Information More Efficiently Than Standard Structurally Diverse Libraries. Journal of Medicinal Chemistry, 65(16), 11404–11413. https://doi.org/10.1021/ACS.JMEDCHEM.2C01004
- Agha, X., Fu, N., & Hu, J. (2022). Designing novel protein structures using sequence generator and AlphaFold2. https://doi.org/10.48550/arxiv.2208.14526
- Wang, J., Cao, H., Zhang, J. Z. H., & Qi, Y. (2018). Computational Protein Design with Deep Learning Neural Networks. Scientific Reports 2018 8:1, 8(1), 1–9. https://doi.org/10.1038/s41598-018-24760-x
- Jendrusch, M., Korbel, J. O., & Sadiq, S. K. (2021). AlphaDesign: A de novo protein design framework based on AlphaFold. BioRxiv, 2021.10.11.463937. https://doi.org/10.1101/2021.10.11.463937
- Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 2021 596:7873, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
- Baek, M., Anishchenko, I., Park, H., Humphreys, I. R., & Baker, D. (2021). Protein oligomer modeling guided by predicted interchain contacts in CASP14. Proteins: Structure, Function, and Bioinformatics, 89(12), 1824–1833. https://doi.org/10.1002/PROT.26197
- Senior, A. W., Evans, R., Jumper, J., Kirkpatrick, J., Sifre, L., Green, T., Qin, C., Žídek, A., Nelson, A. W. R., Bridgland, A., Penedones, H., Petersen, S., Simonyan, K., Crossan, S., Kohli, P., Jones, D. T., Silver, D., Kavukcuoglu, K., & Hassabis, D. (2019). Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13). Proteins, 87(12), 1141. https://doi.org/10.1002/PROT.25834
- AlQuraishi, M. (2020). A watershed moment for protein structure prediction. Nature, 577(7792), 627–628. https://doi.org/10.1038/D41586-019-03951-0
- Senior, A. W., Evans, R., Jumper, J., Kirkpatrick, J., Sifre, L., Green, T., Qin, C., Žídek, A., Nelson, A. W. R., Bridgland, A., Penedones, H., Petersen, S., Simonyan, K., Crossan, S., Kohli, P., Jones, D. T., Silver, D., Kavukcuoglu, K., & Hassabis, D. (2020). Improved protein structure prediction using potentials from deep learning. Nature, 577(7792), 706–710. https://doi.org/10.1038/S41586-019-1923-7
- Bouatta, N., Sorger, P., & AlQuraishi, M. (2021). Protein structure prediction by AlphaFold2: are attention and symmetries all you need? Urn:Issn:2059-7983, 77(8), 982–991. https://doi.org/10.1107/S2059798321007531
- Yang, J., Anishchenko, I., Park, H., Peng, Z., Ovchinnikov, S., & Baker, D. (2020). Improved protein structure prediction using predicted interresidue orientations. Proceedings of the National Academy of Sciences of the United States of America, 117(3), 1496–1503. https://doi.org/10.1073/PNAS.1914677117
- Anishchenko, I., Chidyausiku, T. M., Ovchinnikov, S., Pellock, S. J., Baker, D., & Harvard, J. (2020). De novo protein design by deep network hallucination. BioRxiv, 2020.07.22.211482. https://doi.org/10.1101/2020.07.22.211482
- Baek, M., DiMaio, F., Anishchenko, I., Dauparas, J., Ovchinnikov, S., Lee, G. R., Wang, J., Cong, Q., Kinch, L. N., Dustin Schaeffer, R., Millán, C., Park, H., Adams, C., Glassman, C. R., DeGiovanni, A., Pereira, J. H., Rodrigues, A. V., Van Dijk, A. A., Ebrecht, A. C., … Baker, D. (2021). Accurate prediction of protein structures and interactions using a 3-track neural network. Science (New York, N.Y.), 373(6557), 871. https://doi.org/10.1126/SCIENCE.ABJ8754
- Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., Costa, A. dos S., Fazel-Zarandi, M., Sercu, T., Candido, S., & Rives, A. (2022). Language models of protein sequences at the scale of evolution enable accurate structure prediction. BioRxiv, 2022.07.20.500902. https://doi.org/10.1101/2022.07.20.500902
- Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., Smetanin, N., Verkuil, R., Kabeli, O., Shmueli, Y., Santos Costa, A. Dos, Fazel-Zarandi, M., Sercu, T., Candido, S., & Rives, 2 Alexander. (2022). Evolutionary-scale prediction of atomic level protein structure with a language model. BioRxiv, 2022.07.20.500902. https://doi.org/10.1101/2022.07.20.500902
- Gao, Z., Tan, C., & Li, S. Z. (2022). AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB. https://doi.org/10.48550/arxiv.2202.01079
- Jenkins, J. M. X., Noble, C. E. M., Grayson, K. J., Mulholland, A. J., & Anderson, J. L. R. (2021). Substrate promiscuity of a de novo designed peroxidase. Journal of Inorganic Biochemistry, 217. https://doi.org/10.1016/J.JINORGBIO.2021.111370
- Lombardi, A., Pirro, F., Maglio, O., Chino, M., & DeGrado, W. F. (2019). De Novo Design of Four-Helix Bundle Metalloproteins: One Scaffold, Diverse Reactivities. Accounts of Chemical Research, 52(5), 1148. https://doi.org/10.1021/ACS.ACCOUNTS.8B00674
- Niitsu, A., & Sugita, Y. (2023). Towards de novo design of transmembrane α-helical assemblies using structural modelling and molecular dynamics simulation. Physical Chemistry Chemical Physics : PCCP, 25(5). https://doi.org/10.1039/D2CP03972A
- Kuhlman, B., & Bradley, P. (2019). Advances in protein structure prediction and design. Nature Reviews Molecular Cell Biology 2019 20:11, 20(11), 681–697. https://doi.org/10.1038/s41580-019-0163-x
- Dawson, W. M., Rhys, G. G., & Woolfson, D. N. (2019). Towards functional de novo designed proteins. Current Opinion in Chemical Biology, 52, 102–111. https://doi.org/10.1016/J.CBPA.2019.06.011
- Khoury, G. A., Smadbeck, J., Kieslich, C. A., & Floudas, C. A. (2014). Protein folding and de novo protein design for biotechnological applications. Trends in Biotechnology, 32(2), 99. https://doi.org/10.1016/J.TIBTECH.2013.10.008
- Qing, R., Hao, S., Smorodina, E., Jin, D., Zalevsky, A., & Zhang, S. (2022). Protein Design:From the Aspect of Water Solubilityand Stability. Chemical Reviews, 122(18), 14085. https://doi.org/10.1021/ACS.CHEMREV.1C00757
- Liu, X., IJzerman, A. P., & van Westen, G. J. P. (2021). Computational Approaches for De Novo Drug Design: Past, Present, and Future. Methods in Molecular Biology, 2190, 139–165. https://doi.org/10.1007/978-1-0716-0826-5_6
- Meyers, J., Fabian, B., & Brown, N. (2021). De novo molecular design and generative models. Drug Discovery Today, 26(11), 2707–2715. https://doi.org/10.1016/J.DRUDIS.2021.05.019
- Zhou, W., Šmidlehner, T., & Jerala, R. (2020). Synthetic biology principles for the design of protein with novel structures and functions. FEBS Letters, 594(14), 2199–2212. https://doi.org/10.1002/1873-3468.13796
- Li, Y., & Champion, J. A. (2022). Self-assembling nanocarriers from engineered proteins: Design, functionalization, and application for drug delivery. Advanced Drug Delivery Reviews, 189, 114462. https://doi.org/10.1016/J.ADDR.2022.114462
- Grayson, K. J., & Anderson, J. L. R. (2018). Designed for life: biocompatible de novo designed proteins and components. Journal of the Royal Society Interface, 15(145). https://doi.org/10.1098/RSIF.2018.0472
- Sesterhenn, F., Yang, C., Bonet, J., Cramer, J. T., Wen, X., Wang, Y., Chiang, C. I., Abriata, L. A., Kucharska, I., Castoro, G., Vollers, S. S., Galloux, M., Dheilly, E., Rosset, S., Corthésy, P., Georgeon, S., Villard, M., Richard, C. A., Descamps, D., … Correia, B. E. (2020). De novo protein design enables the precise induction of RSV-neutralizing antibodies. Science, 368(6492). https://doi.org/10.1126/SCIENCE.AAY5051
- Correia, B. E., Bates, J. T., Loomis, R. J., Baneyx, G., Carrico, C., Jardine, J. G., Rupert, P., Correnti, C., Kalyuzhniy, O., Vittal, V., Connell, M. J., Stevens, E., Schroeter, A., Chen, M., MacPherson, S., Serra, A. M., Adachi, Y., Holmes, M. A., Li, Y., … Schief, W. R. (2014). Proof of principle for epitope-focused vaccine design. Nature 2014 507:7491, 507(7491), 201–206. https://doi.org/10.1038/nature12966
- Cao, L., Goreshnik, I., Coventry, B., Case, J. B., Miller, L., Kozodoy, L., Chen, R. E., Carter, L., Walls, A. C., Park, Y. J., Strauch, E. M., Stewart, L., Diamond, M. S., Veesler, D., & Baker, D. (2020). De novo design of picomolar SARS-CoV-2 miniprotein inhibitors. Science, 370(6515). https://doi.org/10.1126/SCIENCE.ABD9909
- Chevalier, A., Silva, D. A., Rocklin, G. J., Hicks, D. R., Vergara, R., Murapa, P., Bernard, S. M., Zhang, L., Lam, K. H., Yao, G., Bahl, C. D., Miyashita, S. I., Goreshnik, I., Fuller, J. T., Koday, M. T., Jenkins, C. M., Colvin, T., Carter, L., Bohn, A., … Baker, D. (2017). Massively parallel de novo protein design for targeted therapeutics. Nature 2017 550:7674, 550(7674), 74–79. https://doi.org/10.1038/nature23912
- Silva, D. A., Yu, S., Ulge, U. Y., Spangler, J. B., Jude, K. M., Labão-Almeida, C., Ali, L. R., Quijano-Rubio, A., Ruterbusch, M., Leung, I., Biary, T., Crowley, S. J., Marcos, E., Walkey, C. D., Weitzner, B. D., Pardo-Avila, F., Castellanos, J., Carter, L., Stewart, L., … Baker, D. (2019). De novo design of potent and selective mimics of IL-2 and IL-15. Nature 2019 565:7738, 565(7738), 186–191. https://doi.org/10.1038/s41586-018-0830-7
- Park, H., Lee, S., & Hong, S. (2015). Structure-based de novo design and synthesis of aminothiazole-based p38 MAP kinase inhibitors. Bioorganic & Medicinal Chemistry Letters, 25(18), 3784–3787. https://doi.org/10.1016/J.BMCL.2015.07.094
- Smadbeck, J., Peterson, M. B., Zee, B. M., Garapaty, S., Mago, A., Lee, C., Giannis, A., Trojer, P., Garcia, B. A., & Floudas, C. A. (2014). De Novo Peptide Design and Experimental Validation of Histone Methyltransferase Inhibitors. PLoS ONE, 9(2), 90095. https://doi.org/10.1371/JOURNAL.PONE.0090095
- Liu, W. S., Jin, W. Y., Zhou, L., Lu, X. H., Li, W. Y., Ma, Y., & Wang, R. L. (2019). Structure based design of selective SHP2 inhibitors by De novo design, synthesis and biological evaluation. Journal of Computer-Aided Molecular Design, 33(8), 759–774. https://doi.org/10.1007/S10822-019-00213-Z
- Arya, H., & Coumar, M. S. (2020). Design of novel ROCK inhibitors using fragment-based de novo drug design approach. Journal of Molecular Modeling, 26(9). https://doi.org/10.1007/S00894-020-04493-3
- Zeng, P., Cheng, Q., Yi, L., Shui Yee Leung, S., Chen, S., Chan, K. F., & Wong, K. Y. (2023). C-terminal modification of a de novo designed antimicrobial peptide via capping of macrolactam rings. Bioorganic Chemistry, 130, 106251. https://doi.org/10.1016/J.BIOORG.2022.106251
- Lajoie, M. J., Boyken, S. E., Salter, A. I., Bruffey, J., Rajan, A., Langan, R. A., Olshefsky, A., Muhunthan, V., Bick, M. J., Gewe, M., Quijano-Rubio, A., Johnson, J. L., Lenz, G., Nguyen, A., Pun, S., Correnti, C. E., Riddell, S. R., & Baker, D. (2020). Designed protein logic to target cells with precise combinations of surface antigens. Science (New York, N.Y.), 369(6511), 1637. https://doi.org/10.1126/SCIENCE.ABA6527
- Xie, M., & Lu, P. (2020). When de novo-designed protein logics meet CAR-T therapies. Cell Research, 30(11), 946. https://doi.org/10.1038/S41422-020-00419-Z