Protein modeling: Unlocking the secrets of the building blocks of life.
Protein drives life, important for all the biological processes. From speeding up the chemical reactions in the body as an enzyme… proteins also provide structural support in the cells. But here is the tricky part: the proper functioning of a protein depends on its 3D structure, which can become a challenge to determine. In today’s world, it is easier for scientists to determine the sequence of amino acids of the proteins but to find the ways in which amino acids fold into their 3D structures, is kind of difficult.
Since, curious minds are always eager to find keys to the unknown locks, scientists have find ways to look into the 3D structures of proteins. This is where, Protein modelling steps in, to offer computational techniques to anticipate and examine protein structures without having the need to use Lab methodologies.
What is Protein Modelling?
Think, you are looking at the unassembled pieces of puzzle. You are observing them and trying to figure out how these pieces are going to come together. Protein modelling works the same way, it tells the scientist how these proteins works and determine their interactions with other molecules, the tasks it performs.
Though, there are several computational methods, we are going to discuss the most popular ones.
The Significance of Protein Modelling:
Now, here comes the question that why do we care about the protein structure and its shape?
Well here is why,
- Drug discovery: The main reason for some diseases is the misfolded proteins. Now, if we know about the structure of a protein we can design drugs which can interact to it in a specific way. We can block harmful activities and promote beneficial ones. In the process of drug designing, drugs need to fit on the targeted protein like a key fits lock. Now if we model protein structure, it will help us design target specific drugs, influencing protein’s activity.
- Understanding diseases: Studying the structure of the protein through modelling, it helps us understand the diseases like those resulting from misfolded proteins like Alzheimer’s disease.
Biotechnology: Modifications can be made in the protein structures to create more efficient enzymes to prevent diseases and for industrial applications, such as breaking of waste in biofuels.
Protein modelling Methods and Algorithms
If we are to learn about the protein modelling, there are three main methods that scientists mostly use.
Homology Modelling (Comparative modelling)
This is the most reliable and easy to use method. For this, you should have a template first. Homology Modelling is used to determine a protein structure by using the known protein structure which is used as a template to create a new one. Since, some proteins have similar structures we can use “copy paste” technique to model new structure.
How it works?
- First find a protein template sequence using PDB or NCBI webservers.
- The structures of known and unknown proteins are aligned by using SWISS-MODEL.
- Note that, choose template with 40% and more sequence similarity.
- The model is now ready, evaluate its quality and structure.
2. Ab Initio Modelling( De novo Modelling)
Ab initio is a Latin word which literally means “from the beginning”…as the name suggests the protein structure is predicted purely from its amino acid sequence. There is no need of a reference structure, this method solely relies on the chemical and physical rules to predict the 3D structure.
How it works?
- Take the desired amino acid sequence from NCBI or Uniprot.
Upload the query sequence on the webserver that provide Ab initio modelling method like I-TASSER or Robetta
You get the desired model of the protein.
3. Threading (Fold recognition)
This method involves taking amino acid sequence of the target protein and then compare it with the known structures present in the databases. Even if the sequence is different, the similarity in folds is detected and measured for the 3D structure.
Please note that, when structure similarity is low, the sequence similarity may exist.
How it works?
You need to scan a database of known protein structures. First we will need a protein’s amino acid sequence in the fasta format by using webservers like NCBI/UNIPROT
Submit the sequence to a software or webserver that allows Threading like I-TASSER , Phyre2 or HHpred
After adding the query protein sequence, it will give you a list of protein structures that aligns with it.
You should select a template with low E value, high probability and good alignment.
Go to SWISSPROT, upload the fasta sequence of the protein and obtain the 3D structure.
_____________________________________________________________________________________
Conclusion:
Drug development and disease research have already benefited greatly from protein modeling, but its potential is still quite fascinating. Tools have become more precise and effective as a result of developments in machine learning and artificial intelligence. This translates into quicker drug development, more effective illness treatments, and fresh biotechnological innovations.
One of the most important instruments in computational biology is protein modeling. The potential to enhance human health and further scientific research is boundless as we persist in revealing the mysteries surrounding protein structure, the possibilities of improving heath and advancement in science also unravels itself.
References:
Blundell, T., Carney, D., Gardner, S., Hayes, F., Howlin, B., Hubbard, T., Overington, J., Singh, D. A., Sibanda, B. L., & Sutcliffe, M. (1988). Knowledge‐based protein modelling and design. European Journal of Biochemistry, 172(3), 513–520. https://doi.org/10.1111/j.1432-1033.1988.tb13917.x
Muhammed, M. T., & Aki‐Yalcin, E. (2018). Homology modeling in drug discovery: Overview, current applications, and future perspectives. Chemical Biology & Drug Design, 93(1), 12–20. https://doi.org/10.1111/cbdd.13388
Schwede, T. (2003). SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Research, 31(13), 3381–3385. https://doi.org/10.1093/nar/gkg520
Xu, D., Zhang, J., Roy, A., & Zhang, Y. (2011). Automated protein structure modeling in CASP9 by I‐TASSER pipeline combined with QUARK‐based ab initio folding and FG‐MD‐based structure refinement. Proteins Structure Function and Bioinformatics, 79(S10), 147–160. https://doi.org/10.1002/prot.2311
Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., & Zhang, Y. (2014). The I-TASSER Suite: protein structure and function prediction. Nature Methods, 12(1), 7–8. https://doi.org/10.1038/nmeth.3213
Zhang, H., & Shen, Y. (2020). Template-based prediction of protein structure with deep learning. BMC Genomics, 21(S11). https://doi.org/10.1186/s12864-020-07249-8
Peitsch, M. C. (1996). ProMod and Swiss-Model: Internet-based Tools for Automated Comparative Protein Modelling. https://www.infona.pl/resource/bwmeta1.element.elsevier-eb84b2af-4fe2-37b9-8204-e401c3d3aed2