ModFinder: Mapping Post-Translational Modifications (PTMs) in Protein Structures from PDB
Protein structures are being determined at an incredible rate using experimental techniques such as X-ray crystallography, NMR spectroscopy, and Cryo-EM. These experimentally validated structures are continuously submitted into the Protein Data Bank (PDB), which is publicly available at: https://www.rcsb.org
The PDB is one of the most valuable resources in structural biology. Researchers worldwide use PDB structures for:
- Molecular docking studies
- Drug discovery workflows
- Protein engineering and mutagenesis
- Structural comparisons and homology modeling
- Protein-ligand interaction analysis
- Functional and evolutionary studies
However, one key aspect is often overlooked in basic structural studies:
Proteins are not static molecules — they are chemically modified in living cells.
What are Post-Translational Modifications (PTMs)?
Proteins undergo multiple chemical modifications that can drastically change their behavior, function, and interaction patterns. These modifications are commonly known as:
Post-Translational Modifications (PTMs)
PTMs are dynamic processes that occur:
- before protein synthesis
- during translation
- after protein synthesis
These modifications influence protein stability, localization, signaling behavior, and biological function.
Common Types of Protein Modifications
Some of the most widely studied PTMs include:
- Phosphorylation (common in signaling pathways)
- Glycosylation (important in membrane proteins and immune recognition)
- Acetylation (often linked with transcription regulation)
- Sulfonation
- Methylation
- Ubiquitination
Each modification can act like a molecular switch that activates or deactivates protein activity.
Why PTM Mapping is Important in Bioinformatics
Studying PTMs is extremely important because:
- PTMs are linked with disease mechanisms (cancer, neurodegeneration, immune disorders)
- Many PTMs regulate enzymatic activity
- PTMs affect binding pockets and docking results
- PTMs influence protein folding and stability
- PTMs determine cellular localization of proteins
In short:
If you ignore PTMs, you may miss the real biological function of a protein.
Where are PTM Annotations Stored?
To help researchers, PTM information is carefully curated and stored in databases.
The most widely used database for protein annotations is:
UniProtKB
UniProtKB provides PTM annotations based on:
- scientific literature evidence
- experimental structural evidence
- homology-based inference
- curated annotations from specialized databases
Other Major PTM Databases
Several specialized databases also exist to store PTM-related data, including:
RESID
A database dedicated to known protein modifications.
PSI-MOD Ontology
A standardized ontology for representing protein modifications in biological data.
Unimod
Widely used in mass spectrometry workflows for protein modification identification.
These databases provide valuable resources, but one problem remains:
PTMs are often difficult to visualize directly on protein 3D structures.
This is where a powerful tool called ModFinder becomes highly useful.
ModFinder: A Breakthrough Tool for PTM Mapping in PDB Structures
In 2017, Gao et al. introduced a software tool designed specifically to identify PTMs in protein structures:
BioJava-ModFinder (ModFinder)
ModFinder is a BioJava package developed to automatically detect and annotate protein modifications directly from 3D protein structures available in PDB.
🔗 GitHub Source: https://github.com/biojava/biojava/tree/master/biojava-modfinder
How ModFinder Works
ModFinder analyzes protein structures in the PDB and identifies modified residues by detecting:
- non-standard amino acids
- chemically altered residues
- modification patterns recognized in structural coordinates
It can map modifications such as phosphorylation or glycosylation directly onto the protein structure.
This is extremely valuable because it bridges the gap between:
✅ sequence-level PTM annotation
and
✅ structure-level PTM visualization
Weekly Updates: How RCSB PDB Uses ModFinder
One of the strongest points of ModFinder is that it is not just a research prototype.
It is integrated into the PDB update process.
What happens every week?
- PDB releases new structures
- ModFinder is run automatically
- Newly detected modifications are extracted
- These PTMs are loaded as annotations into the RCSB PDB database
This means researchers can access updated PTM information regularly without manually checking every structure.
How to Search PTMs in RCSB PDB
The best feature is that these modifications can be searched directly through the RCSB interface.
You can find PTM annotations using:
📌 Advanced Search → Sequence Features
This makes PTM identification easier even for beginners.
Why ModFinder is a Big Deal in Structural Bioinformatics
ModFinder offers three major advantages:
1. Identification
It automatically detects modified residues in protein structures.
2. Annotation
It integrates modification annotations into PDB records.
3. Visualization
Researchers can trace modifications directly inside the 3D structure view.
This makes ModFinder highly valuable for:
- docking studies involving modified residues
- understanding protein regulation
- drug design pipelines
- PTM-driven disease research
Future Possibility: PTM Mapping in DNA or Gene Sequences?
ModFinder has successfully mapped PTMs in protein structures, but a future direction could be:
extending similar automated annotation mapping systems for DNA and genomic sequence datasets.
This could be extremely useful for automated functional annotation pipelines in genomics.
Final Conclusion
ModFinder is a powerful addition to modern bioinformatics because it helps scientists identify and visualize post-translational modifications directly within protein 3D structures from PDB.
Since PTMs are essential for understanding real biological function, tools like ModFinder strengthen research accuracy in:
- structural bioinformatics
- drug discovery
- protein engineering
- disease biology
If you're analyzing protein structures, ModFinder is a tool worth knowing.
References
- Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., & Bourne, P. E. (2000). The Protein Data Bank. Nucleic Acids Research, 28, 235–242.
- Farriol-Mathis, N., Garavelli, J. S., Boeckmann, B., Duvaud, S., Gasteiger, E., Gateau, A., … & Bairoch, A. (2004). Annotation of post-translational modifications in the Swiss-Prot knowledge base. Proteomics, 4(6), 1537–1550.
- Apweiler, R., Bairoch, A., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., … & Martin, M. J. (2004). UniProt: The universal protein knowledgebase. Nucleic Acids Research, 32(suppl 1), D115–D119.
- Garavelli, J. S. (2004). The RESID Database of protein modifications as a resource and annotation tool. Proteomics, 4, 1527–1533.
- Montecchi-Palazzi, L., et al. (2008). The PSI-MOD community standard for representation of protein modification data. Nature Biotechnology, 26, 864–866.
- Creasy, D. M., & Cottrell, J. S. (2004). Unimod: Protein modifications for mass spectrometry. Proteomics, 4(6), 1534–1536.
- Gao, J., Prlić, A., Bi, C., Bluhm, W. F., Dimitropoulos, D., Xu, D., Bourne, P. E., & Rose, P. W. (2017). BioJava-ModFinder: Identification of protein modifications in 3-D structures from the Protein Data Bank. Bioinformatics. https://doi.org/10.1093/bioinformatics/btx101
- Prlić, A., et al. (2012). BioJava: An open-source framework for bioinformatics in 2012. Bioinformatics, 28, 2693–2695.
- Rose, P. W., et al. (2011). The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Research, 39, D392–D401.
- Rose, P. W., et al. (2015). The RCSB Protein Data Bank: views of structural biology for basic and applied research and education. Nucleic Acids Research, 43, D345–D356.


