Decoding the Blueprint of Life for Healthier Future

Press ESC to close

A Bioinformatics Tool: Pairwise Sequence Alignment

Three types of biomolecular sequences are very crucial to one’s life. These sequences are responsible for deciding fate of cell and an organism. First one is DNA sequence which determines the genetic makeup of an organism. It is known to be involve in hereditary process and gives a shape to life. Monomers of DNA molecule are nucleotides molecules differing due to the nitrogenous bases they contain: A, T, G, C. Second sequence, RNA sequence which is responsible for holding genetic code and carrying out protein synthesis. Third one is the Protein sequence, mainly an amino acid sequence does its job by ensuring that proper functioning is taking place in cell to make it alive. These bio sequences will dictate the structure of proteins and structure of protein define its function.(Jackson & Aluru, 2005)

Sequence Alignment 

Sequence alignment is a method for matching sequences of two or more DNA and proteins using computational techniques.

Need for aligning DNA sequences:

 Evolutionary process starts when there is insertion, deletion and substitution of only one or more nucleotides in DNA sequence. This slight variation leads to cause variations among species. That’s why we need sequence alignment algorithms to determine similarities and differences between two or more DNA sequences. so, by aligning sequences we are able to discover Phylogenetic relationships.

Need for aligning protein sequences:

  • To identify conserved regions, as protein sequences that remain conserved during alignment process are responsible for a specific function such as binding and active sites. 

  • In order to check a function of protein, aligning a protein sequence whose function is unknown with a protein of known function give idea about its function if there exist similarity between sequences.

Types of sequence alignment:

Depending on the number of sequences we align, sequence alignment is of two types:

PAIRWISE SEQUENCE ALIGNMENT(PSA): This method involves aligning two sequences (DNA, RNA, Proteins)

MULTIPLE SEQUENCE ALIGNMENT(MSA): This method involves alignment of more than two sequences.

Pairwise sequence alignment:

Pairwise sequence alignment algorithms use various methods to align two sequences. One of them is dynamic programing method for aligning sequences.

Dynamic programing:

Dynamic programming is a problem-solving computational approach that is used in bioinformatics such as in many pair wise sequence alignment algorithms. This method solves problems to attain best possible solutions. This method divides problem into two parts. In first part it breaks a problem to various subproblems and solve these subproblems. By storing the results of subproblems it is able to reuse solutions next time. In this way it is able to make a set of all possible solution to a problem. In second part it evaluates these solutions and choose optimal answer.(Giegerich, 2000)

Dynamic programming algorithms for pairwise sequence alignments:

Algorithms are instructions thar are given to computer according to which it solves problems, align sequences in this case, and give output. Algorithms that use dynamic programming approaches to align two sequences are dynamic programing algorithms. These algorithms use scoring matrices and gap penalties to give best possible alignments of sequences. Scoring matrix is way of giving values to residues in sequences during alignment that are matched and mismatched. Researchers sets gap penalty values that is substituted in case of insertions and deletions in sequences. And then it is subtracted from alignment score to get a final score.

Dynamic programing algorithms for pair wise sequence alignment include

  • Needle-Man Wunsch (NW):

This algorithm is developed in 1970. This algorithm is designed to perform GLOBAL alignment of sequences. In global alignment , algorithm align whole sequence from start to end .It gives score to the matched, mismatched residues and gap penalty in a scoring matrix. This alignment is used for finding the phylogenetic relationships. PAM and BLOSUM are the two matrices used in Needle-Mam Wunsch algorithm. (Chan, 2007)

  • Smith-Waterman (SW):

It was developed by Smith and water in 1981.This algorithm is used to perform LOCAL alignment. In local alignment, algorithms are designed to align only regions of sequences with similarity.  It only aligns sequences with conserved regions, as mostly the non-conserved regions are present at the end of sequence of amino acids so it does not align end portions and align only middle portions. This algorithm uses smith array matrix .It also assign scores to matrix in slightly different way to Needle MAN Wunsch .This algorithm is mostly used to identify conserved regions in sequences over time.(Chan, 2007)

Conclusion:

Pair wise sequence alignment is an important bioinformatic tool that is used by researchers to find differences and similarities between sequences. As large amount of data is discovered and stored related to gene sequences and protein sequences, so alignment tools help biologist to define the evolutionary relationships between organisms. Sequence alignment enables them to know about mutations occurring in sequences over time. It will also give information about regions that will remain conserved during evolutions. Mostly the conserved regions of protein sequences perform important functions. Moreover, it enables researchers to design drugs and estimating functions of novel proteins.

Reference:

Chan, A. (2007). An analysis of pairwise sequence alignment algorithm complexities: Needleman-wunsch, smith-waterman, fasta, blast and gapped blast. Biochemistry–Final Project

Giegerich, R. (2000). A systematic approach to dynamic programming in bioinformatics. Bioinformatics, 16(8), 665-677. 

Jackson, B. N., & Aluru, S. (2005). Pairwise sequence alignment. Handbook of computational molecular biology, 1-1. 

 

Where biomolecules meet bytes, I find my curiosity sparked. As a biochemistry student at the University of Punjab, Lahore, navigating the exciting field of bioinformatics, I will share my learning with you.

Leave a comment

Your email address will not be published. Required fields are marked *