Genomic signal processing- An introduction

I share my academic life!

Genomic signal processing- An introduction

After Biophysics class, I was so interested in biomolecules programming. After some googling, I found a paper and I want to share some quotes.

This is where you can see this paper: https://ieeexplore.ieee.org/document/939833/

Here is some notes from that:

Genomic information is digital in a very real sense; it is represented in the form of sequences of which each element can be one out of a finite number of entities. Such sequences, like DNA and proteins, have been mathematically represented by character strings, in which each character is a letter of an alphabet. In the case of DNA, the alphabet is size 4 and consists of the letters A, T, C and G; in the case of proteins, the size of the corresponding alphabet is 20.

Genomic signal processing,Dimitris Anastassiou

The main reason that the field of signal processing does not yet have significant impact in the field is because it deals with numerical sequences rather than character strings. However, if we properly map a character string into one or more numerical sequences, then digital signal processing (DSP) provides a set of novel and useful tools for solving highly relevant problems.

Genomic signal processing,Dimitris Anastassiou

Protein molecules tend to fold into complex three-dimensional (3-D) structures forming weak bonds between their own atoms, and they are responsible for carrying out nearly all of the essential functions in the living cell by properly binding to other molecules with a number of chemical bonds connecting neighboring atoms. 

Genomic signal processing,Dimitris Anastassiou

A particular triplet, ATG, serves as the START codon and it also
codes for the M amino acid (methionine); thus, methionine appears as the first amino acid of proteins, but it may also appear in other locations. We also see that there are three STOP codons( TAA, TAG, TGA) indicating termination of amino acid chain synthesis, and the last amino acid is the one generated by the codon preceding the STOP codon.

In a DNA sequence of length N, assume that we assign the numbers a, t, c, g to the characters A, T, C, G, respectively. A proper choice of the numbers a, t, c and g can provide potentially useful properties to the numerical sequence x[n].
For example, if we choose complex conjugate pairs t = a* and g = c* , then the complementary DNA strand is represented by
~x[n]= x* [−n + N −1],          n =0,1,…,N −1