Saturday, August 23, 2008

Coding with Class Sequence

Ok… time to get down to business! This time, we’re going to see how to code with class Sequence. To use Class Sequence in Python, download the source code file here and remember to import it into your python source file or interactive session. The following examples assume that you're working at the Python interactive session command prompt. Input (what you type in...!) and output (what Python throws back at you) have been coloured blue and green respectively to help distinguish them from the posts text. Python keywords are coloured orange.

First, let’s create a Sequence object. To do this, type the following at the Python command prompt (‘>>>’):

>>>my_sequence = Sequence(name = 'Sequence 1', seq = 'This is a sequence')

Notice that the name of the sequence and the actual sequence are written inside quotes (you could use either single or double quotes). Anything written inside quotes is taken by the Python interpreter to be a string.
Now, type

>>> print my_sequence

The Python interpreter should print the following:

>Sequence 1

Ok… so it works! Now lets try to add two Sequence objects…

>>> another_sequence = Sequence(name = 'Sequence 2',seq = ' of english characters')
>>> joined_sequences = my_sequence + another_sequence
>>> print joined_sequence
>Sequence 1+Sequence 2

Note that the variable ‘joined_sequence’ is also a Sequence object. Addition of two Sequence objects leads to the creation of a new Sequence object whose name reflects the fact that it is an addition of two Sequences. You may then change the name of a Sequence object if you wish:

>>> joined_sequences.setname('New Name')

One can also obtain the name or the sequence contained within a Sequence object if one wished:

>>> name = joined_sequences.getname()
>>> seq = joined_sequences.getseq()
>>> type(name)

Notice that the name and sequence are themselves just Python strings. If you want to get just the 6th letter in the sequence, type:

>>> sixth_char = joined_sequences[5]
>>> print sixth_char

Note that to access the sixth letter we used ‘joined_sequences[5]’. That’s because the first character in a Python string is actually numbered zero! We can actually search for the first ‘I’ in joined_sequences:

>>> position_first_I = joined_sequences.find(motif = 'I')
>>> print position_first_I.start()
>>> print position_first_I.end()
>>> print position_first_I.span()
(2, 3)

The ‘find’ function can find not only characters but entire sub-strings!

>>> pos_subs = joined_sequences.find(motif = 'SEQUENCE')
>>> print pos_subs.span()
(10, 18)

Finally, the ‘fragment’ function:

>>> fragment = joined_sequences.fragment(my_start = 10, my_stop = 18)
>>> print fragment
>New Name(10,18)

Next time we’ll see how the more biologically relevant classes DNA, preMRNA, mRNA and Protein work.


Blogger Snigdha said...

Read ‘About the Genepython’ and also how to ‘code with the class sequence’. ….. Realized how easily the concept of the ‘central dogma’ can be handled and worked on and used insilico.
Especially the bog on ‘the wet option’ with Craig Venter on the TEDTALK was superb….went completely crazy while seeing that a whole alien chromosome being able to replace the original one in a biological system…ACTUALLY….artificial evolution..can we say that.

The Genepython appears simple, seems to be like ‘C programming’ , in fact simpler….

Want to get back to the class room…and hear SPM talk on the central dogma…wish!!!.i wish..!!!

September 4, 2008 4:44 AM  

Post a Comment

<< Home