Reverse complement business card

lp (2)

Exercise

Modify the function so that it can return either the complement, the reverse, or the reverse complement

Solution

This is an easy one – we just have to add the relevant optional arguments to the function. Let’s start by making the reversal part of the behaviour optional. We’ll add a reverse argument to the function and only reverse the input string if it’s True:

def revcomp(dna, reverse):
    bases = 'ATGCTACG'
    complement_dict = {bases[i]:bases[i+4] for i in range(4)}
    if reverse:
        dna = reversed(dna)
    result = [complement_dict[base] for base in dna]
    return ''.join(result)

If we try running it both ways, we can see that it’s working:

my_dna = 'AAATTTCGCGCG'
print(revcomp(my_dna, True))
print(revcomp(my_dna, False))

here’s the output:

CGCGCGAAATTT
TTTAAAGCGCGC

Now we can add the code to make the complement part optional. We’ll add another argument called complement, and only carry out the replacement of bases if it’s true:

def revcomp(dna, reverse, complement):
    bases = 'ATGCTACG'
    complement_dict = {bases[i]:bases[i+4] for i in range(4)}
    if reverse:
        dna = reversed(dna)
    result_as_list = None
    if complement:
        result_as_list = [complement_dict[base] for base in dna]
    else:
        result_as_list = [base for base in dna]
    return ''.join(result_as_list)

As before, let’s test it with a couple of examples:

my_dna = 'AAATTTCGCGCG'
print(revcomp(my_dna, False, True))
print(revcomp(my_dna, False, False))

and check that the output looks like we expect:

TTTAAAGCGCGC
AAATTTCGCGCG

If we’re going to use this function in real-life programs, it’s probably a good idea to have some sensible defaults for the arguments. Since the name of the function is revcomp, we’ll set the default value to True for both reverse and complement:

def revcomp(dna, reverse=True, complement=True):
    bases = 'ATGCTACG'
    complement_dict = {bases[i]:bases[i+4] for i in range(4)}
    if reverse:
        dna = reversed(dna)
    result_as_list = None
    if complement:
        result_as_list = [complement_dict[base] for base in dna]
    else:
        result_as_list = [base for base in dna]
    return ''.join(result_as_list)

Now if we run the function with just a single DNA sequence as the argument, we’ll get the behaviour we expect.

One of the nice things we can do with a flexible function like this is to use it to define shortcut functions that work by calling the main function with some of the options filled in. Here are two shortcut functions which just reverse or complement the input sequence, but which use our main function to carry out their jobs:

def reverse_only(dna):
    return revcomp(dna, complement=False)

def complement_only(dna):
    return revcomp(dna, reverse=False)

my_dna = 'AAATTTCGCGCG'
print(reverse_only(my_dna))
print(complement_only(my_dna))

The output confirms that these work as expected:

GCGCGCTTTAAA
TTTAAAGCGCGC

This idea – taking a flexible function, and setting some of its parameters in order to create a new function which takes fewer arguments – is a familiar one in functional programming and is known as partial function application (take a look at the functional programming chapter in Advanced Python for Biologists for the background to this idea).

Because it’s so useful, there’s a built-in Python tool for doing it. The partial() function is contained in the functools module, and its job is to take an existing function and a set of keyword arguments, and return a new function that consists of the existing function with those keyword arguments set. Here’s how to copy our example above using functools.partial:

import functools
reverse_only_partial = functools.partial(revcomp, complement=False)
complement_only_partial = functools.partial(revcomp, reverse=False)

my_dna = 'AAATTTCGCGCG'
print(reverse_only_partial(my_dna))
print(complement_only_partial(my_dna))

The output reassures us that the reverse_only_partial and complement_only_partial work just the same as our previous functions:

GCGCGCTTTAAA
TTTAAAGCGCGC

[sc:card_footer]

Powered by WordPress. Designed by Woo Themes