Five things I hate about teaching Python

I’ve just finished teaching an intensive Python programming course and, as usual, spending a week thinking about how best to introduce my students to programming has given me something to write about. I realised that, while I’ve spent a lot time talking about why Python is a great language, I have a number of pet peeves that I’ve never written down.

I’m not talking about the usual problems, like Python’s relative lack of performance or lack of compile-time type checking – these things are deliberate design trade-offs and changing them would involve making Python not-Python. I’m talking about the small things that cause friction, especially in a teaching environment.

Note: I realize that there are good reasons for all these things to be the way they are, so don’t take this too seriously….

1. Floating point vs. integer division

Anyone who’s written in Python for any length of time probably types this line automatically without really thinking about it:

from __future__ import division

but take a moment to consider how you would explain what’s going on in this piece of code to a beginner. In order to really understand what’s happing here, you have to know about:

  • Python’s system for importing modules
  • Python’s system for grouping modules into packages
  • the fact that there are different versions of Python with slightly different behaviour
  • the difference between floating-point and integer numbers
  • the mechanisms of operator overloading, whereby we can define the behaviour of things like + and / for different types
  • the concept of polymorphic functions and operators, which allow us to treat different classes the same, some of the time

Explaining all this to someone who has never written a line of code before is unlikely to be productive, but none of the alternatives are particularly attractive either. We can just present this as a magic piece of code and save the explanation for later (this is normally what I do). We can instruct students to use explicit floating point numbers:

answer = float(4)/3
answer = 4.0/3

, but eventually they will forget and use integers and find that it works some of the time. We can carefully craft our examples and exercises to avoid the need for floating point division, but this is setting students up for pain further down the line. We can use the command-line argument -Q to force floating-point division, or just use Python 3 for teaching, but both of these options will cause confusion once the student goes back to their own environment.

2. split() vs. join()

“OK class, this is how we take a string and split it up into a list of strings using a fixed delimiter:”

sentence = "The all-England summarize Proust competition"
words = sentence.split(" ")

“So I guess, logically, to put the words back together again we just say:

sentence = words.join(" ")

right? Look at that elegant symmetry…… Wait a minute, you’re telling me it doesn’t work like that? The list and the delimiter actually go the other way around, so that we have to write this ugly line?

sentence = " ".join(words)

Wow, that just looks wrong.”

Yes, I know that there are good reasons for collection classes to only have methods that are type-agnostic, but would it really be so bad to just str() everything?

3. Exhaustible files

It’s perfectly logical that you shouldn’t be able to iterate through a file object twice without re-opening it….. once you know a fair bit about how iteration is actually implemented in Python. As a beginner, thought, it’s a bit like Python is giving with one hand and taking away with the other – you can use an opened file object just like a list, except in this one specific but very important way:

my_list = [1,2,3,4]
for number in my_list:
    do_something(number)
# second loop works just as you'd expect
for number in my_list:
    do_something_else(number)

my_file = open("some.input")
for line in my_file:
    do_something(line)
# second loop silently never runs
for line in my_file:
    do_something_else(line)

This problem also rears its ugly head when students try to iterate over a file having already consumed its contents using read():

my_file = open("some.input")
my_contents = my_file.read()
....
# this loop silently never runs
for line in my_file:
    do_something(line)

That second line can be difficult to spot for student and teacher alike when there are many intervening lines between it and the loop.

4. Lambda expressions

OK, this one is more annoying when writing code than when teaching it, since I rarely get round to talking about functional programming in introductory courses. I totally get why there should be a big, obvious flag when we are doing something clever (which lambda expressions generally are). Nevertheless, it seems a shame to have a style of coding that lends itself to elegant brevity marred by so many unnecessary keystrokes.

I think that the reason this bugs me so much is that I first got into functional programming by way of Groovy, which has (to me) a very pleasing syntax for anonymous functions (actually closures):

{x,y -> x**y}

compared to Python:

lambda x,y : x**y

Of course, Python lessens the sting of having to type lambda with its various comprehensions:

squares = map(lambda x : x**2, range(10))
squares = [x**2 for x in range(10)]

so I can’t complain too loudly.

5. Variables aren’t declared

It’s just way too easy for beginners to make a typo that brings their progress to a screeching halt. Consider this real-life example from my most recent course:

positions = [0]
for pos in [12,54,76,103]:
    postions  = positions + [pos]
print(positions) # prints [0] rather than [0,12,54,76,103]

Leaving aside that this particular example could have been salvaged by using positions.append(), it took way to long for us to track down the typo. In real-life code, this is the kind of thing that would ideally be caught by unit testing. This is one (rare!) case in which I pine for the old days of teaching Perl – use strict and my would have taken care of this type of problem.

10 Responses to Five things I hate about teaching Python

  1. Tim Hoffman March 1, 2015 at 11:30 am #

    Just a suggestion on the exhaustible files.

    I know it does introduce a new statement (with), but by using with (context manager), then it would be more obvious that the later attempts to read that are not in with statement context would have a problem. Of course a second set of reads inside the “with: would still have the same problem. Though I am not sure any I/O allows continued read once EOF is reached with out a rewind/tell back to the beginning of the file. May be file I/O semantics should be a precursor to this section.

    • martin March 2, 2015 at 7:58 pm #

      You’re absolutely right – context managers completely solve this problem, but often when teaching complete beginners I want to introduce basic file IO before we start dealing with indented blocks. As you point out, this is common behaviour across programming languages, so not really Python’s fault at all 🙂

  2. Moppers March 1, 2015 at 12:47 pm #

    Point 1 is a pain but will go away when we all move to Python 3. You don’t really have to teach what it means: students can just copy-paste it for now. The same as `if __name__ == ‘__main__’ that you need to load as a module.

    Point 3 is pretty standard for most programming langs. You could perhaps use the `with` context manager to open and close the file each time it’s needed.

    with open (‘myfile’, ‘r’) as f:
    for line in f.readlines():
    # whatever
    # when we get here, f is automatically closed

    Point 5 – your IDE or linter will catch this. Most times. Pylint actaully works!

  3. Rnhmjoj March 1, 2015 at 11:08 pm #

    I will share my opinions.

    1. Floating point vs. integer division
    This is one of the reason you should stop teaching python 2 which is no more the main development branch since 2008.
    Python 3.0 / operator produces a float as everyone would expect.

    2. split() vs. join()
    You are right. It looks strange that this method is a method of the string class. However f it would have been part of the list class you would be able to join lists just by doing words.join(” “) but you would have to implement a join method in every class that makes use of it, not just lists. You don’t just join list of string: there are plenty of objects that can be concatenated with str.join: tuples, file objects,… any iterable.

    3. Exhaustible files
    In the case you need to iterate over a file several times and you don’t care about all the content or it’s too big to read it entirely you can iterate over and reset the cursor position with file.seek(0) and start again.

    4. Lambda expressions
    I agree. Lambda expressions in python are really limited since you can write just one expression and you can’t even define constant without using all sort of hacks. (see http://hastebin.com/idohiqugow.py) and their syntax is ugly.
    However you rarely need them. In my opinion the only case in which lambdas are really necessary is when you create events like this:
    menu.add_command(label=’Reload’, command=lambda: sidebar.delete(0, ‘end’)

    At least the lambda keyword is explicit and easier to find out what it does (see http://stackoverflow.com/q/16242041).

    5. Variables aren’t declared
    There exists a list.extend method which is exactly what you want in this example.
    positions = [0]
    positions.extend([12,54,76,103])
    print(positions)

    There are tools, like pylint or pyflakes, integrable with text editors that can check for runtime errors without need to run the module. However in this example even these will miss the typo.

  4. Anselm March 2, 2015 at 10:16 pm #

    Here are my 2 cents:

    Not going to repeat the replies on 1), 2) and 3) by my predecessors.

    on 4):
    I agree it’s an ugly syntax and “lambda” is an unnecessarily long keyword for a supposedly anonymous function. Consider however the BDFL even wanted to remove it in Python 3 completely because of its uglyness but could be swayed to keep it because of the usefulness of anonymous functions in certain circumstances. Maybe you can come up with a better (less ugly, more useful) syntax and put it in a PEP? Might be worth the effort.

    on 5)
    positions = [0]
    for pos in […]:
    positions += [pos]
    print(positions)

    simpler, more general approach (+=, *= etc. works on many classes, is clear to the reader and often is faster than the explicit lookup and assignment) than using the special .extend method – that’s exactly the case for what it’s invented!

  5. David March 3, 2015 at 12:13 pm #

    A typical set of responses;
    “But there is x that does y”. As ever, a programming solution when what is needed is a teaching solution.

    Have you ever tried explaining these x to someone who has done no programming before. I will lay bets that in your first day of learning to program you were not inundated with closures, context managers and so on.

    A teaching solution to the file issue. Don’t describe it as a list. Describe the filehandle as a bookmark that points to a certain place in the file. Then the explanations of what is going on become much easier. I describe the filehandle as like the finger of a toddler learning to read – we cna then jointly work out as a class how to deal with the stupidity of python.

  6. ED March 7, 2015 at 9:18 pm #

    Are you using Python 2.7 or 3.x?
    Point 1 seems to indicate 2.7.
    Point 5 uses print() and thus implies 3.x, unless you did another __future__ import that needs to be explained.

    Point 2 is really about what it means to be a string instance. A string can be split and it can be used to join other strings. Quite clear IMHO.

    Point 3 is tough. But take this example:
    my_file = open(“some.input”)
    my_contents = my_file.read()
    for lineno, line in enumerate(my_file,1):
    do_something(line)
    if lineno == 7:
    break
    do_something_else
    for line in my_file:
    do_something(line)
    Where is the second for loop supposed to start?

    Point 4 is … Well, as various LISP dialects where among my first programming languages it seems very natural to me. And the syntax acknowledges lambda calculus as the one true source of functional programming so it offers an opportunity for some history of CS in a training.

    Point 5 I cannot reproduce. Both 2.7 and 3.4 print the expected answer.

    Man, are you blessed with just five complaints! But OK, your audience is biologists. Flog them with Java, C#, or Fortran and they will love you for these inconveniences.

    • martin June 23, 2015 at 11:49 am #

      I often teach in courses where people bring their own laptops, so I end up dealing with a mixture of 2.7 and 3. print() works fine in both 2.7 and 3.x, so I try to get everyone into the habit of using it.

      You’re right, biologists do come running gratefully to Python after Java!

  7. Darragh McCurragh March 23, 2015 at 9:12 pm #

    “different versions of Python with slightly different behaviour” Well still better than when i had a short brush with Ada where everything had to be vouched by the DoD … Dr. Phillip M. Feldman has another list of “Python Limitations and Design Flaws”.

  8. Naib June 18, 2015 at 11:46 am #

    #1 is going on about its not intuitive to import the futures statement (a py2 thing).
    Now I agree, as an engineer this is not only annoying when I forget (and thus I got into the habit of 1//2) but equally its part of the problem where python thinks it is a strongly typed language and it thinks its not. With a strongly typed language int/int = int, it can’t change its type.

    However… #5 is using py3 print() and in py3 the import futures and the typecasing of an int to a float automatically occurs

Leave a Reply

Powered by WordPress. Designed by Woo Themes