Monday, November 11, 2013

The history of bool, True and False

Writing up the reasons why True and False, when introduced, weren't reserved words, I realized there's another interesting lesson in the history of Python's bool type. It was formally introduced in Python 2.3, as a new type with two constants, and the type was introduced in PEP 285 ("Adding a bool type").

But bool, True and False were also introduced in Python 2.2.1 (a bugfix release!). The Misc/NEWS file said:
What's New in Python 2.2.1 final?
Release date: 10-Apr-2002

Core and builtins

- Added new builtin function bool() and new builtin constants True and
  False to ease backporting of code developed for Python 2.3.  In 2.2,
  bool() returns 1 or 0, True == 1, and False == 0.

This was the last (and the most criticized) time we added a new feature in a bugfix release -- we'd never do that these days. Also note that bool/True/False in 2.2.1 were different from 2.3: in 2.3, bool is a new type; in 2.2.1, bool() is a built-in function and the constants are just ints.

The chronology is also interesting: the proper new bool type was introduced in 2.3a1, released on Dec 31 2002, well after the above-mentioned 2.2.1 release. And the final 2.3 release didn't come out until July 29 2003. And yet, the above comment talks about backporting from 2.3 to 2.2.1. PEP 285 was created on March 8, 2002, accepted on April 3, and declared final on April 11 (i.e. after Python 2.2.1 was released). I'm assuming that by then the proper bool implementation had landed in the 2.3 branch. That's a breakneck pace compared to how we do things a decade later!

Sunday, November 10, 2013

The story of None, True and False (and an explanation of literals, keywords and builtins thrown in)

I received an interesting question in the mail recently:
What is the difference between keywords and literals? Why are True and False keywords rather than literals in python3?
I was horrified recently to find that assigning to True/False works in python2. So I went digging, and found that True and False were created to be 'constants' like None in PEP 285. Assignment to None was disallowed in 2.4, but not to True/False until python3. Was there a reason None was originally built as a variable rather than a literal?
Let's start with the first question: keywords and literals.

A keyword, in the context of defining the syntax of a language, also known as a reserved word, is something that looks like an identifier in the language, but from the parser's point of view act like a token of the language. An identifier is defined as a sequence of one or more letters, digits and underscores, not starting with a digit. (This is Python's definition, but many languages, like C or Java, use the same or a very similar definition.)

The important thing to remember about keywords is that a keyword cannot be used to name a variable (or function, class, etc.). Some well-known keywords in Python include 'if', 'while', 'for', 'and', 'or'.

A literal, on the other hand, is an element of an expression that describes a constant value. Examples of literals are numbers (e.g. 42, 3.14, or 1.6e-10) and strings (e.g. "Hello, world"). Literals are recognized by the parser, and the exact rules for how literals are parsed are often quite subtle. For example, these are all numeric literals in Python 3:
but these are not:
. (dot)
e10 (identifier)
0y12 (the literal 0 followed by the identifier y12)
0xffe+10 (the literal 0xffe followed by a plus sign and and the number 10)
Note the distinction between a constant and a literal. We often write code defining "constants", e.g.
Here, 15 is a literal, but MAX_LEVELS is not -- it is an identifier, and the all-caps form of the name suggests to the reader that it is probably not changed anywhere in the code, which means that we can consider it a constant -- but this is just a convention, and the Python parser doesn't know about that convention, nor does it enforce it.

On the other hand, the parser won't let you write
This is because the left-hand side of the assignment operator (=) must be a variable, and a literal is not a variable. (The exact definition of variable is complex, since some things that look like expressions are also considered to be variables, such as d[k], (a, b), and -- but not f() or () or 42. This definition of variable is also used by the "del" statement.)

Now on to None, True and False.

Let's begin with None, because it has always been in the language. (True and False were relatively recent additions -- they first made their appearance in Python 2.2.1, to be precise.) None is a singleton object (meaning there is only one None), used in many places in the language and library to represent the absence of some other value. For example, if d is a dictionary, d.get(k) will return d[k] if it exists, but None if d has no key k. In earlier versions of Python, None was just a "built-in name". The parser had no special knowledge of None -- just like it doesn't have special knowledge of built-in types like int, float or str, or built-in exceptions like KeyError or ZeroDivisionError. All of these are treated by the parser as identifiers, and when your code is being interpreted they are looked up just like any other names (e.g. the functions and variables you define yourself). So from the parser's perspective, the following are treated the same, and the parse tree it produces (<name> = <name>) is the same in each case:
x = None
x = int
x = foobar
On the other hand, the following produce different parse trees (<name> = <literal>):
x = 42
x = 'hello'
because the parser treats numeric and string literals as different from identifiers. Combining this with the earlier MAX_LEVEL examples, we can see that if we swap the left and right hand sides, the first three will still be accepted by the parser (<name> = <name>), while the swapped version of the second set will be rejected (<literal> = <name> is invalid).

The practical consequence is that, if you really want to mess with your readers, you can write code that reassigns built-ins; for example, you could write:
int = float
def parse_string(s):
    return int(s)
print(parse_string('42'))    # Will print '42.0'
Some of you may respond to this with "So what? Reasonable programmers don't write such code." Others may react in the opposite way, saying "Why on earth does the language allow assignment to a built-in name like 'int' at all?!"

The answer is subtle, and has to do with consistency and evolution of the language. I bet that without looking it up you won't be able to give a complete list all built-in names defined by Python. (I know I can't.) Moreover, I bet that many of you won't recognize every single name on that list. (To see the list, try typing dir(__builtins__) at the Python command prompt.)

Take for example the weird built-ins named copyright, credits or license. They exist so that we can mention them in the greeting shown when you start Python interactively:
Python 3.4.0a4+ (default:0917f6c62c62, Oct 22 2013, 10:55:35)
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> credits
Thanks to CWI, CNRI,, Zope Corporation and a cast of thousands
for supporting Python development.  See for more information.
 In order for this to work, we made them built-ins. But does this mean you shouldn't be allowed to use 'credits' as a variable or parameter name? I think not. Certainly many people don't realize that these esoteric built-ins even exist, and they would be surprised if they were prevented from using them as variable names. From here, it's just a gradual path. Many people write functions or methods with arguments named str or len, or with names like compile or format. Moreover, suppose you wrote some Python 2.5 code where you used bytes as a variable name. In Python 2.6, we added a built-in function named 'bytes' (it's an alias for str, actually). Should your code now be considered invalid? There's no reason for that, and in fact your code will be fine. (Even in Python 3, where bytes is one of the fundamental types.)

On the other hand, you cannot have a variable named 'if' or 'try', because these are reserved words (keywords) that are treated special by the parser. Because you cannot use these as variable or function names anywhere, ever, in any Python program, everyone using Python has to know about all the reserved words in the language, even if they don't have any need for them. For this reason, we try to keep the list of reserved words small, and the core developers hem and haw a lot before adding a new reserved word to the language.

In fact, many proposed new features have been killed because they would require a new keyword; others have been modified to avoid that need. Also, when we do decide to add a new keyword, we start a deprecation campaign at least one release before the new keyword is introduced, warning developers to choose a different name for their variables. (There's also a trick to allow developers to choose to use the new keyword right away; this is why we have e.g. "from __future__ import with_statement".)

There's no such concern for built-ins. Code that happens to use the name of a new built-in as a variable or function name will continue to function (as long as you don't also try to use the new built-in in the same function). While we still try to be conservative with the introduction of new built-ins, at least we don't have to worry about breaking working code by merely adding something to the language. The (small) price we pay for this is the possibility that some joker intentionally redefines a built-in just to confuse others. But there are tons of other ways to write unreadable code, and I don't see this as a particularly bad problem.

So, after this long detour about built-ins vs. keywords, back to None. Why did we eventually make None a reserved word? Frankly, the reasons were perhaps mostly social. Unlike some built-ins and many exceptions, None is so central to using Python that you really can't be using Python without knowing about None. So people were (like our question-asker) "horrified" when they found that assignment to None was actually allowed at all. Worse, there was the concern (whether founded or not) that the way name lookup in Python works, "evaluating" the expression None is slow, because it requires at least two dictionary lookups (all names are looked up in the globals dict before being looked up in the built-ins dict).

In the end we decided that there was no downside to making None a keyword (there is no code that actually assigns to it) and it might make some code a tiny bit faster, or catch rare typos. There was still a one-time cost to the developer community (changes to the parser and documentation) but this was small enough that we din't hesitate very long.

The situation for True/False is a little different. They weren't always part of the language, and many people had invented their own convention. People would define constants named true and false, True and False, or TRUE and FALSE, and use those consistently throughout their code. I don't recall which spelling was most popular, but when we introduced True and False into the language, we definitely did not want to break any packages that were defining their own True and False constants. (One concern was that those packages would have to have a way to continue to run on previous Python versions.)

So, essentially our hand was forced in this case, and we had to introduce True and False as built-in constants, not as keywords. But over time, code defining its own versions of True and False (by whichever name) became more and more frowned upon, and by the time Python 3 came around, when we looked at opportunities for cleaning up the language, we found that it was logical to make True and False keywords, by analogy to None.

And there you have it. It's all completely logical, once you understand the context. :-) Sorry for the long response; I hope it's been educational.

UPDATE: I still forgot to answer whether None/True/False are literals or keywords. My answer is that they are both. They are keywords because that's how the parser recognizes them. They are literals because that's their role in expressions and because they stand for constant values. One could argue about whether things like {'foo': 42} are literals; personally I'd prefer to give these some other name, because otherwise what would you call {'foo': x+1}? The language reference calls both of these "displays".

Thursday, October 24, 2013

Origin of metaclasses in Python

There was some speculation on python-ideas today on whether Python's metaclass design came from Ruby. It did not. And as long as we are speculating about the origins of language features, I feel the need to set the record straight.

I was not inspired by Ruby at that point (or ever :-). Ruby was in fact inspired by Python. Mats once told me that his inspiration was 20% Python, 80% Perl, and that Larry Wall is his hero.
I wrote about metaclasses in Python in 1998:
New-style classes were just the second or third iteration of the idea.
I was inspired to implement new-style classes by a very specific book, "Putting Metaclasses to Work" by Ira Forman and Scott Danforth (
But even Python's original design (in 1990, published in 1991) had the notion that 'type' was itself an object. The type pointer in any object has always been a pointer to a special object, whose "data" was a bunch of C function pointers implementing the behavior of other objects, similar to a C++ vtable. The type of a type was always a special type object, which you could call a meta-type, to be recognized because it was its own type.
I was only vaguely aware of Smalltalk at the time; I remember being surprised by its use of metaclasses (which is quite different from that in Python or Ruby!) when I read about them much later. Smalltalk's bytecode was a bigger influence of Python's bytecode though. I'd read about it in a book by Adele Goldberg and others, I believe "Smalltalk-80: The Language and its Implementation" (

Why Python uses 0-based indexing

I was asked on Twitter why Python uses 0-based indexing, with a link to a new (fascinating) post on the subject ( I recall thinking about it a lot; ABC, one of Python's predecessors, used 1-based indexing, while C, the other big influence, used 0-based. My first few programming languages (Algol, Fortran, Pascal) used 1-based or variable-based. I think that one of the issues that helped me decide was slice notation.

Let's first look at use cases. Probably the most common use cases for slicing are "get the first n items" and "get the next n items starting at i" (the first is a special case of that for i == the first index). It would be nice if both of these could be expressed as without awkward +1 or -1 compensations.

Using 0-based indexing, half-open intervals, and suitable defaults (as Python ended up having), they are beautiful: a[:n] and a[i:i+n]; the former is long for a[0:n].

Using 1-based indexing, if you want a[:n] to mean the first n elements, you either have to use closed intervals or you can use a slice notation that uses start and length as the slice parameters. Using half-open intervals just isn't very elegant when combined with 1-based indexing. Using closed intervals, you'd have to write a[i:i+n-1] for the n items starting at i. So perhaps using the slice length would be more elegant with 1-based indexing? Then you could write a[i:n]. And this is in fact what ABC did -- it used a different notation so you could write a@i|n.(See

But how does the index:length convention work out for other use cases? TBH this is where my memory gets fuzzy, but I think I was swayed by the elegance of half-open intervals. Especially the invariant that when two slices are adjacent, the first slice's end index is the second slice's start index is just too beautiful to ignore. For example, suppose you split a string into three parts at indices i and j -- the parts would be a[:i], a[i:j], and a[j:].

So that's why Python uses 0-based indexing.

Friday, July 8, 2011

Karin Dewar, Indentation and the Colon

In a recent post on my other blog I mentioned a second-hand story about how Python's indentation was invented by the wife of Robert Dewar. I added that I wasn't very sure of the details, and I'm glad I did, because the truth was quite different. I received a long email from Lambert Meertens with the real story. I am going to quote it almost completely here, except for some part which he requested not to be quoted. In summary: Karin Dewar provided the inspiration for the use of the colon in ABC (and hence in Python) leading up to the indentation, not for indentation itself. Here is Lambert's email:

The Dewar story is not about indentation, but about the invention of the colon.

First about indentation in B. Already B0, the first iteration in the B0, B1, B2, ... sequence of designs leading to ABC, had non-optional indentation for grouping, supplemented by enclosing the group between BEGIN and END delimiters. This can be seen in [GM76], section 4.1 (Layout). The indentation was supposed to be added, like pretty printing, by a dedicated B editor, and the user had no control over this; they were not supposed to be able to turn this off or otherwise modify the indentation regime.

Given the mandatory indentation, separate BEGIN and END delimiters are of course superfluous; in B1 we had no BEGIN, but only END IF, END FOR, and so on, and then the latter delimiters were also dropped in B2, leaving pure indentation as the sole indicator of grouping. See [ME81], Section 4 (STATEMENT SYNTAX).

The origin of indentation in ABC is thus simply the desire to have the program text look neat and be suggestive of the meaning, codifying what was already common practice but not enforced. The decision to remove the noise of BEGIN ... END may have been influenced by [PL75], which actually advocated using pure indentation for grouping. Since occam came later (1983), the feature can't have been copied from that language. Same for Miranda (1985). As far as I am aware, B was the first actually published (and implemented) language to use indentation for grouping.

Now the Dewar story, which is how I got the idea of the colon, as I wrote it down in a memoir of the ABC design rationale:

And here I will paraphrase, at Lambert's request.

In 1978, in a design session in a mansion in Jabłonna (Poland), Robert Dewar, Peter King, Jack Schwartz and Lambert were comparing various alternative proposed syntaxes for B, by comparing (buggy) bubble sort implementations written down in each alternative. Since they couldn't agree, Robert Dewar's wife was called from her room and asked for her opinion, like a modern-day Paris asked to compare the beauty of Hera, Athena, and Aphrodite. But after the first version was explained to her, she remarked: "You mean, in the line where it says: 'FOR i ... ', that it has to be done for the lines that follow; not just for that line?!" And here the scientists realized that the misunderstanding would have been avoided if there had been a colon at the end of that line.

Lambert also included the following useful references:

[PL75] P. J. Plauger. Signal and noise in programming language. In J. D. White, editor, Proc. ACM Annual Conference 1975, page 216. ACM, 1975.

[GM76] Leo Geurts and Lambert Meertens. Designing a beginners' programming language. In S.A. Schuman, editor, New Directions in Algorithmic Languages 1975, pages 1–18. IRIA, Rocquencourt, 1976.

[ME81] Lambert Meertens. Issues in the design of a beginners' programming language. In J.W. de Bakker and J.C. van Vliet, editors, Algorithmic Languages, pages 167–184. North-Holland Publishing Company, Amsterdam, 1981.

Tuesday, August 24, 2010

Why Python's Integer Division Floors

I was asked (again) today to explain why integer division in Python returns the floor of the result instead of truncating towards zero like C.

For positive numbers, there's no surprise:

>>> 5//2

But if one of the operands is negative, the result is floored, i.e., rounded away from zero (towards negative infinity):

>>> -5//2
>>> 5//-2

This disturbs some people, but there is a good mathematical reason. The integer division operation (//) and its sibling, the modulo operation (%), go together and satisfy a nice mathematical relationship (all variables are integers):

a/b = q with remainder r

such that

b*q + r = a and 0 <= r < b

(assuming a and b are >= 0).

If you want the relationship to extend for negative a (keeping b positive), you have two choices: if you truncate q towards zero, r will become negative, so that the invariant changes to 0 <= abs(r) < otherwise, you can floor q towards negative infinity, and the invariant remains 0 <= r < b. [update: fixed this para]

In mathematical number theory, mathematicians always prefer the latter choice (see e.g. Wikipedia). For Python, I made the same choice because there are some interesting applications of the modulo operation where the sign of a is uninteresting. Consider taking a POSIX timestamp (seconds since the start of 1970) and turning it into the time of day. Since there are 24*3600 = 86400 seconds in a day, this calculation is simply t % 86400. But if we were to express times before 1970 using negative numbers, the "truncate towards zero" rule would give a meaningless result! Using the floor rule it all works out fine.

Other applications I've thought of are computations of pixel positions in computer graphics. I'm sure there are more.

For negative b, by the way, everything just flips, and the invariant becomes:

0 >= r > b.

So why doesn't C do it this way? Probably the hardware didn't do this at the time C was designed. And the hardware probably didn't do it this way because in the oldest hardware, negative numbers were represented as "sign + magnitude" rather than the two's complement representation used these days (at least for integers). My first computer was a Control Data mainframe and it used one's complement for integers as well as floats. A pattern of 60 ones meant negative zero!

Tim Peters, who knows where all Python's floating point skeletons are buried, has expressed some worry about my desire to extend these rules to floating point modulo. He's probably right; the truncate-towards-negative-infinity rule can cause precision loss for x%1.0 when x is a very small negative number. But that's not enough for me to break integer modulo, and // is tightly coupled to that.

PS. Note that I am using // instead of / -- this is Python 3 syntax, and also allowed in Python 2 to emphasize that you know you are invoking integer division. The / operator in Python 2 is ambiguous, since it returns a different result for two integer operands than for an int and a float or two floats. But that's a totally separate story; see PEP 238.

Tuesday, June 29, 2010

From List Comprehensions to Generator Expressions

List comprehensions were added in Python 2.0. This feature originated as a set of patches by Greg Ewing with contributions by Skip Montanaro and Thomas Wouters. (IIRC Tim Peters also strongly endorsed the idea.) Essentially, they are a Pythonic interpretation of a well-known notation for sets used by mathematicians. For example, it is commonly understood that this:
{x | x > 10}

refers to the set of all x such that x > 10. In math, this form implies a universal set that is understood by the reader (for example, the set of all reals, or the set of all integers, depending on the context). In Python, there is no concept of a universal set, and in Python 2.0, there were no sets. (Sets are an interesting story, of which more in a future blog post.)

This and other considerations led to the following notation in Python:
[f(x) for x in S if P(x)]

This produces a list containing the values of the sequence S selected by the predicate P and mapped by the function f. The if-clause is optional, and multiple for-clauses may be present, each with their own optional if-clause, to represent nested loops (the latter feature is rarely used though, since it typically maps a multi-dimensional entity to a one-dimensional list).

List comprehensions provide an alternative to using the built-in map() and filter() functions. map(f, S) is equivalent to [f(x) for x in S] while filter(P, S) is equivalent to [x for x in S if P(x)]. One would think that list comprehensions have little to recommend themselves over the seemingly more compact map() and filter() notations. However, the picture changes if one looks at a more realistic example. Suppose we want to add 1 to the elements of a list, producing a new list. The list comprehension solution is [x+1 for x in S]. The solution using map() is map(lambda x: x+1, S). The part “lambda x: x+1” is Python’s notation for an anonymous function defined in-line.

It has been argued that the real problem here is that Python’s lambda notation is too verbose, and that a more concise notation for anonymous functions would make map() more attractive. Personally, I disagree—I find the list comprehension notation much easier to read than the functional notation, especially as the complexity of the expression to be mapped increases. In addition, the list comprehension executes much faster than the solution using map and lambda. This is because calling a lambda function creates a new stack frame while the expression in the list comprehension is evaluated without creating a new stack frame.

Given the success of list comprehensions, and enabled by the invention of generators (of which more in a future episode), Python 2.4 added a similar notation that represents a sequence of results without turning it into a concrete list. The new feature is called a “generator expression”. For example:
sum(x**2 for x in range(1, 11))

This calls the built-in function sum() with as its argument a generator expression that yields the squares of the numbers from 1 through 10 inclusive. The sum() function adds up the values in its argument resulting in an answer of 385. The advantage over sum([x**2 for x in range(1, 11)]) should be obvious. The latter creates a list containing all the squares, which is then iterated over once before it is thrown away. For large collections these savings in memory usage are an important consideration.

I should add that the differences between list comprehensions and generator expressions are fairly subtle. For example, in Python 2, this is a valid list comprehension:
[x**2 for x in 1, 2, 3]

However this is not a valid generator expression:
(x**2 for x in 1, 2, 3)

We can fix it by adding parentheses around the "1, 2, 3" part:
(x**2 for x in (1, 2, 3))

In Python 3, you also have to use these parentheses for the list comprehension:
[x**2 for x in (1, 2, 3)]

However, in a "regular" or "explicit" for-loop, you can still omit them:
for x in 1, 2, 3: print(x**2)

Why the differences, and why the changes to a more restrictive list comprehension in Python 3? The factors affecting the design were backwards compatibility, avoiding ambiguity, the desire for equivalence, and evolution of the language. Originally, Python (before it even had a version :-) only had the explicit for-loop. There is no ambiguity here for the part that comes after 'in': it is always followed by a colon. Therefore, I figured that if you wanted to loop over a bunch of known values, you shouldn't be bothered with having to put parentheses around them. This also reminded me of Algol-60, where you can write:
for i := 1, 2, 3 do Statement

except that in Algol-60 you can also replace each expression with step-until clause, like this:
for i := 1 step 1 until 10, 12 step 2 until 50, 55 step 5 until 100 do Statement

(In retrospect it would have been cool if Python for-loops had the ability to iterate over multiple sequences as well. Alas...)

When we added list comprehensions in Python 2.0, the same reasoning applied: the sequence expression could only be followed by a close bracket ']' or by a 'for' or 'if' keyword. And it was good.

But when we added generator expressions in Python 2.4, we ran into a problem with ambiguity: the parentheses around a generator expression are not technically part of the generator expression syntax. For example, in this example:
sum(x**2 for x in range(10))

the outer parentheses are part of the call to sum(), and a "bare" generator expression occurs as the first argument. So in theory there would be two interpretations for something like this:
sum(x**2 for x in a, b)

This could either be intended as:
sum(x**2 for x in (a, b))

or as:
sum((x**2 for x in a), b)

After a lot of hemming and hawing (IIRC) we decided not to guess in this case, and the generator comprehension was required to have a single expression (evaluating to an iterable, of course) after its 'in' keyword. But at the time we didn't want to break existing code using the (already hugely popular) list comprehensions.

Then when we were designing Python 3, we decided that we wanted the list comprehension:
[f(x) for x in S if P(x)]

to be fully equivalent to the following expansion using the built-in list() function applied to a generator expression:
list(f(x) for x in S if P(x))

Thus we decided to use the slightly more restrictive syntax of generator expressions for list comprehensions as well.

We also made another change in Python 3, to improve equivalence between list comprehensions and generator expressions. In Python 2, the list comprehension "leaks" the loop control variable into the surrounding scope:
x = 'before'
a = [x for x in 1, 2, 3]
print x # this prints '3', not 'before'

This was an artifact of the original implementation of list comprehensions; it was one of Python's "dirty little secrets" for years. It started out as an intentional compromise to make list comprehensions blindingly fast, and while it was not a common pitfall for beginners, it definitely stung people occasionally. For generator expressions we could not do this. Generator expressions are implemented using generators, whose execution requires a separate execution frame. Thus, generator expressions (especially if they iterate over a short sequence) were less efficient than list comprehensions.

However, in Python 3, we decided to fix the "dirty little secret" of list comprehensions by using the same implementation strategy as for generator expressions. Thus, in Python 3, the above example (after modification to use print(x) :-) will print 'before', proving that the 'x' in the list comprehension temporarily shadows but does not override the 'x' in the surrounding scope.

And before you start worrying about list comprehensions becoming slow in Python 3: thanks to the enormous implementation effort that went into Python 3 to speed things up in general, both list comprehensions and generator expressions in Python 3 are actually faster than they were in Python 2! (And there is no longer a speed difference between the two.)

UPDATE: Of course, I forgot to mention that Python 3 also supports set comprehensions and dictionary comprehensions. These are straightforward extensions of the list comprehension idea.