Tuesday, April 21, 2015

Type hinting on Python

If you missed it, it seems right now there's a long thread going on related to the type-hinting PEP:

https://mail.python.org/pipermail/python-dev/2015-April/139221.html

It seems there's a lot of debate and I think the outcome will shape how Python code will look some years from now, so, I thought I'd give one more opinion to juice things up :)

The main thing going for the proposal is getting errors earlier by doing type-checking (which I find a bit odd in the Python world with duck-typing, when many times a docstring could say it's expecting a list but I could pass a list-like object and could get fine with it).

The main point raised against it is is that with the current proposal, the code becomes harder to read.

Example:

def zipmap(f: Callable[[int, int], int], xx: List[int], yy: List[int]) -> List[Tuple[int, int, int]]:

Sure, IDEs -- such as PyDev :) --  will probably have to improve their syntax highlighting so that you could distinguish things better, but I agree that this format is considerably less readable. Note that Python 3 already has the syntax in-place but currently it seems it's very seldomly used -- http://mypy-lang.org/ seems to be the exception :)

Now, for me personally, the current status quo in this regard is reasonable: Python doesn't go into java nor c++ land and keeps working with duck-typing, and Python code can still be documented so that clients knows what kinds of objects are expected as parameters.

I.e.: the code above would be:

def zipmap(f, xx, yy):
    '''
    :type f: callable(tuple(int, int))->int
    :type xx: list(int)
    :type yy: list(int)
    :rtype: list(tuple(int, int, int))
    '''

Many IDEs can already understand that and give you proper code-completion from that -- at least I know PyDev does :)

The only downside which I see in the current status quo is that the format of the docstrings isn't standardized and there's no static check for it (if you want to opt-in the type-checking world), but both are fixable: standardize it (there could be a grammar for docstrings which define types) and have a tool which parses the docstrings and does runtime checks (using the profiler hook for that shouldn't be that hard... and the overhead should be within reasonable limits -- if you're doing type checking with types from annotations, it should have a comparable overhead anyways).

Personally, I think that this approach has the same benefits without the argument against it (which is a harder to read syntax)...

If more people share this view of the Python world, I may even try to write a runtime type-checking tool based on docstrings in the future :)

2 comments:

Anonymous said...

What you want is exactly what the are doing: standardizing the format. But instead of letting the syntax inside a docstring they are putting it outside.

Also (and related with doing it outside docstrings), they are creating a new extension to allow the type hints to exists separately from the code.

And, everything is optional.

Seems pretty well rounded from my point of view.

Anonymous said...

One of the main weaknesses of Python - maybe the main weakness - is that a lot of errors are not caught at compile time, while they would be in other languages.

Type hints will help to catch a lot more errors, in IDEs and other tools, before the code is executed. This may not seem much, but for big projects where robustness matters, this is... huge !

Please do not underestimate the importance of type hints, they are relevant.