Sunday, January 18, 2015

Using tail-recursion on plain Python (hack)

Ok, just to note, I don't think this should be actually used anywhere, but I thought it was a nice hack to share :)

The idea is using the Python debug utilities to go back to the start of the frame... it probably won't win any speed competition and will break your debugger (so, really, don't use it).

The idea is setting "frame.f_lineno" to set the next line to be executed (but that can only be done inside a tracing function, so, we do have to set the tracing just for that and remove it right afterwards).

As a note, setting the next line to execute is  available in the PyDev debugger, so, you could jump back to some place backward in the code through Ctrl+Alt+R to see how some code behaved -- with the possibility of changing local vars, this is usually pretty handy.

So, now on to the code -- I think it's pretty straightforward:

import sys
from functools import partial

    
def trisum(n, csum):
    f = sys._getframe()
    
    if n == 0:
        return csum
    else:
        n -= 1
        csum += 1
        print(n, csum)
        
        f.f_trace = retrace_to_trisum
        sys.settrace(retrace_to_trisum)
        raise AssertionError('Never gets here!')
        
        # Originally the 3 lines above would be the recursion call
        # It's possible to see that commenting the above lines
        # and executing the code will make Python throw a
        # RuntimeError: maximum recursion depth exceeded.
        trisum(n, csum)

def reuse_frame(line, frame, *args):
    # Reusing a frame means setting the lineno and stopping the trace.
    frame.f_lineno = line + 2 # +2 to Skip the f = sys._getframe()
    sys.settrace(None)
    frame.f_trace = None
    return None

retrace_to_trisum = partial(reuse_frame, trisum.__code__.co_firstlineno)
print(trisum(1000, 0))

Friday, January 09, 2015

Creating safe cyclic reference destructors (without requiring __del__)

Well, it seems common for people to use __del__ in Python, but that should be a no-go mainly for the reasons below:

1. If there's a cycle, the Python VM won't be able to decide in what order elements should be deleted and will keep them alive forever (unless you manually clear that cycle from the gc module)... Yes, you shouldn't create a cycle in the first place, but it's hardly guaranteed some client of your library does a cycle when he shouldn't.

2. There are caveats related to reviving self during the __del__ (i.e.: say making a new reference to it somewhere else during its __del__ -- which you should definitely not do...).

3. Not all Python VMs work the same, so, unless you explicitly do some release on the object, some resource may be alive much longer than it should (i.e.: Pypy, Jython...)

4. If an exception in the context is thrown, all the objects may stay alive for much longer than antecipated (because the exception keeps a reference to the frame that has thrown the exception).

Now, if you still want to manage things that way (say, to play safe if the user forgets to do a context manager on some case), at least there's a relatively easy solution for points 1 and 2: instead of using __del__, use the weakref module to have a callback when the object dies to make the needed clearing...

The only thing to make sure here is that you don't use 'self' directly inside the callback, only the things it has to clear (otherwise you'd create a cycle to 'self', which is something you want to avoid here).

The example below shows what I mean (StreamWrapperDel is the __del__ based solution which shouldn't be used and StreamWrapperNoDel is the solution you should use):

import weakref

class StreamWrapperDel(object):
    
    def __init__(self, stream):
        self.stream = stream

    def __del__(self):
        print('__del__')
        self.stream.close()
        
class StreamWrapperNoDel(object):
    
    def __init__(self, stream):
        self.stream = stream
        def on_die(killed_ref):
            print('on_die')
            stream.close()
        self._del_ref = weakref.ref(self, on_die)


if __name__ == '__main__':
    class Stub(object):
        def __init__(self):
            self.closed = False
        
        def close(self):
            self.closed = True
            
    s = Stub()
    w = StreamWrapperDel(s)
    del w
    assert s.closed
    
    s = Stub()
    w = StreamWrapperNoDel(s)
    del w
    assert s.closed

Given that, personally I think Python shouldn't allow __del__ at all as there's another way to do it which doesn't have the related caveats.

For some real-world code which uses that approach, see: https://code.activestate.com/recipes/578998-systemmutex (recipe for a system wide mutex).

p.s.: Thanks to Raymond Hettinger the code above is colorized: https://code.activestate.com/recipes/578178-colorize-python-sourcecode-syntax-highlighting

Thursday, January 08, 2015

PyDev 3.9.1 released

PyDev 3.9.1 has just been released.

There are some noteworthy improvements done:
  • Preferences may now be saved and persisted per project or to the user settings.

For configuring the preferences, the approach is a bit different from most other Eclipse plugin, as it extends the existing preferences pages instead of creating project property pages and allows saving the options to multiple projects or to the user settings from there.







  • The pytest integration had some critical issues fixed (related to expected failures no longer being reported as failures and conftest loading fixed by automatically running from the proper folder).


  • The attach to process is now working in Mac OS.
See: http://pydev.org for more details on the release.