2006-07-06

PyPy Musings

I've been fiddling with PyPy this week. I thought I would share a few things I know, and relate my experience with getting the Twisted unit tests running under PyPy. I've only gotten the twisted.trial the unit testing framework to run its own tests so far.

Many people seem to misunderstand the PyPy project. So I'm going to explain it in very simple terms for those who have not heard of it, or have only heard a brief blurb.

PyPy is a project that aims to completely implement Python, using Python. Much like GCC is written in the C programming language and FreePascal is written in the Pascal programming language.

The structure of PyPy is interesting once you delve into it. It has that ability to run 'untranslated'. That is, PyPy's interpreter running under CPython[1]. This is *slow*. I used this technique to run 250 tests on twisted.trial and it took nearly 11 hours of CPU time. This is for a battery of tests that runs in CPython in 25 seconds.

PyPy can also be translated to a number of backends. Backends for C, llvm, Common Lisp and JavaScript all exist. This is exciting because it means that instead of running at a speed that a snail would be embarressed to be compared to, it runs python at approximately 3-10x slower than CPython. Benchmarks of various builds here.

Translating to a backend requires that platform calls be implemented correctly. This can be implemented by saying (for instance) os.stat should proxy the call through to the CPython implementation in untranslated PyPy, should call this C code when translated to C, should call llvm code when translated to llvm, and should simply not exist when translated to JavaScript.

This is a bit nasty, because you've got to implement things a few times over, so there's a new way based on ctypes that's completely awesome.

When using ctypes, you can implement a python version of the call to the system library, and when translating to C, the calls to ctypes will be translated to C as well. Meaning that it's possible to have a single implementation of a function that works in untranslated and translated PyPy, without writing a line of actual C code.

An example of some actual code I had to write in order to get the twisted tests to run.
dllname = util.find_library('c')
libc = cdll.LoadLibrary(util.find_library('c'))
def access(space, path, mode=R_OK):
  if libc.access(path, mode):
    return space.wrap(0)
  else:
    return space.wrap(1)
access.unwrap_spec = [ObjSpace, str, int]

There's some funky stuff there for wrapping/unwrapping return calls to the Application Level as this is Interpreter Level code - you're not allowed to give the application level any of the underlying interpreter level objects, but the really interesting bit is libc.access(path, mode) which will result in a call to access(2).

And the best thing is, this code will run untranslated or translated. I hope to iron out a few of the bugs I have in modules like fcntl (thanks Lawrence Oluyede for implementing this in ctypes!) that stop them from being translated, then I'll be able to run a translation and run the tests at a reasonable speed. :)

[1] CPython is the name used to unambiguously refer to Python implemented in C, and to distinguish it from Jython, PyPy, IronPython or Stackless Python. /usr/bin/python is CPython.


'rati tags: ,

2 comments:

Lawrence Oluyede said...

you can take a look at the fcntl module in ctypes that I've translated for my SoC 2006 project.

http://codespeak.net/svn/user/rhymes/modules/fcntl/

Let me know if it's useful or if you encounter bugs (l.oluyede [at] gmail.com)

Stephen Thorne said...

I just realised i didn't mention that I am in fact using your cfcntl.py module in my experiements Lawrence. I've updated my article to mention you and link to your svn repo. :)