You could instead of 256x256 render 2176x2176 tiles, discard the outer 128 pixel buffer area (for other, labeling, reasons), and cut out the rest into 8x8 tiles and save those to the file system. I think you can easily use the Python Imaging Library for that.

Or take a look at mod_tile or TileCache, both of which can do metatiling.