Q: I used the word Pickle in a conversation, and (Glyph and/or JP)
shotgunned a handful of Prozac and washed it down with whiskey. Is he
/ are they okay?
A: Pickle is baaaad news, kids. It's *great* if your tiny little
application needs to save some data in a hurry. It generates some
BIIIIG problems if you use it on a large scale.
Your data tends to calcify along with your application code. It's
very difficult to find the links between them in either direction;
you don't know what application code your data references, you
don't know what data your application code is producing. The fact
that it's easy to manage this on a framework level is deceptive -
once you start saving real data the rules all change.
Explicit is better than implicit. *EVERYTHING* in Pickle is
implicit.
It is possible - nay, *easy*, to have some totally random part of
your application stick temporary data to an in-database
objects. Once objects 'get dirty' in this fashion, it's nearly
impossible to find them, and when you do, the usual way to detect
the problem is by having a 'load' operation explode!
As a corollary to that, since there are no explicit schemas, if
you write an otherwise valid upgrader which does not delete an
obsoleted attribute, that attribute will silently sit around
forever, bloating your database until the end of time (or until
objects you THOUGHT were gone from your data are deleted from your
code, only to surprise you by then not loading)
cPickle can be coerced to coredump for certain inputs, both on
store and on load. Some of these inputs are valid, some are the
result of programming bugs. Heaven help you if you ever create a
bug in a reduce method.
Also, 'regular' Pickle is too slow even for a joke.
There are no tools, besides a raw Python prompt, to investigate
the contents of a pickle, or to profile the disk or memory usage
of different parts of it. They are completely opaque blobs.
We have experienced all of these issues in production and some of them
still give us nightmares. While we've never lost any user data, it has
certainly made the process of upgrading and enhancing our production
server... challenging.