Never trust doctrine:data-dump…

…and especially not if you get the impression that the dump will afterwards be readable by the `doctrine:data-load` command of symfony.

It was a costly lesson today when I tried to reimport a dump of a couple of Sympal tables. One of them, the one which models the menu items, has a nested set behaviour, and apparently this one cannot be restored properly by doctrine:

[Doctrine_Record_UnknownPropertyException]                                    
  Unknown record property / related component "children" 
  on "sfSympalMenuItem"

Apparently this particular issue popped up a couple of times in the past for other people as well (Google for it) and while the help of `doctrine:data-dump` still (Doctrine 1.2) blatantly states

The doctrine:data-dump task dumps database data:

./symfony doctrine:data-dump

The task dumps the database data in data/fixtures/%target%.

The dump file is in the YML format and can be reimported
by using the doctrine:data-load task.

./symfony doctrine:data-load

(with the emphasis of “can be reimported”)

the author of Doctrine, Jonathan Wage, told me today on Sympal’s IRC (shortened):

<jonwage> we don’t want people to think you can dump and then restore
<jonwage> that is not what the data fixtures are for
<jonwage> b/c dumping and then loading will never work
<jonwage> an ORM modifies data on the way and and the way out
<me> I mean the least thing doctrine could do there is that if it detects the nested set behaviour it should error out clearly on dump
<jonwage> so you can’t dump the data through an ORM and then try and reload it
<jonwage> i.e. hashed passwords
<me> if dumping is “never” going to work – why do you support dumping into yaml at all?!
<jonwage> if we do that then we would have to throw errors in sooooooo many other cases too
<jonwage> because it is at least a little bit of a convenience
<me> its like a half-baked feature then
<jonwage> we dump the raw data
<jonwage> and you can tweak it
<jonwage> thats my point though, it will ALWAYS be a half baked feature thats why we document it that way
<jonwage> it can NEVER work 100% the way you want it to
<jonwage> so if we fix that one thing, a million other things will be reported that we cannot fix
<jonwage> bc an ORM is not a backup and restore tool
<jonwage> it is impossible

Now I know that as well. My only problem was that I struggled “what is wrong with my fixtures” the whole time and never dared to ask “what is wrong with doctrine”…

Tip: Logging with Symfony >= 1.2

Imagine you have a business method in your model which needs to be accessed by two environments: once from a symfony task and once from the web. So far so good, now what if this business method should be able to log contents somewhere visibly, in case of the command line task to console and to a file and in case of the web application to the default logging mechanisms used there?

Getting the logger in web context is easy, all you have to do is

$logger = sfContext::getInstance()->getLogger();

but its a little harder to do for the command line task.

By default no symfony context is created for a command line task and even if it is created, the above call returns an instance of sfNoLogger. Logging in command applications happens through the sfTask::logSection() method, which basically throws an event at the created dispatcher in SYMFONYDIR/lib/command/cli.php. There you can also see that an instance of sfCommandLogger is created, but there is no way to get your fingers at this instance, because its purely local.

So what can we do? Parametricizing the business method with the sfTask instance and using the logSection() is obviously no solution, because this would break in web context where no such sfTask instance exists…

My solution was a bit more straight forward – I simply decided to not use the task-supplied logging schema at all, but created my own logger like this:

$dispatcher = new sfEventDispatcher();
$logger = new sfAggregateLogger($dispatcher);
$logger->addLogger(new sfCommandLogger($dispatcher));
// optionally add another file logger
if ($logToFile)
{
    $logger->addLogger(
        new sfFileLogger($this->dispatcher, ...)
    );
}

Hope this helps somebody.

monotone 0.46 released

The monotone developers are proud to announce the release of version 0.46. The highlights in this release are bisection support – thanks to Derek Scherger! – and the possibility to call the automation interface over the network – thanks to Timothy Brownawell!

Please note that stdio interface has been changed in an backwards-incompatible way. More information can be found in the documentation and in an earlier blog post of me.

Thanks again to everybody who made this release possible! Grab it while its hot – MacPorts already has the new version and other binaries should follow shortly after this announcement.

Doctrine Horror

My latest Symfony project uses Doctrine as ORM, which is considered to be a lot better than Propel by many people…

Well, not by me. Doctrine seems to have a couple of very good concepts, amongst them built-in validators, a powerful query language, and last but not least, an easy schema language. (Though to be fair, Propel will gain most of these useful things in the future as well or already has, f.e. with its `PropelQuery` feature.)

But Doctrine also fails in many areas; the massive use of overloads everywhere makes it very hard to debug and even worse, it tries to outsmart you (the developer) in many areas, which makes it even more hard to debug stuff which Doctrine doesn’t get right.

A simple example – consider this schema:

Foo:
  columns:
     id: { type: integer(5), primary: true, autoincrement: true }
     name: { type: string }

Bar:
  columns:
     id: { type: integer(5), primary: true, autoincrement: true }
     name: { type: string }

FooBarBaz:
  columns:
     foo_id: { type: integer(5), primary: true }
     bar_id: { type: integer(5), primary: true }
     name: { type: string }

(I’ll skip the relation setup here, Doctrine should find them all with an additional `detect_relations: true`)

So what do you expect you see when you call this?

$obj = new FooBarBaz();
print_r($obj->toArray());

Well, I expected to get an empty object, with a `NULL`ed `foo_id` and `bar_id`, but I didn’t! For me `foo_id` was filled with a 1. Wait, where does this come from?

After I digged deep enough in Doctrine_Record, I saw that this was automatically assigned in the constructor, coming from a statically incremented `$_index` variable. I could revert this by using my own constructor and call `assignIdentifier()` like this:

class FooBarBaz extends BaseFooBarBaz 
{
   public function __construct()
   {
      parent::__construct();
      $this->assignIdentifier(false);
   }
}

but now this object could no longer be added to a `Doctrine_Collection` (which is a bummer, because if you want to extend object lists with “default” empty objects, you most likely stumble upon a Doctrine_Collection, which is the default data structure returned for every SQL query).

So you might ask “Why the hell does all this impose a problem for you?”

Well, if you work with the `FooForm` created by the doctrine plugin for you in Symfony and you want to add `FooBarBazForm` via `sfForm::embedFormForEach` a couple of times (similar to the use case described here), you suddenly have the problem that your embedded form for the appended new `FooBarBaz` object “magically” gets a foo_id of a wrong (maybe not existing) `Foo` object and you wonder where the heck this comes from…

I have my lesson learned for the last one and a half days. I promise I’ll never *ever* create a table in Doctrine with a multi-key primary key again and I’m returing back to Propel for my next project.

monotone automate stdio overhauled [Update]

Yesterday my “automate-out-of-band” branch finally made it into monotone’s trunk. This is a prerequisite for the support of netsync commands in guitone I’ve blogged about earlier, as it makes in-stream informational, error and ticker messages possible, even for remote connections!

While I was at it, several other small things have been changed in stdio – f.e. the error code is now only issued once as payload of the ‘l’ stream-, which mean unfortunately that monotone 0.46 will break compatibility with clients which only understand the pre-0.46 output format. To avoid another hard break like this in the future, a new header section has been added to both stdio’s and the first header which is issued there is the “format-version” header:

$ mtn au stdio
format-version: 2

[...actual output...]

`stdio-version` is promised to stay constant as long as the output format doesn’t change and will be incremented by ‘1’ if there is any other major dealbreaker in the future. This is actually different from the `interface_version` number we also have in the automate interface, whose major number will raise everytime an incompatible change is made to any automate command. We’ll possibly change this behaviour to something more client-friendly in the future, but there is no ETA on this yet, as the current system is still good enough.

All changes for (remote) stdio will be clearly documented in the manual once 0.46 is out. Before this release happens though, I plan to finish the “automate-netsync” branch as well… Holiday time is hacking time 🙂

[Update: It was decided to name the version header “format-version” instead of “stdio-version”; I’ve updated my example accordingly]

Configure Thunderbird 3’s indexing behaviour

The current version of Thunderbird comes with a terrific global search functionality, but sometimes its cumbersome to watch it reindex the email history if something corrupted the database or to get emails in the search results which you’re absolutely not interested in (commit messages, f.e.).

Unfortunately Thunderbird 3 has only a global option to enable / disable the search database and the indexer, but a smart guy has filled this gap with his extension GlodaQuilla. After you’ve installed it you can configure so-called “inherited properties” for every account

… and every folder, easily overridable simply by toggling the “inherit” option:

I’d prefer that the Thunderbird guys would build this right into the product itself, but until that has been done this add-on is a life saver!

They’re not enough yet (Update)

I’m watching the climate conference in Copenhagen with great anger, mostly because I’m just feeling confirmed that the manhood is simply stupid. Not individuals on their own, but if they’re organized and have to go into the one, the only right direction, they’re completly dumb. I mean, how many homeless, how many dead people do we have to count before everyone, and I really mean everyone, acts in concert?

Sadly, this won’t happen in one, five or even ten years. Maybe within the mid of this century, when there have been enough Tsunamis, Hurricanes and Typhoons killing people, and enough floods, ground erosion and slash-and-burn have taken place and have wiped out enough farmland to let the remaining poor people die on hunger, then maybe some leaders will open their eyes and come to the conclusion “We’ve fucked up in Copenhagen ’09”.

Update: I’ve stumbled across a noteworthy article of The Guardian from Mark Lynas about the conference and the blocking position of China:

[…] China’s growth, and growing global political and economic dominance, is based largely on cheap coal. China knows it is becoming an uncontested superpower […] Its coal-based economy doubles every decade, and its power increases commensurately. Its leadership will not alter this magic formula unless they absolutely have to. […]

History tells us that a single human life – or even hundreds or thousands of them – doesn’t count much in China. I expect this won’t change in the future.

openSUSE build service client ported

I used to create packages for a couple of open source projects for the openSUSE Linux distribution. They have this really nice build service running on build.opensuse.org, on which you can – despite of its name – also build packages for other Linux distributions like Fedora, Gentoo or Debian.

While the web-based interface of the service is nice, some configurations and local builds require their command line client osc though, which is python-based and works similar to subversion. This client however was only packaged for the main distros the build service itself supports, but was unavailable for others like f.e. Mac OS X, so I created a MacPort for it today (installable via sudo port install osc).

Of course local Linux builds are not possible with it, as we’re missing the complete environment, but I think its still useful for maintaining and managing remote builds on the service itself. Have fun with it!

monotone-viz [updated]

I’ve recently packaged monotone-viz 1.0.2 for MacPorts (and soon also for openSUSE), a program to display monotone’s DAG of revisions and their properties. This becomes very handy if you need to do a complex (asynchronous) merge or you want to know what exactly monotone has merged together for you. One example is the graph of the “merge fest” we’ve had in spring 2008 for the last summit you see on the right.

complex merge in monotone

(Source: monotone website)

Merging in monotone is actually quite robust; while I’ve had a lot of “fuzzy” feelings in the past when doing complex merges with subversion or even CVS, merging in monotone is a no-brainer. It most of the time does exactly what you want it to do. One exception here is the handling of deleted files however, also known as “die-die-die” merge fallout: If you merge together two distinct development lines where one file has been edited on the left side and deleted on the right side, the deletion always wins over the edit, and there is absolutely nothing you can do against it (well, despite re-adding the file after merge and loosing the file’s previous history). Thankfully this is not such a common use case and keeping an “Attic” directory where deleted, but possibly revivable files reside is the medium-term solution, until someone picks up the topic again.

But back to monotone-viz, I couldn’t fix one problem with monotone-viz on MacPorts: It doesn’t properly draw the arrows on the graph, but rather puts them above the revisions, like this:

monotone-viz-drawing-bug

I’ve already asked the author about it, but he couldn’t find out whats wrong, so I suspect something is wrong with my gtk+ setup. If you have a hint for me where to look at, give me a pointer, I’d be very thankful. And if you tell me that it works correctly for you, then even better, drop me a note as well. I’ve uploaded a test monotone database with a simple merge to test the behaviour. Thanks!

[Update: As this bug points out the render problem comes from Graphviz’ dot program – hopefully the patch will made it into a new release shortly.]