Welcome to End Point’s blog

Ongoing observations by End Point people

Subverting Subversion for Fun and Profit

One of our clients recently discovered a bug in a little-used but vital portion of the admin functionality of their site. (Stay with me here...) After traditional debugging techniques failed on the code in question, it was time to look to the VCS for identifying the regression.

We fortunately had the code for their site in version control, which is obviously a big win. Unfortunately (for me, at least), the repository was stored in Subversion, which means that my bag o' tricks was significantly diminished compared to my favorite VCS, git. After attempting to use 'svn log/svn diff -c' to help identify the culprit based on what I *thought* the issue might be, I realized that svn was just not up to the task.

Enter git-svn. Using git svn clone file://path/to/repository/trunk, I was able to acquire a git-ized version of the application's repository. For this client, we use DevCamps exclusively, so the entire application stack is stored in the local directory and run locally, including apache instance and postgres cluster. These pieces are necessarily unversioned, and are ignored in the repository setup. I was able to stop all camp services in the old camp directory (svn-based), rsync over all unversioned files to the new git repository (excluding the .svn metadata), replace the svn-based camp with the new git-svn based one, and fire up the camp services again. Started up immediately and worked like a charm. I now had git installed and working in what had previously only been svn-capable before.

Now that I had a git installation, I was able to pull one of my favorite tools from my toolbox when fighting regressions: git-bisect. In my previous svn contortions, I had located a previous revision several hundred commits back which did not exhibit the regression, so I was able to start the bisect with the following command: git bisect start bad good. In this case, bad was master and good was the revision I had found previously. Using git svn find-rev rnumber, I found the SHA1 commit for the good ref as git saw it.

From this point, I was able to quickly identify the commit which introduced the regression. In reviewing the diff, there was nothing that I would have expected to cause the issue at hand; the code did not touch any of the affected area of the admin. But git had never lied to me before. I compared the code currently in master with that introduced in the implicated commit and saw that most of it was still in place. I began selectively commenting out pieces of the code the commit introduced, and was able to enable/disable the bug with increasingly fine granularity. Finally, I was able to identify the single line which when removed caused the issue to evaporate. This was a line in an innocuous template which had a simple variable interpolation (inside an HTML comment, nonetheless); however, this line (which was in a file which was included with every document, added in the implicated commit) revealed a bug in the parser of the app-server which was causing the symptoms in the unrelated admin area.

It's certain that I would never have been able to find the source of this issue without git-bisect, as manual bisection with svn would have been too tedious to even consider. I am able to happily interact with the rest of the development team with git being my secret weapon; git svn dcommit enables me to push my commits upstream, and git svn fetch/git svn rebase enable me to pull in the upstream changes. I'll never need to tell my subversive secret (except, you know, on the company blog), and my own happiness and productivity has increased. Profit!!11 all around.

1 comment:

Jon Jensen said...

Nice find, David.

Since I know about the code you removed, I will explain what I think was actually wrong:

(1) First, for the record, you know that the offending code being in an HTML comment is irrelevant and doesn't make the bug any more or less strange. :)

(2) In Interchange, @_SOMEVAR_@ (as well as __SOMEVAR__) isn't as innocuous as it may seem. See &Vend::Interpolate::vars_and_comments where the Pragma dynamic_variables is checked for.

(3) Pragma dynamic_variables makes variable interpolation get handled by &Vend::Interpolate::dynamic_var which checks your VariableDatabase for any changes to variables on the fly. (So you don't have to reload the catalog to pick up changes.)

(4) The VariableDatabase variable is a GDBM table, and when that's opened by dynamic_var as the first access of a request, it ends up in read-only mode.

(5) Later attempts to write to the variable GDBM table fail because it's read-only.

I think this may be fixable for a given page by explicitly flagging the variable table as writable, but I'm not sure.

Since you were using MV_PAGE, which is always global and isn't a real developer-set variable anyway, you can use the standard ITL method Mark mentioned to get it: [var MV_PAGE 1].

But it's actually easier than that, now that I review the source: You can use @@MV_PAGE@@ and it'll work fine, because global variables are never dynamically checked, and MV_PAGE is always global.

So it was actually a bug in the custom catalog code that used @_MV_PAGE_@ which needlessly checks catalog variable space for an always-global variable. It's a twisted explanation, so I can't imagine how you overlooked it before. :)

Perhaps a good argument for putting the variable table into an SQL database instead of GDBM, but then you don't get the easy version-control of the tab-delimited export file.