Welcome to End Point’s blog

Ongoing observations by End Point people

Code Debt-Free

Every now and then, the opportunity arises to write debt-free code (meaning free of technical debt). When such opportunities come, we must seize them.

I recently had the distinct pleasure of cranking out some Perl modules in the following order:

  1. Write documentation for the forthcoming functionality
  2. Implement unit tests for the aforementioned forthcoming functionality
  3. Verify that the unit tests fail
  4. Implement the awaited functionality
  5. Verify (jumping back to step 4 as necessary) that the unit tests work.

Timelines, interruptions, and other pressures often get in the way of this short-term development cycle. The cycle can feel tedious; it makes the task of implementing even simple functions seem unpleasantly large and drawn out. When an implementation approach flashes into the engineer's mind, leaping to step 4 (implementation) feels natural and immediately gratifying. The best-intentioned of us can fall into this out of habit, out of inertia, out of raw enthusiasm.

Documentation, though, demonstrates that you know what you're trying to achieve. It is not a nicety, it is proof that you understand the problem at hand. Unit tests, as hard as they can sometimes be to implement, offer proof that you in fact achieved the documented aspirations. Both serve to illustrate intent, purpose. That intent, and the thinking that informed it, is arguably more important than the code itself. It is the code's reason for being.

Many engineers at many companies in many circumstances have, do, and will sing praises to test-driven development, unit testing, and the Path to Software Engineering Enlightenment. Few of those engineers will actually do it. It is not natural for human beings to expend energy on a multi-step process when they believe -- falsely -- that a one-step process would achieve the same ends.

Here is what is necessary to ensure that we don't let our weaker nature subvert our perfect plans:

  • self-discipline
That is all.

MySQL vs. PostgreSQL mailing list activity

My co-worker, Greg Sabino Mullane, noted this writeup on the MarkMail blog comparing the amount of traffic on the various MySQL and PostgreSQL mailing lists.

I suppose you could pessimistically say that PostgreSQL users need more community help than MySQL users do, but reviewing the content of the traffic (and going from years of personal experience) doesn't support such a view. The PostgreSQL community seems to have more long-term, deeply involved users who are also contributors.

But let's hope the competition in the free database world picks up. It looks like the new Drizzle project has a good chance of growing a new community around MySQL.

In any case, the MarkMail mailing list archive and search service is an excellent resource. Thanks, MarkMail folks!

Git push: know your refspecs

The ability to push and pull commits to/from remote repositories is obviously one of the great aspects of Git. However, if you're not careful with how you use git-push, you may find yourself in an embarrassing situation.

When you have multiple remote tracking branches within a Git repository, any bare git push invocation will attempt to push to all of those remote branches out. If you have commits stacked up that you weren't quite ready to push out, this can be somewhat unfortunate.

There are a variety of ways to accommodate this:

  • use local branches for your commits, only merging those commits into your remote tracking branches when you're ready to push them out;
  • push remote tracking branches out whenever you have something worth committing.

However, even with sensible branch management practices, it's worthwhile to know exactly what it is you're pushing. Therefore, if you want to have a sense of what you're potentially doing in calling a bare git push, always call it with the --dry-run option first. This will show you what a the push will send out, where the conflicts are, and so on, all without actually performing the push.

It is ultimately best, though, to understand the different ways of invoking git push so you can control things precisely and only change exactly what you want to change.

 git push some_repo some_branch

This will identify the ref named some_branch within your repository and push it out to the some_repo repository. If you are good about having your remote tracking branches use the same name as the source branch in the relevant remote ref, this is a simple, effective way of ensuring that you're pushing out one branch and only one branch. However, it does require that you know the purpose of some_repo; it doesn't do any magic for deciding what the "right" repository to push to is based on some_branch.

To be extremely precise, you can use a full refspec in your push call:

 git push some_repo local_branch:refs/heads/new_branch

This would take the local branch local_branch and push it out to within the remote ref identified by some_repo, but pushing it to the branch name new_branch within some_repo. This is a very useful invocation to understand in order to create new branches in bare repositories to be shared between developers/repositories. While both examples shown here will create the branch in some_repo if it does not already exist, the second example gives the programmer full control over the branch names.

If you're sharing your work with multiple developers/repositories, it can become unwieldy if not impossible to keep your tracking branch names consistent with source branch names in your remote refs. In which case, knowing these invocations of git push is an absolute necessity.

Check out the documentation on git push for a full explanation, and for an example of how to delete a branch in a remote ref. There are considerably more options for the command than what is explained here, but the refspec documentation can be a bit confusing to newcomers, in which case hopefully this discussion provides a bit more clarity. (Then again, perhaps it doesn't.)

Perl incompatibility moving to 5.10

We're preparing to upgrade from Perl 5.8.7 to 5.10.0 for a particular project, and ran into an interesting difference between the two versions.

Consider the following statement for some hashref $attrib:

  use strict;
  my ($a, $b, $c) = @{%{$attrib}}{qw(a b c)};

In 5.8.7, the @{...} construct will return a slice of the hash referenced by $attrib, meaning that $a gets $attrib->{a}, $b gets $attrib->{b}, and so on.

In 5.10.0, the same construct will result in an error complaining about using a string for a hashref.

I suspect it's due to the hash dereference (%{$attrib}) being fully executed prior to applying the hash-slice operation (@{...}{qw(a b c)}), meaning that you're not operating against a hashref anymore.

Fortunately, the fix is wonderfully simple and significantly more readable:

  my ($a, $b, $c) = @$attrib{qw( a b c )};

The "fix" -- which is arguably how it should have been constructed in the first place, but this is software we're talking about -- works in both versions of Perl.

Signs of a too-old Git version

When running git clone, if you get an error like this:

Couldn't get http://some.domain/somerepo.git/refs/remotes/git-svn for remotes/git-svn
The requested URL returned error: 404 error: Could not interpret remotes/git-svn as something to pull

You're probably using a really old version of Git that can't handle some things in the newer repository. The above example was from Git, the very old version included with Debian Etch. The best way to handle that is to use Debian Backports to upgrade to Git 1.5.5.

On Red Hat Enterprise Linux, Fedora, or CentOS, the Git maintainers' RPMs usually work (though you may need to get a dependency, the perl-Error package from RPMforge).

If all else fails, grab the Git source and build it. I've never had a problem building the code anywhere, though building the docs requires a newer version of asciidoc than is easy to get on RHEL 3.

Building Perl on 64-bit RHEL/Fedora/CentOS

When building Perl from source on 64-bit Red Hat Enterprise Linux, Fedora, CentOS, or derivatives, Perl's Configure command needs to be told about the "multilib" setup Red Hat uses.

The multilib arrangement allows both 32-bit and 64-bit libraries to exist on the same system, and leaves the "non-native" 32-bit libraries in /lib and /usr/lib while the "native" 64-bit libraries go in /lib64 and /usr/lib64. That allows the same 32-bit RPMs to be used on either i386 or x86_64 systems. The downside of this is that 64-bit applications have to be told where to look for, and put, libraries, or they usually won't work.

For Perl, to compile from a source tarball with the defaults:

./Configure -des -Dlibpth="/usr/local/lib64 /lib64 /usr/lib64"

Then build as normal:

make && make test && sudo make install

I hope this information will come in handy for someone. I believe I learned it from Red Hat's source RPM for Perl.