News

Welcome to End Point’s blog

Ongoing observations by End Point people

Insidious List Context

Recently, I fell into a deep pit. Not literally, but a deep pit of Perl debugging. As a result, I'm here to warn you and yours about "Insidious List Context(TM)".

(Note: this is a fairly elementary discussion, for people early in their Perl wizardry training.)

Perl has two contexts for evaluating expressions: list and scalar. (All who know this stuff cold can skip down a ways.) "Scalar" context is what non-Perl languages just call "normal reality", but Perl likes to do things ... differently ... so we have more than one context.

In scalar context, a scalar is a scalar is a scalar, but a list becomes a scalar that represents the number of items in the list. Thus,

@x = (1, 1, 1);  # @x is a list of three 1s
# vs.
$x = (1, 1, 1);  # $x is "3", the list size

In list context, a list of things is still a list of things. That's pretty simple, but when you are expecting a scalar and you get a list, your world can get pretty confused.

Okay, now the know-it-alls have rejoined us. I had a Perl hashref being initialized with code something like this:

my $hr = {
  KEY1 => $value1,
  KEY2 => $value2,
  KEY_TROUBLE => (defined($foo) ? mysub($foo) : 1),
  KEY3 => $value3,
};

So here is the issue: if mysub() returns a list, then the hashref will get extra data. Remember, Perl n00bs, "=>" is not a magical operator, it's just a "fat comma". So a construction like this:

1 => (2, 3, 4)
is really the same as:
1, 2, 3, 4

Here's a complete example to illustrate just what size and shape hole I fell into:

use strict;
use Data::Dumper;

my($value1,$value2,$value3,$foo) = qw(value1 value2 value3 foo);

my $hr = {
  KEY1 => $value1,
  KEY2 => $value2,
  KEY_TROUBLE => (defined($foo) ? mysub($foo) : 1),
  KEY3 => $value3,
};

print Data::Dumper->Dumper($hr);

sub mysub {
  return qw(junk extrajunk);
}
This outputs:
$VAR1 = 'Data::Dumper';
$VAR2 = {
          'extrajunk' => 'KEY3',
          'KEY2' => 'value2',
          'KEY1' => 'value1',
          'value3' => undef,
          'KEY_TROUBLE' => 'junk'
        };

Now, the actual subroutine involved in my little adventure was even more insidious: it returned a list context because it was evaluating a regular expression, in a list context. Its actual source:

sub is_yes {
   return( defined($_[0]) && ($_[0] =~ /^[yYtT1]/));
}

So watch those expression-evaluation contexts; they can turn fairly harmless expressions into code-busters.

2 comments:

Mike Pomraning said...

Neat. Been there.

In a scalar context, an *array* evaluates to its length, whereas a *list* evaluates to its final element. So in your example `$x=(1,1,1)`, $x is 1, not 3.

Now, subs don't "[return] a list context," but they may be evaluated in a list context.

(Was your actual problem that is_yes() was returning an empty list? That would flummox your hashkeys, and at first blush I don't see how is_yes() can return other than a zero or one element list.)

David Christensen said...

Hi Mike,

Yes, is_yes was returning an empty list. Since it was called in the hash assignment the results of the regexp match were being evaluated in list context, so when there was no match it was returning 0 elements which resulted in the shifted hash keys.