End Point

News

Welcome to End Point's blog

Ongoing observations by End Point people.

DBD::Pg prepared statement change

One of the changes in the recently released DBD::Pg version 3 (in addition to the big utf8 change), is the addition of a new attribute, pg_switch_prepared. This accompanies a behavior change in the use of prepare/execute. DBD::Pg will now postpone creating a server-side PREPARE statement until the second time a query is run via the execute() method.

Technically, DBD::Pg will use PQexecParams (part of the underlying libpq system that DBD::Pg uses) the first time a statement is executed, and switch to using PQexecPrepared the second time the statement is executed (by calling PQprepare first). When it actually switches is controlled by the pg_switch_prepared attribute, which defaults to 2 (the behavior above). You can set it to 0 or 1 to always use PQexecPrepared (as the older versions did), or you can set it to -1 to always use PQexecParams and avoid creating prepared statements entirely.

The typical flow of events in a DBI script is to create a statement handle via the prepare() method, then call the execute() time with varying arguments as many times as needed.

#!perl

use strict;
use warnings;
use DBI;

my $DSN = 'DBI:Pg:dbname=postgres';
my $dbh = DBI->connect($DSN, '', '', {AutoCommit=>0,RaiseError=>1,PrintError=>0})
  or die "Connection failed!\n";
print "DBI is version $DBI::VERSION, DBD::Pg is version $DBD::Pg::VERSION\n";

## We do this so we can see the version number in the logs
my $SQL = 'SELECT ?::text';
$dbh->do($SQL, undef, "DBD::Pg version $DBD::Pg::VERSION");

my $sth = $dbh->prepare('SELECT count(*) FROM pg_class WHERE relname = ?');
$sth->execute('foobar1');
$sth->execute('foobar2');
$sth->execute('foobar3');

When the script above is run on DBD::Pg versions 2.19.1 and 3.0.0, you can see the difference:


LOG:  execute <unnamed>: SELECT $1::text
DETAIL:  parameters: $1 = 'DBD::Pg version 2.19.1'
LOG:  execute dbdpg_p30462_1: SELECT count(*) FROM pg_class WHERE relname = $1
DETAIL:  parameters: $1 = 'foobar1'
LOG:  execute dbdpg_p30462_1: SELECT count(*) FROM pg_class WHERE relname = $1
DETAIL:  parameters: $1 = 'foobar2'
LOG:  execute dbdpg_p30462_1: SELECT count(*) FROM pg_class WHERE relname = $1
DETAIL:  parameters: $1 = 'foobar3'


LOG:  execute <unnamed>: SELECT $1::text
DETAIL:  parameters: $1 = 'DBD::Pg version 3.0.0'
LOG:  execute <unnamed>: SELECT count(*) FROM pg_class WHERE relname = $1
DETAIL:  parameters: $1 = 'foobar1'
LOG:  execute dbdpg_p30618_1: SELECT count(*) FROM pg_class WHERE relname = $1
DETAIL:  parameters: $1 = 'foobar2'
LOG:  execute dbdpg_p30618_1: SELECT count(*) FROM pg_class WHERE relname = $1
DETAIL:  parameters: $1 = 'foobar3'

As you can see, the do() method always uses PQexecParams (this is what creates the "<unnamed>" statement seen in the logs). For the prepare/execute section, the older versions issued an implicit prepare right away, while 3.0.0 uses an unnamed statement for the first iteration, and only when called more than once switches to a named prepared statement. The use of PQexecParams is faster than doing a PQprepare plus a PQexecParams, but if you are going to execute the same query a number of times, it is more efficient to simply send the arguments via PQexecPrepared and absorb the one-time cost of creating the statement via PQprepare.

What does this mean for users of DBD::Pg? Probably nothing, as the new default is already a decent compromise, but it's good to know about the pg_switch_prepared knob, that is there if you need it.

MySQL, ASCII Null, and Data Migration

Data migrations always have a wide range of challenges. I recently took on a request to determine the difficulty of converting an ecommerce shop's MySQL 5.0 database to PostgreSQL 9.3, with the first (presumably "easier") step being just getting the schema converted and data imported before tackling the more challenging aspect of doing a full assessment of the site's query base to re-write the large number of custom queries that leverage MySQL-specific language elements into their PostgreSQL counterparts.

During the course of this first part, which had contained a number of difficulties I had anticipated, I hit one that I definitely had not anticipated:

ERROR:  value too long for type character varying(20)

Surely, the error message is absolutely clear, but how could this possibly be? The obvious answer--that the varchar definitions were different lengths between MySQL and PostgreSQL--was sadly quite wrong (which you knew, or I wouldn't have written this).

After isolating out the row in question, the first clear distinction of the data in question was the presence of the ASCII null character in it:

'07989409006\007989409'

OK, sure, that's weird (and I have a hunch why it's there) but I'm still puzzled how the "too long" data could be in MySQL in the first place. Then, things become both weirder and more clear when I go to take a look at the record as it exists in the source database of the dump. The table in question has many columns, so naturally I look at the results with the left-justified output of \G. Without displaying the full record for a number of reasons, here's what I saw for that field:

*************************** 1. row ***************************
xxxxx: 07989409006
1 row in set (0.00 sec)

Wow! OK, so in MySQL, even though the ASCII null is clearly maintained in the data (the row came from a mysqldump directly from that same database) somehow the data are truncated to the string before the ASCII null, and so it must be determining string length from the same process that is truncating. To "prove" this to myself, I added to the where clause on the above query "AND xxxxx = '07989409006'", and sure enough the record was not found.

I discussed all of my findings with a colleague, who was equally bewildered and recommended I post something describing the issue. I agreed, and to make sure I was talking about an existing problem (this was MySQL 5.0 after all) I went to reproduce the issue on a more modern release (v 5.5 to be exact).

I created a test case, with a new table having a single field into which I would insert a too-long string that had an ASCII null at a point such that the string to that point was less than the field length. My testing seemed to indicate the problem had been fixed sometime between 5.0 and 5.5:

mysql> select version();
+-----------------+
| version()       |
+-----------------+
| 5.5.35-33.0-log |
+-----------------+
1 row in set (0.00 sec)

mysql> create table foo (foo_var varchar(5));
Query OK, 0 rows affected (0.60 sec)

mysql> insert into foo values ('123\0456');
Query OK, 1 row affected, 1 warning (0.93 sec)

mysql> select * from foo;
+---------+
| foo_var |
+---------+
| 123 4   |
+---------+
1 row in set (0.00 sec)

Now to run the exact same test on my MySQL 5.0 instance, prove the deviation in behavior, and we'll be good to go:

mysql> select version();
+------------+
| version()  |
+------------+
| 5.0.95-log |
+------------+
1 row in set (0.00 sec)

mysql> create table foo (foo_var varchar(5));
Query OK, 0 rows affected (0.00 sec)

mysql> insert into foo values ('123\0456');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> select * from foo;
+---------+
| foo_var |
+---------+
| 123 4   |
+---------+
1 row in set (0.00 sec)

So, my contrived test also worked on the original MySQL server that had dumped out the data that led to this problem in the first place. What was going on here?

By now, I'm pretty seriously flailing about in indignation. What was so special about the data from the dump, and its behavior in the original table, that I had failed to understand and reproduce? I begin devising tests that seemed too absurd to be believed, as though I had two apples that, if I repositioned them just right, would prove 1+1 is actually 3. Then, by a stroke of fortune, I hit on the missing clue. It was a bug, all right, but not anything like what I was thinking:

mysql> select version();
+------------+
| version()  |
+------------+
| 5.0.95-log |
+------------+
1 row in set (0.00 sec)

mysql> select * from foo;
+---------+
| foo_var |
+---------+
| 123 4   |
+---------+
1 row in set (0.00 sec)

mysql> select * from foo \G
*************************** 1. row ***************************
foo_var: 123
1 row in set (0.00 sec)

Eureka! The bug was an issue rendering the output using the \G query terminator, whether restricted just to ASCII null or perhaps to other backslash-escaped characters, I do not know. Finally, to confirm whether the issue was still an issue or fixed over the intervening versions, I ran my new test on 5.5:

mysql> select version();
+-----------------+
| version()       |
+-----------------+
| 5.5.35-33.0-log |
+-----------------+
1 row in set (0.00 sec)

ql> select * from foo;
+---------+
| foo_var |
+---------+
| 123 4   |
+---------+
1 row in set (0.00 sec)

mysql> select * from foo \G
*************************** 1. row ***************************
foo_var: 123 4
1 row in set (0.00 sec)

Now I had a clear explanation for the behavior of the data with ASCII nulls, but it threw me back to starting over on the original length problem. However, without the red herring of the "too long" data being truncated to explain the size problem, I looked more closely at the specific data causing the problem and realized that counting \ and 0 as literal characters, instead of the metadata represetnation of ASCII null, the width of the data ended up exceeding the field length. Testing this hypothesis was simple enough:

mysql> select length('07989409006\007989409') AS len;
+-----+
| len |
+-----+
|  20 |
+-----+
1 row in set (0.00 sec)

#####

postgres=# select length('07989409006\007989409') AS len;
 len
-----
  21
(1 row)

This answer leaves some confusion on a couple of fronts. Why doesn't PostgreSQL properly treat the \0 representation of ASCII null? The answer to this is, itself, a mixed answer. PostgreSQL will treat a string as a literal unless explicitly told to do otherwise. In order to tell the database a string has escape sequences in it, you have to open the string with the "E" (or "e") identifier. However, doing that, we still don't match the behavior we see in MySQL:

postgres=# select length(E'07989409006\007989409') AS len;
 len
-----
  18
(1 row)

The escape sequence consumed two additional characters because PostgreSQL interpreted the escape as octal \007:

\o, \oo, \ooo (o = 0 - 7) -> octal byte value1

The other potentially confusing point, then, is why didn't MySQL interpret the escape sequence as octal \007? The simple answer there is, it just doesn't do that; MySQL makes a special case of recognizing the ASCII null character as \0:

MySQL recognizes the escape sequences shown in Table 9.1, "Special Character Escape Sequences". For all other escape sequences, backslash is ignored.2

If you look at Table 9.1, you'll see the list of escape sequences is rather short and does not include general handling for octal or hex numeric representations.

Refs:

  1. http://www.postgresql.org/docs/9.3/static/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE
  2. http://dev.mysql.com/doc/refman/5.6/en/string-literals.html#character-escape-sequences

Monitoring Productivity with RescueTime


RescueTime.com

I'm always on the lookout for tools that help me be more productive in my daily work. Time tracking is always one of those difficult tasks for those who bill time by the hour. I'll admit that there are days here and there where I know I've worked all day but at the end of the day I can't remember where I spent my time and how long I've spent on different tasks. Last week I discovered a new tool that I have loved! It's called RescueTime and it has changed how I approach my days.

RescueTime is a web service that you can sign up for free. You download a little app that runs in the background and monitors your productivity throughout the day. You can then log into the RescueTime website and see many different breakdowns of how you spent your day.

So how does it track productivity? It monitors how much time you spend in different applications on your computer. Then it tries to tag that time on a scale from distracting to very productive. You might be thinking that if you spend your whole day in a browser that wouldn't be very helpful. But RescueTime is pretty smart and it also monitors which web sites you visit and lets you categorize that time as well.



The nicest thing about RescueTime is how little time you spend actually tending the app. It's always there and their automatic categorization gets it right almost all the time. I simply love logging into the dashboard on the site at the end of the day and seeing how my time has been spent that day. So far it has helped motivate me to spend as little time on the distracting items of my day and focus on what is truly productive and important.



RescueTime also gives you the opportunity to set goals for yourself and tracks your progress over time. I haven't used these features yet, but I plan to once I've used the app long enough to collect enough information about my days to better inform my goal setting. Lifehacker.com has a great article no how to optimize your time using RescueTime called "How to Reclaim Your Time in Seven Days for a More Productive and Stress-Free Workday" and I'd recommend giving it a quick read.

RescueTime also provides a premium service that not just monitors but actually interacts with your computer as well. For example, you can tell it to block certain web sites during certain times of the day. For what I'm doing that's a bit overkill but I can see how the value could play out if you're really having a hard time kicking some distracting online habit.

Accepting Bitcoin with BitPay

End Point recently began discussions with long-time Interchange client FrozenCPU about their desire to accept bitcoin payments. Because of the newness of bitcoin, as well as its non-traditional nature as a cryptocurrency, traditional payment gateways do not support it. While ideally suited to function as an exchange medium for ecommerce, we had to seek solutions that allowed for its integration into the checkout process.

BitPay Payment Gateway

After investigating their options, FrozenCPU elected to partner with BitPay to handle bitcoin payments. BitPay provides merchants with a solution similar in function to PayPal, where users are either redirected to the BitPay servers to complete payment, or the interaction with BitPay can be embedded directly into the merchant site via an iframe. In either case, all communication between customer and payment processor is direct.

Customers with a digital wallet then approve a transfer for the converted amount of bitcoin. Merchants need not display or maintain conversion rates; they can continue to use their native currency and BitPay will manage the exchange rate directly.

The payment negotiation is managed by the creation of an invoice which tracks the state of the payment through a sequence of status updates. Each status update can be thought of as analogous to a credit card transaction, but instead of a number of independent transactions related through references, the same invoice and identifier are used throughout.

The BitPay Process

At the point to finalize an order, a number of steps must be taken to complete the transaction using bitcoin. Again, the general process is most similar to PayPal, although no component of address management is included.

  • At the point where an authorization is normally issued for a credit card payment, instead the merchant generates a new invoice from BitPay, which starts in the status "new".
  • Upon successful invoice generation, the merchant can either redirect the customer to BitPay's server to present the payment screen, or embed the same through an iframe to keep the customer on site, if preferred.
  • Customer authorizes payment to transfer bitcoin from their digital wallet at the exchange rate provided from BitPay.
  • Once transferred, the invoice status moves to "paid" and the customer can then be presented with the order receipt, or in the case of redirect can click back on the merchant-provided backlink to the site to produce the receipt.
  • BitPay then negotiates a validation process over the bitcoin network, where the payment is "complete" after 6 block confirmations. Merchants can choose how quickly they want to accept the payment as valid, where a higher speed increases the odds of failing validation later in the process:
    • High speed: immediately consider the payment "confirmed" once customer authorizes payment.
    • Medium speed: consider payment "confirmed" once BitPay receives 1 block confirmation. Typically 5-10 minutes.
    • Low speed: wait for all 6 block confirmations before considering the payment "confirmed". Typically 45-60 minutes.
  • All payments must ultimately receive 6 block confirmations within 1 hour to complete. If this threshold cannot be met, the invoice status becomes "invalid" and the merchant must decide how to proceed.
  • Once the invoice status is "complete", the transaction cannot be reversed. No chargebacks. Funds can be received by the merchant in the local currency or in bitcoin.

  • Invoices can be created with flags to either send notifications to an API the merchant creates, a designated email address, or both. This allows the merchant to keep on top of the transaction lifecycle in real time and minimize delays or disruptions on fulfillment.

Bumps in the Road

This was our first venture into an emerging field with a young technology. We hit a few difficulties on our path to implementation:

  • Not every status change on the invoice generates a notice, even when the fullNotifications flag is set to true. Customers have 15 minutes to pay an invoice, and if that time elapses the invoice status moves to "expired". That left us with a dangling order because the IPN does not fire on that status update. We had to compensate by developing a monitoring script to look for those dangling orders older than 15 minutes, make an API call to BitPay to confirm the invoice status of "expired", and cancel the order.
  • We initially received no update messages on status changes to the invoice that sourced back to problems in BitPay's proxy software. However, BitPay was diligent in their efforts to troubleshoot the issue with us and they were able to upgrade the packages causing the problem. Their responsiveness in this matter certainly separates BitPay from other payment processors who are either entirely unresponsive to system bugs or are content to allow problems to persist through their regular release schedule.
  • Development and testing was complicated by the lack of a development environment at BitPay.

We are looking forward to the anticipated success of accepting bitcoin for payment on FrozenCPU and the opportunity to explore this offering with our many other ecommerce clients.

More jQuery confusion: hidden, disabled fields

As you may have read in my earlier post, I have a proclivity for stumbling into oddball bits of jQuery behavior. (Maybe it's not just me.) Here's another I discovered recently. I was debugging an odd AJAX bug in a form, one of those dynamically updating form widgets in a page:
 <select name="make">...</select>
 <select name="model">...</select>
where the first selector sends off an AJAX query to retrieve data to fill the second selector, and it in turn fires off a query to get data for the next, and so on down the line. Squirreled away in the form was:
 <input type="hidden" name="limitation" disabled="disabled" value="">
Supposedly, this input was filled in to restrict the query on some pages, but not all. I think the original author's intent was to put data in here as it became available, and to toggle the field's disabled setting when it was pertinent to limit the query. (Mostly, it was pertinent when the second or third selector was firing off its AJAX query.) However, a recent change to the code behind this AJAX query created a bug, because the "limitation" parameter was showing up where it wasn't wanted. The AJAX call was assembling parameters "by hand" (as opposed to the .serialize() method, and that's where the problem lies!).
<form>
<input type="hidden" name="secret" disabled="disabled" value="Sue Richards">
<input type="submit">
</form>
If you submit this form, the data received by the server won't include the "secret" field. If you .serialize() it, likewise the data won't include "secret"'s value. But if you build up the parameters "by hand", e.g.,
var data = {};
data.secret = $('input[name=secret]').val();
...
And that's the bug. The short answer is:
data.secret = $('input[name=secret]').not(':disabled').val();
because that will at least suppress the unwanted data from being transmitted.

Python decorator basics, part II

This is a continuation of my previous post: Python decorator basics. Here I'll talk about a decorator with optional arguments. Let's say we want to pass an optional argument to the same debug decorator:

def debug(msg=None):
    def actual_decorator(f):
        def wrapper(*args):
            if msg:
                print msg
            return f(*args)
        return wrapper
    return actual_decorator

@debug("Let's multiply!")
def mul(x, y):
    return x*y

Calling mul:

mul(5, 2)
Let's multiply!
10

Excellent. Now let's decorate without a msg and call mul:

@debug
def mul(x, y):
    return x*y

mul(5, 2)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: actual_decorator() takes exactly 1 argument (2 given)

Oh oh. Let's see what happens at time of decoration:

mul = debug(mul)

Hmmm, mul gets passed to debug as it's argument and then the arguments (5, 2) are passed to actual_decorator, since debug returns actual_decorator. To resolve this we need to always call the decorator as a function:

@debug()
def mul(x, y):
    return x*y

mul(5, 2)
10

Assuming that we always expect the msg parameter to be a non-callable, another option would be to check the type of argument passed to the debug decorator:

def debug(msg=None):
    def actual_decorator(f):
        def wrapper(*args):
            if msg:
                print msg
            return f(*args)
        return wrapper
    if callable(msg):
        # debug decorator called without an argument so
        # msg is the function being decorated
        return debug()(msg)
    return actual_decorator

@debug
def mul(x, y):
    return x*y

mul(5, 2)
10

@debug("Let's multiply!")
def mul(x, y):
    return x*y

mul(5, 2)
Let's multiply!
10

DBD::Pg 3.0.0 and the utf8 flag

One of the major changes in the recently released 3.0 version of DBD::Pg (the Perl driver for PostgreSQL) was the handling of UTF-8 strings. Previously, you had to make sure to always set the mysterious "pg_enable_utf8" attribute. Now, everything should simply work as expected without any adjustments.

When using an older DBD::Pg (version 2.x), any data coming back from the database was treated as a plain old string. Perl strings have an internal flag called "utf8" that tells Perl that the string should be treated as containing UTF-8. The only way to get this flag turned on was to set the pg_enable_utf8 attribute to true before fetching your data from the database. When this flag was on, each returned string was scanned for high bit characters, and if found, the utf8 flag was set on the string. The Postgres server_encoding and client_encoding values were never consulted, so this one attribute was the only knob available. Here is a sample program we will use to examine the returned strings. The handy Data::Peek module will help us see if the string has the utf8 flag enabled.

#!perl
use strict;
use warnings;
use utf8;
use charnames ':full';
use DBI;
use Data::Peek;
use lib 'blib/lib', 'blib/arch';

## Do our best to represent the output faithfully
binmode STDOUT, ":encoding(utf8)";

my $DSN = 'DBI:Pg:dbname=postgres';
my $dbh = DBI->connect($DSN, '', '', {AutoCommit=>0,RaiseError=>1,PrintError=>0})
  or die "Connection failed!\n";                                            
print "DBI is version $DBI::VERSION, DBD::Pg is version $DBD::Pg::VERSION\n";

## Create some Unicode strings (perl strings with the utf8 flag enabled)
my %dm = (
    dotty  => "\N{CADUCEUS}",
    chilly => "\N{SNOWMAN}",
    stuffy => "\N{DRAGON}",
    lambie => "\N{SHEEP}",
);

## Show the strings both before and after a trip to the database
for my $x (sort keys %dm) {
    print "\nSending $x ($dm{$x}) to the database. Length is " . length($dm{$x}) . "\n";                                                                    
    my $SQL = qq{SELECT '$dm{$x}'::TEXT};             
    my $var = $dbh->selectall_arrayref($SQL)->[0][0];
    print "Database gave us back ($var) with a length of " . length($var) . "\n";
    print DPeek $var;
    print "\n";
}

Let's checkout an older version of DBD::Pg and run the script:

$ cd dbdpg.git; git checkout 2.18.1; perl Makefile.PL; make
$ perl dbdpg_unicode_test.pl
DBI is version 1.628, DBD::Pg is version 2.18.1

Sending chilly (☃) to the database. Length is 1
Database gave us back (☃) with a length of 3
PV("\342\230\203"\0)

Sending dotty (☤) to the database. Length is 1
Database gave us back (☤) with a length of 3
PV("\342\230\244"\0)

Sending lambie (��) to the database. Length is 1
Database gave us back (🐑) with a length of 4
PV("\360\237\220\221"\0)

Sending stuffy (��) to the database. Length is 1
Database gave us back (🐉) with a length of 4
PV("\360\237\220\211"\0)

The first thing you may notice is that not all of the Unicode symbols appear as expected. They should be tiny but legible versions of a snowman, a caduceus, a sheep, and a dragon. The fact that they do not appear properly everywhere indicates we have a way to go before the world is Unicode ready. When writing this, only chilly and dotty appeared correctly on my terminal. The blog editing textarea showed chilly, dotty, and lambie. The final blog in Chrome showed only chilly and dotty! Obviously, your mileage may vary, but all of those are all legitimate Unicode characters.

The second thing to notice is how badly the length of the string is computed once it comes back from the database. Each string is one character long, and goes in that way, but comes back longer. Which means the utf8 flag is off - this is confirmed by a lack of a UTF8 section in the DPeek output. We can get the correct output by setting the pg_enable_utf8 attribute after connecting, like so:

...
my $dbh = DBI->connect($DSN, '', '', {AutoCommit=>0,RaiseError=>1,PrintError=>0})
  or die "Connection failed!\n";
## Needed for older versions of DBD::Pg.
## This is the same as setting it to 1 for DBD::Pg 2.x - see below
$dbh->{pg_enable_utf8} = -1;
...

Once we do that, DBD::Pg will add the utf8 flag to any returned string, regardless of the actual encoding, as long as there is a high bit in the string. The output will now look like this:


Sending chilly (☃) to the database. Length is 1
Database gave us back (☃) with a length of 1
PVMG("\342\230\203"\0) [UTF8 "\x{2603}"]

Now our snowman has the correct length, and Data::Peek shows us that it has a UTF8 section. However, it's not a great solution, because it ignores client_encoding, has to scan every single string, and because it means having to always remember to set an obscure attribute in your code every time you connect. Version 3.0.0 and up will check your client_encoding, and as long as it is UTF-8 (and it really ought to be!), it will automatically return strings with the utf8 flag set. Here is our snowman test on 3.0.0 with no explicit setting of pg_enable_utf8:

$ git checkout 3.0.0; perl Makefile.PL; make
$ perl dbdpg_unicode_test.pl
DBI is version 1.628, DBD::Pg is version 3.0.0

Sending chilly (☃) to the database. Length is 1
Database gave us back (☃) with a length of 1
PVMG("\342\230\203"\0) [UTF8 "\x{2603}"]

This new automatic detection is the same as setting pg_enable_utf8 to -1. Setting it to 0 will prevent the utf8 flag from ever being set, while setting it to 1 will cause the flag to always be set. Setting it to anything but -1 should be extremely rare in production and used with care.

Common Questions

What happens if I set pg_enable_utf8 = -1 on older versions of DBD::Pg?

Prior to DBD::Pg 3.0.0, the pg_enable_utf8 attribute was a simple boolean, so that setting to anything than 0 will set it to true. In other words, setting it to -1 is the same as setting it to 1. If you must support older versions of DBD::Pg, setting it to -1 is a good setting.

Why does DBD::Pg flag everything as utf8, including simple ASCII strings with no high bit characters?

The lovely thing about the UTF-8 scheme is that ASCII data fits nicely inside it with no changes. However, a bare ASCII string is still valid UTF-8, it simply doesn't have any high-bit characters. So rather than read each string as it comes back from the database and determine if it *must* be flagged as utf8, DBD::Pg simply flags every string as utf8 because it *can*. In other words, every string may or may not contain actual non-ASCII characters, but either way we simply flag it because it *may* contain them, and that is good enough. This saves us a bit of time and effort, as we no longer have to scan every single byte coming back from the database. This decision to mark everything as utf8 instead of only non-ASCII strings was the most contentious decision when this new version was being developed.

Why is only UTF-8 the only client_encoding that is treated special?

There are two important reasons why we only look at UTF-8. First, the utf8 flag is the only flag Perl strings have, so there is no way of marking a string as any other type of encoding. Second, UTF-8 is unique inside Postgres as it is the universal client_encoding, which has a mapping from nearly every supported server_encoding. In other words, no matter what your server_encoding is set to, setting your client_encoding to UTF-8 is always a safe bet. It's pretty obvious at this point that UTF-8 has won the encoding wars, and is the de-facto encoding standard for Unicode.

When is the client_encoding checked? What if I change it?

The value of client_encoding is only checked when DBD::Pg first connects. Rechecking this seldom-changed attribute would be quite costly, but there is a way to signal DBD::Pg. If you really want to change the value of client_encoding after you connect, just set the pg_enable_utf8 attribute to -1, and it will cause DBD::Pg to re-read the client_encoding and start setting the utf8 flags accordingly.

What about arrays?

Arrays are handled as expected too. Arrays are unwrapped and turned into an array reference, in which the individual strings within it have the utf8 flag set. Example code:

...
for my $x (sort keys %dm) {

    print "\nSending $x ($dm{$x}) to the database. Length is " . length($dm{$x}) . "\n";
    my $SQL = qq{SELECT ARRAY['$dm{$x}']};
    my $var = $dbh->selectall_arrayref($SQL)->[0][0];
    print "Database gave us back ($var) with a length of " . length($var) . "\n";
    $var = pop @$var;
    print "Inner array ($var) has a length of " . length($var) . "\n";
    print DPeek $var;
    print "\n";
}

DBI is version 1.628, DBD::Pg is version 3.0.0

Sending chilly (☃) to the database. Length is 1
Database gave us back (ARRAY(0x90c555c)) with a length of 16
Inner array (☃) has a length of 1
PVMG("\342\230\203"\0) [UTF8 "\x{2603}"]

Why is Unicode so hard?

Partly because human languages are a vast and complex system, and partly because we painted ourselves into a corner a bit in the early days of computing. Some of the statements presented above have been over-simplified. Unicode is much more than just using UTF-8 properly. The utf8 flag in Perl strings does not mean quite the same thing as a UTF-8 encoding. Interestingly, Perl even makes a distinction between "UTF8" and "UTF-8". It's quite a mess, but at the end of the day. Unicode support is far better in Perl than any other language.

a git and symlink mistake

A couple of times, I've accidentally created an infinite symlink loop and lost files from that directory from a single git commit. Here's how it happened:

  1. First, on the production app, images tied to products in a database were uploaded to the server. Let's say I was working on a Rails app, so these images were uploaded to the RAILS_ROOT/public/system/ directory. I added "public/system/" to my .gitignore file, and all appeared to be good.
  2. Next on a camps instance, I created a symlink from CAMP_ROOT/public/system pointing to the production app public/system directory. This is common practice in End Point's camps setup because we often don't need the redundancy of uploaded files on our dev camp instance, and we don't want the extra disk space used up for these types of files. The make camp script is designed to allow a user to toggle symlink functionality on and off for various directories during the make camp process.
  3. Next, I accidentally committed and push the public/system symlink from my development instance.
  4. Finally, I pulled the commit onto my production instance. The pull resulted in public/system symlinking to itself, and all of the files vanished (poof). Since they were git-ignored in the first place, I couldn't recover them from git.

This is a pretty simple mistake to mitigate or avoid completely:

Backups

Have backups! Have backups! Did I say have backups?! Our awesome hosting team here at End Point was able to recover the lost files from the nightly backups when this has happened.

Gitignore Update

Second, .gitignore should be modified to ignore "public/system" which includes the symlink and the directory, instead of the directory only with "public/system/". In the case of Rails, the Rails app will automatically create the "system" directory if it does not exist, so that directory does not need to be committed with a hidden file (e.g. .empty). But in other cases where the directory is not automatically created, it might make sense to include "public/some_folder", "public/some_folder/*", and "!public/some_folder/.empty" in your gitignore so that the symlink and directory contents (except for .emtpy) are ignored.

It seems silly that a thing such as a single character (a "/" in this case) can wreak havoc on websites for me more than once, but such is life in programming. Backups and attention to detail are important!

Java Web app error: "Your security settings have blocked a self-signed application from running"

There's a growing number people complaining that Java does not seem secure enough and that they feel vulnerable every time they open (and confirm to "Trust") Java Web applications.

Since after the last update Java had at the end of January, this shouldn't be a problem anymore as you can read in the Java Changelog for the latest release there has been many efforts toward making "the Web" a safer place for everyone.

Unfortunately is also quite known that security and usability often fight one against each other, so it's fairly possible that after installing the last Java Update and when trying to use Web Apps that worked until a few minutes earlier, you found yourselves facing the following error: "Your security settings have blocked a self-signed application from running".

What happened is that Oracle changed the behavior your Java browser plugin will have when dealing with self signed Web applications by actually denying the execution of those since considered harmful by default.

In order to fix this situation you'll need to launch the command jcontrol. Most Linux distributions will install it under /usr/bin/jcontrol while others will place that binary in different places, as an example ArchLinux packages places it in /opt/java/bin/jcontrol. If you can't find where it is, simply use which jcontrol and if it's in your PATH its location will be shown.

Open jcontrol as your regular user and use it to whitelist the site you're trying to use. Once the main window will be open, move to the Security tab and press the Edit Site List.. button. Here you should Add a new row and in there write the site which is hosting the Java Web Applications you want to use. It's important to put the complete URL scheme://domain:port/path, otherwise it won't simply be detected as whitelisted.

Press OK to definitely put the new settings in use and reload the offending page. Everything should be working again now.

Sometimes things works quick and easy for Linux users too.

Spree Active Shipping Gem "We are unable to calculate shipping rates for the selected items." Error

I was recently working on a Spree site and setting up the Spree Active Shipping Gem along with the UPS API. For those not familiar, the Spree Active Shipping Gem interacts with various shipping APIs like UPS, USPS, and FedEx. Due to the nature of Spree – where it does so much for you, and the interaction between the Active Shipping Gem and a shipping API also being 'auto-magic', it is often difficult to debug. As I was recently undertaking the task of setting this up I found a few 'gotchas' that I hope, through this blog post, may be able to save others a lot of time.

I have found that there wasn't a lot of instruction for setting up the Active Shipping Gem and a shipping carrier API like the UPS Shipping API. Ostensibly, there isn't much to it – the Active Shipping Gem handles much of the interaction between the shipping API of choice and Spree.

First, you're going to go the Spree Active Shipping Gem GitHub repo and follow the instructions for installing the Active Shipping Gem. It is very straightforward, but do proceed in the order mentioned in the Spree Active Shipping Gem documentation as some steps depend on the successful completion of others.

Second, you're going to go to the shipper of your choice, in this case UPS, and follow their directions for using their API. I do recommend actually reading, or at least skimming, the pages and pages of documentation. Why? Because there are some important agreements explaining how the API is to be used (basically legal requirements for the UPS logo).

The Active Shipping Gem makes a call to the API, the API returns a hash of various shipping methods and prices based on the parameters you've sent it (such as shipment origin and destination), and then it automatically displays in the UI as an adjustment. How great is that?!

Well, it would be great if it all worked out exactly as planned. However, if you are running Spree 2-0-stable you may find yourself battling an unusual circumstance. Namely, Spree 2-0-stable will create your core/app/views/spree/checkout/edit.html.erb as

<% content_for :head do %>
     <%= javascript_include_tag '/states' %>
   <% end %>

This will provide the incorrect path. It is intended to hit the StatesController, so update it like so:

<% content_for :head do %>
     <%= javascript_include_tag states_url %>
   <% end %>

Now, once this correction has been made you may find that you are still having an error, "We are unable to calculate shipping rates for the selected items."

At this point Chrome Dev Tools will not show any errors. When I had this error, a number of Google searches returned results of the kind "make sure you have set up your shipping zones correctly and added shipping methods to these zones". I verified this again and again in the console, as did many others who were equally perplexed by this message on StackOverflow as well as in Google Groups, like here and here. Some got this error when they added an item to the cart that had a count_on_hand of 0 and backorderable equal to false, like you can see here at Spree GitHub issues. If a 0 count_on_hand is what is giving you this error, but you want a product to be backorderable, make sure to also check the 'Propagate All Variants' in the Spree admin as seen below. This will loop through all of the product's variants with a count_on_hand of 0, and allow them to be backorderable.

After a long while of searching and wondering, is it the API? Is it the Active Shipping Gem? Is it a blacklisted zip code? I went through and changed one setting at a time in the Spree admin until finally arriving at the source of this error for me. Missing product weight. Because UPS needs the product weight in order to calculate shipping charges, make sure this is set.

The "We are unable to calculate shipping rates for the selected items" error message is misleading. If you encounter this error after correcting the javascript_include_tag, the cause is most likely a setting in the admin. Check for how insufficient inventory is handled, missing product weights, or incorrectly setup up or non-existent shipping zones & associated methods. I hope if this error message is what brought you here that this post has saved you some time.

A Brief Retrospective of Spree

SpreeConf NYC 2014 starts on the 26th and its hard to believe that Spree is almost 7 years old! Here's a retrospective showing some notable Spree moments and major releases.

July 15, 2007 RailsCart, the precursor to Spree, is created by Sean Schofield and gets its first commit as a Google Code project.

February 1, 2008 Sean is hired by End Point Corporation to work on RailsCart along with its other Rails developers. End Point sponsors Spree for the next year and a half.

February 15, 2008 RailsCart project moves from Google Code to GitHub and is renamed Spree.

June 1, 2008 Spree 0.0.9 released: sophisticated data model for inventory.

June 3, 2008 Product Properties added.

June 5, 2008 Spree 0.2.0 released: adds Spree extensions.

July 4, 2008 Zones introduced.

Sept 8, 2008 Refactored to REST routing added state machine to order.

October 1, 2008 New Taxonomy system introduced.

October 2, 2008 Spree 0.4.0 released: Taxonomy, and VAT-inclusive pricing.

November 26, 2008 Volume pricing introduced as an extension.

November 24, 2008 Spree 0.5.0 released: new shipping framework.

December 3, 2008 SEO friendly URLs.

December 4, 2008 Switched from attachment_fu to paperclip for image attachments.

February 3, 2009 Spree 0.6.0 released: friendly urls, attachment_fu to paperclip, paranoid deletion of products, standardized security model, rake tasks for translations.

March 5, 2009 Searchlogic gem added.

March 10, 2009 Spree 0.7.0 released: switches to ajax-based Magento-style single page checkout, order number as permalinks, French, Russian and Norwegian translations added.

April 2, 2009 Authlogic gem added.

April 23, 2009 Spreecommerce.com launched.

May 4, 2009 Spree 0.8.0 released: CSS based layout using compass, sass, and blueprint, Authlogic, guest checkout, expanded functionality for prototype products, improved upgrade rake task, improved order security using tokens.

May 18, 2009 Spree core team formalized.

May 28, 2009 Sean Schofield leaves End Point and founds Rails Dog.

September 22, 2009 Spree 0.9.0 released: Coupon and discount support, improved system of calculators, Thai, Hebrew, Dutch, Finish, and Mexican Spanish translations.

September 5, 2009 Spree now 100% jQuery for all JavaScript functionality.

March 13, 2009 Spree 0.10.0 released: Named scopes and product groups, pluggable search (Xapian, Sphinx, and Solr), theming, switch to multipage checkout, improved gateway configuration, multiple payment methods, refunds and credits, restful api.

June 14, 2010 Spree 0.11.0 released: new look for Spree, Rails 2.3.8 compatibility.

November 1, 2010 New extension registry.

November 9, 2010 Spree 0.30.0 released: Rails 3 support, reorganization into multiple gems: spree_core, spree_auth, spree_api, spree_dash, spree_sample, Rails engines introduced, site extension removed, extensions become gems, improved payments with addition of state machine, simplification of adjustments, new promotion functionality, no more "vendor mode".

December 10, 2010 Spree 0.40.0 released: switched from Authlogic to Devise for authentication.

March 23, 2011 Spree 0.50.0 released: improved test coverage, replaced Searchlogic gem with Meta search gem.

May 13, 2011 Spree 0.60.0 released: removal of Resource Controller.

October 10, 2011 Announcement that Spree Commerce Inc. is formed and raised $1.5M in seed funding.

November 7, 2011 Spree 0.70.0 Integration with rails asset pipeline, theming support, themes as engines, extension generator, promotions extended.

January 31, 2012 Spree analytics with Jiraffe introduced.

February 9, 2012 Spree 1.0 released, namespacing introduced, Spree referenced routes, mounting Spree engine in routes, Spree analytics, Spree dash gem, command line tool, default payment gateway Spree/Skrill Spree_usa_epay, moved gateways out of core to Spree gateway gem, Refactored preferences, polymorphic adjustments, removed helpers and JavaScript related to VAT, removed sales tax and VAT calculators.

February 18, 2012 First SpreeConf held in New York City.

April 30, 2012 Spree 1.1.0 released: support for Rails 3.2, Ransack replaces meta search, Spree product groups is a standalone extension, theme support deprecated in favor of deface, major rewrite of credit card model, API rewrite, stronger mass assignment protection, clearer separation between Spree components, easier Spree spec testing.

August 31, 2012 Spree 1.2.0 released: Auth component completely removed from Spree and placed in separate Spree auth devise extension. Customizing state machine no longer requires completely overriding it.

December 19, 2012 Spree 1.3.0 released: Admin section redesigned, currency support.

December 20, 2012 New Spree "Fancy" theme introduced.

April 15, 2013 New documentation site launched.

May 19, 2013 Spree 2.0.0 released: Removed support for Ruby 1.8.7, Backend and Frontend split from core, Split shipments introduced, I18n names spaced translations, New API endpoints, instance level permission in API, Custom API templates, adjustment state changes, Order Populator in its own class, coupon applicator in its own class, product duplicator moved to its own class, new helpers to modify checkout flow steps, API supports "checking out" order, Auto-rotation of images, Unique payment identifier added to payments, removal of state call back in checkout controller, tracking URL for shipments, SSLRequirement deprecated in favor of ForceSSL, MailMethod model no longer exists,

May 21, 2013 Second SpreeConf held in Washington DC.

September 16, 2013 Spree 2.1.0 released: Rails 4 compatibility, breaking API changes, better Spree PayPal Express extension.

Perl PostgreSQL driver DBD::Pg 3.0.0 released

I am happy to announce that version 3.0.0 of DBD::Pg, the Perl interface to Postgres, was released on February 3, 2014. This represents a major release, mostly due to the way it now handles UTF-8. I will try to blog soon with more details about that and some other major changes in this version.

The new version is available from CPAN. Please make sure that this is the latest version, as new versions may have come out since this post was written.


Checksums for 3.0.0:

58c2613bcb241279aca4c111ba16db48 DBD-Pg-3.0.0.tar.gz
03ded628d453718cbceaea906da3412df5a7137a DBD-Pg-3.0.0.tar.gz

The complete list of changes is below. Thank you to everyone who sent in patches, helped debug, wrote bug reports, and helped me get this version out the door!

Version 3.0.0  Released February 3, 2014 (git commit 9725314f27a8d65fc05bdeda3da8ce9c251f79bd)

  - Major change in UTF-8 handling. If client_encoding is set to UTF-8, 
    always mark returned Perl strings as utf8. See the pg_enable_utf8 docs
    for more information.
    [Greg Sabino Mullane, David E. Wheeler, David Christensen]

  - Bump DBI requirement to 1.614

  - Bump Perl requirement to 5.8.1

  - Add new handle attribute, switch_prepared, to control when we stop 
    using PQexecParams and start using PQexecPrepared. The default is 2: 
    in previous versions, the effective behavior was 1 (i.e. PQexecParams 
    was never used).
    [Greg Sabino Mullane]

  - Better handling of items inside of arrays, particularly bytea arrays.
    [Greg Sabino Mullane] (CPAN bug #91454)

  - Map SQL_CHAR back to bpchar, not char
    [Greg Sabino Mullane, reported by H.Merijn Brand]

  - Do not force oids to Perl ints
    [Greg Sabino Mullane] (CPAN bug #85836)

  - Return better sqlstate codes on fatal errors
    [Rainer Weikusat]

  - Better prepared statement names to avoid bug
    [Spencer Sun] (CPAN bug #88827)

  - Add pg_expression field to statistics_info output to show 
    functional index information
    [Greg Sabino Mullane] (CPAN bug #76608)

  - Adjust lo_import_with_oid check for 8.3
    (CPAN bug #83145)

  - Better handling of libpq errors to return SQLSTATE 08000
    [Stephen Keller]

  - Make sure CREATE TABLE .. AS SELECT returns rows in non do() cases

  - Add support for AutoInactiveDestroy
    [David Dick] (CPAN bug #68893)

  - Fix ORDINAL_POSITION in foreign_key_info
    [Dagfinn Ilmari Mannsåker] (CPAN bug #88794)

  - Fix foreign_key_info with unspecified schema
    [Dagfinn Ilmari Mannsåker] (CPAN bug #88787)

  - Allow foreign_key_info to work when pg_expand_array is off
    [Greg Sabino Mullane and Tim Bunce] (CPAN bug #51780)

  - Remove math.h linking, as we no longer need it
    (CPAN bug #79256)

  - Spelling fixes
    (CPAN bug #78168)

  - Better wording for the AutoCommit docs
    (CPAN bug #82536)

  - Change NOTICE to DEBUG1 in t/02attribs.t test for handle attribute "PrintWarn":
    implicit index creation is now quieter in Postgres.
    [Erik Rijkers]

  - Use correct SQL_BIGINT constant for int8
    [Dagfinn Ilmari Mannsåker]

  - Fix assertion when binding array columns on debug perls >= 5.16
    [Dagfinn Ilmari Mannsåker]

  - Adjust test to use 3 digit exponential values
    [Greg Sabino Mullane] (CPAN bug #59449)

  - Avoid reinstalling driver methods in threads
    [Dagfinn Ilmari Mannsåker] (CPAN bug #83638)

  - Make sure App::Info does not prompt for pg_config location 
    if AUTOMATED_TESTING or PERL_MM_USE_DEFAULT is set
    [David E. Wheeler] (CPAN bug #90799)

  - Fix typo in docs for pg_placeholder_dollaronly
    [Bryan Carpenter] (CPAN bug #91400)

  - Cleanup dangling largeobjects in tests
    [Fitz Elliott] (CPAN bug #92212)

  - Fix skip test counting in t/09arrays.t
    [Greg Sabino Mullane] (CPAN bug #79544)

  - Explicitly specify en_US for spell checking
    [Dagfinn Ilmari Mannsåker] (CPAN bug #91804)

Long-term Benefits from RailsAdmin

Sometimes setting up an admin tool with user authorization at the onset of a new Rails project can take a bit of work, and it's not until later that long-term benefits of this framework can be realized. This came up recently when I was asked to write an export method to allow website admins to export customer and order data (for a specific date range) on a platform that was already using RailsAdmin in combination with CanCan for user authorization.

A normal Rails development process for this (from scratch) might look like this:

  1. Create controller.
  2. Create route to map to controller.
  3. Create method in controller to render initial view and send data.
  4. Create view which allows user to select date range.
  5. Add user authorization to allow specific users to access functionality.

This isn't that much work, but here's how the same functionality was built using RailsAdmin:

RailsAdmin Action

First, I create Rails Admin action, which inherits from RailsAdmin::Config::Actions. Inheriting from RailsAdmin::Config::Actions includes many class methods such as defining the actions http_methods (get, post, etc.), defining if the action is applicable to a single instance or a set of instances (which influences whether the action shows as a tab or individual item listings), and defining the icon that a user will see in the admin for this action. This bit of code also contains the controller method. In my case, this includes this method checks if the request is a get or post to determine whether to render the view or send the exported data. Here's what the code looks like:

module RailsAdmin
  module Config
    module Actions
      class SpecialExport < RailsAdmin::Config::Actions::Base
        register_instance_option :collection do
          true
        end

        register_instance_option :http_methods do
          [:get, :post]
        end

        register_instance_option :controller do
          Proc.new do
            if request.post?
              # generate CSV data
              send_data csv_data, :filename => "export.csv", :type => "application/csv"
            end
            # renders view otherwise
          end
        end
        register_instance_option :link_icon do
          'icon-share'
        end
      end
    end
  end
end

Register Action

Next, I register and activate the action in the RailsAdmin configuration. Here's what this looks like:

module RailsAdmin
  module Config
    module Actions
      class SpecialExport < RailsAdmin::Config::Actions::Base
        RailsAdmin::Config::Actions.register(self)
      end
    end
  end
end
RailsAdmin.config do |config|
  config.actions do
    # existing actions
    special_export
  end
end

CanCan Modifications

Next up, I modify CanCan's abilities to specify that the action can be performed on a specific class, and not others. Here's what my code looks like for this:

class Ability
  include CanCan::Ability
  def initialize(user)
    if user && user.is_admin?
      #existing user authentication
      
      cannot :special_export, :all
      can :special_export, Order
    end
  end
end

Create view

After the above steps, I create a view to allow the user to select a date range. This is fairly simple, and uses the already included jQuery UI date select functionality for the calendar UI.

Conclusion

Although line for line, the amount of code in the latter code infrastructure with RailsAdmin may be about the same to the former option, the API for hooking custom functionality into RailsAdmin is a win in terms of inherited functionality and maintenance. My "Special Export" method shows as a tab that can be performed on Orders only in the admin, and I wrote very little code for that specific behavior. Another maintenance gain here is that my custom methods are all consistent, which makes it easier for another developer to jump in and make changes, especially someone who is already familiar with RailsAdmin.


Example "Special Export" tab in RailsAdmin interface.

I'm not married to RailsAdmin in particular, but it happens to be a tool I've become familiar with and it integrates well with existing user authorization and user authentication tools. But I do always make a suggestion to use an existing Admin interface in new Rails applications, whether it's RailsAdmin, ActiveAdmin, or another. These tools have a great foundation and have settled on a consistent API that encourage efficient customization and maintenance.

Rails ActiveRecord with Database Column Defaults

I had an interaction with a coworker recently that made me take stock of what occasions and situations I use database column defaults. I realized that far and away my primary use is for booleans. I commonly set a default on my boolean columns when I'm defining a new migration. I do this primarily to minimize the potential for three states--true, false, and null--when I usually want a boolean to be limited to either true or false.

Alongside the distillation down to the classically defined values, another perk of defaults in general is that Rails uses the table's definition within the database to pre-fill attributes that are not included in the initialization params for an object. For example, a table with columns defined as follows:

create_table :my_objects do |t|
  t.string :column_one, default: 'foo'
  t.string :column_two
end
Will result in:
$> m = MyObject.new
  => #<MyObject id: nil, column_one: "foo", column_two: nil>
Note that column_two has no default and so is initialized to nil. But column_one is set to "foo" because no other value was supplied. This behavior can be quite handy for boolean attributes such as :published or :unread. Published can be a good example of a value that would start as false while unread is a good candidate to start out true.

It's worth mentioning that defaults aren't absolutely enforced. It is still your prerogative to override should you so choose. For example:

$> m = MyObject.new(column_one: nil)
  => #<MyObject id: nil, column_one: nil, column_two: nil>
$> m.save
  => true
$> m
  => #<MyObject id: 1, column_one: nil, column_two: nil>

I mentioned my primary use for defaults is pairing them with booleans. Considering that overriding a default could defeat the purpose of having that default, pairing the default with a non-null constraint is one additional level of security you can provide yourself. For example, you can modify the original table definition as follows:

create_table :my_objects do |t|
  t.boolean :column_one, default: true, null: false
  t.string :column_two
end
Attempting to save an occurrence of the above object with column_one set to nil would raise a database-specific error. If you don't want to rescue an error, you can go one step further and add a validation to your Rails object.
class MyObject < ActiveRecord::Base
  validates :column_one, inclusion: {in: [true, false]}
end
It's probably worth noting the use of an inclusion validation instead of a presence validation. Using presence would disallow setting the boolean to false.

Personally, I usually stick with:

t.boolean :column_one, default: <pick one>, null: false
I don't find much need for the model validation since if I explicitly add the following line of code:
my_object.boolean_column = nil
I’m confident that I won’t be doing it by accident. That only leaves the potential for:
my_object.boolean_column = some_method
where some_method may have the potential to return a null value. I find that almost exclusively any method I assign to a boolean will be of the form “some_method?” which by convention should return either true or false. Again, strictly the two values I'm interested in representing.

In conclusion, though I don't find myself using them on every project, I am happy to have this particular tool available in my repertoire for those not uncommon occasions where I can draw a benefit.

The more the merrier? Not always...

Recently we were working on an image manipulation function for a customer that used ImageMagick's convert command. We had spent tons of time working on the program and the associated system that was going to use it. Everything was working fine. The whole system was nice and snappy. Great job everyone!

Until we moved it into production where the page load times were 8 to 10 times slower than in our camps development system ...

We all sprang into action to try to figure out why the page load times were so slow now. We reviewed system settings, configuration files for the database and application, and anything else we could think of. We made sure the OS packages, ImageMagick version, and various other things were the same, and everything looked right. The production hardware has twice the RAM and 4 times the number of processors as development does. So what the heck is going on?

To distract ourselves and hope for more insight, we tried to optimize the code a bit and while making it a bit better we were still 6 to 8 times slower than in development. We deactivated the section of the site overnight so we could sleep on it. Luckily this was a new product line so it wasn't tragic to turn it off.

The next morning while discussing, a co-worker mentioned that the larger number of processors in production could be relevant. At first I was a bit taken aback because this goes against everything we think about more being better. How could it be that having 4 times the number of processors was a bad thing?!

Well, it turns out that ImageMagick is a threaded program and will use as many processors in parallel as it can to accomplish whatever task it is asked to do. In doing this instead of splitting up the convert over 8 processors, it was now splitting it up over 32 processors. The extra work it was taking to manage the work being split between those extra 24 processors actually greatly slowed down the processing work!

Luckily ImageMagick respects an environment variable setting to limit the maximum number of threads it will use. We set the OMP_NUM_THREADS environment variable to 4, re-ran the code, and then it performed as expected in production. It could be set to more or less threads, but we found that was fine and aside from keeping performance high, it keeps this one process from dominating CPU usage while it is running.

It was a very interesting riddle to solve so I figured I'd share it.

HTML Doctypes Make A Difference

If you are like me you may not have given much thought to an HTML doctype other than "yes, it needs to be there" and "there are a few different types, but I don't really understand the differences between them." Well if you are like me then you also recently discovered why doctypes matter and how they actually affect your web page.

For those that are not familiar with an HTML doctype, they are the very first line of an HTML document and look like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" 
  "http://www.w3.org/TR/html4/strict.dtd">

As I mentioned before, there are a few different document types. The reason for this is because each one corresponds to a different rule set of HTML syntax for the web browser to follow, like: HTML 4.0, HTML 4.01, XHTML 1.0, or XHTML 1.1.

"The HTML syntax requires a doctype to be specified to ensure that the browser renders the page in standards mode. The doctype has no other purpose." [1]

When you specify an official doctype, your web browser should follow the standard behavior for rendering that specific rule set for the given HTML syntax. Thus you should get expected results when it renders your web page. There are official doctypes defined by the W3C for both "strict" and "transitional" rule sets. A "strict" document type adheres strictly to the specification for that syntax, and any legacy or not supported tags found in your document will cause rendering problems. A "transitional" document follows the specification for the given syntax, but also allows for legacy tags to exist in the document. They also define "frameset" doctypes which are transitional documents that also allow for frame related tags.

The doctype for HTML 5 is a lot simpler and shorter than other doctypes, and may look like this:

<!DOCTYPE html>

When you declare an unofficial doctype, the browser will not know which tag syntax rule set to use and will not render the page in a standard way. This is called quirks mode, and the browser basically regresses to an older rules engine to support all legacy tags it knows about and attempts to handle it. This also means that your web page may not render or behave as expected, especially if you use newer tags or features in your document.

Besides the HTML tag syntax, JavaScript is also affected by the doctype- since it is tied to the DOM engine being used by the browser rendering the page. For example, in an strict doctype you will have the native JSON parser object available, but in quirks mode it may not even exist and calls to JSON.parse() or JSON.stringify() could fail.

If you are not sure you are using an official doctype, or if you are using tags that are not supported by the doctype that you are using, then you can check with http://validator.w3.org/ and validate your page. The whole point is to get your web page to render and behave as you expect it, providing a better experience for you and your users.

phpMemcachedAdmin: Graphical/Web Administration for memcached

Memcached is an awesome tool, though it doesn't offer the best interactive administration experience out there, with its command manually run via a telnet/nc connection.

Well luckily enough there's an alternative to all that pain and its name is phpMemcachedAdmin

While the development stopped in the end of 2012 (if we don't consider minor forks) the features offered are quite interesting.

Quoting directly from the main site:

[...]
This program allows to see in real-time (top-like) or from the start of the server, stats for get, set, delete, increment, decrement, evictions, reclaimed, cas command, as well as server stats (network, items, server version) with googlecharts and server internal configuration
You can go further to see each server slabs, occupation, memory wasted and items (key & value).
Another part can execute commands to any memcached server : get, set, delete, flush_all, as well as execute any commands (like stats) with telnet
[...]

Which is incredible news, even more considering the complete lack of similar features from the native Memcached code.

Since all the code basically revolves around a PHP file, the setup phase is just a matter of creating a new VirtualHost section inside a working web server which can serve PHP files and placing the uncompressed code tarball inside the docroot. For more details and configuration info, please consult the code homepage.

It's important to highlight that phpMemcachedAdmin has absolutely no layer of security. Unless you're setting up phpMemcachedAdmin just for teaching or testing purposes, it's strongly advisable to protect it with at least a simple Basic Auth mechanism and possibly HTTPS in order to protect your Memcached cluster from potential malicious attackers.

Once you have a working instance you'll see something similar to the following screenshots:

Every Memcached administrator or heavy user have, at least once, wished to have such a powerful tool and now you can finally put your favourite terminal at rest.. at least for the next minute.