End Point

News

Welcome to End Point's blog

Ongoing observations by End Point people.

CSSConf US 2014 — Part One

Geeks in Paradise

IMG 2414

Today I was very lucky to once again attend CSSConf US here in Amelia Island, Florida. Nicole Sullivan and crew did an excellent job of organizing and curating a wide range of talks specifically geared toward CSS developers. Although I work daily on many aspects of the web stack, I feel like I'm one of the (seemingly) rare few who actually enjoy writing CSS so it was a real treat to spend the day with many like-minded folks.

Styleguide Driven Development

Nicole Sullivan started things off with her talk on Style Guide Driven Development (SGDD). She talked about the process and challenges she and the team at Pivotal Labs went through when they redesigned the Cloud Foundry Developer Console and how they overcame many of them with the SGDD approach. The idea behind SGDD is to catalog all of the reusable components used in a web project so developers use what's already there rather than reinventing the wheel for each new feature. The components are displayed in the style guide next to examples of the view code and CSS which makes up each component. Check out the Developer Console Style Guide for an example of this in action. The benefits of this approach include enabling a short feedback loop for project managers and designers and encouraging developers who may not be CSS experts to follow the "blessed" path to build components that are consistent and cohesive with the design. In Nicole's project they were also able to significantly reduce the amount of unused CSS and layouts once they had broken down the app into reusable components.

Hologram is an interesting tool to help with the creation of style guides which Nicole shared and is definitely worth checking out.

Sara Soueidan — Styling and Animating Scalable Vector Graphics with CSS

Sara talked to us about using SVG with CSS and included some really neat demos. Adobe Illustrator, Inkscape and Sketch 3 are the commonly used tools used to create SVG images. Once you have your SVG image you can use the SVG Editor by Peter Collingridge or SVGO (node.js based tool) to clean up and optimize the SVG code. After the cleanup and optimization you can replace the generic CSS class names from your SVG creation app with more semantic CSS class names.

There are a variety of ways to include SVG on a page and Sara went over the pros and cons of each. The method that seemed most interesting to me was to use an <object> tag which allowed for a fallback image for browsers that do not support SVG. Sara mapped out the subset of CSS selectors which can be used to target SVG elements, how to "responsify" SVGs and to animate SVG paths. Be sure to check out her slides from the talk.

Lea Verou — The Chroma Zone: Engineering Color on the Web

Lea's talk was about color on the web. She detailed the history of how color has been handled up to this point, how it works today and some of the interesting color-related CSS features which are coming in the future. She demonstrated how each of the color spaces have a geographical representation (e.g. RGB can be represented as a cube and HSL as a double-cone) which I found neat. RGB is very unintuitive when it comes to choosing colors. HSL is much more intuitive but has some challenges of its own. The new and shiny CSS color features Lea talked about included:

  • filters
  • blending modes
  • CSS variables
  • gray()
  • color() including tint, shade and other adjusters
  • the #rgba and #rrggbbaa notation
  • hwb()
  • named hues and <angle> in hsl()

Some of these new features can be used already via libs like Bourbon and Myth. Check out the Chroma Zone: Engineering Color on the Web slides to learn more.

C$$

I will write up more of the talks soon but wanted to thank Jenn Schiffer for keeping us all laughing throughout the day in her role as MC and topping it off with a hilarious, satirical talk of her own. Thanks also to Alex and Adam for curating the music and looking after the sound.

Supporting Apple Retina displays on the Web

Apple's Retina displays (on Mac desktop & laptop computers, and on iPhones and iPads) have around twice the pixel density of traditional displays. Most recent Android phones and tablets have higher-resolution screens as well.

I was recently given the task of adding support for these higher-resolution displays to our End Point company website. Our imagery had been created prior to Retina displays being commonly used, but even now many web developers still overlook supporting high-resolution screens because it hasn't been part of the website workflow before, because they aren't simple to cope with, and since most people don't notice any lack of sharpness without comparing low & high-resolution images side by side.

Most images which are not designed for Retina displays look blurry on them, like this:

The higher-resolution image is on the left, and the lower-resolution image is on the right.

Now, to solve this problem, you need to serve a larger, higher quality image to Retina displays. There are several different ways to do this. I'll cover a few ways to do it, and explain how I implemented it for our site.

Retina.js

As I was researching ways to implement support for Retina displays, I found that a popular suggestion is the JavaScript library Retina.js. Retina.js automatically detects Retina screens, and then for each image on the page, it checks the web server for a Retina image version under the same name with @2x before the suffix. For example, when fetching the image background.jpg on a Retina-capable system, it would afterward look for background@2x.jpg and serve that if it's available.

Retina.js makes it relatively painless to deal with serving Retina images to the correct people, but it has a couple of large problems. First, it fetches and replaces the Retina image after the default image, serving both the normal and Retina images to Retina users, greatly increasing download size and time.

Second, Retina.js does not use the correct image if the browser window is moved from a Retina display to a non-Retina display or vice versa when using multiple monitors. For example, if an image is loaded on a standard 1080p monitor and then the browser is moved to a Retina display, it will show the incorrect, low-res image.

Using CSS for background images

Doesn't the "modern web" have a way to handle this natively in HTML & CSS? For sites using CSS background images, CSS media queries will do the trick:

@media only screen and (-webkit-min-device-pixel-ratio: 2), (min-resolution: 192dpi) {
  .icon {
    background-image: url(icon@2x.png);
    background-size: 20px 20px;
  }
}

But this method only works with CSS background images, so for our site and a lot of other sites, it will only be useful for a small number of images.

Take a look at this CSS-Tricks page for some excellent examples of Retina (and other higher-res display) support.

Server-side checks for Retina images

A very efficient way to handle all types of images is to have the browser JavaScript set a cookie that tells the web server whether to serve Retina or standard images. That will keep data transfer to a minimum, with a minimum of trickery required in the browser. You'll still need to create an extra Retina-resolution image for every standard image on the server. And you'll need to have a dynamic web process run for every image served. The Retina Images open source PHP program shows how to do this.

Why we didn't use these methods

There is one reason common to all of these methods which made us decide against them: All of them require you to maintain multiple versions of each image. This ends up taking a lot of time and effort. It also means your content distribution network (CDN) or other HTTP caches will have twice as many image files to load and cache, increasing cache misses and data transfer. It also uses more disk space, which isn't a big problem for the small number of images on our website, but on an ecommerce website with many thousands of images, it adds up quickly.

We would feel compelled to have the separate images if it were necessary if the Retina images were much larger and slow down the browsing experience for non-Retina users for no purpose. But instead we decided on the following solution that we saw others describe.

Serving Retina images to everybody (how we did it)

We read that you can serve Retina images to everyone, but we immediately thought that wouldn't work out well. We were sure that the Retina images would be several times larger than the normal images, wasting a ton of bandwidth for anyone not using a Retina screen. We were very pleasantly surprised to find out that this wasn't the case at all.

After testing on a few images, I found I could get Retina images within 2-3 KB of the normal images while keeping the visual fidelity, by dropping the JPEG compression rate. How? Because the images were being displayed at a smaller size than they were, the compression artifacts weren't nearly as noticeable.

These are the total file sizes for each image on our team page:

Retina  Normal  Filename
 10K    9.3K    adam_spangenthal.jpg
 13K     13K    adam_vollrath.jpg
 12K     11K    benjamin_goldstein.jpg
7.6K    4.2K    bianca_rodrigues.jpg
 14K     13K    brian_buchalter.jpg
 13K     15K    brian_gadoury.jpg
7.5K    8.0K    brian_zenone.jpg
9.8K    6.6K    bryan_berry.jpg
 12K     11K    carl_bailey.jpg
6.9K     15K    dave_jenkins.jpg
 13K     13K    david_christensen.jpg
7.7K     21K    emanuele_calo.jpg
 16K     16K    erika_hamby.jpg
 13K     11K    gerard_drazba.jpg
 14K     14K    greg_davidson.jpg
 14K     12K    greg_sabino_mullane.jpg
 14K     15K    jeff_boes.jpg
 14K     12K    jon_jensen.jpg
 13K     12K    josh_ausborne.jpg
 13K     14K    josh_tolley.jpg
 13K     11K    josh_williams.jpg
8.9K    9.5K    kamil_ciemniewski.jpg
 13K     21K    kent_krenrich.jpg
 15K     12K    kiel_christofferson.jpg
9.9K     11K    kirk_harr.jpg
7.7K     13K    marco_manchego.jpg
 12K     13K    marina_lohova.jpg
 14K     11K    mark_johnson.jpg
7.3K     13K    matt_galvin.jpg
 15K     12K    matt_vollrath.jpg
6.6K     14K    miguel_alatorre.jpg
 13K     14K    mike_farmer.jpg
7.1K     19K    neil_elliott.jpg
9.9K    9.0K    patrick_lewis.jpg
 13K    5.6K    phin_jensen.jpg
 12K     14K    richard_templet.jpg
 12K    9.9K    rick_peltzman.jpg
 14K     13K    ron_phipps.jpg
9.7K     14K    selvakumar_arumugam.jpg
9.3K     15K    spencer_christensen.jpg
 12K     12K    steph_skardal.jpg
 15K     18K    steve_yoman.jpg
6.7K     15K    szymon_guz.jpg
7.5K    6.8K    tim_case.jpg
 15K     21K    tim_christofferson.jpg
9.3K     12K    will_plaut.jpg
 12K     14K    wojciech_ziniewicz.jpg
 12K    9.9K    zed_jensen.jpg

TOTALS

Retina: 549.4K
Normal: 608.8K

This is where I found the biggest, and best, surprise. The cumulative size of the Retina image files was less than that of the original images. So now we have support for Retina displays, making our website look nice on modern screens, while actually using less data transfer. We don't need JavaScript, cookies, or any extra server-side trickery to do this. And best of all, we don't have to maintain a separate set of Retina images.

Once you've seen the difference in quality on a Retina screen or a new Android phone, you'll wonder how you ever were able to tolerate the lower-resolution images. And at least for our selection of JPEG images, there's not even a file size penalty to pay!

Reference reading

DBD::Pg, array slices, and pg_placeholder_nocolons

New versions of DBD::Pg, the Perl driver for PostgreSQL, have been recently released. In addition to some bug fixes, the handling of colons inside SQL statements has been improved in version 3.2.1, and a new attribute named pg_placeholder_nocolons was added by Graham Ollis in version 3.2.0. Before seeing it in action, let's review the concept of placeholders in DBI and DBD::Pg.

Placeholders allow you store a dummy representation of a value inside your SQL statement. This means you can prepare a SQL statement in advance without specific values, and fill in the values later when it is executed. The two main advantages to doing it this way are to avoid worrying about quoting, and to re-use the same statement with different values. DBD::Pg allows for three styles of placeholders: question mark, dollar sign, and named parameters (aka colons). Here's an example of each:

$SQL = 'SELECT tbalance FROM pgbench_tellers WHERE tid = ? AND bid = ?';
$sth = $dbh->prepare($SQL);
$sth->execute(12,33);

$SQL = 'SELECT tbalance FROM pgbench_tellers WHERE tid = $1 AND bid = $2';
$sth = $dbh->prepare($SQL);
$sth->execute(12,33);

$SQL = 'SELECT tbalance FROM pgbench_tellers WHERE tid = :teller AND bid = :bank';
$sth = $dbh->prepare($SQL);
$sth->bind_param(':teller', 10);
$sth->bind_param(':bank', 33);
$sth->execute()

One of the problems with placeholders is that the symbols used are not exclusive for DBI only, but can be valid SQL characters as well, with their own special meaning. For example, question marks are used by geometric operators, dollar signs are used in Postgres for dollar quoting, and colons are used for both type casts and array slices. DBD::Pg has a few ways to solve these problems.

Question marks are the preferred style of placeholders for many users of DBI (as well as some other systems). They are easy to visualize and great for simple queries. However, question marks can be used as operators inside of Postgres. To get around this, you can use the handle attribute pg_placeholder_dollaronly, which will ignore any placeholders other than dollar signs:

## Fails:
$SQL="SELECT ?- lseg'((-1,0),(1,0))' FROM pg_class WHERE relname = \$1";
$sth = $dbh->prepare($SQL);
## Error is: Cannot mix placeholder styles "?" and "$1"

## Works:
$dbh->{pg_placeholder_dollaronly} = 1;
$sth = $dbh->prepare($SQL);
$sth->execute('foobar');
## For safety:
$dbh->{pg_placeholder_dollaronly} = 0;

Another good form of placeholder is the dollar sign. Postgres itself uses dollar signs for its prepared queries. DBD::Pg will actually transform the question mark and colon versions to dollar signs internally before sending the query off to Postgres to be prepared. A big advantage of using dollar sign placeholders is the re-use of parameters. Dollar signs have two problems: first, Perl uses them as a sigil, Postgres uses them for dollar quoting. However, DBD::Pg is smart enough to tell the difference between dollar quoting and dollar-sign placeholders, so dollar signs as placeholders should always simply work.

The final form of placeholder is 'named parameters' or simply 'colons'. In this format, an alphanumeric string comes right after a colon to "name" the parameter. The main advantage to this form of placeholder is the ability to bind variables by name in your code. The downside is that colons are used by Postgres for both type casting and array slices. The type casting (e.g. 123::int) is detected by DBD::Pg and is not a problem. The detection of array slices was improved in 3.2.1, such that a number-colon-number sequence is never interpreted as a placeholder. However, there are many other ways to write array slices. Therefore, the pg_placeholder_nocolons attribute was invented. When activated, it effectively turns off the use of named parameters:

## Works:
$SQL = q{SELECT relacl[1:2] FROM pg_class WHERE relname = ?};
$sth = $dbh->prepare($SQL);
$sth->execute('foobar');

## Fails:
$SQL = q{SELECT relacl[1 :2] FROM pg_class WHERE relname = ?};
$sth = $dbh->prepare($SQL);
## Error is: Cannot mix placeholder styles ":foo" and "?"

## Works:
$dbh->{pg_placeholder_nocolons} = 1;
$SQL = q{SELECT relacl[1 :2] FROM pg_class WHERE relname = ?};
$sth = $dbh->prepare($SQL);
$sth->execute('foobar');

Which placeholder style you use is up to you (or your framework / supporting module!), but there should be enough options now between pg_placeholder_dollaronly and pg_placeholder_nocolons to support your style peacefully.

Liquid Galaxy at MundoGeo in Brazil

End Point reached another milestone last week when we hosted one of our own Liquid Galaxies at a conference in Brazil for the first time. End Point brought our new Brazilian Liquid Galaxy to MundoGeo Connect in São Paulo, which ran from May 7th to 9th, 2014. MundoGeo Connect is the largest geospatial solutions event in Latin America, which this year drew almost five thousand visitors, among them businessmen, specialists, and government agencies from all over the continent. It was the perfect opportunity to showcase the Liquid Galaxy as the powerful and immersive GIS interaction tool that it is.

Similar to last year, Liquid Galaxy was the runaway hit of the show, and no one in the building left without coming around for a test flight in the Google Earth spaceship. But it was when we started showcasing the GIS functionality, the KML layers, along with other applications like panoramic imagery, embedded video, and live streaming video, that visitors really started understanding its potential for data visualizations, corporate marketing, public outreach, conference communications, and sheer immersive entertainment.


As the Latin American market continues to grow with commodities such as mining and petroleum, GIS visualization becomes more important than ever. Brazil in particular, with the World Cup and Olympics scheduled, has a huge need for urban planning, metro-mapping, and multi-video integration. The Liquid Galaxy can play a pivotal role in enabling companies to fully exploit their data and multimedia content.


Kamelopard version 0.0.15 released

The Camelopardalis constellation, as shown in Urania's Mirror

I've just pushed version 0.0.15 of Kamelopard to RubyGems. As described in several previous blog posts, Kamelopard is a Ruby gem designed to create KML documents quickly and easily. We use it to create content for our Liquid Galaxy customers. This release doesn't include any major new features, rather it provides a number of small but very helpful modifications that taken together make life much easier.

New Spline Types

Perhaps the most useful of these new features relate to spline functions, introduced in Kamelopard version 0.0.12. The original spline function interface described here accepts a series of equi-dimensional vectors as control points, and returns vectors as results, but there's no indication of what each dimension in the vector means. This is convenient in that you can use splines to make nice paths through any number of dimensions and use them however you'd like, but in practice we most commonly want to make tours which fly through sets of either KML Points or AbstractViews (Points include just a latitude, longitude, and altitude, but AbstractViews include a direction vector, so they describe cameras and their positions). Two new classes, called PointSplineFunction and ViewSplineFunction, accept Points and AbstractViews respectively as their control points, and return those types when evaluated, freeing the user from having to map each control point's coordinates to a simple vector.

Often when creating tours, we'll use Google Earth to find a set of views we like, save them to a KML file, and then write a script to make a tour out of those placemarks. With these new classes, that becomes much easier. Here's an example which ingests a KML file containing several placemarks, and creates a simple spline-based tour through them, in the order in which they appear in the KML file.

sp = Kamelopard::Functions::ViewSpline.new
each_placemark(XML::Document.file('waypoints.kml')) do |p, v|
  sp.add_control_point(w, 10)
end
 
(1..30).each do |i|
  fly_to sp.run_function(i.to_f/30.0), :duration => 0.8, :mode => :smooth
end

This uses the each_placemark function Kamelopard has had for quite a while to iterate through the file's placemarks and create control points, and then calculates the value of the spline along 30 points to create the flight path. This is such a common idiom when making tours that this Kamelopard version makes it even easier with a new fly_placemarks helper function. Using the new function, the code above becomes simply this:

fly_placemarks XML::Document.file('waypoints.kml')

More Flexible Geocoding

Kamelopard tries to make it easy to use geocoding services, which allow users to convert things like street addresses into latitudes and longitudes. This has its difficulties, as service providers regularly change formats or requirements, or quit the business altogether. Kamelopard has supported various geocoders in its time; this version finally adds support for Google's service. I'd left it out in previous versions because of an incorrect understanding of Google's licensing terms. It became important now because for the data we had from one particular client, Google's geocoding was significantly more accurate than the MapQuest geocoder I had been using previously. For different data sets, of course, some other service might get the most accurate results, but geocoding accuracy is a big concern for the work we do. No client wants to ask their shiny new Liquid Galaxy to zoom in on the corporate headquarters and see instead a seven screen panorama of the neighboring grocery store.

For geocoding projects of your own, there are a few considerations to keep in mind. First, geocoding services often impose usage limits. We'll sometimes find when geocoding a list of addresses that the service rejects every third or fourth query simply because we're querying faster than it wants to allow. They generally limit the number of queries allowed in one day, too, so debug your scripts using a small list of addresses before trying out a whole bunch at once. Having a project delayed simply because the geocoding service has stopped talking to you for the day is frustrating. Finally, it's often a good idea to geocode in one step, save the results somewhere, and make the tour in a second step using the saved results. This frees you from dependence on access to the service, and allows manual tweaking of the geocoded results. Note, however, that some services' licenses forbid saving the results anywhere.

Whichever service you end up using, Kamelopard code for geocoding generally looks like this:

g = GoogleGeocoder.new('my-api-key')
results = {}

Addresses.each do |t|
  results[t] = g.lookup(t)
end

Most services require an API key used for authorization, and return a large JSON structure which includes latitude and longitude, a status code, result quality, and any other useful information the service provider thinks you should have.

Other Updates

It seems it has always been hard for newcomers to get used to Kamelopard. This version includes a number of updates to the documentation, which will hopefully make it easier for them. It also includes new helper functions for writing out all KML documents in memory at once, and creating KMZ files automatically where desired.

Please give the new version a try. We'd love to hear how it's being used.

git checkout at a specific date

There are times when you need to view a git repository as it was at a certain point in time. For example, someone sends your project an error report and says they were using the git head version from around January 17, 2014. The short (and wrong!) way to do it is to pass the date to the checkout command like so:

$ git checkout 'HEAD@{Jan 17 2014}'

While I used to rely on this, I no longer do so, as I consider it somewhat of a footgun. To understand why, you first have to know that the ability to checkout using the format above only works for a short window of time, as defined by the git parameter gc.reflogExpire. This defaults to a measly 90 days. You can view yours with git config gc.reflogExpire. The problem is that when you go over the 90 day limit, git outputs a warning, but them spews a mountain of output as it performs the checkout anyway! It uses the latest entry it has in the reflog (e.g. 90 days ago). This commit has no relation at all with the date you requested, so unless you catch the warning, you have checked out a repository that is useless to your efforts.

For example, the Bucardo project can be cloned via:

$ git clone git://bucardo.org/bucardo.git/

Now let's say we want to examine the project as it looked on January 17, 2014. As I am writing this, the date is May 19, 2014, so that date occurred about four months ago: well over 90 days. Watch what happens:

$ git checkout 'HEAD@{Jan 17 2014}'
warning: Log for 'HEAD' only goes back to Sat, 22 Feb 2014 11:47:33 -0500.
Note: checking out 'HEAD@{Jan 17 2014}'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at d7f89dd... Bucardo now accepts pg_service for databases

So, we get the warning that HEAD only goes back to Feb 22, but then git goes ahead and checks us out anyway! If you were not paying attention - perhaps because you only glanced over that perfectly ordinary looking last line - you might not realize that the checkout you received is not what you requested.

Since this behavior cannot, to my knowledge, be turned off, I avoid this method and use other ways to checkout the repo as it existed on a certain date. The simplest is to find the closest commit by viewing the output of git log. In smaller projects, you can simply do this in a text editor and search for the date you want, then find a good commit sha-1 hash to checkout (i.e. git log > log.txt; emacs log.txt). Another somewhat canonical way is to use git-rev-list:

$ git checkout `git rev-list -1 --before="Jan 17 2014" master`

This command works fine, although it is a little clunky and hard to remember. It's requesting a list of all commits on the master branch, which happened before the given date, ordered by date, and stop once a single row has been output. Since I deal with SQL all day, I think of this as:

SELECT repository WHERE commit_id = 
  (SELECT commit
   FROM rev-list
   WHERE commit_date <= 'Jan 10, 2014'
   AND branch = 'master'
   ORDER BY commit_date DESC
   LIMIT 1
  );

This is one of the cases where the date IS inclusive. With git, you should always test when using date ranges if the given date is inclusive or exclusive, as reading the fine manual does not always reveal this information. Here is one way to prove the date is inclusive for the rev-list command:

$ git rev-list -1 --before="Jan 17 2014" master --format=medium
commit d4b565bf46b6f478b969a378578b0cff3b24e82d
Author: Greg Sabino Mullane 
Date:   Fri Jan 17 10:49:09 2014 -0500

    Make our statement_chunk_size default match up.

As a final nail in the coffin for doing a checkout via the reflog date, the reflog actually is local to you and will pull the date of the repo as it existed for you at that point in time. This may or may not line up with the commits, depending on how often you are syncing with other people via git pull or other methods! So play it safe and request a specific commit by sha-1 hash, or use the rev-list trick.

Highlights of OpenWest conference 2014

Last week I was able to not only attend the OpenWest conference, held in Orem, UT at the UVU campus, but I was also able to be a presenter. This year's conference had their largest attendance to date and also their most diverse list of tracks and presentations. Many of the presentations are posted on YouTube.

My own presentation was on Git Workflows That Work (previous blog post, talk slides). It was part of the "Tools" track and went pretty well. Rather than recap my own presentation, I'd like to briefly touch on a few others I enjoyed.

Building a Scalable Codebase using Domain Driven Design

The folks at Insidesales.com presented on Domain Driven Design. This was something I had heard of but wasn't very familiar with. The main principles of DDD are: understand the business purpose (not just the app requirements), rigorous organization (of code, services, design, etc.), distinct layers, and functional cohesion. The biggest point I got was that all business logic should be located in a "domain" separated from the application and separated from the data. The application is then just a thin wrapper around API calls to the business logic within the domain. DDD is different from MVC though because each layer is distinct and could have its own design. Thus MVC could be a smaller piece within a given layer.

As I listened to this presentation I remembered a project I worked on at a previous job building a Feeds Admin to create and manage product feeds to third parties (search engines, shopping portals, etc). I had built that using DDD without realizing it at the time. It came naturally as I wanted to cleanly organize and separate the logic for each feed (with different data pulled from the database, different feed formats, different transport methods, etc.). So I definitely have seen the benefits and principles of DDD in practice.

Mentoring Devs into DevOps

This presentation by Justin Carmony discussed how his team has been going through a transition of empowering developers to also do some Ops work. They had roughly 30 developers, 2 operations admins, and over 300 servers in their infrastructure. They also had several different tools and ways of doing things between the different teams of developers; for example, some used Capistrano and some used Ansible, and some used Jenkins and others Travis CI, some used Vagrant and some didn't, etc.). They considered hiring for a new DevOps position to own and mentor everyone in standardizing tools and processes, but instead promoted one of their star developers to that role. This was a tough decision because it meant that they would not be able to use his skills a developer. But Justin said it was definitely worth it because within just 3 months this developer had helped everyone to standardized on tools and release procedures, making the entire team more productive and efficient.

They also switched to using Salt for configuration management for all servers at this same time. And to empower the developers and expose them to the world of Ops, they switched to using Vagrant managed by Salt for all development environments. Developers were able to make some Salt changes if needed to their own environment, and then submit a git pull request to Ops for peer review of the Salt configuration and Ops would then merge it and deploy it to production if they accepted the changes.

Justin mentioned that it is still an on-going transition for them. Ultimately the main points he presented are:

  • There is a long continuum or gradient of skill level and access level between Dev and Ops and it is better to think of DevOps as a continuum as well and not a strict role with strict access levels. Some Devs can have access to dashboards and graphs of systems while others can manage build tools and configuration. "DevOps" can be spread out.
  • Team culture matters. It can be very difficult to improve processes and get people to embrace change for the better, so positive influences by everyone involved can make a big difference. Business owners need to be supportive as well.

Retrospectus

Daniel Evans gave an interesting presentation on retrospective meetings. For those not familiar with these meetings, they are for discussing "what went well" and "what could be improved" over the last project/sprint/time period/etc. I've participated in these types of meetings for many years and have seen the benefits that can come from them. Daniel also talked about what these meetings are not: they are not meant for deep dives into problems, blame, or griping. They should be focused on the good that has been done and on solutions to fix problems going forward. And even then, if the solutions are not quickly apparent then you should not spend too much time trying to find them. Those conversations should happen outside of this meeting.

His presentation was quite short and didn't really cover much that I didn't know already. However, at the end he then turned the rest of the time into a retrospective meeting reviewing his presentation, with everyone participating. This turned out to be pretty fun and was a good exercise, and covered a lot more then a regular Q & A session would have.

Other Links

Interchange form pitfalls

I'll warn you going in: this is obscure Interchange internal form-wrangling voodoo, and even if you are familiar with that, it's not going to be an easy trip to get from here to the end of the story. Fair warning!

Interchange form-handling has many, many features. Some of them are pretty obscure. One such is the concept of a "click map": a snippet of named code, stored on the server. Specifically, it is stored within the user's session, which is a chunk of storage associated with a browser cookie, and can contain all sorts of stuff -- but which is primarily used for things like the shopping cart, the user's identity, and so on.

But the tricky business here is when we put something into the session that says, in effect, "when we do this, do that, too".

(Note: you don't have to store your click map code in the session. In fact, it's probably better not to, for a number of good reasons. You can put it in a sort of global configuration file, in which each individual snippet needs a unique name – which is precisely the requirement that would have prevented this bug, had I or one of my predecessors chosen to embrace that restriction.)

An Interchange page is both code and presentation (that is, variable assignments, ITL and/or Perl, plus HTML). No, this isn't a great design feature -- but it was considered so in an earlier, more innocent time.

I had one application that continued to present bugs that resisted solution, in fact they were virtually impossible to reproduce. Until one day, when I was testing a theory (which was totally wrong, but led me into the right spot): I landed on a page, then refreshed the page -- and saw a totally different page. Here's what happened, as simply as I can represent it.

Page1.html:
<form ...>
[set Continue]
mv_nextpage=Page2.html
[/set]
<input type="submit" name="mv_click" value="Continue"
</form>

Page2.html:
<form ...>
[set Continue]
mv_nextpage=Page3.html
[/set]
<input type="submit" name="mv_click" value="Continue"
</form>

When you request Page1.html, before the page is delivered to the browser, the session variable "Continue" is initialized with the code:

mv_nextpage=Page2.html

In Interchange terms, this tells Interchange where to go when form processing is complete. When we click on the "Continue" button, form processing happens on the server, and the "mv_click" button's code is referenced by name as "Continue". That takes us to "Page2.html". When that page is prepared, Interchange overwrites the "Continue" variable with its new data, which is also a next-page setup.

Now, if we click the button on Page1.html, but then refresh the page on the resulting Page2, form processing is run again -- but this time, our call-by-name to "Continue" executes the code that Page2 set up for its Continue button -- not the Page1 setup. In other words: clicking the button on Page1.html sends us to page 2, but refreshing Page2.html sends us to Page3.html.

I can't tell you what a relief this was to finally reproduce this bug. The simple fix was to just change the names involved, to "Continue1" and "Continue2". There's certainly a lot of power in Interchange form processing, but hoo boy -- with great power comes great potential to shoot yourself in the foot.

MediaWiki extensions and wfLoadExtensionMessages


Image by Flickr user Susan Drury

Upgrading MediaWiki can be a challenging task, especially if you use a lot of extensions. While the core upgrade process usually goes smoothly, it's rare you can upgrade a major version or two without having to muddle with your collection of extensions. Extensions are bits of code that extend what MediaWiki can do. Only a few are packaged with and maintained alongside MediaWiki itself - the great majority are written by third-party developers. When the MediaWiki API changes, it is up to those developers to update their extension so it works with the new version of MediaWiki. This does not always happen. Take for example one of the more common errors seen on a MediaWiki upgrade since 1.21 was released:


[Tue May 06 11:21:52 2014] [error] [client 12.34.56.78] PHP Fatal error: Call to undefined function wfLoadExtensionMessages() in /home/beckett/mediawiki/extensions/PdfExport/PdfExport.php on line 83, referer: http://test.ziggy.com/wiki/Main_Page

This is because the wfLoadExtensionMessages function, which many extensions use, has been deprecated since MediaWiki version 1.16 and was finally removed in 1.21, resulting in the error seen above. Luckily, this function has been a no-op since 1.16, so it is safe to comment it out and/or make a dummy function in your LocalSettings.php file (see below).

Sadly, the release notes for 1.21 make no mention of this fairly major change. Let's walk through as if we didn't know anything about it and see how we could solve the given error with the help of git. For this example, we'll use the Pdf Export extension, which allows you to export your wiki pages into PDF form. A pretty handy extension, and one which completely fails to work in MediaWiki version 1.21 or better.

First, let's verify that wfLoadExtensionMessages does not exist at all in version 1.21 of MediaWiki. For these examples, I've checked out the MediaWiki code via git, and am relying on the fact that lightweight git tags were made for all the versions we are interested in.

$ git clone https://github.com/SemanticMediaWiki/SemanticMediaWiki.git mediawiki
$ cd mediawiki
$ git grep wfLoadExtensionMessages 1.21.0
1.21.0:HISTORY:* (bug 12880) wfLoadExtensionMessages does not use $fallback from MessagesXx.php

A nice feature of git-grep is the ability to simply use a tag after the search string. In this case, we see that the only mention of wfLoadExtensionMessages in the entire codebase is an old mention of it in the history file. Let's see what version that bug is from:

$ git grep -n wfLoadExtensionMessages 1.21.0
1.21.0:HISTORY:5280:* (bug 12880) wfLoadExtensionMessages does not use $fallback from MessagesXx.php
$ git show 1.21.0:HISTORY | head -5280 | tac | grep '===' -m1
=== Bug fixes in 1.12 ===

That message is from way back in version 1.12, and doesn't concern us. Let's take a look at what tags exist in the 1.20 branch so we can scan the latest one:

$ git tag | grep '^1\.20'
1.20.0
1.20.0rc1
1.20.0rc2
1.20.1
1.20.2
1.20.3
1.20.4
1.20.5
1.20.6
1.20.7
1.20.8

Now we can peek inside version 1.20.8 and see what that function did before it was removed. By using the -A and -B (after and before) arguments to grep, we can see the entire function in context:

$ git grep wfLoadExtensionMessages 1.20.0
1.20.0:HISTORY:* (bug 12880) wfLoadExtensionMessages does not 
  use $fallback from MessagesXx.php
1.20.0:includes/GlobalFunctions.php:function wfLoadExtensionMessages() {
$ git show 1.20.8:includes/GlobalFunctions.php | \
  grep -B6 -A2 LoadExtensionMessages
/**
 * Load an extension messages file
 *
 * @deprecated since 1.16, warnings in 1.18, remove in 1.20
 * @codeCoverageIgnore
 */
function wfLoadExtensionMessages() {
    wfDeprecated( __FUNCTION__, '1.16' );
}

Thus wfLoadExtensionMessages was basically a no-op in MediaWiki version 1.20, with the caveat that it will write a deprecation warning to your error log (or, in modern versions, the debug log unless $wgDevelopmentWarnings is set). Next we want to find the last time this function did something useful - which should be version 1.15 according to the comment above. Thus:

$ git show 1.15.0:includes/GlobalFunctions.php | \
  grep -A4 LoadExtensionMessages
function wfLoadExtensionMessages( $extensionName, $langcode = false ) {
    global $wgExtensionMessagesFiles, $wgMessageCache, $wgLang, $wgContLang;

    #For recording whether extension message files have been loaded in a given language.
    static $loaded = array();

So, it's a pretty safe bet that unless you are upgrading from 1.15.0 or earlier, it should be completely safe to remove it. When was 1.16.0 released? There are no dates in the HISTORY file (shame), but the date it was tagged should be a good guess:

$ git show 1.16.0 | grep -m1 Date
Date:   Wed Jul 28 07:11:03 2010 +0000

So what should you do with extensions that are still using this deprecated function? There are two quick solutions: comment it out inside the extension, or add a dummy function to your version of MediaWiki.

Changing the extension itself is certainly quick and easy. To get the PdfExport extension to work, we only have to comments out two calls to wfLoadExtensionMessages inside of the file PdfExport.php, and one inside of PdfExport_body.php. The diff:

$ git difftool -y -x "diff -u1"
--- /tmp/7YqvXv_PdfExport.php 2014-05-08 12:45:03 -0400
+++ PdfExport.php             2014-05-08 12:34:39 -0400
@@ -82,3 +82,3 @@
   if ($img_page > 0 || $img_page === false) {
-        wfLoadExtensionMessages('PdfPrint');
+        //wfLoadExtensionMessages('PdfPrint');
                $nav_urls['pdfprint'] = array(
@@ -92,3 +92,3 @@
 function wfSpecialPdfToolbox (&$monobook) {
-          wfLoadExtensionMessages('PdfPrint');
+          //wfLoadExtensionMessages('PdfPrint');
           if (isset($monobook->data['nav_urls']['pdfprint']))
--- /tmp/7gO8Hz_PdfExport_body.php   2014-05-08 12:45:03 -0400
+++ PdfExport_body.php               2014-05-08 12:34:44 -0400
@@ -44,3 +44,3 @@
            // For backwards compatibility
-             wfLoadExtensionMessages('PdfPrint');
+             //wfLoadExtensionMessages('PdfPrint');

A better way is to add a dummy function to LocalSettings.php. This ensures that any extension we add in the future will continue to work unmodified. Just throw this at the bottom on your LocalSettings.php:

function wfLoadExtensionMessages() { }

Probably the best overall solution is to not only add that to your LocalSettings.php, but to try to get the extension changed as well. You can notify the author, or try to fix it yourself and release a new version if the extension has been abandoned. You might also look to see if the extension has been superseded by a different extension, as sometime happens.

While there may be other compatibility issues when upgrading MediaWiki, for some extensions (such as PdfExport), this is the only change needed to make it work again on newer versions of MediaWiki!

Drupal Commerce for Fun and Profit

Drupal has for years been known as open source content management system (CMS) for creating easy content-driven websites for any topic. In addition to the CMS, there are a whole host of tools built upon this foundation to extend the features using modules. One module in particular I worked with recently for a client was Drupal Commerce, which helps to augment the CMS with an eCommerce platform.

Within Drupal for normal content items, you would create distinct articles, which have a predefined model of fields that would reflect the different aspects of an article, author, title, tags, etc. While Drupal Commerce uses these same functions, it will also give you the capability to also define SKUs, product categories, and rules and procedures to be carried when adding a particular item to your cart. In much the same way as articles are published to the Drupal taxonomy to allow users to browse through the articles, product categories, and individual products are published, and allow the administrator to customize the layout of how those products would display.

Getting Started with Drupal Commerce

One tool that is extremely helpful on getting a Drupal Commerce setup up for yourself is using the Drupal Commerce Kickstart.
This tool, which is maintained by a group of Drupal consultants/enthusiasts who created an embedded install of Drupal with some modules installed and configured to get both a working Drupal install and a copy of the Commerce plugin as well. The installation is quite simple, you extract the install archive into your web server document root, and then you follow an install procedure to configure your database, the name of the site, and any other information needed.

Once the install is complete, you will have a copy of Drupal Commerce installed, and will have a basic product display model on the front page with a few example products available. From here the next steps would be to define your own products and categories and input those model objects.

Creating Products

Internally to Drupal, when referring to articles, there are distinct units for each article and category known as nodes, and a hierarchical taxonomy that is defined by the administrator for how these items fit together. Within Drupal Commerce there is a similar designation, but the items are known as 'entities' and could refer to a whole host of possible objects in Drupal Commerce. Here is a chart to show the relationships:
These "entities" extend the original concept of a Drupal node to allow for different products, attributes of products and customers and their attributes. The product reference is a pointer object that allows a product to be added to a node article to be displayed on the front end of the site. This reference refers back to the products that were added, and if those product objects are updated, the content to display those products will stay static, but will be dynamically updated. In this way, the content templates for display products is abstracted from the actual product SKUs themselves.

When defining each product there are a number of relationships that are created which relate to the possible categories that a product would live in, as well as fields and properties implied by that product. Here is a visual representation of these relationships:
Every product entity would by definition be part of the "Product" bundle, and also would fall into a custom bundle for the "T-shirt" product category. In this way, you can create as complex or as simple of relationships between products and categories to reflect the actual state of the business using the tool, and will allow for the product definitions to remain in place, even if you were to redefine the layout for those products to display.

Conclusions

Drupal Commerce is a hulking tool, with a number of learning curves and a bit of terminology to get straight, but once in place and working will give a user a very robust commerce platform. Any Drupal developer familiar with creating custom taxonomies for article display and categorization will be able to use the same skills to create a slick eCommerce site.

Git Workflows That Work

There was a significant blog post some years ago. It introduced a “successful” workflow for using Git. This workflow was named Gitflow. One of the reasons this blog post was significant is that it was the first structured workflow that many developers had been exposed to for using Git. Before Gitflow was introduced, most developers didn’t work with Git like that. And if you remember when it was introduced back in 2010, it created quite a buzz. People praised it as “the” way to work with Git. Some adopted it so quickly and full heartedly that they dismissed any other way to use Git as immature or childish. It became, in a way, a movement.

I start with this little bit of history to talk about the void that was filled by Gitflow. There was clearly something that drew people to it that wasn’t there before. It questioned the way they were working with Git and offered something different that worked “successfully” for someone else. I supposed many developers didn’t have much confidence or strong feelings about their use of Git before they heard of Gitflow. And so they followed someone who clearly did have confidence and strong feelings about a particular workflow. Some of you may be questioning your current Git workflow now and can relate to what I’m describing. However, I’m not going to prescribe a particular workflow for you as “the” way to do it.

Instead, let’s talk about the purpose of a workflow. Let’s reword that so we’re clear- the purpose of a software development workflow using Git. What is the purpose? Let’s back up and ask what is the purpose of software? The purpose of software is to help people. Period. Yes it can help servers, and networks, and robots, and telephones, etc. But help them do what? Help people. They are tools to help us (people) do things better, faster, simpler, etc. I submit to you that the purpose of a software development workflow using Git should be the same. It should help people release software. Specifically, it should help match the software development process with business expectations for the people responsible for the software. That list of people responsible for the software should include more than just the developers. It also includes operations engineers, project managers, and certainly business owners.

Does your Git workflow help your business owners? Does it help your project managers or the Operations team? These are questions you should be thinking about. And by doing so, you should realize that there is no “one size fits all” workflow that will do all that for every case. There are many different workflows based on different needs and uses. Some are for large complex projects and some are extremely simple. What you need to ask is- what will best help my team/project/organization to develop, release, and maintain software effectively? Let’s look at a few workflow examples and see.

GitHub Flow

GitHub’s own workflow, their internal workflow, is quite different from what everyone else does who uses GitHub. It is based on a set of simple business choices:

  • Anything in the master branch is deployable
  • To work on something new, create a descriptively named branch off of master (ie: new-oauth2-scopes)
  • Commit to that branch locally and regularly push your work to the same named branch on the server
  • When you need feedback or help, or you think the branch is ready for merging, open a pull request
  • After someone else has reviewed and signed off on the feature, you can merge it into master
  • Once it is merged and pushed to ‘master’ on the origin, you can and should deploy immediately

They release many times per day to production using this model. They branch off master for every change they make, hot fixes and features are treated the same. Then they merge back into master and release. They even have automated their releases using an irc bot.

Skullcandy’s workflow

When I worked for Skullcandy we used a workflow loosely based on the GitHub Flow model, but altered a bit. We used a Scrum Agile methodology with well defined sprints of work and deliverables at the end of each sprint. The workflow followed these business choices:

  • A userstory or defect in our tracking system represented a single deliverable, and a Git branch was created for each userstory or defect. We used a naming convention for branches (skdy/schristensen/US1234-cool-new-feature, for example). Yes, you can use ‘/’ characters in branch names.
  • Everything branches off master. Features and hot fixes are treated the same.
  • After code review, then the branch was merged into a QA branch and deployed to the QA environment where business owners tested and approved the changes.
  • The QA branch is just another branch off master and can be blown away and recreated when needed at any time.
  • We released once a week, and only those changes that have been approved by the business owners in QA got merged into master and released.
  • Since branch names and items in our issue tracking system were tied together we could easily verify the status of a change, the who, when, and what, and why of it, and even automate things- like auto merging of approved branches and deployment, auto updating tickets in the tracking system, and notifying developers of any merge issues or when their branch got released.

Master only workflow

Not every team or project is going to work like this. And it may be too complicated for some. It may be appropriate to just work on master without branching and merging. I do this now with some of the clients I work with.

  • Each feature or hot fix is worked on in dev environment that is similar to production, that allows business owner direct access for testing and approval. Changes are committed locally.
  • Once approved by the business owner, commit and push changes to master on origin, and then deploy to production immediately.

You may not be working for a business, and so the term “business owner” may not fit your situation. But there should always be someone who approves the changes as acceptable for release. That person should be the same one who requested the change in the first place.

Gitflow

On the other end of the spectrum from a master only workflow, is Gitflow. Here there are at least three main branches: develop (or development), release, and master. There are other branches as well for features and hot fixes. Many of these are long running. For example, you merge develop into the release branch but then you continue working on develop and add more commits. The workflow looks like this:

  • All work is done in a branch. Features are branched off develop. Hot fixes are treated different and are branched off master.
  • Features are merged back into develop after approval.
  • Develop is merged into a release branch.
  • Hot fixes are merged back into master, but also must be merged into develop and the release branch.
  • The release branch is merged into master.
  • Master is deployed to production.

Backcountry workflow

When I worked for Backcountry.com we used a similar workflow, however we used different names for the branches. All development happened on master, feature branches were branched off and then merged back into master. Then we branched master to create a new release branch. And then we merged the release branch into a branch called “production”. And since master is just a branch and doesn’t have to be special, you could use a branch named whatever you want for your production code.

Guidelines

There are many other examples we could go over and discuss, but these should be enough to get you thinking about different possibilities. There are a few guidelines that you should consider for your workflow:

  • Branches should be used to represent a single deliverable request from the business- like a single user story or bug fix. Something that can be approved by the business that contains everything needed for that single request to be released- and nothing more!
  • The longer a feature branch lives without getting merged in for a release, the greater risk for merge conflicts and challenges for deployment. Short lived branches merge and deploy cleaner.
  • Business owner involvement in your workflow is essential. Don’t merge, don’t deploy, don’t work without their input. Otherwise pain and tears will ensue (or worse).
  • Avoid reverts. Test, test, test your branch before a merge. When merging use git merge --no-ff, which will ease merge reverts if really needed.
  • Your workflow should fit how you release. Do you release continually, multiple times a day? Do you have 2 week sprints with completed work to release on a regular schedule? Do you have a business Change Control Board where all released items must get reviewed and approved first? Does someone else run your releases, like the Operations team or a Release manager? Your branching and merging strategy needs to make releasing easier.
  • Complicated workflows drive people crazy. Make it simple. Review your workflow and ask how you can simplify it. In actively making things more simple, you will also make them easier to understand and work with as well as easier for others to adopt and maintain.

These should help you adjust your software development workflow using Git to fulfill its purpose of helping people. Helping you.

Further Reading

There is a lot more you can read about on this topic, and here are several good places to start: