News

Welcome to End Point’s blog

Ongoing observations by End Point people

Impressions from Open Source work with Elixir

Some time ago I started working on the Elixir library that would allow me to send emails as easily as ActionMailer known from the Ruby world does.

The beginnings were exciting — I got to play with a very clean and elegant new language which Elixir is. I also quickly learned about the openness of the Elixir community. After hacking some first draft-like version and posting it on GitHub and Google groups — I got a very warm and thorough code review from the language’s author José Valim! That’s just impressive and it made me even more motivated to help out the community by getting my early code into a better shape.

Coding the ActionMailer like library in a language that was born 3 years ago doesn’t sound like a few hours job — there’s lots of functionality to be covered. An email’s body has to be somehow compiled from the template but also the email message has to be transformed to the form in which the SMTP server can digest and relay it. It’s also great if the message’s body can be encoded with „quoted printable” — this makes even the oldest SMTP server happy. But there’s lots more: connecting with external SMTP servers, using the local in-Elixir implementation, ability to test etc…

Fortunately Elixir’s built on top of the Erlang’s „Virtual Machine” - BEAM - which makes you able to use its libraries — a lot of them. For the huge part of the functionality I needed to cover I chose the great gen_smtp library (https://github.com/Vagabond/gen_smtp). This allowed me to actually send emails to SMTP servers and have them properly encoded. With the focus on developer’s productivity, Elixir made me come up with the nice set of other features that you can check out here: https://github.com/kamilc/mailman

This serves as a shout out blog post for the Elixir ecosystem and community. The friendliness that it radiates with makes open source work like this very rewarding. I invite you to make your contributions as well — you’ll like it.

Simple cross-browser communication with ROS

ROS and RobotWebTools have been extremely useful in building our latest crop of distributed interactive experiences. We're continuing to develop browser-fronted ROS experiences very quickly based on their huge catalog of existing device drivers. Whether a customer wants their interaction to use a touchscreen, joystick, lights, sound, or just about anything you can plug into the wall, we now say with confidence: "Yeah, we can do that."

A typical ROS system is made out of a group ("graph") of nodes that communicate with (usually TCP) messaging. Topics for messaging can be either publish/subscribe namespaces or request/response services. ROS bindings exist for several languages, but C++ and Python are the only supported direct programming interfaces. ROS nodes can be custom logic processors, aggregators, arbitrators, command-line tools for debugging, native Arduino sketches, or just about any other imaginable consumer of the data streams from other nodes.

The rosbridge server, implemented with rospy in Python, is a ROS node that provides a web socket interface to the ROS graph with a simple JSON protocol, making it easy to communicate with ROS from any language that can connect to a web socket and parse JSON. Data is published to a messaging topic (or topics) from any node in the graph and the rosbridge server is just another subscriber to those topics. This is the critical piece that brings all the magic of the ROS graph into a browser.

A handy feature of the rosbridge JSON protocol is the ability to create topics on the fly. For interactive exhibits that require multiple screens displaying synchronous content, topics that are only published and subscribed between web socket clients are a quick and dirty way to share data without writing a "third leg" ROS node to handle input arbitration and/or logic. In this case, rosbridge will act as both a publisher and a subscriber of the topic.

To develop a ROS-enabled browser app, all you need is an Ubuntu box with ROS, the rosbridge server and a web socket-capable browser installed. Much has been written about installing ROS (indigo), and once you've installed ros-indigo-ros-base, set up your shell environment, and started the ROS core/master, a rosbridge server is two commands away:

$ sudo apt-get install ros-indigo-rosbridge-suite
$ rosrun rosbridge_server rosbridge_websocket

While rosbridge is running, you can connect to it via ws://hostname:9090 and access the ROS graph using the rosbridge protocol. Interacting with rosbridge from a browser is best done via roslibjs, the JavaScript companion library to rosbridge. All the JavaScripts are available from the roslibjs CDN for your convenience.

<script type="text/javascript"
  src="http://cdn.robotwebtools.org/EventEmitter2/current/eventemitter2.min.js">
</script>
<script type="text/javascript"
  src="http://cdn.robotwebtools.org/roslibjs/current/roslib.min.js">
</script>

From here, you will probably want some shared code to declare the Ros object and any Topic objects.

//* The Ros object, wrapping a web socket connection to rosbridge.
var ros = new ROSLIB.Ros({
  url: 'ws://localhost:9090' // url to your rosbridge server
});

//* A topic for messaging.
var exampleTopic = new ROSLIB.Topic({
  ros: ros,
  name: '/com/endpoint/example', // use a sensible namespace
  messageType: 'std_msgs/String'
});

The messageType of std_msgs/String means that we are using a message definition from the std_msgs package (which ships with ROS) containing a single string field. Each topic can have only one messageType that must be used by all publishers and subscribers of that topic.

A "proper" ROS communication scheme will use predefined message types to serialize messages for maximum efficiency over the wire. When using the std_msgs package, this means each message will contain a value (or an array of values) of a single, very specific type. See the std_msgs documentation for a complete list. Other message types may be available, depending on which ROS packages are installed on the system.

For cross-browser application development, a bit more flexibility is usually desired. You can roll your own data-to-string encoding and pack everything into a single string topic or use multiple topics of appropriate messageType if you like, but unless you have severe performance needs, a JSON stringify and parse will pack arbitrary JavaScript objects as messages just fine. It will only take a little bit of boilerplate to accomplish this.

/**
 * Serializes an object and publishes it to a std_msgs/String topic.
 * @param {ROSLIB.Topic} topic
 *       A topic to publish to. Must use messageType: std_msgs/String
 * @param {Object} obj
 *       Any object that can be serialized with JSON.stringify
 */
function publishEncoded(topic, obj) {
  var msg = new ROSLIB.Message({
    data: JSON.stringify(obj)
  });
  topic.publish(msg);
}

/**
 * Decodes an object from a std_msgs/String message.
 * @param {Object} msg
 *       Message from a std_msgs/String topic.
 * @return {Object}
 *       Decoded object from the message.
 */
function decodeMessage(msg) {
  return JSON.parse(msg.data);
}

All of the above code can be shared by all pages and views, unless you want some to use different throttle or queue settings on a per-topic basis.

On the receiving side, any old anonymous function can handle the receipt and unpacking of messages.

// Example of subscribing to a topic with decodeMessage().
exampleTopic.subscribe(function(msg) {
  var decoded = decodeMessage(msg);
  // do something with the decoded message object
  console.log(decoded);
});

The sender can publish updates at will, and all messages will be felt by the receivers.

// Example of publishing to a topic with publishEncoded().
// Explicitly declare that we intend to publish on this Topic.
exampleTopic.advertise();

setInterval(function() {
  var mySyncObject = {
    time: Date.now(),
    myFavoriteColor: 'red'
  };
  publishEncoded(exampleTopic, mySyncObject);
}, 1000);

From here, you can add another layer of data shuffling by writing message handlers for your communication channel. Re-using the EventEmitter2 class upon which roslibjs depends is not a bad way to go. If it feels like you're implementing ROS messaging on top of ROS messaging.. well, that's what you're doing! This approach will generally break down when communicating with other non-browser nodes, so use it sparingly and only for application layer messaging that needs to be flexible.

/**
 * Typed messaging wrapper for a std_msgs/String ROS Topic.
 * Requires decodeMessage() and publishEncoded().
 * @param {ROSLIB.Topic} topic
 *       A std_msgs/String ROS Topic for cross-browser messaging.
 * @constructor
 */
function RosTypedMessaging(topic) {
  this.topic = topic;
  this.topic.subscribe(this.handleMessage_.bind(this));
}
RosTypedMessaging.prototype.__proto__ = EventEmitter2.prototype;

/**
 * Handles an incoming message from the topic by firing an event.
 * @param {Object} msg
 * @private
 */
RosTypedMessaging.prototype.handleMessage_ = function(msg) {
  var decoded = decodeMessage(msg);
  var type = decoded.type;
  var data = decoded.data;
  this.emit(type, data);
};

/**
 * Sends a typed message to the topic.
 * @param {String} type
 * @param {Object} data
 */
RosTypedMessaging.prototype.sendMessage = function(type, data) {
  var msg = {type: type, data: data};
  publishEncoded(this.topic, msg);
};

Here's an example using RosTypedMessaging.

//* Example implementation of RosTypedMessaging.
var myMessageChannel = new RosTypedMessaging(exampleTopic);

myMessageChannel.on('fooo', function(data) {
  console.log('fooo!', data);
});

setInterval(function() {
  var mySyncObject = {
    time: Date.now(),
    myFavoriteColor: 'red'
  };
  myMessageChannel.sendMessage('fooo', mySyncObject);
}, 1000);

If you need to troubleshoot communications or are just interested in seeing how it works, ROS comes with some neat command line tools for publishing and subscribing to topics.

### show messages on /example/topicname
$ rostopic echo /example/topicname

### publish a single std_msgs/String message to /example/topicname
### the quotes are tricky, since rostopic pub parses yaml or JSON
$ export MY_MSG="data: '{\"type\":\"fooo\",\"data\":{\"asdf\":\"hjkl\"}}'"
$ rostopic pub -1 /example/topicname std_msgs/String "$MY_MSG"

To factor input, arbitration or logic out of the browser, you could write a roscpp or rospy node acting as a server. Also worth a look are ROS services, which can abstract asynchronous data requests through the same messaging system.

A gist of this example JavaScript is available, much thanks to Jacob Minshall.

Liquid Galaxy for Google.org at SXSW

End Point enjoyed an opportunity to work with Google.org, who bought a Liquid Galaxy to show their great efforts, at last week's SXSW conference in Austin. Google.org has a number of projects worldwide, all focused on how tech can bring about unique and inventive solutions for good. To showcase some of those projects, Google asked us to develop presentations for the Liquid Galaxy where people could fly to a given location, read a brief synopsis of the grantee organizations, and view presentations which included virtual flight animations, map overlays, and videos of the various projects.

Some of the projects included are as follows:
  • Charity:Water - The charity: water presentation included scenes featuring multi screen video of Scott Harrison (Founder/CEO) and Robert Lee (Director of Special Programs), and an animated virtual tour of charity: water well sites in Ethiopia.
  • World Wildlife Fund - The World Wildlife Fund presentation featured a virtual tour of the Bouba N’Djida National Park, Cameroon putting the viewer into the perspective of a drone patrolling the park for poachers. Additional scenes in the presentation revealed pathways of transport for illegal ivory from the park through intermediate stops in Nigeria and Hong Kong before reaching China.
  • Samasource - The Samasource presentation showed slums where workers start and the technology centers they are able to migrate to while serving a global network of technology clients.

But the work didn’t stop there! Google.org was also sharing the space with the XPrize opening gala on Monday night. For this event, we drew up a simple game where attendees could participate in various events around the room, receive coded answers at each station, and then enter their code into the Liquid Galaxy touchscreen. If successful, the 7-screen display whisked them to the Space Port in the Mojave, then into orbit. If the wrong code was entered, the participant got splashed into the Pacific Ocean. It was great fun!

The SXSW engagement is the latest in an ongoing campaign to bring attention to global challenges and how Google.org is helping solve those challenges. End Point enjoys a close working relationship with Google and Google.org. We relish the opportunity to bring immersive and inviting displays that convey a wealth of information to viewers in such an entertaining and engrossing manner.

This event ran for 2 days at the Trinity Hall venue just 1 block from the main SXSW convention center in downtown Austin, Texas.

Simple AngularJS Page

The best thing in AngularJS is the great automation of actualizing the data in the html code.

To show how easy Angular is to use, I will create a very simple page using AngularJS and Github.

Every Github user can get lots of notifications. All of them can be seen at Github notification page. There is also the Github API, which can be used for getting the notification information, using simple http requests, which return jsons.

I wanted to create a simple page with a list of notifications. With information if the notification was read (I used "!!!" for unread ones). And with automatical refreshing every 10 minutes.

To access the Github API, first I generated an application token on the Github token page. Then I downloaded a file from the AngularJS page, and a Github API javascript wrapper.

Then I wrote a simple html file:

 
    <html>
      <head>
        
        
        
        
      </head>

      <body ng-app="githubChecker">
        

Github Notifications

!!! {{ n.subject.title }}
</body> </html>

This is the basic structure. Now we need to have some angular code to ask Github for the notifications and fill that into the above html.

The code is also not very complicated:

  var githubChecker = angular.module('githubChecker', []);

  githubChecker.controller("mainController", ['$scope', '$interval', function($scope, $interval){

    $scope.notifications = [];

    var github = new Github({
      username: "USERNAME",
      token:    "TOKEN",
      auth:     "oauth"
    });
    var user = github.getUser();

    var getNotificationsList = function() {
      user.notifications(function(err, notifications) {
        $scope.notifications = notifications;
        $scope.$apply();
      });
    };

    getNotificationsList();

    $interval(getNotificationsList, 10*60*1000);

  }]);

First of all I've created an Angular application object. That object has one controller, in which I created a Github object, which gives me a nice way to access the Github API.

The function getNotificationsList calls the Github API, gets a response, and just stores it in the $scope.notifications object.

Then the angular's magic comes into play. When the $scope fields are updated, angular automatically updates all the declarations in the html page. This time it is not so automatic, as I had to call the $scope.$apply() function to trigger it. It will loop through the $scope.notifications and update the html.

For more information about the Angular, and the commands I used, you can check the AngularJS Documentation.

Mobile-friendly sites or bust!

A few weeks ago, Google announced that starting on April 21 it will expand its “use of mobile-friendliness as a ranking signal” which “will have a significant impact in our search results”.

The world of search engine optimization and online marketing is aflutter about this announcement, given that even subtle changes in Google’s ranking algorithm can have major effects to improve or worsen any particular site’s ranking. And the announcement was made less than two months in advance of the announced date of the change, so there is not much time to dawdle.

Google has lately been increasing its pressure on webmasters (is that still a real term‽) such as with its announcement last fall of an accelerated timetable for sunsetting SSL certificates with SHA-1 signatures. So far these accelerated changes have been a good thing for most people on the Internet.

In this case, Google provides an easy Mobile-Friendly Site Test that you can run on your sites to see if you need to make changes or not:


 

So get on it and check those sites! I know we have a few that we can do some work on.

Advanced Product Filtering in Ecommerce

One of my recent projects for Paper Source has been to introduce advanced product filtering (or faceted filtering). Paper Source runs on Interchange, a perl-based open source ecommerce platform that End Point has been involved with (as core developers & maintainers) for many years.

In the case of Paper Source, personalized products such as wedding invitations and save the dates have advanced filtering to filter by print method, number of photos, style, etc. Advanced product filtering is a very common feature in ecommerce systems with a large number of products that allows a user to narrow down a set of products to meet their needs. Advanced product filtering is not unlike faceted filtering offered by many search engines, which similarly allows a user to narrow down products based on specific tags or facets (e.g. see many Amazon filters on the left column). In the case of Paper Source, I wrote the filtering code layered on top of the current navigation. Below I'll go through some of the details with small code examples.

Data Model

The best place to start is the data model. A simplified existing data model that represents product taxonomy might look like the following:


Basic data model linking categories to products.

The existing data model links products to categories via a many-to-many relationship. This is fairly common in the ecommerce space – while looking at a specific category often identified by URL slug or id, the products tied to that category will be displayed.

And here's where we go with the filtering:


Data model with filtering layered on top of existing category to product relationship.

Some notes on the above filtering data model:

  • filters contains a list of all the files. Examples of entries in this table include "Style", "Color", "Size"
  • filters_categories links filters to categories, to allow finite control over which filters show on which category pages, in what order. For example, this table would link category "Shirts" to filters "Style", "Color", "Size" and the preferred sort order of those filters.
  • filter_options includes all the options for a specific filter. Examples here for various options include "Large", "Medium", and "Small", all linked to the "Size" filter.
  • filter_options_products links filter options to a specific product id with a many to many relationship.

Filter Options Exclusivity

One thing to consider before coding are the business rules pertaining to filter option exclusivity. If a product is assigned to one filter option, can it also have another filter option for that same filter type? IE, if a product is marked as blue, can it also be marked as red? When a user filters by color, can they filter to select products that are both blue and red? Or, if a product is is blue, can it not have any other filter options for that filter? In the case of Paper Source product filtering, we went with the former, where filter options are not exclusive to each other.

A real-life example of filter non-exclusivity is how Paper Source filters wedding invitations. Products are filtered by print method and style. Because some products have multiple print methods and styles, non-exclusivity allows a user to narrow down to a specific combination of filter options, e.g. a wedding invitation that is both tagged as "foil & embossed" and "vintage".

URL Structure

Another thing to determine before coding is the URL structure. The URL must communicate the current category of products and current filter options (or what I refer to as active/activated filters).

I designed the code to recognize one component of the URL path as the category slug, and the remaining paths to map to the various filter option url slugs. For example, a URL for the category of shirts is "/shirts", a URL for large shirts "/shirts/large", and the URL for large blue shirts "/shirts/blue/large". The code not only has to accept this format, but it also must create consistently ordered URLs, meaning, we don't want both "/shirts/blue/large" and "/shirts/large/blue" (representing the same content) to be generated by the code. Here's what simplified pseudocode might look like to retrieve the category and set the activated filters:

my @url_paths = split('/', $request_url); #url paths is e.g. /shirts/blue/large
my $category_slug = shift(@url_paths)
# find Category where slug = $category_slug
# redirect if not found
# @url_paths is active filters

Applying the Filtering

Next, we need a couple things to happen:

  • If there is an activated filter for any filter option, apply it.
  • Generate URLs to toggle filter options.

First, all products are retrieved in this category with a query like this:

SELECT products.*,
  COALESCE((SELECT GROUP_CONCAT(fo.url_slug) FROM filters_options_item foi 
    JOIN filters_options fo ON fo.id = foi.filter_option_id 
    WHERE foi.product_id = products.id), '') AS filters
FROM products 
JOIN categories_products cp ON cp.product_id = products.id
JOIN categories c ON c.id = cp.category_id
WHERE c.url_slug = ?

Next is where the code gets pretty hairy, so instead I'll try to explain with pseudocode:

#@filters = all applicable filters for current category
# loop through @filters
  # loop through filter options for this filter
  # filter product results to include any selected filter options for this filter
  # if there are no filter options selected for this filter, include all products
  # build the url for each filter option, to toggle the filter option (on or off)
# loop through @filters (yes, a second time)
  # loop through filter options for this filter
  # count remaining products for each filter option, if none, set filter option to inactive
  # build the final url for each filter option, based on all filters turned on and off

My pseudocode shows that I iterate through the filters twice, first to apply the filter and determine the base URL to toggle each filter option, and second to count the remaining filtered products and build the URL to toggle each filter. The output of this code is a) a set of filtered products and b) a set of ordered filters and filter options with corresponding counts and links to toggle on or off.

Here's a more specific example:

  • Let's say we have a set of shirts, with the following filters & options: Style (Long Sleeve, Short Sleeve), Color (Red, Blue), Size (Large, Medium, Small).
  • A URL request comes in for /shirts/blue/large
  • The code recognizes this is the shirts category and retrieves all shirts.
  • First, we look at the style filter. No style filter is active in this request, so to toggle these filters on, the activation URLs must include "longsleeve" and "shortsleeve". No products are filtered out here.
  • Next, we look at the color filter. The blue filter option is active because it is present in the URL. In this first loop, products not tagged as blue are removed from the set of products. To toggle the red option on, the activation URL must include "red", and to toggle the blue filter off, the URL must not include "blue", which is set here.
  • Next, we look at the size filter. Products not tagged as large are removed from the set of products. Again, the large filter has to be toggled off in the URL because it is active, and the medium and small filter need to be toggled on.
  • In the second pass through filters, the remaining items applicable to each filter option are counted, for long sleeve, short sleeve, red, medium, and small options. And the URLs are built to turn on and off all filter options (e.g. applying the longsleeve filter will yield the URL "/shirts/longsleeve/blue/large", applying the red filter will yield the URL "/shirts/blue/red/large", turning off the blue filter will yield the URL "/shirts/large").

The important thing to note here is the double pass through filters is required to build non-duplicate URLs and to determine the product count after all filter options have been applied. This isn't simple logic, and of course changing the business rules like exclusivity will change the loop behavior and URL logic.

Alternative Approaches

Finally, a few notes regarding alternative approaches here:

  • Rather than going with the blacklist approach described here, one could go with a whitelist approach where a set of products is built up based on the filter options set.
  • Filtering could be done entirely via AJAX, in which case URL structure may not be a concern.
  • If the data is simple enough, products could potentially be filtered in the database query itself. In our case, this wasn't feasible since we generate product filter option details from a number of product attributes, not just what is shown in the simplified product filter data model above.

wroclove.rb a.k.a. "The best Java conference in Ruby world"

Time and Date: 13-15th March 2015
Place: Wrocław University

Amongst many Ruby and Rails talks on this conference, there were some not related to Ruby at all. Since 2013 I have observed a growing number of functional programming topics and concerns. Even the open discussion was mainly devoted not to OOP but to concepts like Event Sourcing or patterns similar to those promoted by Erlang or Clojure.

So let's enumerate the best moments.

First of all, the "ClojureScript + React.js" presented by Norbert Wójtowicz. The link below points to the presentation that was recorded on Lambda Days. It is a fascinating talk about gigantic technology leap that is going to have a impact on the whole OOP world.



Nicolas Dermine presented Overtone's implemented mostly in Ruby that's called Sonic Pi which lets you basically write code that plays music. It is a great tool from which to learn programming and to use to have fun in the same time. It is also a great tool for teachers (both - music and programming).

There was also a fascinating talk about some aspects of social engineering in sourcing the requirements from your customers. It was presented by Alberto Brandolini, and the technique is called Event Storming.


HTTP/2 is on the way!

HTTPS and SPDY

Back in August 2014, we made our websites www.endpoint.com and liquidgalaxy.endpoint.com HTTPS-only, which allowed us to turn on HTTP Strict Transport Security and earn a grade of A+ from Qualys’ SSL Labs server test.

Given the widely-publicized surveillance of Internet traffic and injection of advertisements and tracking beacons into plain HTTP traffic by some unscrupulous Internet providers, we felt it would be good to start using TLS encryption on even our non-confidential public websites.

This removed any problems switching between HTTP for most pages and HTTPS for the contact form and the POST of submitted data. Site delivery over HTTPS also serves as a ranking signal for Google, though presumably still a minor one.

Doesn’t SSL/TLS slow down a website? Simply put, not really these days. See Is TLS Fast Yet? for lots of details. And:

Moving to HTTPS everywhere on our sites also allowed us to take advantage of nginx’s relatively new SPDY (pronounced “speedy”) capability. SPDY is an enhancement to HTTPS created by Google to increase web page delivery time by compressing headers and multiplexing many requests in a single TCP connection. It is only available on HTTPS, so it also incentivizes sites to stop using unencrypted HTTP in order to get more speed, with security as a bonus. Whereas people once avoided HTTPS because SSL/TLS was slower, SPDY turned that idea around. We began offering SPDY for our sites in October 2014.

On the browser side, SPDY was initially only supported by Chrome and Firefox. Later support was added to Opera, Safari 8, and partially in IE 11. So most browsers can use it now.

There is only partial server support: In the open source world, nginx fully supports SPDY now, but Apache’s mod_spdy is incomplete and development on it has stalled.

Is SPDY here to stay? After all it was an experimental Google protocol. Instead of getting on track to become an Internet standard protocol as is, it was used as the starting point for the next version of HTTP, HTTP/2. That sounded like good news, except that the current version HTTP/1.1 was standardized in 1999 and hadn’t really changed since then. Many of us wondered if HTTP/2 would get mired in the standardization process and take years to see the light of day.

However, the skeptics were wrong! HTTP/2 was completed over about 3 years, and its official RFC form is now being finalized. Having it be the next version of HTTP will go a long way toward getting more implementation and adoption, since it is no longer a single company’s project. On the other hand, basing HTTP/2 on SPDY meant that there was a widely-used proof of concept out there already, so discussions didn’t get lost in the purely theoretical. The creators of SPDY at Google were heavily involved in the HTTP/2 standardization process, so their lessons were not lost, and it appears that HTTP/2 will be even better.

What is different in HTTP/2?

  • Request and response multiplexing in a single TCP connection (no need for 6+ connections to the same host!)
  • Stream prioritization (prioritizing files that the client most needs first)
  • Server push (of files the server expects the client to need, before the client knows it), and client stream cancellation (in case the server or the client is wrong and wants to abort a stream)
  • Binary framing (no more hand-typing requests via telnet, sadly)
  • Header compression (greatly reducing the bloat of large cookies)
  • Backward-compatibility with HTTP/1.1 and autodiscovery of HTTP/2 support (transparent upgrading for users)
  • When TLS is used, require TLS 1.2 and minimum acceptable cipher strength (to help retire weak TLS setups)

For front-end web developers, these back-end plumbing changes have some very nice consequences. As described in HTTP2 for front-end web developers, you will soon be able to stop using many of the annoying workarounds for HTTP/1.1’s weaknesses: no more sprites, combining CSS & JavaScript files, inlining images in CSS, sharding across many subdomains, etc.

This practically means that the web can largely go back to working the way it was designed, with different files for different things, independent caching of small files, serve assets from the same place.

What is not changing?

Most of HTTP/1.1 basic semantics remain the same, with most of the changes being to the “wrapping” or transport of the data. All this stays the same:

  • built on TCP
  • stateless
  • same request methods
  • same request headers (including cookies)
  • same response headers and body
  • may be unencrypted or layered on TLS (although so far, Chrome and Firefox have stated that they will only support HTTP/2 over TLS, and IE so far only supports HTTP/2 over TLS as well)
  • no changes in HTML, CSS, client-side scripting, same-origin security policy, etc.

The real point: speed

Speed and efficiency are the main advantages of HTTP/2. It will use less data transfer for both requests and responses. It will use fewer TCP connections, lightening the load on clients, servers, firewalls, and routers.

As clients adapt more to HTTP/2, it will probably provide a faster perceived experience as servers push the most important CSS, images, and JavaScript proactively to the client before it has even parsed the HTML.

See these simple benchmarks between HTTP/1.1, SPDY, and HTTP/2.

When can we use it?

Refreshingly, Google has announced that they are happy to kill their own creation SPDY: they will drop support for SPDY from Chrome in early 2016 in favor of HTTP/2.

Firefox uses HTTP/2 by default where possible, and Chrome has an option to enable HTTP/2. IE 11 for Windows 10 beta supports HTTP/2. You can see if your browser supports HTTP/2 now by using the Go language HTTP/2 demo server.

On the server side, Google and Twitter already have been opportunistically serving HTTP/2 for a while. nginx plans to add support this year, and an experimental Apache module mod_h2 is available now. The H2O open-source C-based web server supports HTTP/2 now, as does Microsoft IIS for Windows beta 10.

So we probably have at least a year until the most popular open source web servers easily support HTTP/2, but by then most browsers will probably support it and it should be an easy transition, as SPDY was. As long as you’re ready to go HTTPS-only for your site, anyway. :)

I think HTTP/2 will be a good thing!

Give me more details!

I highly recommend that system administrators and developers read the excellent http2 explained PDF book by Daniel Stenberg, Firefox developer at Mozilla, and author of curl. It explains everything simply and well.

Other reference materials:

Cross Release APT Managment aka How to Watch Netflix on Debian 7 Wheezy

# Native Netflix video streaming has come to GnuLinux! ...if you have the correct library versions.

I am currently running GNU-Linux Debian 7 - Wheezy with OpenBox.  I really enjoy this lightweight, speedy and easily customized window manager (OpenBox uses simple XML configuration files). So I was also pretty excited when Netflix added HTML5 streaming support and read that folks were proclaiming success in Google Chrome browsers without necessitating any agent masking workarounds.

However, I found I was still getting errors when attempting to stream video in Chrome. The forums I was reading were reporting that when using the Chrome 36+ browser, Netflix would allow Linux streaming. Most all of these forums were based in a Ubuntu 14.04+ environment. Nevertheless, I found a hint as to how to proceed in Debian after reading this article regarding libnss:

"Netflix streams its video in HTML5, but uses a technology called Encrypted Media Extensions to prevent piracy. These extensions in turn require a set of libraries called Network Security Services that the browser can access."

Debian Wheezy's repo list maxed out at libnss3==2:3.14 and I would need libnss3==2:3.16+ in order to pass the DRM tests and securely stream with Netflix's HTML5 option enabled. In order to allow this libnss upgrade, I would first need to provide APT with instructions to pull from the Debian "jessie" development branch.

This is accomplished by setting repo priorities. I created a "jessie" specific APT sources list and added the Debian repo url's for jessie:

$ cat /etc/apt/sources.list.d/jessie.list
## DEBIAN JESSIE
deb ftp://ftp.debian.org/debian/ jessie main
deb-src ftp://ftp.debian.org/debian/ jessie main

And set pin priorities for libnss3 to fetch jessie libraries over wheezy - while defining a lower priority of all other jessie packages:

$ cat /etc/apt/preferences
Package: *
Pin: release a=waldorf
Pin-Priority: 1001

Package: *
Pin: release a=wheezy
Pin-Priority: 500

Package: *
Pin: release a=jessie
Pin-Priority: 110

Package: libnss3
Pin: release n=jessie
Pin-Priority: 510

Now, update apt and confirm higher libnss3 installation candidates:

$ sudo apt-get update && sudo apt-cache policy libnss3
libnss3:
  Installed: 2:3.14.5-1+deb7u3
  Candidate: 2:3.17.2-1.1
  Package pin: 2:3.17.2-1.1
  Version table:
    2:3.17.2-1.1 510
        500 ftp://ftp.debian.org/debian/ jessie/main amd64 Packages
*** 2:3.14.5-1+deb7u3 510
        500 http://http.debian.net/debian/ wheezy/main amd64 Packages
        500 http://security.debian.org/ wheezy/updates/main amd64 Packages
        100 /var/lib/dpkg/status

Install new libnss3 candidate:

$ sudo apt-get install libnss3=2:3.17.2-1.1

Restart any Chrome instances (and upgrade to 36+ if you haven't yet) and enjoy Netflix streaming on Debian Linux!

On End Point's Development Environment

A few recent conversations have sparked my interest in writing up a blog post that summarizes the familiar elements of our development environments. The majority of End Pointers work remotely, but many of the tools listed below are common to many developers.

  • ssh/sftp: We do primarily remote development. Many of us are familiar with local development, but remote development with camps (see next point) is typically the most efficient arrangement in working with multiple development instances that are accessible to clients for testing and staging.
  • camps: DevCamps are a tool specific to and created by End Point, which are development instances with an entire webserver, database, and app server stack, similar to containers like Docker. Check out the DevCamps website for more information.
  • vim/emacs/nano: While most of our employees use vim or emacs for command-line editors, nano is an inefficient but easy to use editor that we can suggest to new developers. Not many of us use IDEs, if at all.
  • screen/tmux: screen and tmux are our preferred terminal multitasking and sharing.
  • command-line database interaction (specifically psql and mysql ad-hoc querying): Working with an SQL database through an ORM like Active Record, DBIC, etc. is not enough for us.
  • *nix / basic command-line interaction: This topic could make up its own blog post, but some of the tools we use frequently are netstat/ss, ifconfig/ip, lsof, ps/top/htop/atop, free, df, nice/ionice, tail -f, sort, uniq -c, grep.
  • git & github: Not uncommon to devshops these days, git is the most popular version control system, and github an extremely popular host of both open source and private repositories.
  • IRC, Skype, Google Hangouts, appear.in, talky.io, glideroom.com, Google Voice, & *gasp* regular phones: As remote developers, we communicate often and there are a number of tools available in the communication space that we leverage.

It's interesting to note that if any new developers come in with a preference for a trendy new tool, while we are happy to let them work in an environment that allows them to be efficient, ultimately we can't provide support for those tools that we are unfamiliar with.

Postgres searchable release notes - one page with all versions

The inability to easily search the Postgres release notes has been a long-standing annoyance of mine, and a recent thread on the pgsql-general mailing list showed that others share the same frustration. One common example when a new client comes to End Point with a mysterious Postgres problem. Since it is rare that a client is running the latest Postgres revision (sad but true), the first order of business is to walk through all the revisions to see if a simple Postgres update will cure the problem. Currently, the release notes are arranged on the postgresql.org web site as a series of individual HTML pages, one per version. Reading through them can be very painful - especially if you are trying to search for a specific item. I whipped up a Perl script to gather all of the information, reformat it, clean it up, and summarize everything on one giant HTML page. This is the result: https://bucardo.org/postgres_all_versions.html

Please feel free to use this page however you like. It will be updated as new versions are released. You may notice there are some differences from the original separate pages:

  • All 270 versions are now on a single page. Create a local greppable version with:
    links -dump https://bucardo.org/postgres_all_versions.html > postgres_all_versions.txt
  • All version numbers are written clearly. The confusing "E.x.y" notation was stripped out
  • A table of contents at the top allows for jumping to each version (which has the release date next to it).
  • Every bulleted feature has the version number written right before it, so you never have to scroll up or down to see what version you are currently reading.
  • If a feature was applied to more than one version, all the versions are listed (the current version always appears first).
  • All CVE references are hyperlinks now.
  • All "mailtos" were removed, and other minor cleanups.
  • Replaced single-word names with the full names (e.g. "Massimo Dal Zotto" instead of "Massimo") (see below)

Here's a screenshot showing the bottom of the table of contents, and some of the items for Postgres 9.4:

The name replacements took the most time, as some required a good bit of detective work. Most were unambiguous: "Tom" became "Tom Lane", "Bruce" became "Bruce Momjian", and so on. For the final document, 3781 name replacements were performed! Some of the trickier ones were "Greg" - both myself ("Greg Sabino Mullane") and "Greg Stark" had single-name entries. Similar problems popped up with "Ryan", and with "Peter" *not* being the familiar Peter Eisentraut (but Peter T. Mount) threw me off for a second. The only one I was never able to figure out was "Clark", who is attributed (via Bruce) with "Fix tutorial code" in version 6.5. Pointers or corrections welcome.

Hopefully this page will be of use to others. It's a very large page, but not remarkably wasteful of space, like many HTML pages these days. Perhaps some of the changes will make their way to the official docs over time.

Working with Annotator: Part 2

A while back, I wrote about my work on Ruby on Rails based H2O with Annotator, an open source JavaScript library that provides annotation functionality. In that article, I discussed the history of my work with annotations, specifically touching on several iterations with native JavaScript functionality developed over several years to handle colored highlight overlapping on selected text.

Finally, last July we completed the transition to Annotator from custom JavaScript development, with the caveat that the application had quite a bit of customization hooked into Annotator's easily extensible library. But, just a few months after that, I revisited the customization, described in this post.

Separation of UI Concerns

In our initial work with Annotator, we had extended it to offer a few additional features:

  • Add multiple tags with colored highlights, where content can be tagged on the fly with a color chosen from a set of predefined colors assigned to the tag. Text would then be highlighted with opacity, and colors combined (using xColor) on overlapping highlights.
  • Interactive functionality to hide and show un-annotated text, as well as hide and show annotated text with specific tags.
  • Ability to link annotated text to other pieces of content.

But you know the paradox of choice? The extended Annotator, with so many additional features, was offering too many choices in a cluttered user interface, where the choices were likely not used in combination. See:

Too many annotation options: Should it be hidden? Should it be tagged?
Should it be tagged with more than one tag? Can text be tagged and hidden?

So, I separated concerns here (a common term in software) to intentionally separate annotation features. Once a user selects text, a popup is shown to move forward with adding a comment, adding highlights (with or without a tag name), adding a link, or hiding that text:


The new interface, where a user chooses the type of annotation they are saving, and only relevant fields are then visible.


After the user clicks on the highlight option, only highlight related fields are shown.

Annotator API

The functionality required to intercept and override Annotator's default behavior and required core overrides, but the API has a few nice hooks that were leveraged to accommodate this functionality in the H2O plugin:

  • annotationsLoaded: called after annotation data loaded
  • annotationsEditorSubmit: called after user saves annotation, before data sent to server
  • annotationCreated: after annotation created
  • annotationUpdated: after annotation updated
  • annotationDeleted: after annotation deleted

While these hooks don't mean much to someone who hasn't worked with Annotator, the point is that there are several ways to extend Annotator throughout the CRUD (create, read, update, destroy) actions on an annotation.

Custom Data

On the backend, the four types of annotation are contained in a single table, as was the data model prior to this work. There are several additional data fields to indicate the type of annotation:

  • hidden (boolean): If true, text is not visible.
  • link (text): Annotation can link to any other URL.
  • highlight (color only, no tag): Annotation can be assigned a single colored highlight.
  • highlight + tag (separate table, using various Rails plugins (acts_as_taggable_on)): Annotation can be assigned a single tag with a corresponding colored highlight.

Conclusion

The changes here resulted in a less cluttered, more clear interface. To a user, each annotation has a single concern, while utilizing Annotator and saving to a single table.

SCaLE 13x

SCaLE Penguin

I recently went to the Southern California Linux Expo (SCaLE). It takes place in Los Angeles at the Hilton, and is four days of talks, classes, and more, all focusing around Linux. SCaLE is the largest volunteer run open source conference. The volunteers put a lot of work into the conference, from the nearly flawless wireless network to the AV team making it as easy as plugging in a computer to start a presentation.


One large focus of the conference was the growing DevOps community in the Linux world. The more DevOps related talks drew the biggest crowds, and there was even a DevOps focused room on Friday. There are a wide range of DevOps related topics but the two that seemed to draw the largest crowds were configuration management and containerization. I decided to attend a full day talk on Chef (a configuration management solution) and Docker (the new rage in containerization).


The Thursday Chef talk was so full that they decided to do an extra session on Sunday. The talk was more of an interactive tutorial than a lecture, so everyone was provided with an AWS instance to use as their Chef playground. The talk started with the basics of creating a file, installing a package, and running a service. It was all very interactive; there would be a couple of slides explaining a feature and then there was time provided to try it out. During the talk there was a comment from someone about a possible bug in Chef, concerning the suid bit being reset after a change of owner or group to a file. The presenter, who works for the company that creates Chef, wasn't sure what would happen and said, "Try it out." I did try it out, and there was a bug in Chef. The presenter suggested I file an issue on github, so I did and I even wrote a patch and made a pull request later on that weekend.


Containers were the other hot topic that weekend, with the half day class on Friday, and a few other talks throughout the weekend. The Docker Talk was also set up in a learn by doing style. We learned the basics of downloading and running Docker images from the Docker Hub through the command line. We added our own tweaks to the tops of those images and created new images of our own. The speaker, Jerome Petazzoni, usually gives a two or three day class on the subject, so he picked the parts he thought most interesting to share with us. I really enjoyed making a Docker File which describes the creation of a new machine from a base image. I also thought one of the use cases described for Docker to be very interesting, creating a development environment for employees at a company. There is usually some time wasted moving things from machine to machine, whether upgrading a personal machine or transferring a project from one employee to another, especially when they are using different operating systems. Docker can help to create a unified state for all development machines in a company to the point where setting a new employee up with a workspace can be accomplished in a matter of minutes. This also helps to bring the development environment closer to the production environment.


One sentiment I heard reiterated in multiple DevOps talks was the treatment of servers as Pets vs. Cattle. Previously servers were treated as pets. We gave servers names, we knew what they liked and didn't like, when they got sick we'd nurse them back to health. This kind of treatment for servers is time consuming and not manageable at the scale that many companies face. The new trend is to treat servers like cattle. Each server is given a number, they do their job, and if they get sick they are "put down". Tools like Docker and Chef make this possible, servers can be set up so quickly that there's no reason to nurse them back to health anymore. This is great for large companies that need to manage thousands of servers, but it can save time for smaller companies as well.