End Point

News

Welcome to End Point's blog

Ongoing observations by End Point people.

Elasticsearch: Give me object!

I'm currently working on a project where Elasticsearch is used to index copious amounts of data with sometimes deeply nested JSON. A recurring error I've experienced is caused by a field not conforming to the type listed in the mapping. Let's reproduce it on a small scale.

Assuming you have Elasticsearch installed, let's create an index and mapping:

$ curl -XPUT 'http://localhost:9200/test' -d '
{
    "mappings": {
        "item": {
            "properties": {
                "state": {
                    "properties": {
                        "name": {"type": "string"}
                    }
                }
            }
        }
    }
}
'
{"ok":true,"acknowledged":true}

Since we've defined properties for the "state" field, Elasticsearch will automatically treat it as an object.* Let's now add some documents:

$ curl -XPUT 'http://localhost:9200/test/item/1' -d '
{
    "state": {
        "name": "North Carolina"
    }
}
'
{"ok":true,"_index":"test","_type":"item","_id":"1","_version":1}

Success! Let's now get into trouble:

$ curl -XPUT 'http://localhost:9200/test/item/2' -d '
{
    "state": "California"
}
'
{"error":"MapperParsingException[object mapping for [state] tried to parse as object, but got EOF, has a concrete value been provided to it?]","status":400}

The solution: check any non-objects in your data against your mapping schema and you'll be sure to find a mismatch.

One thing to note is that the explicit creation of the mapping is unnecessary since Elasticsearch creates it using the first added document. Try this:

$ curl -XPUT 'http://localhost:9200/test2/item/1' -d '
{
    "state": {
        "name": "North Carolina"
    }
}
'
{"ok":true,"_index":"test2","_type":"item","_id":"1","_version":1}
$ curl -XGET 'http://localhost:9200/test2/_mapping'
{
    "test2": {
        "item": {
            "properties": {
                "state": {
                    "dynamic":"true",
                    "properties": {
                        "name": {"type":"string"}
                    }
                }
            }
        }
    }
}

So, this stays true to the statement: "Elasticsearch is schema-less, just toss it a typed JSON document and it will automatically index it." You can throw your car keys at Elasticsearch and it will index, however, as noted above, just be sure to keep throwing nothing but car keys.

*Anything with one or more nested key-value pairs is considered an object in Elasticsearch. For more on the object type, see here.

Estimating overlayfs File Space Usage on Ubuntu 12.04 LiveCD

End Point's Liquid Galaxy platform is a cluster of computers distributing the rendering load of Google Earth and other visualization applications. Each of these "display nodes" boots the same disk image distributed over the local network. Our disk image is based on the Ubuntu 12.04 LiveCD, and uses the same overlayfs to combine a read-only ISO with a writeable ramdisk. This Union mount uses Copy-on-write to copy files from the read-only "lower" filesystem to the virtual "upper" filesystem whenever those files or directories are opened for writing.

We often allocate 4GB of system memory to the ramdisk containing the "upper" filesystem. This allows 4GB of changed files on the / filesystem, most of which are often Google Earth cache files. But sometimes the ramdisk fills up with other changes and it's difficult to track down which files have changed unexpectedly.

The df command properly displays total usage for the overlayfs "upper" filesystem mounted at /.

$ df -h
Filesystem              Size  Used Avail Use% Mounted on
/cow                    3.9G  2.2G  1.8G  55% /
/dev/loop0              833M  833M     0 100% /cdrom
/dev/loop1              808M  808M     0 100% /rofs

But how can we identify which files are consuming that space? Because the root device has been pivoted by the casper scripts at boot, the /cow device is not readily available. We often use the du tool to estimate disk usage, but in this case it cannot tell the difference between files in the "upper" ramdisk and the "lower" read-only filesystem. To find the files filling our /cow device, we need a way to enumerate only the files in the "upper" filesystem, and then estimate only their disk usage.

The `mount` command shows the / filesystem is type "overlayfs".

$ mount
/cow on / type overlayfs (rw)
/dev/loop0 on /cdrom type iso9660 (ro,noatime)
/dev/loop1 on /rofs type squashfs (ro,noatime)

The find command does indicate that most directories exist in the filesystem of type "overlayfs", and most unmodified files are on the "lower" filesystem, in this case "squashfs".

$ sudo find / -printf '%F\t%D\t%p\n' | head -n 7 ### fstype, st_dev, filename
overlayfs 17 /
overlayfs 17 /bin
squashfs 1793 /bin/bash
squashfs 1793 /bin/bunzip2
squashfs 1793 /bin/bzcat
squashfs 1793 /bin/bzcmp
squashfs 1793 /bin/bzdiff

However, the modified files are reported to be on an "unknown" filesystem on a device 16. These are the files that have been copied to the "upper" filesystem upon writing.

$ find /home/lg/ -printf '%F\t%D\t%p\n' | head -n 33
overlayfs 17 /home/lg/
squashfs 1793 /home/lg/.Xresources
unknown  16 /home/lg/.bash_history
squashfs 1793 /home/lg/.bash_logout
squashfs 1793 /home/lg/.bashrc
overlayfs 17 /home/lg/.config
overlayfs 17 /home/lg/.config/Google
unknown  16 /home/lg/.config/Google/GoogleEarthPlus.conf
unknown  16 /home/lg/.config/Google/GECommonSettings.conf

I couldn't quickly discern how find is identifying the filesystem type, but it can use the -fstype test to reliably identify files that have been modified and copied. (Unfortunately find does not have a test for device number so if you have more than one overlayfs filesystem this solution may not work for you.)

Now that we have a reliable list of which files have been written to and copied, we can see which are consuming the most disk space by piping that list to du. We'll pass it a null-terminated list of files to accommodate any special characters, then we'll sort the output to identify the largest disk space hogs.

$ sudo find / -fstype unknown -print0 | du --files0-from=- --total | sort -nr | head -n 4
2214228 total
38600 /var/cache/apt/pkgcache.bin
38576 /var/cache/apt/srcpkgcache.bin
17060 /var/log/daemon.log

We included a total in this output, and notice that total is exactly the 2.2GB indicated by the df output above, so I believe this is measuring what we intend.

Of course five hundred 2MB Google Earth cache files consume more space than a single 38MB apt cache file, so we'd like to list the directories whose files are consuming the most ramdisk space. Unfortunately giving find and du depth arguments won't work: the "unknown" filesystem doesn't have any directories that we can see. We'll have to parse the output, and that's left as an exercise for the reader for now.

I just realized I could've simply looked for files modified after boot-time and gotten very similar results, but that's not nearly as fun. There may be a way to mount only the "upper" filesystem, but I was disappointed by the lack of documentation around overlayfs, which will likely be included in the mainline 3.10 Linux kernel.

SSH ProxyCommand with netcat and socat

Most of my day to day is work is conducted via a terminal, using Secure Shell (SSH) to connect to various servers. I make extensive use of the local SSH configuration file, ~/.ssh/config file, both to reduce typing by aliasing connections, and to allow me to seamlessly connect to servers, even when a direct connection is not possible, by use of the ProxyCommand option.

There are many servers I work on that cannot be directly reached, and this is where the ProxyCommand option really comes to the rescue. It allows you to chain two or more servers, such that your SSH connections bounce through the servers to get to the one that you need. For example, some of our clients only allow SSH connections from specific IPs. Rather than worry about which engineers need to connect, and what IPs they may have at the moment, engineers can access servers through certain trusted shell servers. Then our engineers can SSH to one of those servers, and from there on to the client's servers. As one does not want to actually SSH twice every time a connection is needed, the ProxyCommand option allows a quick tunnel to be created. Here's an example entry for a .ssh/config file:

Host proxy
User greg
HostName proxy.example.com

Host acme
User gmullane
HostName pgdev.acme.com
ProxyCommand ssh -q proxy nc -w 180 %h %p

So now when we run the command ssh acme, ssh actually first logs into proxy.example.com as the user "greg", runs the nc (netcat) command (after plugging in the host and port parameters for us), and then logs in to gmullane@pgdev.acme.com from proxy.example.com. We don't see any of this happening: one simply types "ssh acme" and gets a prompt on the pgdev.acme.com server.

Often times more than one "jump" is needed, but it is easy to chain servers together, such that you can log into a third server by running two ProxyCommands. Recently, this situation arose but with a further wrinkle. There was a server, we'll call it calamity.acme.com, which was not directly reachable via SSH from the outside world, as it was a tightly locked down production box. However, it was reachable by other boxes within the company's intranet, including pgdev.acme.com. Thus to login as gmullane on the calamity server, the .ssh/config file would normally look like this:

Host proxy
User greg
HostName proxy.example.com

Host acme
User gmullane
HostName pgdev.acme.com
ProxyCommand ssh -q proxy nc -w 180 %h %p

Host acme_calamity
User gmullane
## This is calamity.acme.com, but pgdev.acme.com cannot resolve that, so we use the IP
HostName 192.168.7.113
ProxyCommand ssh -q acme nc -w 180 %h %p

Thus, we'd expect to run ssh acme_calamity and get a prompt on the calamity box. However, this was not the case. Although I was able to ssh from proxy to acme, and then from acme to calamity, things were failing because acme did not have the nc (netcat) program installed. Further investigation showed that it was not even available via packaging, which surprised me a bit as netcat is a pretty old, standard, and stable program. However, a quick check showed that the socat program was available, so I installed that instead. The socat program is similar to netcat, but much more advanced. I did not need any advanced functionality, however, just a simple bidirectional pipe which the SSH connections could flow over. The new entry in the config file thus became:

Host acme_calamity
User gmullane
HostName 192.168.13.123
ProxyCommand ssh -q acme socat STDIN TCP:%h:%p

After that, everything worked as expected. It's perfectly fine to mix socat and netcat as we've done here; at the end of the day, they are simple dumb pipes (although socat allows them to be not so simple or dumb if desired!). The arguments to socat are simply the two sides of the pipe. One is stdin (sometimes written as stdio or a single dash), and the other is a TCP connection to a specific host and port (which SSH will plug in). You may also see it written as TCP4, which simply forces IPv4 only, where TCP encompasses IPv6 as well.

The options to netcat are very similar, but shorter as it already defaults to using stdin for the one side, and because it defaults to a TCP connection, so we can leave that out as well. The "-w 180" simply establishes a three minute timeout so the connection will close itself on a problem rather than hanging out until manually killed.

Even if both netcat and socat were not available, there are other solutions. In addition to other programs, it is easy enough to write a quick Perl script to create your own bidirectional pipe!

Leap Motion Controller + Liquid Galaxy


We’re constantly trying out new things with our Liquid Galaxy at End Point: new content types, new remote management scripts, new hardware. The good people at Leap Motion sent over one of their hand motion controllers and we couldn't wait to get it unpacked and driving the Liquid Galaxy that sits in our office.

With access to developer tools, we were able to configure the Leap Motion and update the device firmware. Then, after a quick installation of the latest Google Earth build v 7.1.1.1550 that includes Leap Motion controller support, we were ready to test out the new device on our 7-screen Liquid Galaxy.

Watch the video and see for yourself-- we think the combination is fantastic! That superhero feeling of flying around the world with a Liquid Galaxy is only enhanced that much more by now commanding the movement of the planet with subtle flexing of your fingers. (That’s Kiel in the video-- personally, I think he looks like Magneto without the helmet.)

End Point continues to push the boundaries on what the Liquid Galaxy can do. We want to experiment some more and see what else might be possible with the Leap Motion controller for Google Street View, panoramic photos, videos, and maybe even games. If you've got an idea, or if you have a cool place where this might be deployed, please contact us at info@endpoint.com.

Creating Smooth Flight Paths in Google Earth with Kamelopard and Math

The major motivation for writing Kamelopard was that writing XML by hand is a pain in the neck. But there were other motivations as well. In particular we found some limitations of Google Earth's default FlyTo behavior, and wanted to be able to address them flexibly. Version 0.0.11, just released, does exactly that.

Bezier curve, similar to a cspline

Our clients often want Google Earth's camera to fly smoothly from one place to another, through a precisely defined set of waypoints. Earth does this with a FlyTo, one of Google's extensions to KML. It tells Earth to move from its current camera position to a new one, following a nice path Google Earth calculates automatically. Most of the time this works just fine, but on occasion, Earth's automatic path will run into buildings or mountains, or do other unexpected and strange things. There are a few KML tricks we've learned to handle those cases, but it would often be nice to have tighter control. Unfortunately getting that level of control means calculating the flight path ourselves. We've developed the smarts to do that, a little bit at a time.

The first version involved Catmull-Rom splines, a variant of a cubic spline (or "cspline") that gives nice results and is fairly simple to calculate. The idea of a spline is to build a curve that passes smoothly through a set of "control points". To achieve this, in essence we project those points into a vector space defined by a special set of cubic basis functions. In other words, we turn our control points into a matrix and multiply it by another matrix to end up with a function describing our path. The general cspline requires "tangents" in addition to the control points to define behavior at the ends of the curve; the Catmull-Rom variant derives the tangents from the control points, which limits flexibility but works well for our purposes. So if we want a path between several points on a globe, we use those points as the control points in the spline, and Kamelopard would make a nice path between them automatically. Better still, csplines can support any number of dimensions, so we can include factors such as altitude or heading in the generated curve. As it turned out, though, we didn't end up using this spline very much. It was built on top of some other code that later proved insufficient for what we wanted, and that was removed.

Parabola defined by three points

The next iteration, recently committed, allows Kamelopard scripts to define mathematical functions in terms of the same control points, and asks Kamelopard to use those functions to build smooth paths. Right now Kamelopard supports cubic, quadratic, and linear functions; others would of course be possible given sufficient reason to develop them, but the existing functions seem to work very well. The matrix math to interpolate these types of functions based on control points was straightforward, and they can define a wide variety of paths.

One limitation of the new code compared to the old: the spline functions would take any number of control points, whereas the mathematical function versions are more limited. Three points in a particular order, for instance, uniquely describe a quadratic curve. As a result, our quadratic function implementation only supports three control points. I plan to reintroduce splines in the future for situations where this limitation causes problems. These functions are especially flexible, though, in that they can determine not only the camera's latitude and longitude, but also its heading, altitude, and various other things including the duration of each FlyTo. Our cspline implementation was intended to handle those dimensions as well, but never got that far along.

So how does all this work? Here is a simple example, showing an interpolated path from a point about 10 km above one of the cows near my house, to another point a bit to the north. First, the KML itself is available for download; in Google Earth, it looks like this (click on the image to get the larger version):

Here's the code:

require 'rubygems'
require 'kamelopard'

include Kamelopard
include Kamelopard::Functions

a = make_function_path(10,
    :latitude => Line.interpolate(38.8, 40.3),
    :longitude => Cubic.interpolate(-112.4, -111.9, -0.5, -113, 0.5, -110),
    :altitude => Line.interpolate(10000, 2000),
    :heading => Line.interpolate(0, 90),
    :tilt => Line.interpolate(40.0, 90),
    :altitudeMode => :absolute,
    :show_placemarks => 1,
    :duration => Quadratic.interpolate(2.0, 4.0, 0.0, 1.0),
)

name_document 'Functions test'
name_folder 'Placemarks'
name_tour 'Function test'

write_kml_to 'doc.kml'

The make_function_path call does most of the work here. We've given it functions to interpolate the latitude, longitude, and other characteristics of the flight path we want, along with a few other options described in the gem documentation. We also tell it how many points to create in the path, in this case 10. The root of the flying is still Google Earth's FlyTo algorithm, but we set to smooth mode, to keep Earth doing more precisely what we want it to, and we create waypoints on the flight path frequently enough that we have tight control over where Earth actually flies.

Creating the functions themselves is relatively easy, but you need to remember the order of the arguments, which in this case at least can be confusing. I'll probably change it in a future version, once I come up with something better. This code defines the latitude function in terms of the beginning and ending latitude, and that's all since we only need two points to define a line. The quadratic and cubic functions take three and four points, respectively. Although not demonstrated here, make_function_paths can also take a code block for more complex behaviors at each point.

In the end this generates a Google Earth Tour, which flies from the start point to the end point very smoothly. This code has demonstrated that it works well for flights over large areas; my next goal is to use it to navigate between 3D buildings on a precisely defined path. That will be for another article, though.

Liquid Galaxy in GSoC 2013!

Once again The Liquid Galaxy Project has been accepted as a mentoring organization for the Google Summer of Code program! The Google Summer of Code program (AKA GSoC) provides a tremendous opportunity for talented undergraduate and graduate students to work developing Open Source software guided by a mentor. Students receive $5000 stipends for successfully completing their summer projects. The past two years The Liquid Galaxy Project has had three GSoC students successfully complete their projects each year. This year we are hoping to increase this number.

Right now we are in the "would-be student participants discuss application ideas with mentoring organizations" phase of the program. Interested students should contact the project's mentors and admins (which includes a goodly number of End Pointers) by emailing lg-gsoc@endpoint.com or by jumping into the #liquid-galaxy Freenode IRC channel. Applicants are well advised to take advantage of the opportunity to consult with project mentors in developing their applications. Student applications are being accepted starting April 22 and must be submitted by May 3. The definitive timeline for the Google Summer of Code program with the exact time of the student application deadline should be consulted by interested students. Applications for the Liquid Galaxy Project's GSoC program should be emailed to lg-gsoc@endpoint.com.

The Liquid Galaxy GSoC 2013 Ideas Page is at http://code.google.com/p/liquid-galaxy/wiki/GSoC2013Ideas. We are interested in project proposals based on all the topics listed there and in other ideas from students for projects that will advance the capabilities of Liquid Galaxy. This past year has seen wonderful advances on the Liquid Galaxy, with great improvements in the display of Street View, panoramic photography and video, Google Maps, and more. This summer we will have weekly Hangouts for all students and mentors and other interested community members. The Hangouts will provide concrete support for our students in their projects.

The FAQ for the 2013 Google Summer of Code program is located at https://google-melange.appspot.com/gsoc/document/show/gsoc_program/google/gsoc2013/help_page

If you are a student with programming chops who likes Open Source software you should look at the Google Summer of Code program and apply to one or more of the great projects in the program. If you know of students who you think would be a good fit for the program you'll be doing a good deed by encouraging them to check it out. And by all means, if you are interested in the Liquid Galaxy Project jump in and contact us!

Making SSL Work with Django Behind an Apache Reverse Proxy

Bouncing Admin Logins

We have a Django application that runs on Gunicorn behind an Apache reverse proxy server. I was asked to look into a strange issue with it: After a successful login to the admin interface, the browser was re-directed to the http (non-SSL) version of the interface.

After some googling and investigation I determined the issue was likely due to our specific server arrangement. Although the login requests were made over https, the requests proxied by Apache to Gunicorn used http (securely on the same host). Checking the Apache SSL error logs quickly affirmed this suspicion. I described the issue in the #django channel on freenode IRC and received some assistance from Django core developer Carl Meyer. As of Django 1.4 there was a new setting Carl had developed to handle this particular scenario.

Enter SECURE_PROXY_SSL_HEADER

The documentation for the SECURE_PROXY_SSL_HEADER variable describes how to configure it for your project. I added the following to the settings.py config file:

SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https')

Because this setting tells Django to trust the X-Forwarded-Proto header coming from the proxy (Apache) there are security concerns which must be addressed. The details are described in the Django documentation and this is the Apache configuration I ended up with:

# strip the X-Forwarded-Proto header from incoming requests
RequestHeader unset X-Forwarded-Proto

# set the header for requests using HTTPS
RequestHeader set X-Forwarded-Proto https env=HTTPS

With SECURITY_PROXY_SSL_HEADER in place and the Apache configuration updated, logins to the admin site began to work correctly.

This is standard practice for web applications that reside behind an HTTP reverse proxy, but if the application was initially set up using only plain HTTP, when HTTPS is later added, it can be easy to be confused and overlook this part of the setup.

Avoid 2:00 and 3:00 am cron jobs!

A word to the wise: Do not set any cron jobs for 2:00 am or 3:00 am on Sunday morning! Or to be safe, on other mornings besides Sunday as well, since jobs originally set to run on some particular day may eventually be changed to run on another day, or every day.

Most of the time such cron jobs will run fine, but if they run every Sunday morning, then twice per year they will run at the exact time daylight savings time (aka summer time) kicks in or ends, sometimes with very strange results.

On Linux with vixie-cron we saw two cron jobs run something like once per second between 3:00 and 3:01 when the most recent daylight savings time began. Thus they ran about 60 times, stepping all over each other and making a noisy mess in email. No serious harm was done, but that's only because they were not tasks capable of causing serious harm.

Feel free to wish for or agitate for or fund or write a better open source job scheduler that everyone will use, one that will ensure no overlapping runs, allow specifying time limits, etc. Better tools exist, but until one of them achieves cron's level of ubiquity, we have to live with cron at least some places and sometimes.

Alternatively, where possible set the server timezone to UTC so that no daylight savings changes will happen at all.

Or most preferable: Governments of the world, stop the twice-yearly dance of daylight saving time altogether.

But in the meantime this particular problem can be entirely avoided by just not scheduling any cron jobs to run on Sunday morning at 2:00 or 3:00 server time.

Pounding Simplicity into Wiki

Day two of MountainWest Ruby Conference starts out with a bang! Notable developer and thought leader Ward Cunningham describes the how he is going about developing his latests ideas behind the wiki. While doing so, Cunningham teaches concepts of innovation and how creativity to help inspire ruby developers.

Promise

The promise is a basic statement of the desired outcome. Not in a way a mock-up shows the finished product, but the way in which it will affect humanity. Wikipedia gives words, depth, and meaning that ordinary people can depend on every day. The promise of this new kind of wiki is to give numbers depth and meaning that ordinary people can depend on every day.

This means data visualization intermixed with context. For example, a weather map can show you numbers on a map to tell you temperatures. A meteorologist doesn't just see a number, he sees the actual weather, the hot and cold, the wind or the rain, etc. Data visualizations like a wind map excel at helping users to visually see the wind in region.

To accomplish this promise, Cunningham implemented a new kind of wiki. The main difference in this new wiki is that the data is federated among several different locations on the web and then assembled in the browser. You can think of it as a traditional mashup. The wiki content is both self generated and programatically generated from data on the web or attached to the web via some device.

Process

  • 0 Story: Pages with datasets, images and paragraphs with history (versions).
  • 1 Binding: Attaches the data to different versions of the page revisions.
  • 2 Attribution: Source is dynamically generated so that it can be tracked back.
  • 3 Link Context: Links to other pages on other servers give hints to tell you where the data originates.
  • 4 Neighborhood: Click on a page that doesn't exist (red link) server looks for similar page on other wikis in the federated network.
  • 5 Search: Global search looks in all the wikis in the federated network.

Principle

The principle behind this project is one of discovery. As the development continues, the possibilities for it increase and new thoughts and ideas are discovered. This was talked about in a talk by Bret Victor called Inventing on Principle. If you were to compare this to agile, it might look like this:

Agile Principle
velocity smallest
customer curiosity
confidence wonder

Plugins

Widgets are some markup that allow interactivity. Widgets have access to the content on the wiki page to allow it to be integrated into the markup. Widgets source code live on github. Widgets allow you to explore your data by breaking up the visualizations and put it in context with explanation and documentation of the wiki.

Connecting Things to the Federated Wiki

The documentation page allows you to use a widget to talk to a connected computer or device. Cunningham demonstrated connecting the wiki to a small microcontroller that emitted sound and blinked an LED. From the wiki page he could see output from the device and send instructions to make the device do different things. All of the communication is handled over a websocket so the interaction is seamless and instant. The idea here is that different sensors could provide live data to the wiki to augment research and discovery.

Ward Cunningham has an amazing ability to bring small comprehensible things together into systems that show us the future of our interactions with the web. This sparks new ideas and explores realms of possibility that enhances our lives, just as that simple idea of throwing some markdown into a web server and displaying it for the world to see in a searchable and linkable way. That little idea that sparked a revolution in information discovery. The wiki.

End Point Upgrades Liquid Galaxy at Ann Arbor Hands On Museum

End Point has assisted the Ann Arbor Hands On Museum in upgrading their Liquid Galaxy to the latest version of interactive immersive digital viewing experience. The museum has a mission to inspire children to discover the wonder of science, technology, math, art, and engineering, and to be the leader in imaginative and interactive learning experiences. The Liquid Galaxy plays a central part in that mission.

Googlers provided material support in the form of hardware and technical expertise to assemble a Liquid Galaxy based on the open source project information. Hands On Museum staffers coordinated the Exhibits, IT and Administration teams at the Museum to bring the project together in a few months. The result was an exhibit that beautifully blended Liquid Galaxy into the mission of the museum.

In December 2012 The Ann Arbor hands On Museum announced the addition of a Liquid Galaxy to their roster of exhibits. At End Point, we congratulated them, and we also offered any assistance they might need. The Museum was running into some performance issues, and asked if any upgrades or recommendations were available from End Point.

Working closely with Google, and with the great cooperation of everyone at the museum, End Point replaced the computers and upgraded the system and application software for the Hands On Museum's Liquid Galaxy. End Point added a new head node and new display nodes for improved performance and reliability.

Additionally, the software upgrades provide a number of advantages:

  • Google Street View capability
  • constant remote monitoring of all systems
  • usage statistics via an online dashboard
  • remote system upgrades and troubleshooting from End Point
  • a new content management system for the museum administrators.

Mel Drumm, Director for the Hands On Museum, gave a strong review of the Liquid Galaxy and our work:

"I wish to thank everyone for this amazing upgrade. Few exhibits at the Museum have become so popular so fast. Everyone thoroughly enjoys the Liquid Galaxy exhibit. Additionally, the exhibit has become a very important signature exhibit as we work to expand our audience and introduce impressive technologies into our visitor experience. I have enjoyed the opportunity to showcase the exhibit to guests, donors, collaborators and visitors on multiple occasions since the upgrade. It is engaging to everyone and works absolutely flawlessly. We are so grateful for the support from our colleagues at Google and End Point. It is an impressive installation. Thank you!!!”

The Ann Arbor Hands On Museum joins a distinguished group of museums that feature a Liquid Galaxy as one of their exhibits sponsored by Google which End Point has set up: the National Air and Space Museum at the Smithsonian, the Tech Museum in San Jose California, the NOAA Monterrey Bay National Marine Sanctuary Visitor Center in Santa Cruz, the Pavillon de l'Arsenal in Paris, and the Musée Océanographique in Monaco.

For more information on how your museum or organization can get a Liquid Galaxy, contact sales@endpoint.com or see details at LiquidGalaxy.EndPoint.com

Testing Anti-Patterns

Testing is always a popular subject at MountainWest RubyConf. Now that many developers are embracing test driven development, there is a big need for guidence. Aja Hammerly is talking about testing anti-patterns. Tests should be trustworthy. You should be able to depend on them. If they fail, your system is broken. If they pass your system works. Tests should be simple. Simple to read, simple to write, simple to run.

Here are some anti-patterns that Aja addressed:

Pointless Tests

  • No tests! Solution: Write tests.
  • Not Running Tests. Solution: Use Continuous Integration so you have to run your tests.
  • Listen to your failing tests. Fix team culture that ignores a red CI.
  • That test that fails sometimes. Fix it!

Wasted Time / Effort

  • Testing Other Peoples Code (OPC). Solution: Test only what you provide and use third-party code that has good coverage.
  • assert_nothing_raised or in other words, don’t assert that a block runs without an exception only. If an error is raised, it will just raise an exception, which is a failure.

False Positives and Negatives

  • Time sensitive tests. For example, using Time.now. Solution: stub Time.now().
  • Hard-coded dates in tests. Use relative dates instead.
  • Order dependent tests. Make sure your tests clean up after themselves. Randomizing your tests helps detect these problems.
  • Tests that can’t fail. This can happen when we heavily stub or mock. You can also detect these errors by ensuring you always see your tests go red, then green.

Inefficient Tests

  • Requiring External Resources such as networking requests or other IO not checked into your source control. Solution: Mock/stub external resources. You can use WebMock for this, but it’s better to write your own Stub.
  • Complicated Setup: Solution is to refactor your implementation. See Working Effectively with Legacy Code by Michael Feathers.

Messy Tests:

  • Repeated code
  • Copy paste tweak
  • disorganized tests
  • literals everywhere

Solution:

  • DRY your tests
  • Group by method under test
  • Use descriptive names
  • Put literals in variables

Many of the reasons we don’t have adequate tests in our apps is because they are slow. Applying the solutions for these anti-patterns can solve the majority of these issues.

Code Smells: Your Refactoring Cheat Codes

Code smells are heuristics for refactoring. Resistance from our code are hints for refactoring. John Pignata shares some great tips on how to actually go about refactoring a long method. Here are some of the highlights and steps that were covered.

First:

  • Wrap entire method in a class
  • Promote locals to instance variables

Second:

  • Move the work into private methods

Third:

  • Look for multiple responsibilities in the class
  • Create new classes and adjust interfaces so everything still works

Fourth:

  • Wrap your lower levels of abstraction (IO, Sockets).

Fifth:

  • Your class may know too much about your lower level abstractions. Find ways to remove that knowledge using design patterns such as Observer/Listener.

Sixth:

  • Look for case statements or other big conditionals
  • Replace conditionals with polymorphism
  • Move the conditional to a factory if applicable

Seventh:

  • Remove data clumps such as knowledge of indexes in arrays or arrays of arrays (data[1][2]).

Eighth:

  • Remove uncommunicative names such as “data” and “new”

Ninth:

  • Look for variables that have same name but different meaning such as local variables that match instance variables.

Tenth:

  • Look for nil checks. Look for indicators that nil actually means something and replace it with a NullObject.

These are all great suggestions for refactoring. If you want more information on this topic, I highly recommend Martin Fowler’s book “Refactoring”.

Batteries Included!

How many gems does it take to build an app? Many gems duplicate functionality that’s already in ruby core or in the standard library. Daniel Huckstep reviews some goodies that come with ruby that you could probably use to replace some of those gems.

Basics

  • Set: Like an array but only allows unique values. Set also optimizes for the include? method so if you are calling include? a lot on your arrays and you don’t require duplicates, Set will be a much better option for you.

  • Enumerable: Gives you map, each_with_index, and all the other goodness that comes with the Enumerable class. All you need to implement is each and you get everything else in Enumerable for free.

  • Enumerator: Allows you to build your new enomerators on the fly and implement lazy loading.

  • SimpleDelegator: Inherit from SimpleDelegator and then set self and it will delegate any missing methods to the underlying class.

  • Forwardable: Forward selected methods to another object

Performance

  • Benchmark: Allows you to measure the performance of blocks of code. It also has many different outputs and reporting formats.

  • RubyVM::InstructionSequence: Set compile options to optimize things like tailcall in recursive methods.

  • Fiddle: Allows you to call C functions in an external libraries.

  • WeakRef: (Ruby >= 2.0) Wrap whatever you pass to it to make it a candidate for garbage collection.

Beyond

  • SecureRandom: Provides random numbers, uuid, base64, hex, etc.

  • GServer: Gives you a threaded TCP server.

  • Kernel#spawn: Provides a way to “spawn” an outside process.

  • Shellwords: Easy argument parsing for command line.

  • PStore: PStore is a simple data store that uses marshal to store objects to disk. It’s thread safe and supports transactions.

  • MiniTest: Full blown test framework built into ruby. Supports test unit and rspec like syntaxes and includes ability to mock.

Other nice libraries: OptionParser, WEBrick, Rss, Drb, Find, Ripper, ThreadsWait, Queue / SizedQuue, MonitorMixin, Net::POP/FTP/HTTP/SMTP.

We have a great ecosystem of gems, but there’s already a lot of what we use gems for already in the Ruby core and standard library.

A DIY Ruby Profiler!

A simple profiler can be nice to help detect how often different parts of our code are being run by using some statistical analysis and a few threading tricks. New Relic developer Jason Clark talks about how it’s more efficient to take samples than to use ruby profiler to profile every call and then walks us through building your own profiler.

This was a very insightful talk on how to analyze the backtrace of currently active threads. You can find the code for his DIY profiler on github.

Immutable Ruby by Michael Fairley

Michael Fairley is presenting on Immutable objects in ruby.

Immutability is a term used to describe data or objects that can’t be changed. This is a lesser known concept for ruby developers because almost everything in ruby is mutable. Despite this, much of ruby is full of immutable code. Make it explicit. Even in your databases there are some records you don’t want to modify.

One technique to making your database records immutable is to use the readonly? method in ActiveRecord or to revoke the permission to modify at the database level.

Many objects can utilize the freeze method which ensures that objects aren’t being modified. For example, use freeze for configurations. Be aware though that freeze doesn’t freeze objects in an array or hash so you’ll need a gem that provides the ability to deep freeze.

The gem values, provides a set of data structures that are frozen by default. These can be useful in cases where you might use Struct. You can create the object but it doesn’t allow you to change the attributes.

Use value objects in your ActiveRecord object by using composed_of which lets you use an object to combine attributes of the record into a value object. This gives you greater flexibility. The Address part of a user object is a good example.

Another use case for immutable data is in the logging of events. Many events such as a ledger or a database log are helpful because they can be replayed to create an accurate derived state such as current_status or an account balance.

If you use immutable objects for cache keys you can solve many issues around cache invalidation. Since the keys are immutable, you can rely on them being there. The exception to this would be deletion which could be solved by having a callback or something similar that invalidates the cache manually when the object is deleted.

There are some downsides to using immutable objects the key one being performance. Immutable objects tend to be copied a lot using a lot of memory and processing to recreate them.

More information can be found on this topic by visiting Michael’s site.

MWRC Ruby 2.0 with Matz

Today’s first speaker at mwrc is the one and only Yukihiro Matsumoto better known as Matz. Matz is the founder and chief architect of ruby. Matz has a wonderfully friendly personality. His talks are always filled with humor.

Probably not unexpectedly Matz is talking about Ruby 2.0. Ruby 2.0 is the happiest ruby release ever. He outlined some of the new features in Ruby 2.0 as follows:

Features of 2.0

It’s faster than 1.9

100% compatible with 1.9

Keyword Arguments: Keywork arguments provide support for literals at the end of the arguments: log("hello", level: "DEBUG") You can implement keyword arguments in 1.9, but 2.0 makes it simpler to read and write and implement

Module#prepend: Module#prepend is kind of like alias_method_chain from Rails but it doesn’t use aliases.

The prepend method evaluation comes before the existing methods but after the includes so that you can more easily extend existing methods. And you can package changes into a single module.

Refinements: Currently, monkey patching is a common practice in ruby. But, monkey patching is difficult to scope because it is global. This often leads to name spacing problems and difficult debugging. Refinements offers a kind of scope to your modifications.

module R
  refine String do
    def foo
    end
  end
end

"".foo => error

using R do
  "".foo => ok
end

It’s only partially implemented in Ruby 2.0 due to sharp criticism from the developers on the implementation.

Enumerable#lazy: As the name suggests it allows lazy evaluation of Enumerables such as Array. This is helpful for method chains such as map().select().first(5). Now you can call lazy.map.select.first and every subsequent call will be lazily.

UTF8 by default.

Dtrace / TracePoint have been improved to support better debugging.

Performance Improvements: Improved VM, Improved GC, require(improved invocation time)

Future of ruby

Matz was clear to state that he doesn’t know what’s coming in the future. He has some ideas, but they are still working on them. It sounded to me like the core group is moving to a more iterative approach. Matz stated that there will be more frequent releases in the future to try to decrease the number of patch levels (1.9.3 has over 300 patch levels).

It's Time Once Again for MountainWest RubyConf!

It’s one of my favorite times of the year. This will be my third trip to Salt Lake City to attend Mountain West Ruby Conference. One of ther really great things about developing in ruby is the amazing community. Hanging out with fellow developers and enthusiasts is very envigorating and so much fun.

This year I will be blogging on all the talks and reporting on some of the great stuff that will be presented. You can checkout the MWRC Schedule to see some of the great speakers in the lineup. You can follow the #mwrc tag on twitter to follow this amazing conference.

Converting RHEL 5.9 and 6.4 to CentOS

CentOS is, by design, an almost identical rebuild of Red Hat Enterprise Linux (RHEL). Any given version of each OS should behave the same as the other and packages and yum repositories built for one should work for the other unchanged. Any exception I would call a bug.

Because Red Hat is the source or origin of packages that ultimately end up in CentOS, there is an inherent delay between when Red Hat releases new packages and when they appear in CentOS. CentOS is financed by optional donations of work, hosting, and money, while Red Hat Enterprise Linux is financed by requiring customers to purchase entitlements to use the software and get various levels of support from Red Hat.

Thanks to this close similarity and the tradeoff between rapidity of updates vs. cost and entitlement tracking, we find reasons to use both RHEL and CentOS, depending on the situation.

Sometimes we want to convert RHEL to CentOS or vice versa, on a running machine, without the expense and destabilizing effect of having to reinstall the operating system. In the past I've written on this blog about converting from CentOS 6 to RHEL 6, and earlier about converting from RHEL 5 to CentOS 5.

I recently needed to migrate several servers from RHEL to CentOS, and found an update of the procedure was in order because some URLs and package versions had changed. Here are current instructions on how to migrate from RHEL 5.9 to CentOS 5.9, and RHEL 6.4 to CentOS 6.4.

These commands should of course be run as root, and observed carefully by a human eye to look for any errors or warnings and adapt accordingly.

RHEL 5.9 to CentOS 5.9 conversion, 64-bit (x86_64)

cd
mkdir centos
cd centos
wget http://mirror.centos.org/centos/5.9/os/x86_64/RPM-GPG-KEY-CentOS-5
wget http://mirror.centos.org/centos/5.9/os/x86_64/CentOS/centos-release-5-9.el5.centos.1.x86_64.rpm
wget http://mirror.centos.org/centos/5.9/os/x86_64/CentOS/centos-release-notes-5.9-0.x86_64.rpm
wget http://mirror.centos.org/centos/5.9/os/x86_64/CentOS/yum-3.2.22-40.el5.centos.noarch.rpm
wget http://mirror.centos.org/centos/5.9/os/x86_64/CentOS/yum-updatesd-0.9-5.el5.noarch.rpm
wget http://mirror.centos.org/centos/5.9/os/x86_64/CentOS/yum-fastestmirror-1.1.16-21.el5.centos.noarch.rpm
wget http://mirror.centos.org/centos/5.9/os/x86_64/CentOS/gamin-python-0.1.7-10.el5.x86_64.rpm
yum erase yum-rhn-plugin rhn-client-tools rhn-virtualization-common rhn-setup rhn-check rhnsd yum-updatesd
yum clean all
rpm --import RPM-GPG-KEY-CentOS-5
rpm -e --nodeps redhat-release
yum localinstall *.rpm
yum upgrade
shutdown -r now

RHEL 5.9 to CentOS 5.9 conversion, 32-bit (i386)

cd
mkdir centos
cd centos
wget http://mirror.centos.org/centos/5.9/os/i386/RPM-GPG-KEY-CentOS-5
wget http://mirror.centos.org/centos/5.9/os/i386/CentOS/centos-release-5-9.el5.centos.1.i386.rpm
wget http://mirror.centos.org/centos/5.9/os/i386/CentOS/centos-release-notes-5.9-0.i386.rpm
wget http://mirror.centos.org/centos/5.9/os/i386/CentOS/yum-3.2.22-40.el5.centos.noarch.rpm
wget http://mirror.centos.org/centos/5.9/os/i386/CentOS/yum-updatesd-0.9-5.el5.noarch.rpm
wget http://mirror.centos.org/centos/5.9/os/i386/CentOS/yum-fastestmirror-1.1.16-21.el5.centos.noarch.rpm
wget http://mirror.centos.org/centos/5.9/os/i386/CentOS/gamin-python-0.1.7-10.el5.i386.rpm
yum erase yum-rhn-plugin rhn-client-tools rhn-virtualization-common rhn-setup rhn-check rhnsd yum-updatesd
yum clean all
rpm --import RPM-GPG-KEY-CentOS-5
rpm -e --nodeps redhat-release
yum localinstall *.rpm
yum upgrade
shutdown -r now

RHEL 6.4 to CentOS 6.4 conversion, 64-bit (x86_64)

cd
mkdir centos
cd centos
wget http://mirror.centos.org/centos/6.4/os/x86_64/RPM-GPG-KEY-CentOS-6
wget http://mirror.centos.org/centos/6.4/os/x86_64/Packages/centos-release-6-4.el6.centos.10.x86_64.rpm
wget http://mirror.centos.org/centos/6.4/os/x86_64/Packages/yum-3.2.29-40.el6.centos.noarch.rpm
wget http://mirror.centos.org/centos/6.4/os/x86_64/Packages/yum-utils-1.1.30-14.el6.noarch.rpm
wget http://mirror.centos.org/centos/6.4/os/x86_64/Packages/yum-plugin-fastestmirror-1.1.30-14.el6.noarch.rpm
yum erase yum-rhn-plugin rhn-client-tools rhn-virtualization-common rhn-setup rhn-check rhnsd yum-updatesd subscription-manager
yum clean all
rpm --import RPM-GPG-KEY-CentOS-6
rpm -e --nodeps redhat-release-server
yum localinstall *.rpm
yum upgrade
shutdown -r now

We don't use 32-bit (i386) RHEL or CentOS 6, so you're on your own with that, but it should be very straightforward to adapt the x86_64 instructions.

If during the yum localinstall you get an error like this that references a URL containing %24releasever:

[Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not Found"
Error: Cannot retrieve repository metadata (repomd.xml) for repository

Then you need to temporarily disable that add-on yum repository until after the conversion is complete by editing /etc/yum.repos.d/name.repo to change enabled=1 to enabled=0. The problem here is caused by the repo configuration using the releasever yum variable which is undefined mid-conversion because we forcibly removed the redhat-release* package that defines it. We can't expect the OS to know what kind it is in the middle of its identity crisis and change!

If all goes well, nothing will look any different at all, except you'll now see:

# cat /etc/redhat-release 
CentOS release 5.9 (Final)

or:

# cat /etc/redhat-release 
CentOS release 6.4 (Final)

Deploying password files with Chef

Today I worked on a Chef recipe that needed to deploy an rsync password file from an encrypted data bag. Obtaining the password from the data bag in the recipe is well documented, but I knew that great care should be taken when writing the file. There are a plethora of ways to write strings to files in Chef, but many have potential vulnerabilities when dealing with secrets. Caveats:

  • The details of execute resources may be gleaned from globally-visible areas of proc.
  • The contents of a template may be echoed to the chef client.log or stored in cache, stacktrace or backup areas.
  • Some chef resources which write to files can be made to dump the diff or contents to stdout when run with verbosity.

With tremendous help from Jay Feldblum in freenode#chef, we came up with a safe, optimized solution to deploy the password from a series of ruby blocks:

pw_path = Pathname("/path/to/pwd/file")
pw_path_uid = 0
pw_path_gid = 0
pw = Chef::EncryptedDataBagItem.load("bag", "item")['password']

ruby_block "#{pw_path}-touch" do
  block   { FileUtils.touch pw_path } # so that we can chown & chmod it before writing the pw to it
  not_if  { pw_path.file? }
end

ruby_block "#{pw_path}-chown" do
  block   { FileUtils.chown pw_path_uid, pw_path_gid, pw_path }
  not_if  { s = pw_path.stat ; s.uid == pw_path_uid && s.gid == pw_path_gid }
end

ruby_block "#{pw_path}-chmod" do
  block   { FileUtils.chmod 0600, pw_path }
  not_if  { s = pw_path.stat ; "%o" % s.mode == "100600" }
end

ruby_block "#{pw_path}-content" do
  block   { pw_path.open("w") {|f| f.write pw} }
  not_if  { pw_path.read == pw } # NOTE: a secure compare method might make this even better
end

Further reading:

Debugging Localization in Rails

Rails offers a very extensive library for handling localization using the rails-i18n gem. If you've done any localization using Rails, you know that it can be difficult to keep track of every string on your web application that needs translation. During a recent project, I was looking for an easy way to visually see translation issues while browsing through the UI in our application.

Missing Translations

I've known for some time that when I was missing a translation for a given text that it would appear in the HTML with a special <span> tag with the class .translation_missing. So in my global CSS I added the following:

.translation_missing { background-color: red; color: white !important; }

Now when I viewed any page where I had forgotten to add a translation to my en.yml file I could see the text marked in bright red. This served as a great reminder that I needed to add the translation.

I'm an avid watcher of my rails log while I develop. I almost always have a terminal window right next to my browser that is tailing out the log so I can catch any weirdnesses that may crop up. One thing that I thought would be nice is if the log showed any translation issues on the page it was loading. After some research, I found that the I18n gem used by Rails doesn't give you any obvious hooks for translation errors. There is, however, a section in the Rails I18n Guide for adding custom exception handlers for translation issues and this is where I started with something like this in an initializer:

module I18n
  def self.just_raise_that_exception(*args)
    Rails.logger.debug args.join("\n")
    raise args.first
  end
end

I18n.exception_handler = :just_raise_that_exception

This got things started for me. After some small modifications to the logger line, I had the output that I wanted and to make things really stick out, I added some color using the term-ansicolor gem and came up with this:

def self.just_raise_that_exception(*args)
  require 'term/ansicolor'
  include Term::ANSIColor
  Rails.logger.debug red("--- I18n Missing Translation: #{args.inspect} ---")
  raise args.first
end

I18n Fallbacks

After using this solution for a while I ran into another issue. When I would change the locale in my app to something like es (Spanish) and look for translation issues, I was noticing some text that wasn't being translated but it wasn't being marked as red by my CSS. When I looked at the text further I noticed that it wasn't even being surrounded by the <span class="translation_missing> tag. As it turned out, I did have a translation for that text in English, but not in the language that was set to my current locale. The reason for this is what I18n refers to as Fallbacks. Fallbacks tell the I18n what alternate, or default, locale to "fall back" to if a translation isn't available for the current locale. By default, the fallback locale is en. Since I had a translation for that text in my en.yml there was no indicator that I had "fallen back" even though it was obvious that there was a translation issue.

Coming up with a way to be notified of fallbacks wasn't nearly as simple as my initial solution. After a lot of googling and reading in StackOverflow, I found that perhaps the best solution was to override the translate method in the I18n gem. This sounds really scary, but it actually wasn't too bad. Back in my initializer, I removed the solution that I had originally come up with and added the following:

module I18n::Backend
  module Fallbacks

    # To liven things up a bit
    require 'term/ansicolor'
    include Term::ANSIColor

    def translate(locale, key, options = {})
      # First things first, if there is an option sent, decorate it and send it back.
      return fallback_message(locale, key, super) if options[:fallback]

      default = extract_non_symbol_default!(options) if options[:default]

      options[:fallback] = true
      I18n.fallbacks[locale].each do |fallback|
        catch(:exception) do
          # Get the original translation
          translation = super(fallback, key, options)

          # return the translation if we didn't need to fall back
          return translation if fallback == locale

          # return the decorated fallback message if we did actually fall back.
          result = fallback_message(locale, key, translation)
          return result unless result.nil?
        end
      end
      options.delete(:fallback)

      return super(locale, nil, options.merge(:default => default)) if default

      # If we get here, then no translation was found.
      Rails.logger.debug red("--- I18n Missing Translation: #{locale}::#{key} ---")
      throw(:exception, I18n::MissingTranslation.new(locale, key, options))
    end

    # Added this method to log the fallback and decorate the text with a <span> tag.
    def fallback_message(locale, key, fallback_text)
      return nil if fallback_text.nil?
      keys = key.to_s.split(".")
      return fallback_text if keys.first == 'platform_ui'

      Rails.logger.debug yellow("--- I18n Fallback: #{locale}::#{key} ---")
      %(<span class="translation_fallback" title="translation missing: #{locale}, #{key}">#{fallback_text}</span>).html_safe
    end

  end
end

With this initializer in place, I added the following CSS to my global stylesheet:

.translation_fallback { background-color: green; color: white !important; }

Now all missing translation errors and fallbacks were noticeable while browsing the UI and all translation issues were logged to my development log. Now obviously, I don't want these things showing up in production so I wrapped the whole thing in a if Rails.env.development? conditional. I could also think of times, like while working on implementing design, that I don't really care about translation issues as well, so I added a flag for turning it off in development as well.

You can see my entire solution on a gist over at github. It was suggested by a fellow colleague that I should turn this into a gem, but I didn't see the need if all you had to do is copy and paste this initializer. If a gem is something you'd like to see this in, please let me know and I'll consider packaging it up.