End Point

News

Welcome to End Point's blog

Ongoing observations by End Point people.

Puppet, Salt, and DevOps (a review of the MountainWest DevOps conference)

mtnwestdevops logo Last week I attended the MountainWest DevOps conference held in Salt Lake City, UT. This was a one day conference with a good set of presenters and lightning talks. There were several interesting topics presented, but I’ll only review a few I wanted to highlight.

I Serve No Master!

Aaron Gibson of Adaptive Computing discussed a very common problem with Puppet (and other configuration management systems): they work well in the scenario they were designed for but what about when the situation isn’t typical? Aaron had a situation where developers and QA engineers could instantiate systems themselves via OpenStack, however the process for installing their company’s software stack on those VMs was inconsistent, mostly manual, and took many hours. One of the pain points he shared, which I related to, was dealing with registering a puppet node with a puppet master- the sometimes painful back and forth of certificate issuing and signing.

His solution was to remove the puppet master completely from the process. Instead he created a bash wrapper script to execute a workflow around what was needed, still using puppet manifests on each system but run locally. This wrapper tool, called “Builder”, relies on property files to customize the config and allow the script to manage different needs. This new script allowed them to keep using puppet to manage these self-serve OpenStack servers gaining the benefits of consistency and removing manual setup steps, providing the ability to automate installs with Jenkins or other tools. But it freed them from having to use a puppet master for nodes that were disposable. It also helped to reduce software install time from 12 hours down to a 11 minutes.

His Builder tool is still an internal only tool for his company, but he discussed some next steps he would like to add, including better reporting and auditing of executions. I pinged him after the conference on twitter and mentioned that Rundeck might be a good fit to fill that gap. I used Rundeck for 2 years at my last job, integrating nicely with other automation tools and providing reporting and auditing as well as access control of arbitrary jobs.

Automating cloud factories and the internet assembly line with SaltStack

Tom Hatch of Salt Stack spoke about Salt as an automation and remote execution platform. I’ve done quite a bit of work with Salt recently with a client and so I was pretty familiar with Salt. But one thing that he mentioned I didn’t know was that Salt was originally designed as a Cloud management tool, not necessarily a configuration management tool. However in the course of time configuration management became a higher priority for the Salt dev team to focus on. Tom mentioned that recently they have been working on Cloud management tools again- providing integration with Rackspace, AWS, xen, and more. I’ll have to dig more into these tools and give them a try.

How I Learned to Stop Worrying and Love DevOps

Bridget Kromhout of 8thBridge spoke on the culture of DevOps and her journey from a corporate, strictly siloed environment to a small start-up that embraced DevOps. One of the first things she brought up that was different was the focus and approach to goals of each organization. In an organization where Ops teams are strictly separate from Developers, they often butt heads and have a limited vision of priorities. Each focuses on the goals of their own team or department, and have little understanding of the goals of the other departments. This leads to an adversarial relationship and culture of not caring much about different teams or departments.

In contrast, the organization that embraces DevOps as a culture will see to it that Ops and Devs work together on whatever solution best reaches the goals of the whole organization. In doing so, barriers will have to be questioned. Any “special snowflake” servers/applications/etc. that only one person knows and can touch can’t exist in this culture. Instead, any unique customizations need to be minimized through automation, documentation (sharing knowledge), and reporting/monitoring. This doesn’t mean root access for all- but it means reducing barriers as much as possible. Good habits from the Ops world to keep include: monitoring, robustness, security, scaling, and alerting.

The main pillars of DevOps are: culture, automation, measurement, and sharing. Culture is important and can be supportive or rejecting of the other pillars. Without a culture in the organization that supports DevOps, it will fizzle back into siloed "us vs. them" enmity.

Thanks to all the presenters and those that put on the conference. It was a great experience and I am glad I attended.

Links

2014 Mountain West Ruby Conference Day 2

This past Friday concluded my second Mountain West Ruby Conference right here in my backyard of Salt Lake City, Utah. Just like the 2013 MWRC, this year’s conference was great. It was also nice to meet up with fellow remote co-worker Mike Farmer for both days. Here are a few of my personal favorites from day 2 of the conference, which I almost missed when I had the audacity to show up without a Macbook Air/Pro. (Kidding!)

Randy Coulman - Affordances in Programming Languages

Affordances ("A quality of an object or environment that allows someone to perform an action") are all around us. Randy opened with some examples of poor affordances such as a crosswalk button with two enticing-looking widgets that requires arrows drawn on the box to point to the one that actually is a button. Then, a counter-example showing the "walking" or "standing" footprints painted on an escalator and how they instantly and intuitively communicate how to best use the escalator.

Randy followed with a few examples of affordances in software. One of them was a simple Ruby implementation of an affordance called Deterministic Destructors. It was a method that acquired a resource, yielded to a block argument, then automatically released that resource when the block returned.

The last affordance Randy showed is most near and dear to my heart because it happens to be something I can use in my current project: Subclass iteration. Rather than use a registry pattern to explicitly register components that can handle specific file formats (for example), you can leverage Ruby’s built-in inherited(subklass) class method by overriding it in your parent class. That, along with a one-liner helper method, can be used to automagically register your subclasses. Implement a standard interface (can_handle_thing?) for your subclasses to advertise what type of data they can handle, then use your de facto subclass registry to delegate appropriately at runtime.

Ultimately, Randy’s list of takeaways summed it all up nicely:

  • "Languages afford certain designs and inhibit others," thereby shaping how we think about certain problems
  • "Learning new languages will increase your "solution space" and expose you to different approaches that may be applicable elsewhere.
Thanks for the talk, Randy! Here is his "Affordances in Programming Languages" Slideshare

John Athayde - The Timeless Way of Building

Patterns. Patterns in nature, human behavior, architecture. Patterns are only truly valuable when used appropriately, which, it is said, requires a certain amount of fluency in the language of patterns. Treating them as "a library of code templates" is a problem, says John. I would agree. Correctly applying design patterns in software development is more than finding a GoF design pattern and squeezing your problem through it like so much delicious looking Play dough.

John discussed patterns using some interesting architectural examples and ran through a number of anti-patterns and their corresponding pattern "solutions" in a helpful before/after format. He touched on a number of such anti-patterns that one might see in the wild; code that walks the object graph a la @account.customer.address.state,) fat models in need of some module, concern, or gem treatment, and ActiveRecord callback soup.

John’s overall takeaway was that memorizing and applying patterns is neither the crux of the matter, nor is it even sufficient. True skill and the mark of proficiency is to be "fluent in the language of patterns." This reminds me of the difference between being able to speak a foreign language and being able to *think* in that language and seamlessly express those thoughts without a need for intermediate translation.

John’s recommended reading: The Timeless Way of Building, A Pattern Language, The Oregon Experiment, Rails Antipatterns, Design Patterns in Ruby

Ryan Davis - Nerd Party v3.1

Ryan’s talk covering the various "versions" of the Seattle.rb Ruby user group throughout the years gets an honorable mention because it convinced me to get involved with one of the Ruby user groups here in Salt Lake City.

ZNC: An IRC Bouncer

Kickin' it Old Skool

At End Point, we use IRC extensively for group chat and messaging. Prior to starting here I had been an occasional IRC user - asking questions about various open source projects on Freenode and helping others as well. When I began to use IRC daily, I ran into a few things that bugged me and thought I would write about what I have done to mitigate those. While it might not be as fancy as Campfire, HipChat or Slack I'm happy with my setup now.

What did I miss?

The first thing that annoyed me about IRC was the lack of persistence. If I wasn't logged on to the server, I missed out on the action. This was exacerbated by the fact that I live in the Pacific Time zone and lots of discussion takes place before my work day begins. Some people on our team solve this issue by running a terminal-based IRC client (like irssi or WeeChat inside a tmux or GNU Screen session on a remote server. This approach works well until the server needs to be rebooted (e.g. for OS or kernel updates etc.). It also introduces the limitation of using a terminal-based client which isn't for everyone.

Test Driving IRC Clients

The next challenge was finding a good IRC client. In my attempt to test-drive several clients side-by-side I quickly discovered that each client with a direct connection to the server requires its own unique nick (e.g. greg, greg_, greg_2 etc.). I also wanted to be able to use an IRC client on my phone without having to use multiple nicks. At this point I did a little research and determined that a bouncer might be helpful.

Enter the Bouncer

A bouncer connects to an IRC server on your behalf and maintains a persistent connection. Rather than connecting directly to IRC server you connect to the bouncer. This allows you to remain connected to IRC while you are offline.

Bouncers

ZNC is the bouncer software I have been using and it has been great. Using ZNC allows you to connect multiple clients at the same time. E.g. a phone and laptop using a single nick. ZNC also maintains a buffer of the most recent conversations while your clients are not connected. The buffer can be configured to suit your needs (per-channel, per-server etc.). In my case it's buffering the past 100 messages and when I connect my IRC client they are automatically played back and the buffer is cleared. This has solved my issue with missing discussions and context when my IRC client isn't running.

IRC to go

Another benefit of this setup has been the ability to connect to IRC on my phone (with the same nick as my laptop). I have used Mobile Colloquy and Palaver on iOS and both work quite well. There are also ZNC modules on GitHub that enable ZNC to send push notifications for mentions and private messages. Mobile Colloquy does a good job of this -- I have not yet tried this with Palaver yet.

Modules

ZNC comes with a number of built in modules and allows users to develop their own as well (Perl, C++, Python or Tcl). The following are some of the ones I use:

  • chansaver: keeps you connected to all of the channels you've joined
  • simple_away: update your away message when all of your clients have disconnected / detached
  • autoreply
  • log: logs all discussions
  • web admin: web interface for viewing and editing your ZNC config

Running ZNC

ZNC is easy to compile and install and there are also packages available for some Linux distros. After testing it out locally for a while I installed it on a small VPS and that has been working well. Digital Ocean has published an article on how to install ZNC on a Ubuntu VPS if you are interested in learning more about it. Also, the ZNC wiki has lots of helpful information on how to install and configure ZNC.

GIS Visualizations on the Liquid Galaxy

The Liquid Galaxy presents an incredible opportunity to view Google Earth and Google Street View, but did you know that the platform is also amaza-crazy good at visualizing GIS data?

Geographic Information Systems are concerned with collecting, storing, and manipulating data sets that include geographic coordinates, and with displaying this raw and analyzed data on maps. In the 2D world this usually involves colored pencils or highlighted polygons. As computing power advances many GIS consultancies have extended the GIS visualization methods to see complex 3D visualizations on digital maps. The Liquid Galaxy takes this concept forward another step: see your data across an immense landscape of pixels in a geometrically-adjusted immersive world.

The Liquid Galaxy has separate instances of Google Earth running on each screen. In a standard Liquid Galaxy this means that each 1080x1920 screen is getting a full resolution image. It also means that the viewing angle for each screen matches the physical angle of that screen. Together, those elements can show geographical information at an incredible scale and resolution.

End Point has developed skills and methods to take GIS data sets (commonly found as KML files, but any spreadsheet of data with latitude and longitude can be adapted), apply the geometric adjustments for multiple screens, and build incredible presentations for our clients. Sample data (in the video) includes: utility power usage, offshore oil & gas leases, population growth by state, school districts, and earthquakes along the San Andreas fault.

End Point welcomes the opportunity to work with GIS consultancies in bringing their data to the Liquid Galaxy and then presenting an immersive visual platform to their clients and customers. For more information, please email sales@endpoint.com or contact us here.

Mountain West Ruby Conference, Day 1

MWRC Notes

It’s that magical time of year that I always look forward to, March. Why March? Because that’s when Mike Moore organizes and puts on the famed Mountain West Ruby Conference in Salt Lake City, Utah. This conference is always a personal pleasure for me due to the number of incredible people I get to meet and associate with there. This year was no exception in that regard. It was simply fantastic to meet up with old friends and catch up on all their latest and greatest projects and ideas.

In writing a summary of day 1 here, I’d like to focus on just three talks that you will definitely want to go watch over on confreaks as soon as they are up. All the talks were great, but these three were exceptional and you won’t want to miss them.

A Magical Gathering

The opening keynote started off with a bang of entertainment and just plain geeking out with Aaron Patterson. Aaron holds the peculiar position of being on both Ruby core and Rails core teams. Aaron’s code is probably used by more people than just about anyone in the Ruby community. Everyone that knows and loves Ruby and Ruby on Rails is indebted to this genius and generous coder. But Aaron is more than just a coder. Other than Matz himself, I don’t think anyone has influenced the culture of Ruby more than Aaron. Matz wants all of us to be nice (MINASWAN: Matz is nice and so we are nice) and Aaron just wants us to just love what we do and love the people that we work with (#fridayhug with @tenderlove).

Aaron gave an entertaining talk on how he built an app that reads “Magic, the Gathering Cards”. Turns out Aaron was a big fan of the game several years ago and had accumulated quite a stash of cards (2600+). He recently gained an interest in the game again and wanted to see what he had as far as the quality of the cards. To accomplish this task, he built an app that uses a webcam to take a picture of the card. Then he compared it with a database that he scraped from the Gatherer website. He then used a bunch of Ruby libraries and other tools to do an image comparison and “teach” his app how to match the cards. In the end, this talk was not only extremely entertaining but introduced me to a whole bunch of tools that I never knew existed. Here’s a list of tools and sites referenced in his talk:

  • OpenCV for cropping and straightening images
  • Magic the Gathering Gatherer for getting card data
  • MTG JSON for getting card data in JSON format (easier than scraping)
  • Phashion Ruby gem for comparing similar images and generating a hamming distance between the two

Unpacking Technical Decisions

The next talk that impressed me was Unpacking Technical Decisions by Sarah Mei. Sarah gracefully walked us through a very difficult technical decision that all web developers seem to face these days, “What JavaScript framework should I use?” Although the question was based on JavaScript and not Ruby, the principles that were discussed could be related to any project.

Sarah broke our decision making questions into a quadrant system, aptly named The Mei System. The quadrants were identified as Accessibility, Interface, Popularity, and Activity. Then we looked at three of the more common JavaScript Frameworks Ember, Angular, and Backbone. Using the quadrant Sarah pointed out how any one of the frameworks could work based on the project we are working on and the people that will be working on the project. The conclusion was that none of these frameworks are the “best” framework in the large, but it is possible to make a good decision on what framework would be best for your project.

Don’t

The last talk I want to mention here was the final talk for the day by Ernie Miller titled “Don’t”. The premise for the talk is that we all make mistakes. Some are harder to handle than others. Ernie’s goal was to help us avoid some of the same mistakes that he made. This was a very candid look at Ernie’s past that was both entertaining and moving. Ernie presented us with mistakes in of a wide variety from programming mistakes to life mistakes. To conclude this blog post, I present to you the list of Ernie’s “Don’t” mistakes.

  • Don’t overestimate how much time you have. Do thank someone who’s had a positive impact on your life.
  • Don’t forget your choices have consequences.
  • Don’t fall in love with metaprogramming.
  • Don’t put your code in buckets (meaning god objects). Do find a real home for your code and give it a proper name.
  • Don’t hitch your cart to someone else’s cart (couple your code to gems).
  • Don’t think of your app as a “Rails App”.
  • Don’t assume too much how other will use your code.
  • Don’t use ActiveRecord Callbacks. Do use super!
  • Don’t mistake the illusion of accomplishment for the real thing (addictive gaming).
  • Don’t accept counter-offers.
  • Don’t take a job for the money.
  • Don’t assume it’s too late.
  • Don’t get too comfortable. Do work at a job that pushes you and holds you accountable.
  • Don’t try to be someone you’re not.
  • Don’t be afraid.
  • Don’t be afraid to say no.
  • Don’t be afraid to share.
  • Don’t be afraid to speak.
  • Don’t be afraid to stretch.

Proxy Nginx ports using a regular expression

I'm working on a big Rails project for Phenoms Fantasy Sports that uses the ActiveMerchant gem to handle Dwolla payments. One of the developers, Patrick, ran into an issue where his code wasn't receiving the expected postback from the Dwolla gateway. His code looked right, the Dwolla account UI showed the sandbox transactions, but we never saw any evidence of the postback hitting our development server.

Patrick's theory was that Dwolla was stripping the port number off the postback URL he was sending with the request. We tested that theory by using the Requestb.in service for the postback URL, and it showed Dwolla making the postback successfully. Next, we needed to verify that Dwolla could hit our development server on port 80.

I started Nginx on port 80 of our dev server and Patrick fired his Dwolla transaction test again. The expected POST requests hit the Nginx logfile. Suspicions confirmed. It looked like we would just have to work around the Dwolla weirdness by proxying port 80 to the port that Patrick's development instance was running on. Then we'd need a way to make that work for the other developers' instances on the dev. server, as well.

Proxying a single port to another port with Nginx is easy, but that second requirement is a little more complicated. We did get a little lucky with this bit, however. We are using DevCamps.org "camps" for this project. DevCamps' naming convention for a given development instance (AKA a "camp") number uses a two digit camp number as the hostname and as part of the port number. For example, camp 42 would run on 42.camp.example.com:9042. (I bet you can already see where I'm going with this.)

I tweaked the Nginx config for the port 80 instance to use a regex to capture the hostname ("42" in this case) from the server_name portion of the HTTP request. It then appends that to the full hostname/IP address and the first two digits of that camp's port number. That made the proxy work for everyone's camp. Finally, I updated the config to work with any URI under the /dwolla directory.

We now have the following tidbit in our Nginx config:

server {
    [SNIP unrelated config stuff]

    server_name ~^(?<portname>\d\d)\.camp\.;

    location /dwolla {
        proxy_pass        http://169.29.89.157:90${portname}$uri;
        proxy_set_header  X-Real-IP  $remote_addr;
    }
}

As root, I ran `service nginx reload` to pick up the new config changes. Now Nginx automagically proxies connections to specific ports based on the server's hostname.

Significant Whitespace in an Interchange UserTag

Here's a quick issue I ran into with an Interchange UserTag definition; I'd made some changes to a custom UserTag, but upon restarting the Interchange daemon, I ended up with a message like the following:

UserTag 'foo_user_tag' code is not a subroutine reference

This was odd, as I'd verified via perl -cw that the code in the UserTag itself was valid.  After hunting through the changes, I noticed that the UserTag definition file was itself in DOS line-ending mode, but the changes I'd made were in normal Unix line-ending mode.  This was apparently sufficient reason to confuse the UserTag parser.

Sure enough, changing all line endings in the file to match resulted in the successful use of the UserTag.  For what it's worth, it did not matter whether the line endings in the file were Unix or DOS, so long as it was consistent within the file itself.

Scripting ssh master connections

Elephant Parade 005

At End Point, security is a top priority. We just phased out the last of the 1024-bit keys for all of our employees -- those of us in ops roles that have keys lots of places had done so a long while back. Similarly, since we'll tend to have several sessions open for a long while, a number of us will use ssh-agent's -c (confirm) option. That forces a prompt for confirmation of each request the agent gets. It can get a little annoying (especially since it takes the focus over to one monitor, even if I'm working on the other) but it combats SSH socket hijacking when we have the agent forwarded to remote servers.

Working on server migrations is where it gets really annoying. I like to write little repeatable scripts that I can tweak and re-run as needed. They're usually simple little things, starting with a bunch of rsync's or pipe-over-ssh's for pg_dump or any other data we need to move across. With any more than a couple of those ssh connections in there, repeatedly hitting the confirm button gets irritating fast. And if a large transfer takes a while, I'll go off to do something else, later getting an unexpected confirmation box when I'm not thinking about the running script. Unexpected SSH auth confirmations, of course, get denied. So the script has to be re-run, and the vicious cycle repeats anew.

ssh has a neat ability to multiplex over a single connection. I have it set to do that locally in auto mode, and that made me wonder if it could be used in these scripts so I only have to authorize the connection once. Well, of course it can, and it turns out to be nothing special. But here's what I got to work:

ssh
    -o ControlPath=~/.ssh/user@server.domain.foo  # Set the socket location
    -M  # Defines master mode for the client
    -N  # Don't bother to do anything remotely yet
    -f  # Drop into the background so we can continue on
    user@server.domain.foo  # And the typical connection username/host

At the end of the script, remember to shut down the control socket by passing it an exit command, otherwise it'll leave a connection hanging around out there:

ssh -o ControlPath=~/.ssh/user@server.domain.foo -O exit user@server.domain.foo

In between, it's just a matter of using the ControlPath option. It can get a little repetitive, so a variable can be helpful. The latest iteration I have wraps the ssh command together with its -o ControlPath option, which can be executed directly or passed to rsync, as below.

As an example, here's a somewhat stripped-down version of the script we used to move bucardo.org to a new host, minus some error handling and such:

#!/bin/bash

MSSH="ssh -o ControlPath=~/.ssh/root@bucardo.org" 
$MSSH -MNf bucardo.org

# Tell rsync to use ssh pointed at the master socket
rsync -e "$MSSH" -aHAX --del bucardo.org:/var/lib/git/ /var/lib/git/

# Pipe pg_dump (or whatever you need) over ssh
$MSSH bucardo.org 'su - postgres -c "pg_dump -c wikidb"' | su - postgres -c "psql wikidb"

# sed -i commands or anything else that's needed to fix up the local configuration

$MSSH -O exit bucardo.org

Setup Rails Environment with PostgreSQL on Apple Mac OS X

Setting up Rails on Mac OS X to have a Rails application is a tedious process. It's a kind of road block for newbies. Here I have listed the steps to have Rails application running with a PostgreSQL database on the Mac OS X.

1. Rails
Before installing Rails, We need couple of things installed on Mac OS X.
- Ruby
Luckily Mac OS X comes with preinstalled Ruby.
$ ruby -v
ruby 2.0.0p247 (2013-06-27 revision 41674) [universal.x86_64-darwin13]
- Xcode and Command Line Tools
Install Xcode from Mac Store. Xcode contains some system libraries which are required for Rails.
To install Command Line Tools, Open Xcode -> Xcode(menu bar) -> Preferences -> Downloads -> Install 'Command Line Tools'
- Homebrew
Homebrew helps to install gems with 'gem' and its dependencies with help of brew. Homebrew makes our life easier by handling dependencies for us during installation.
$ ruby -e "$(curl -fsSL https://raw.github.com/Homebrew/homebrew/go/install)"
Note:-- Xcode already comes bundled with gcc. But install gcc using homebrew if you face any gcc problems while installing Rails.
$ brew tap homebrew/dupes
$ brew install apple-gcc42
$ sudo ln -s /usr/local/bin/gcc-4.2 /usr/bin/gcc-4.2
- RVM
RVM (Ruby Version Manager) is a must have tool to easily manage multiple Ruby environments. Let's install RVM:
$ curl -L https://get.rvm.io | bash -s stable --ruby
$ rvm -v
rvm 1.25.19 (stable) by Wayne E. Seguin <wayneeseguin@gmail.com>, Michal Papis <mpapis@gmail.com> [https://rvm.io/]
$ echo '[[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm"' >> ~/.bashrc
$ source ~/.bashrc
Gemsets are very helpful to manage multiple applications with different sets of gems packed. So let's create a gemset to work on:
$ rvm use ruby-2.1.1@endpoint --create
$ rvm gemset list
gemsets for ruby-2.1.1 (found in /Users/selva/.rvm/gems/ruby-2.1.1)
   (default)
=> endpoint
   global
(See also the similar rbenv, which some people prefer.)
- Rails
We are all set to install Rails now:
$ gem install rails
$ rails -v
Rails 4.0.3

2. Install PostgreSQL
Download and Install PostgreSQL database from Postgres.app which provides PostgresSQL in a single package to easily get started with Max OS X. After the installation, open Postgres located under Applications to start PostgreSQL database running. Find out the PostgreSQL bin path and append to ~/.bashrc for accessing commands through the shell.
$ echo 'PATH="/Applications/Postgres.app/Contents/Versions/9.3/bin:$PATH"' >> ~/.bashrc
$ source ~/.bashrc
Next create a user in PostgreSQL for Rails application.
$ createuser -P -d -e sampleuser
Enter password for new role:
Enter it again:
CREATE ROLE sampleuser PASSWORD 'md5afd8d364af0c8efa11183c3454f56c52' NOSUPERUSER CREATEDB NOCREATEROLE INHERIT LOGIN;

3. Create Application
Rails environment is ready with PostgreSQL database. Let's create a sample web application
$ rails new SampleApp
Configure database details under SampleApp/config/database.yml
development:
  adapter: postgresql
  encoding: unicode
  database: sampledb
  username: sampleuser
  password: samplepassword
Start the Rails server and hit http://0.0.0.0:3000 on your browser to verify Rails app is running on your computer.
$ cd SampleApp
$ ./bin/rails server

4. Version Control System
It is always good to develop an application with version control system. Here I am using 'Git'.
$ cd SampleApp
$ git init
$ git add . && git commit -m "Initial Commit"

5. Server on https (optional)
Sometimes the application needs to be run on the https protocol for security reasons to if third-party services require the application to be served over https. So we should setup https (SSL Security). First, create an self-signed SSL certificate. To create a self-signed certificate we should have RSA key and Certificate request in place beforehand.
$ mkdir ~/.ssl && cd ~/.ssl

# creating 2048 bit rsa key
$ openssl genrsa -out server.key 2048 
Generating RSA private key, 2048 bit long modulus
..........++++++
.........++++++
e is 65537 (0x10001)

# creating certificate request
$ openssl req -new -key server.key -out server.csr 
........
# Common Name value should be FQDN without protocol
 Common Name (eg, your name or your server's hostname) []:mydomain.com 
........

# creating self-signed certificate
$ openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt 
WEBrick server can be configured to use SSL key and certificate by adding below lines of code into bin/rails after '#!/usr/bin/env ruby'. Provide the locations of RSA key and certificate in the Ruby code.
require 'rubygems'
require 'rails/commands/server'
require 'rack'
require 'webrick'
require 'webrick/https'

module Rails
    class Server < ::Rack::Server
        def default_options
            super.merge({
                :Port => 3000,
                :environment => (ENV['RAILS_ENV'] || "development").dup,
                :daemonize => false,
                :debugger => false,
                :pid => File.expand_path("tmp/pids/server.pid"),
                :config => File.expand_path("config.ru"),
                :SSLEnable => true,
                :SSLVerifyClient => OpenSSL::SSL::VERIFY_NONE,
                :SSLPrivateKey => OpenSSL::PKey::RSA.new(
                       File.open("/path/to/server.key").read),
                :SSLCertificate => OpenSSL::X509::Certificate.new(
                       File.open("/path/to/server.crt").read),
                :SSLCertName => [["CN", WEBrick::Utils::getservername]]
            })
        end
    end
end 
Next start the Rails server.
$ ./bin/rails server
It will use the SSL certificate and run the application over https protocol as we configured above. We can verify at https://0.0.0.0:3000.

We are ready with a Rails development environment on our Mac to do magic.

Provisioning a Development Environment with Packer, Part 2

In my previous post on provisioning a development environment with Packer I walked through getting a server setup with an operating system installed. This post will be focused setting up Ansible so that I can setup my development environment just the way I like it. Packer supports many different methods for provisioning. After playing with some of them, I decided that Ansible was a good mix of simplicity and functionality.

A Packer provisioner is simply a configuration template that is added to the json configuration file. The "provisioners" section of the configuration file takes an array of json objects which means that you aren't stuck with just one kind of provisioner. For example, you could run some shell scripts using the shell provisioner, then upload some files using the File Uploads provisioner, followed by your devops tool of choice (puppet, salt, chef, or ansible). You can even roll-your-own provisioner if desired. Here's an example provisioner setup for the shell provisioner:

{
  "variables": {...},
  "builders" : [...],
  "provisioners" [
    {
      "type": "shell",
      "inline": [ "echo foo" ]
    }
  ]
}

Sudo and User Considerations

Packer will login to the server over ssh and run your provisioners. The big headache that always comes out of this is some provisioners require sudo, or being logged in as root, to run their commands. Packer, however, will login as the user that was created during the build stage (see the "builders" section in the code snippet in the previous post). There are a couple of ways to handle this. First, you can do everything in packer as root. I don't love this approach because I like to simulate the way that I setup a machine by hand and I never login as root if I can help it. The second method is to grant your user sudo access. This gets a little tricky so I'll just show the a code snippet and then explain it below.

{
  "type": "shell",
  "execute_command": "echo '{{user `ssh_pass`}}' | {{ .Vars }} sudo -E -S sh '{{ .Path }}'",
  "inline": [
    "echo '%sudo    ALL=(ALL)  NOPASSWD:ALL' >> /etc/sudoers"
  ]
}

Utilizing the shell provisioner, the execute_command option is used to specify to the provisioner that whenever a command is run, use this command. The commands provided to the inline array are compiled into a single shell script which is injected as the .Path variable. To quote the Packer documentation:

The -S flag tells sudo to read the password from stdin, which in this case is being piped in with the value of [the user variable ssh_pass.] The -E flag tells sudo to preserve the environment, allowing our environmental variables to work within the script.

By setting the execute_command to this, your script(s) can run with root privileges without worrying about password prompts.

Taking advantage of this trick, a command can now be placed in the inline section that will add all members of the sudo group to the sudoers file granting them permission to use sudo without a password. Now I know this isn't secure but for my purpose, which is to create a custom development enviornment on a Virtual Machine running only on my machine, this will be just fine. I also use this example to illustrate how to run commands as sudo.

The Ansible Provisioner

Once the shell provisioner is working, Ansible can be installed on the new machine and then executed using Packer's Ansible provisioner. The easiest way to do this that I found was to have the shell provisioner install Ansible on the virtual machine as well as upload an ssh public key so that the Ansible user could log in. My "provisioners" section looks like this:

"provisioners": [
  {
    "type": "shell",
    "inline": [
      "mkdir .ssh",
      "echo '{{user `public_key`}}' >> .ssh/authorized_keys"
    ]
  },
  {
    "type": "shell",
    "execute_command": "echo '{{user `ssh_pass`}}' | {{ .Vars }} sudo -E -S sh '{{ .Path }}'",
    "inline": [
      "add-apt-repository ppa:rquillo/ansible",
      "apt-get update",
      "apt-get install -y ansible",
      "echo '%sudo    ALL=(ALL)  NOPASSWD:ALL' >> /etc/sudoers"
    ]
  },
  {
    "type": "ansible-local",
    "playbook_file": "site.yml"
  }
]

This configuration uses a variable in which I placed an ssh public key (I could also have used the File Uploads provisioner for this), installs Ansible from an updated PPA, and grants the user sudo priveliges via the sudo group as explained above. This also shows how you can execute more than one statement at a time using the "shell" provisioner.

Ansible Provisioning Tip

This tip could probably apply to any of the devops tools you'd like to use. If you are creating your Ansible yml files for the first time, you will likely run into the issue where you spend a lot of time waiting for the machine to build and provision only to discover your Ansible script is wrong. Troubleshooting becomes a problem because if anything fails during the provision, Packer will stop running and delete the Virtual Machine leaving you with no option other than fixing your mistake and then waiting for the entire process to run again.

One way I found around this is to take the last section of the provisioners array out, build your machine, and then move the base machine into a new directory once it's been successfully built. From there you can start the machine manually and then run the ansible-playbook command from your local machine while you develop your playbook. Once you have a working playbook, add the ansible-local section back to the provisioners array and rebuild your machine with Packer. That should speed up your development and troubleshooting cycles.

A Hiccup with Ansible and Packer

Ansible allows you to create template files that can be used for configuration files and the like. According to the documentation, you can specify the location of the templates and other files using the playbook_paths option in the provisioner. I could not get this to work and after a lot of troubleshooting and looking at the code for the provisioner I am convinced there is a bug copying the playbook_paths directories to the remote machine. I've posted on the Packer discussion group about this but haven't had any response on it yet. Once I get to the bottom the issue I'll post an update here.

Conclusion

Packer has turned out to be a fabulous resource for me in quickly ramping up development environments. Ansible has also been a breath for fresh air for provisioning those machines. I've previously used chef and other devops tools which only led to a great deal of frustration. I'm happy to have some new tools in my belt that took very little time to learn and to get working.

Restrict IMAP account access to one (or more) IP address

If you're in need of some extra layer of security on your mail server and know in advance who is going to access your IMAP account and from where (meaning which IP), then the following trick could be the perfect solution for you.

In order to use this feature you'll have to use Dovecot 2.x+ and then just add a comma separated list of addresses/subnets to the last field of your dovecot passwd auth file:

user:{plain}password::::::allow_nets=192.168.0.0/24,10.0.0.1,2001:abcd:abcd::0:0/80

After a quick reload Dovecot will start to enforce the specified new settings.

An additional neat aspect is that from an attacker perspective the given error will always be the same one got from a "wrong password" attempt, making basically impossible to discover this further protection.

Stay safe out there!

Bucardo, and Coping with Unicode

Given the recent DBD::Pg 3.0.0 release, with its improved Unicode support, it seemed like a good time to work on a Bucardo bug we've wanted fixed for a while. Although Bucardo will replicate Unicode data without a problem, it runs into difficulties when table or column in the database include non-ASCII characters. Teaching Bucardo to handle Unicode data has been an interesting exercise.

Without information about its encoding, string data at its heart is meaningless. Programs that exchange string information without paying attention to the encoding end up with problems exactly like that described in the bug, with nonsense characters all over. Further, it's impossible even to compare two different strings reliably. So not only would Bucardo's logs and program output contain junk data, Bucardo would simply fail to find database objects that clearly existed, because it would end up querying for the wrong object name, or the keys of the hashes it uses internally would be meaningless. Even communication between different Bucardo processes needs to be decoded correctly. The recent DBD::Pg 3.0.0 release takes care of decoding strings sent from PostgreSQL, but other inputs, such as command-line arguments, must be treated individually. All output handles, such as STDOUT, STDERR, and the log file output, must be told to expect data in a particular encoding to ensure their output is handled correctly.

The first step is to build a test case. Bucardo's test suite is quite comprehensive, and easy to use. For starters, I'll make a simple test that just creates a table, and tries to tell Bucardo about it. The test suite will already create databases and install Bucardo for me; I can talk to those databases with handles $dbhA and $dbhB. Note that in this case, although the table and primary key names contain non-ASCII characters, the relgroup and sync names do not. That will require further programming. The character in the primary key name, incidentally, is a staff of Aesculapius, which I don't recommend people include in the name of a typical primary key.

for my $dbh (($dbhA, $dbhB)) {
    $dbh->do(qq/CREATE TABLE test_büçárđo ( pkey_\x{2695} INTEGER PRIMARY KEY, data TEXT );/);
    $dbh->commit;
}

like $bct->ctl('bucardo add table test_büçárđo db=A relgroup=unicode'),
    qr/Added the following tables/, "Added table in db A";
like($bct->ctl("bucardo add sync test_unicode relgroup=unicode dbs=A:source,B:target"),
    qr/Added sync "test_unicode"/, "Create sync from A to B")
    or BAIL_OUT "Failed to add test_unicode sync";

Having created database objects and configured Bucardo, the next part of the test starts Bucardo, inserts some data into the master database "A", and tries to replicate it:

$dbhA->do("INSERT INTO test_büçárđo (pkey_\x{2695}, data) VALUES (1, 'Something')");
$dbhA->commit;

## Get Bucardo going
$bct->restart_bucardo($dbhX);

## Kick off the sync.
my $timer_regex = qr/\[0\s*s\]\s+(?:[\b]{6}\[\d+\s*s\]\s+)*/;
like $bct->ctl('kick sync test_unicode 0'),
    qr/^Kick\s+test_unicode:\s+${timer_regex}DONE!/,
    'Kick test_unicode' or die 'Sync failed, no point continuing';

my $res = $dbhB->selectall_arrayref('SELECT * FROM test_büçárđo');
ok($#$res == 0 && $res->[0][0] == 1 && $res->[0][1] eq 'Something', 'Replication worked');

Given that DBD::Pg handles the encodings for strings from the database, I only need to make a few other changes. I added a few lines to the preamble of some files, to deal with UTF8 elements in the code itself, and to tell input and output pipes to expect UTF8 data.

use utf8;
use open qw( :std :utf8 );

In some cases, I also had to add a couple more modules, and explicitly decode incoming values. For instance, the test suite repeatedly runs shell commands to configure and manage test instances of Bucardo. . There, too, the output needs to be decoded correctly:

    debug("Script: $ctl Connection options: $connopts Args: $args", 3);
-   $info = qx{$ctl $connopts $args 2>&1};
+   $info = decode( locale => qx{$ctl $connopts $args 2>&1} );
    debug("Exit value: $?", 3);

And with that, now Bucardo accepts non-ASCII table names.

[~/devel/bucardo]$ prove t/10-object-names.t 
t/10-object-names.t .. ok     
All tests successful.
Files=1, Tests=20, 24 wallclock secs ( 0.01 usr  0.01 sys +  2.01 cusr  0.22 csys =  2.25 CPU)
Result: PASS

Provisioning a Development Environment with Packer, Part 1

I recently needed to reconstruct an old development environment for a project I worked on over a year ago. The code base had aged a little and I needed old versions of just about everything from the OS and database to Ruby and Rails. My preferred method for creating a development environment is to setup a small Virtual Machine (VM) that mimics the production environment as closely as possible.

Introducing Packer

I have been hearing a lot of buzz lately about Packer and wanted to give it a shot for setting up my environment. Packer is a small command line tool written in the increasingly popular Go programming language. It serves three primary purposes:

  1. Building a machine based on a set of configuration parameters
  2. Running a provisioner to setup the machine with a desired set of software and settings
  3. Performing any post processing instructions on that machine

Packer is really simple to install and I would refer you to their great documentation to get it setup. Once setup, you will have the packer command at your disposal. To build a new machine, all you need to is call

packer build my_machine.json

The file my_machine.json can be the name of any json file and contains all the information packer needs to setup your machine. The configuration json has three major sections: variables, builders, and provisioners. Variables are simply key value pairs that you can reference later in the builders and provisioners sections.

The Builder Configuration

Builders takes an array of json objects that specify different ways to build your machines. You can think of them as instructions on how to get your machine setup and running. For example, to get a machine up and running you need to create a machine, install an Operating System (OS) and create a user so that you can login to the machine. There are many different types of builders, but for the example here, I’ll just use the vmware-iso machine type. Here’s a working json configuration file:

{
  "variables": {
    "ssh_name": "mikefarmer",
    "ssh_pass": "mikefarmer",
    "hostname": "packer-test"
  },

  "builders": [
    {
      "type": "vmware-iso",
      "iso_url": "os/ubuntu-12.04.4-server-amd64.iso",
      "iso_checksum": "e83adb9af4ec0a039e6a5c6e145a34de",
      "iso_checksum_type": "md5",
      "ssh_username": "{{user `ssh_name`}}",
      "ssh_password": "{{user `ssh_pass`}}",
      "ssh_wait_timeout": "20m",
      "http_directory" : "preseeds",
      "http_port_min" : 9001,
      "http_port_max" : 9001,
      "shutdown_command": "echo {{user `ssh_pass`}} | sudo -S shutdown -P now",
      "boot_command": [
        "<esc><esc><enter><wait>",
        "/install/vmlinuz noapic ",
        "preseed/url=http://{{ .HTTPIP }}:{{ .HTTPPort }}/precise_preseed.cfg ",
        "debian-installer=en_US auto locale=en_US kbd-chooser/method=us ",
        "hostname={{user `hostname`}} ",
        "fb=false debconf/frontend=noninteractive ",
        "keyboard-configuration/modelcode=SKIP keyboard-configuration/layout=USA ",
        "keyboard-configuration/variant=USA console-setup/ask_detect=false ",
        "initrd=/install/initrd.gz -- <enter>"
      ]
    }
  ]
}

The documentation for these settings is really good but I want to point out a few things that weren’t immediately clear. Some of these pertain mostly to the vmware-iso builder type, but I believe they are worth pointing out because some of them apply to other builder types as well.

First, the iso_url setting can be either an absolute path, a relative path, or a fully qualified url. The relative path is relative to the directory where you run the packer command. So here, when I run packer, I need to make sure that I do so from a directory that has an os subdirectory with the ubuntu iso located therein.

Next, once the ISO is downloaded, packer will automatically start up your VMWare client and boot the virtual machine. Immediately after that, packer will start up a VNC client and server along with a mini web server to provide information for your machine. The http_port_min and http_port_max specify which ports to use for the VNC clients. Setting them to the same will allocate just that port for it to use. The http_directory setting provides the name of a local directory to use for the mini web server as the document root. This is important for providing your VM with a preseed file, more about the preseed file will be discussed below.

Since we are using Ubuntu as our main machine, we will need to use sudo to send the shutdown command. The shutdown_command setting is used to gracefully shut down the machine at the conclusion of the run and provisioning of the machine.

Installing your OS

The boot_command is a series of keystrokes that you can send to the machine via VNC. If you have setup a linux machine from scratch you know that you have to enter in a bunch of information to the machine about how to set it up for the first time such as time zone, keyboard layout, how to partition the hard drive, host name, etc. All these keystrokes needed to setup your machine can be used here. But if you think about it, that’s a ton of keystrokes and this command could get quite long. A better way to approach this is to use a preseed file. A preseed.cfg file contains the same information you enter when you setup a machine for the first time. This isn’t something provided by packer, but it is provided by the operating system to automatically provision machines. For Ubuntu, a preseed file is used like so:

  • When you boot from the startup media (in this case an iso), you can choose the location of the preseed file via a url
  • The preseed file is uploaded into memory and the configuration is read
  • The installation process begins using information from the preseed file to enter the values where the user would normally enter them.

So how do we get the preseed file up to the machine? Remember that little web server that packer sets up? Well, the ip and port is made available to the virtual machine when it boots from the ISO. The following line tells the OS where to find the web server and the configuration file:

 "preseed/url=http://{{ .HTTPIP }}:{{ .HTTPPort }}/precise_preseed.cfg"

Strings in packer can be interpolated using a simple template format similar to moustache. The double curly braces tells packer to insert a variable here instead of text. The HTTPIP and HTTPPort variables are made available by packer to the template.

One more important note about the preseed file, you need to make sure that the settings for the username and password are the same as listed in your variables section so that you can login to the machine once it is built. Where do you get a preseed file? I found one on a blog titled Packer in 10 Minutes by @kappataumu. I only had to modify a few settings that were specific to my setup

Remember that http_directory mentioned above? Well, that directory needs to include your preseed file. I’ve named mine pricise_preseed.cfg for Ubuntu 12.04 Precise Pangolin.

Next up is provisioning but that is such a big topic by itself that I’ll move that into a separate blog post. The config file above will work as-is and once run it should setup a basic Ubuntu server for you. Go ahead and give it a try and let me know in the comments how it worked out for you.

Super Powers

I said that packer has 3 primary purposes earlier. Well, I lied. Packer’s super power is that it can perform those 3 purposes over any number of machines, whether virtual, hosted, or otherwise in parallel. Supported machines are currently:

  • Amazon EC2 (AMI)
  • DigitalOcean
  • Docker
  • Google Compute Engine
  • OpenStack
  • QEMU
  • VirtualBox
  • VMware

Consider for a moment that you can now automatically setup and provision multiple machines with the same environment using a single command. Now you are seeing the power of Packer.

Interchange table hacking

Interchange has a powerful but terribly obscure table administration tool called the Table Editor. You can create, update, and delete rows, and even upload whole spreadsheets of data, but the Table Editor isn't the most flexible thing in the world, so sometimes it just flat-out refuses to do what you want.

So you trick it.

A client wanted to upload data to a table that had a single-column primary key (serial), but also had a unique three-column key that was only used in the upload process (because the uploaded data was intended to replace rows with identical three-column combinations). Example:

In the table:

code: 243
field1: AAA
field2: BBB
field3: CCC
data-fields: ...

In the spreadsheet:

field1  field2  field3  data-fields...
 AAA     BBB     CCC     ...

In the database definition for this table, I had to add a secondary key definition for Interchange's use:

Database  my_table  COMPOSITE_KEY  field1 field2 field3

in addition to the original key:

Database  my_table  KEY  code

Here's the problem this presents: when you add a COMPOSITE_KEY to a table, the table editor refuses to show per-row checkboxes that allow you to delete rows. I thought I might have to write a custom admin page to carry this off, but then I had an inspiration -- the REAL_NAME attribute of the Database directive:

Database  my_table_edit my_table_edit.txt __SQLDSN__
Database  my_table_edit REAL_NAME my_table
Database  my_table_edit KEY code
Database  my_table_edit  PREFER_NULL code
Database  my_table_edit  AUTO_SEQUENCE 

This chunk of config-code tells Interchange that a second table can be accessed in the database, which is in fact the same table as the first (not a view, not a copy, but the actual table), but it doesn't have the COMPOSITE_KEY. When Interchange is restarted with this new definition in place, the Table Editor will show this second table with the familiar per-row checkboxes, even as it refuses to show them for the original table.

Phew. I dodged a bullet with that, as I didn't want to have to write a special page just to mimic the Table Editor.

Implementing Background Fetch in iOS 7

With the iOS7 being out and gaining market share, great features it introduced are becoming available to more and more users.
One such new feature is a set of so-called "background modes".

States the application can be in, in iOS

To explain this new set of modes, let me give you a really quick intro to what modes are.
In iOS, at a given point in time, an app can be in one of the following states:

Not running

There is no process for the app in the system.

Inactive

The app is running in the foreground but currently is not receiving any events. (It may be executing other code though.) An app usually stays in this state only briefly as it transitions to a different state.

Active

The application is running and is receiving user input. The main user interface is visible on the display.

Background

The application is running. It's not receiving user input. Its code is being executed but it will be switched to the suspended state very soon by the system.

Suspended

The app remains in memory, but it's not being executed. It remains dormant until a user chooses to activate it again or a system switches it back to a background state to allow it to process certain kinds of data.

Background modes

The last paragraph described certain kinds of data an app may want to process even if it's not a receiver of user's actions.
This makes sense - there are apps that use GPS or audio system even when they aren't active. In fact those along with the VOIP were one of scenarios iOS creators designated as special in the previous versions of the system.
Inactive applications were allowed to run certain parts of their code in response to GPS or VOIP events. They were also allowed to play or record audio content.
Those scenarios are what are called background modes. This just refers to situations under which iOS brings an app from the suspended state to background, to allow it to run its code.

Other background modes available in pre iOS 7 systems are:

  • Newsstand downloads
  • External accessory communication
  • Bluetooth networking
  • Bluetooth data sharing

New background modes in iOS 7

With the newest version of the system, Apple introduced two new modes:

  • Background fetch
  • Remote notifications

The first one allows apps to periodically access web servers to download data. The second one allows apps to create notifications for users that some new content can be downloaded.

A little bit about the use case I was working on

I was implementing the background fetch mode recently in the iOS application for the Locate Express platform. The platform allows its users to search in real time for service providers when they need e. g. to repair something at their homes.

One of the business rules in the system is that providers who are not actively using the application to make themselves available for searches - are marked as pending, which effectively excludes them from the results. The problem with this logic was that the client app would stop talking to the Locate Express system as soon as it entered the suspended state. The remedy for this was using the location updates background mode. The problem with this in turn was that iOS devices would quickly run out of battery. The location updates mode can be configured to report small changes, but it can also be configured only to report significant ones. Given that many service providers do not change their location significantly while they are at work - we were in need of a different solution.

The app itself, receives information about so called service requests from the web based portal located at: www.locateexpress.com. The natural path then was to use the new background fetch mode to improve user's experience by periodically synchronizing the list of requests. This allowed us to:

  • present always up-to-date list of requests even when a user did not access the app from the remote notification that was bringing the information about any new request
  • avoid any waiting time for the list of requests to refresh
  • mark a provider as active frequently, without draining the battery

The background fetch mode setup

Like any other background mode, you have to make few steps first to make the app able to use it.


  1. In Xcode, Go to your project's settings by clicking on it and then accessing the "Capabilities" pane
  2. Turn "Background Modes" on if it's not already
  3. Check the "Background fetch" checkbox

How frequently will the fetch be executed?

You can control this frequency only to some extent. In fact, to be able to fetch any data you still have to run the following code, some time when initializing your AppDelegate:

[[UIApplication sharedApplication] setMinimumBackgroundFetchInterval:UIApplicationBackgroundFetchIntervalMinimum];

This sets the value telling iOS the minimum interval in which you want it to run the background fetch code. There is no maximum hint though. It means that you can tell iOS not to run the fetch more frequent than some time span. You cannot tell it to run not rarely than some other value.

The frequency is managed by iOS. You give it a hint with the above snippet of code and returning information if you were able to get any new data with the current fetch. Based on that in part - it is able to compute the most optimal (from its own stand point) frequency value.

Implementing the fetching method

Last step is to implement the following method in your AppDelegate:

(void) application:(UIApplication *)application performFetchWithCompletionHandler:(void (^)(UIBackgroundFetchResult))completionHandler
{
  NSLog(@"Background Fetch!");
  // fetching the data here ...
  if(failedFetch) { // use your own flag here
    completionHandler(UIBackgroundFetchResultFailed);
  }
  else {
    if(newDataFetched) { // use your own flag here
    completionHandler(UIBackgroundFetchResultNewData);
  }
  else {
    completionHandler(UIBackgroundFetchResultNoData);
  }
}

More to read

md+lvm expansion from RAID 1 to RAID 5

VHS is on the way out, or so they tell me. A little while back I unearthed the family's collection of old tape recordings, and have been digitizing everything in an effort to preserve all the old youth sports games and embarassing birthday parties. There's no way I'm going to let my brother forget those. It's a lot of video, and that takes up quite a bit of space. Between that, HD videos recorded more recently, Postgres test databases and source datasets, server backups, and so on, the 3TB of space on my local file server was quickly running out.

I know, right? My first hard drive was 40MB, in two 20MB partitions. And we were glad for that space!

Back in the present, now it was time to add another hard drive. This (otherwise) tiny rooted plug server contained two 3TB USB hard drives in a RAID-1 configuration through the Linux md module. On top of that is lvm. lvm itself could have been used for the RAID-1, but has the disadvantage of not being able to optimize multiple reads, whereas md can allow both disks in the mirror serve different reads at the same time.

I bought a third disk of the same model, so it could be added in to a RAID-5 configuration. These being big USB drives, operations that read and write the disks as a whole (such as RAID rebuild operations) take a while. A long while. I could have unmounted, disassembled the array, rebuilt it as RAID-5, and brought it back after, but keeping it offline for that amount of time wasn't too appealing. Lets attempt an online conversion.

First up, confirming the disks are all the same exact size:
root@plug01:/backup# blockdev --getsize64 /dev/sdd2
2999557554176
root@plug01:/backup# blockdev --getsize64 /dev/sdc2
2999557554176
root@plug01:/backup# blockdev --getsize64 /dev/sdb2
2999557554176
mdadm says the array is in good shape, but it won't be for long. We'll need to break the RAID-1 in order to recreate the RAID-5. Yes, it's as scary as it sounds. Backups were double checked. Backups were triple checked. To break the array set one of the devices as failed, then remove it:
Number Major Minor RaidDevice State
0 8 34 0 active sync /dev/sdc2
1 8 18 1 active sync /dev/sdb2
root@plug01:/backup# mdadm /dev/md0 -f /dev/sdc2 # Set as failed
mdadm: set /dev/sdc2 faulty in /dev/md0
root@plug01:/backup# mdadm /dev/md0 -r /dev/sdc2 # Remove from array
mdadm: hot removed /dev/sdc2 from /dev/md0
Now we have an ex-RAID-1 (sdb2) with two spare disks (sdc2 and sdd2.) Those two spare partitions can then be put into a RAID-5 configuration.

Wait, what? RAID-5 with two disks? Sure, I could have created a 3-device RAID-5 with one marked as "missing" but I wanted to restore the redundancy as soon as possible, and so gave it a shot. Lo and behold...
root@plug01:/backup# mdadm --create /dev/md1 --level=5 --raid-devices=2 /dev/sdc2 /dev/sdd2
mdadm: /dev/sdc2 appears to be part of a raid array:
level=raid1 devices=2 ctime=Sat Jan 12 05:25:51 2013
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.
root@plug01:/backup# mdadm -D /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Fri Jan 3 20:13:05 2014
Raid Level : raid5
Array Size : 2929253888 (2793.55 GiB 2999.56 GB)
Used Dev Size : 2929253888 (2793.55 GiB 2999.56 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent

Update Time : Fri Jan 3 20:13:05 2014
State : clean, degraded, recovering
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 512K

Rebuild Status : 0% complete

Name : plug01:1 (local to host plug01)
UUID : 1d493c17:7a443a6d:e6c121b4:53e8b9a1
Events : 0

Number Major Minor RaidDevice State
0 8 34 0 active sync /dev/sdc2
2 8 50 1 spare rebuilding /dev/sdd2
Seems to have worked! The array build will do its thing in the background, but we can start using it immediately. Since we want the redundancy back sooner than later, lets start moving the data off the now single disk it resides on. Since we're using lvm, this is just a matter of having it move the volume group from the old pv to the new one. That process does take a long time. Set up the physical volume structure, add it to the volume group, and start the move process:
root@plug01:/backup# pvcreate /dev/md1
Found duplicate PV gejBFzirMdX0KSGMO6S1TYQSOBJTUqOw: using /dev/md0 not /dev/md1
get_pv_from_vg_by_id: vg_read_internal failed to read VG plug01
Physical volume "/dev/md1" successfully created
root@plug01:/backup# vgextend array1 /dev/md1
Volume group "array1" successfully extended
root@plug01:/backup# pvmove /dev/md0 /dev/md1
/dev/md0: Moved: 0.0%
/dev/md0: Moved: 0.0%
<snip>
/dev/md0: Moved: 99.9%
/dev/md0: Moved: 100.0%
/dev/md0: Moved: 100.0%
Clean up the now-abandoned disk, then add it to the new RAID-5...
root@plug01:/backup# vgreduce array1 /dev/md0
Removed "/dev/md0" from volume group "array1"
root@plug01:/backup# vgs
VG #PV #LV #SN Attr VSize VFree
array1 1 1 0 wz--n- 2.73t 28.75g
root@plug01:/backup# pvremove /dev/md0
Labels on physical volume "/dev/md0" successfully wiped
root@plug01:/backup# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@plug01:/backup# mdadm --add /dev/md1 /dev/sdb2
mdadm: added /dev/sdb2
root@plug01:/backup# mdadm --grow --raid-devices=3 --backup-file=/root/tmp/md1.bak /dev/md1
mdadm: /dev/md1 is performing resync/recovery and cannot be reshaped
Oh, our pvmove activity superseded the array build procedure, so we have to wait until that finishes before we can grow the RAID-5. While we're waiting, I'll note that the operation backs up some of the metadata to an external file at the very start of the procedure, just in case something happens early on in the process. It doesn't need it for long.

There, that's probably enough waiting...
root@plug01:/backup# mdadm --grow --raid-devices=3 --backup-file=/root/tmp/md1.bak /dev/md1
mdadm: Need to backup 1024K of critical section..
root@plug01:~# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md1 : active raid5 sdb2[3] sdd2[2] sdc2[0]
2929253888 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
[>....................] reshape = 0.5% (16832872/2929253888) finish=9666.3min speed=5021K/sec

unused devices:
More waiting. After that completes we're then able to resize the pv to take up that space, and then resize the lv. Note that the lv doesn't take up the entirety of the pv and has a little bit of space reserved for snapshots.
root@plug01:/backup# pvresize /dev/md1
Physical volume "/dev/md1" changed
1 physical volume(s) resized / 0 physical volume(s) not resized
root@plug01:/backup# lvextend -L 5558g /dev/mapper/array1-vol1
Extending logical volume vol1 to 5.43 TiB
Logical volume vol1 successfully resized
root@plug01:/backup# pvs
PV VG Fmt Attr PSize PFree
/dev/md1 array1 lvm2 a- 5.46t 29.11g
root@plug01:/backup# vgs
VG #PV #LV #SN Attr VSize VFree
array1 1 1 0 wz--n- 5.46t 29.11g
root@plug01:/backup# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
vol1 array1 -wi-ao 5.43t
And the last step, perform an online resize of the ext4 volume:
root@plug01:/backup# resize2fs /dev/mapper/array1-vol1
resize2fs 1.41.12 (17-May-2010)
Filesystem at /dev/mapper/array1-vol1 is mounted on /mnt/disk01; on-line resizing required
old desc_blocks = 173, new_desc_blocks = 348
Performing an on-line resize of /dev/mapper/array1-vol1 to 1456996352 (4k) blocks.
The filesystem on /dev/mapper/array1-vol1 is now 1456996352 blocks long.

root@plug01:/backup# df -h /dev/mapper/array1-vol1
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/array1-vol1 5.4T 2.6T 2.6T 50% /mnt/disk01
There, a completely new array structure and a bunch more space without having to unmount the filesystem for a moment!

Amazon Payments - Caveat Developer

A client of ours needed me to install Amazon Payments for them. Now there are several shopping carts for which Amazon Payments can be installed as an option, and I assume they work just fine. This client was not so lucky, and I had to roll my own.

Amazon starts with a JavaScript widget that asks you to log in or create an Amazon account. The widget returns an order reference ID:

<div id="payWithAmazonDiv" style="padding-top: 1.2em;">
   <br />
   <img src="https://payments.amazon.com/gp/widgets/button?sellerId=[Amazonseller id]&size=large&color=orange" 
        style="cursor: pointer;"/>
</div>

<script type="text/javascript">
 var amazonOrderReferenceId;
 new OffAmazonPayments.Widgets.Button ({
   sellerId: '[Amazon seller id]',
   useAmazonAddressBook: true,
   onSignIn: function(orderReference) {
     amazonOrderReferenceId = orderReference.getAmazonOrderReferenceId();
      window.location = 'https://www.yoursite.com/nextpage.html?session=' +
                        amazonOrderReferenceId;
   },
   onError: function(error) {
     // your error handling code
     alert('Amazon error');
    }
 }).bind("payWithAmazonDiv");
</script>

The Id returned looks like “P##-#######-#######.” and must be saved for future screens. It's know as the Amazon Order Reference Id. In my case, I simply passed it to the next page in the session variable of the query string.

Amazon next wants you specify a shipping address and that's when the fun begins: Amazon provides a way to specify a callback function that gets invoked when an address has been selected. You use this function to ask Amazon to provide the details of the order so that you can calculate a shipping cost. To do this part you first need to provide the order reference, your Amazon access key id and sellerId (both provided to you by Amazon), Then you must compute a signature using the SignatureMethod specified. Also be sure to format your time stamp in the way Amazon requires it (%Y-%m-%dT%H:%M.000Z).

https://mws.amazonservices.com/OffAmazonPayments
?AWSAccessKeyId=[Amazon ACCESS KEY]
&Action=GetOrderReferenceDetails
&AmazonOrderReferenceId=[Amazon Order Reference]
&SellerId=[Amazon Seller ID]
&SignatureMethod=HmacSHA256
&SignatureVersion=2
&Timestamp=2014-03-01T17%3A49.000Z
&Version=2013-01-01
&Signature=[Computed Signature]'

For this I used a Perl module: use Digest::SHA qw( hmac_sha256_base64 ). This routine successfully encodes the data and converts it to base64, as Amazon requires. Another little bit of fun comes from having to sort the options before the signature in case-sensitive alphabetical order. Only this results in the Signature being generated properly. Another little gotcha is to make sure your timestamp is set to the future. I set it for six hours ahead, and it seems to work properly.

I followed these guidelines for the remainder of Amazon's steps, and at last resulted in an order where the charge had been properly approved or declined.

The thing is, that unlike PayPal and traditional credit card gateways, Amazon does not necessarily return an immediate yes or no answer as to whether the transaction is approved. The order is placed in a "Pending" state, and you need to poll them from time to time to get the final approval status of such orders. They attribute this to extra fraud protection that they perform. At the worst case, they say that it could take them up to a full day to return a formal decision, though most transactions (95%+) will be resolved within an hour. This unexpected development caused me to have to handle Amazon orders differently from other orders placed at the site. In my case, Amazon orders got their own order acknowledgement page and email, and we changed the way that the company determines which orders are approved and ready to be filled.

In short Amazon Payments it not any sort of "drop-in" alternative to PayPal. Rather it's a fairly complex system that may require you to change the way orders are processed.

Ansiblizing SSH Keys

It is occasionally the case that several users share a particular account on a few boxes, such as in a scenario where a test server and a production server share a deployment account, and several developers work on them. In these situations the preference is to authenticate the users with their ssh keys through authorized_keys on the account they are sharing, which leads to the problem of keeping the keys synchronized when they are updated and changed. We add the additional parameter that perhaps any given box will have a few users of the account that aren't shared by the others, but otherwise allow a core of developers to access them. Now extend this scenario across hundreds of machines, and the maintenance becomes difficult or impossible when updating any of the core accounts. Obviously this is a job for a remote management framework like Ansible.

Our Example Scenario

We have developers Alice, Bob and Carla which need access to every box. We have additional developers Dwayne and Edward that only need access to one box each. We have a collection of servers: dev1, dev2, staging and prod. All of the servers have an account called web_deployment.

The authorized_keys for web_deployment on each box contains:

  • dev1
    • alice
    • bob
    • carla
  • dev2
    • alice
    • bob
    • carla
    • dwayne
  • staging
    • alice
    • bob
    • carla
    • edward
  • prod
    • alice
    • bob
    • carla

Enter Ansible

Ansible is setup for every box already. The basic strategy for managing the keys is to copy a default authorized_keys file from the ansible host containing Alice, Bob and Carla (since they are present on all of the destination machines) and assemble the keys with a collection of keys local to the host (Dwayne's key on dev2, and Edward's key on staging). To perform the assembly action we also want to provide a script so that the keys can be manually manipulated (local keys changed) without touching the ansible box. The script is thus:


#!/usr/bin/env bash
set -u -o errexit -o pipefail

target_ssh_dir="/home/web_deployment/.ssh"
base_authorized_key_file="authorized_keys"

local_authorized_keys="${target_ssh_dir}/${base_authorized_key_file}.local"
hosting_authorized_keys="${target_ssh_dir}/${base_authorized_key_file}.hosting"
target_authorized_keys="${target_ssh_dir}/${base_authorized_key_file}"

tmp_authorized_keys="${target_ssh_dir}/${base_authorized_key_file}.tmp"

authorized_keys_backup_dir="${target_ssh_dir}/history"

# BEGIN multiline configuration_management_disclaimer string variable
configuration_management_disclaimer="\n\
# ******************************************************************************\n\
# This file is automatically managed by End Point Configuration management
# system. In order to change it please apply your changes
# to $local_authorized_keys and run $0\n\
# so to assemble a new $target_authorized_keys\n\
# ******************************************************************************\n\
"
# END multiline configuration_management_disclaimer string variable

# BEGIN assembling tmp file
echo -e "$configuration_management_disclaimer" > $tmp_authorized_keys

echo -e "# BEGIN STANDARD HOSTING KEYS\n" >> $tmp_authorized_keys
cat $hosting_authorized_keys >> $tmp_authorized_keys
echo -e "# END STANDARD HOSTING KEYS\n" >> $tmp_authorized_keys

if [[ -r $local_authorized_keys ]]
then
  echo -e "# BEGIN LOCAL KEYS\n" >> $tmp_authorized_keys
  cat $local_authorized_keys >> $tmp_authorized_keys
  echo -e "# END LOCAL KEYS\n" >> $tmp_authorized_keys
fi

echo -e "$configuration_management_disclaimer" >> $tmp_authorized_keys
# END assembling tmp file

# BEGIN check (and do) backup of old file
if ! cmp $tmp_authorized_keys $target_authorized_keys &> /dev/null
then
  mkdir -p $authorized_keys_backup_dir

  backup_old_auth_keys="${authorized_keys_backup_dir}/${base_authorized_key_file}_$(date '+%Y%m%dT%H%M%z')"
  cat $target_authorized_keys > $backup_old_auth_keys
fi
# END check (and do) backup of old file

cat $tmp_authorized_keys > $target_authorized_keys

rm $tmp_authorized_keys

if [ -d $authorized_keys_backup_dir ]
then
  if [ -n "$(find $authorized_keys_backup_dir -maxdepth 0 -type d)" ]
  then
    chmod -R u=rwX,go= $authorized_keys_backup_dir
  fi
fi

if [ -f $local_authorized_keys ]
then
  if [ -n "$(find $local_authorized_keys -maxdepth 0 -type f)" ]
  then
    chmod u=rw $local_authorized_keys
  fi
fi

if [ -f $hosting_authorized_keys ]
then
  if [ -n "$(find $hosting_authorized_keys -maxdepth 0 -type f)" ]
  then
    chmod u=rw $hosting_authorized_keys
  fi
fi

We then use an Ansible task to distribute the files to the destination hosts:

# tasks/authorized_keys_deploy.yml
---
  - name: Create /home/web_deployment subdirectories
    file: path=/home/web_deployment/{{ item }}
          state=directory
          owner=web_deployment
          group=web_deployment
          mode=0700
    with_items:
      - .ssh
      - bin

  - name: Copy /home/web_deployment/.ssh/authorized_keys.universal
    template: src=all/home/web_deployment/.ssh/authorized_keys.universal.j2
          dest=/home/web_deployment/.ssh/authorized_keys.hosting
          owner=web_deployment
          group=web_deployment
          mode=0600
    notify:
      - Use shellscript to locally assemble authorized_keys

  - name: Copy /home/web_deployment/bin/assemble_authorized_keys.sh
    copy: src=files/all/home/web_deployment/bin/assemble_authorized_keys.sh
          dest=/home/web_deployment/bin/assemble_authorized_keys.sh
          owner=web_deployment
          group=web_deployment
          mode=0700
    notify:
      - Use shellscript to locally assemble authorized_keys

This task is invoked by an Ansible playbook:

# authorized_keys_deploy.yml
---
- name: authorized_keys file deployment/management
  hosts: authorized_keys_servers
  user: root

  handlers:
  - include: handlers/authorized_keys_deploy.yml

  tasks:
  - include: tasks/authorized_keys_deploy.yml

And finally the handler which invokes the assembly script:

# handlers/authorized_keys_deploy.yml
---
  - name: Use shellscript to locally assemble authorized_keys
    command: "/home/web_deployment/bin/assemble_authorized_keys.sh"

A note about this setup: The authorized_keys.universal has the extension .j2, and is invoked as a Jinja2 template. This allows server-specific conditionals amongst other things. It is useful for example when per-key shell features are used (for example restricting one particular key to invoking rsync for backups), and if the OS selection is mixed thus requiring the paths to differ between hosts.

Conclusion

We hope that this example is helpful. There are some clear directions for improvement and ways to make this suit other scenarios, such as having the universal keys list also merged with an additional keys list select by host or other information such as OS or abstract values associated with the host. One could also envision a system in which some arbitrary collection of key files are merged at the destination server selected through the aforementioned means or others.

Shout out to Lele Calo for his putting together the Ansible setup for this procedure.

Liquid Galaxy at the Economist World Ocean Summit

The Liquid Galaxy with image capture equipment
End Point had the distinct privilege of providing a Liquid Galaxy to the Economist World Ocean Summit held in Half Moon Bay, California last week. Originally conceived as a Google Earth and Google Street View platform, the Liquid Galaxy is rapidly becoming a premiere display for ocean content as well. Google is partnering with NGOs and corporations that are actively working toward ocean sustainability. Together, these organizations capture and categorize a great deal of ocean content, with most of that content publicly available via Street View. Some of this equipment was on display next to the Liquid Galaxy: a Trekker camera for remote areas and an underwater camera provided by the Catlin Seaview Survey for capturing hi-definition panoramas underwater.

This deployment used a custom configuration with six large HDTVs arrayed in a curved 3x2 panorama. The Liquid Galaxy is able to show this content as a single large surface with geometrically-adjusted angles. In short, it's like a bay window on a submarine. Viewers are able to navigate to different locations through a touchscreen interface, and then rotate and zoom in on detailed panoramic photos with a 6-axis controller.

End Point enjoys a close working relationship with Google, and is happy to provide a Liquid Galaxy for such events. Our engineer Neil assembled a custom frame to hold the 3x2 screen configuration (deployed for the first time here), tested all devices, adjusted viewing angles, and then deployed and supported the full rig onsite at the Ritz-Carlton Hotel in Half Moon Bay.




JavaScript Namespacing with the Rails Asset Pipeline

Since the release of Rails 3 [a while back], I've had a lot of use with the asset pipeline. There can be minor headaches associated with it, but ultimately, the process of combining, minifying, and serving a single gzipped JavaScript and CSS file is a great gain in terms of reducing requests to speed up your web application. It's a behavior that I've wanted to emulate in other platforms that I've used (including Interchange).

One headache that might come up from the asset pipeline is that JavaScript functions from various parts of the application might have the same name, and may override existing functions of the same name. The last defined function will be the one that executes. I've come up with a common pattern to avoid this headache, described below:

Set the Body ID

First, I set the body tag in my application layout file to be related to the controller and action parameters. Here's what it looks like:

<body id="<%= "#{params[:controller].gsub(/\//, '_')}_#{params[:action]}" %>">
...
</body>

If I was less lazy, I could create a helper method to spit out the id.

Shared JS

I create a shared.js file, which contains JavaScript shared across the application. This has the namespace "shared", or "app_name_global", or something that indicates it's global and shared:

var shared = {
};

Namespace JavaScript

Next, I namespace my JavaScript for that particular controller and action which contains JavaScript applicable only to that controller action page. I namespace it to match the body ID, such as:

# for users edit page
var users_edit = {
...
};
# for product_show page
var products_show = {
...
};

Add initialize method:

Next, I add an initialize method to each namespace, which contains the various listeners applicable to that page only:

# for users edit page
var users_edit = {
    intialize: function() {
        //listeners, onclicks, etc.
    }, 
    ...
};
# for product_show page
var products_show = {
    intialize: function() {
        //listeners, onclicks, etc.
    }, 
    ...
};

Shared initialize method

Finally, I add a method to check for the initialize method applicable to the current page and execute that, in the shared namespace:

var shared = {
    run_page_initialize: function() {
        var body_id = $('body').attr('id');
        if(eval(body_id + '.initialize') !== undefined) {
            eval(body_id + '.initialize()');
        }
    }  
};
$(function() {
    shared.run_page_initializer();
});

Dealing with shared code across multiple actions

In some cases, code might apply to multiple parts of the application and no one wants to repeat that code! I've set up a single namespace for one of the controller actions, and then defined another namespace (or variable) pointing to that first one in this case, shown below.

# for users edit page
var users_edit = {
    intialize: function() {
        //listeners, onclicks, etc.
    }, 
    ...
};

# reuse users_edit code for users_show
var users_show = users_edit;

Conclusion

It's pretty simple, but it's a nice little pattern that has helped me be consistent in my organization and namespacing and makes for less code repetition in executing the initialized methods per individual page types. Perhaps there are a few more techniques in the Rails space intended to accomplish a similar goal – I'd like to hear about them in the comments!