End Point

News

Welcome to End Point's blog

Ongoing observations by End Point people.

A review of The Rails 4 Way

With a brand new project on the horizon (and a good candidate for Ruby on Rails 4 at that), I thought it would be a productive use of my time to bone up on some of the major differences introduced with this new version. My goal was to trade a little research for as many precluded that’s-not-how-we-do-it-anymore-500-errors as I could when fleshing out my first scaffold.

With this goal in mind, I purchased a copy of The Rails 4 Way by Obie Fernandez, Kevin Faustino, Vitaly Kushner, and Ari Lemer. Considering the free-nature of everything else Rails related, I will admit to a slight aversion to paying money for a Rails book. For those of you out there with similar proclivities, I felt compelled to share my experience.

The Rails 4 Way presents itself not as a tutorial of Ruby on Rails, but as “a day-to-day reference for the full-time Rails developer.” The tone of the book and the depth of information presented hints that the authors don’t want to teach you how to write Ruby on Rails applications. They want to share insight that will help you write better Ruby on Rails applications. Much of this is accomplished by fleshing out—utilizing a plentiful source of examples and snippets—the methods and features the framework has to offer.

Early on the authors concede the point, “There are definitely sections of the text that experienced Rails developer will gloss over. However, I believe that there is new knowledge and inspiration in every chapter, for all skill levels.” I would classify myself as an experienced Rails developer and I found this statement to be completely true. When settling down to read the book for the first time, I scrolled through the pages until a specific topic caught my eye. As a result of this, I started my read part-way through chapter 4. Upon completion of the last section, I wrapped back around to the beginning and dutifully read everything including the foreword. I did this specifically because each of the chapters, if not sections, had contained at least one little nugget of information that I found valuable. I felt compelled to go back and look for the extra tidbits that I’d passed over with the initial, casual flicks of my mouse.

I mentioned that the authors focus more on helping you produce better code, and to this end, they don’t dance around any issues. My favorite quote from the book illustrates this, “Frankly, there’s not much of a reason why that should be a problem unless you’ve made some pretty bad data-modeling decisions.” By all means, call it like it is.

The book is still a work-in-progress. Somewhere around 10% of the sections contain “TBD” rather than a body. I gather that the book is based on its Rails 3 counterpart and some of the sections have yet to be enhanced with the Rails-4-specific content. A majority share of the Rails 3 content is still applicable, so this shortcoming did not mar my experience at all. Especially since I will be given access to the new content once it is added. At the time of my download, the last change was authored 12 days ago, so for that reason also, I am unconcerned.

In conclusion, I took enough new information away, and enjoyed the process enough that I feel confident in recommending The Rails 4 Way to friends and coworkers. The bulk of this book reads like an API document, and in those sections, there’s nothing technical that you can’t find on the Rails API page itself. What the book does offer is back-story into what these methods were designed to do and what they pair well with. I would compare the Rails API and this book to a map in your hand and a local who knows the area. Most of the time you can figure out how to get where you need to go by carefully reading your unfolded map. But sometimes it’s faster to ask a local who can tell you, “See that red sign off in the distance? Head over to it and take a right.”

How to Enable MySQL Event Scheduler

You may think that you already know what's the opposite of "DISABLED", but with MySQL Event Scheduler you'll be wrong.

In fact MySQL Event Scheduler may have three different states[1][2]:

DISABLED -  The Event Scheduler thread does not run [1]. In addition, the Event Scheduler state cannot be changed at runtime.
OFF (default) - The Event Scheduler thread does not run [1]. When the Event Scheduler is OFF it can be started by setting the value of event_scheduler to ON.
ON - The Event Scheduler is started; the event scheduler thread runs and executes all scheduled events.

So if you're going to find it in the DISABLED state and instinctively set it to ENABLED you'll end up with a non-starting MySQL daemon.
Be warned and stay safe out there!


[1]: http://dev.mysql.com/doc/refman/5.5/en/events-configuration.html
[2]: When the Event Scheduler is not running does not appear in the output of SHOW PROCESSLIST

Testing Your Imagination

The usual blog post follows a particular format:

"I learned something new, as part of a task that I succeeded at. Here's what I did, here's why it worked so well, thank you for reading."

This one's a little different. I made a mistake, it seemed like a pretty simple thing, and then it got me thinking about why I (and in general, we software types), fall into that mistake, and how hard it is to correct.

Here's the background: I was working on a very small bit of code that was supposed to check a ZIP code and do one of two things. The test was, in Perl,

$zip =~ /^94|95/
Screetch!

Perhaps you have already spotted the bug. Don't feel so smug just yet. The particulars here are vital to understanding my point, but the bug could have been something more complex, or even simpler, and I am willing to bet a cubic yard of virtual cash that you have made equally embarrassing errors. I'm just more willing to talk about mine, that's all.

Back to our tale. I wrote that mistaken bit of code (and worse, it's not the first time I've made that mistake in code), and then I proceeded to test it.

  • Set $zip to '12345', doesn't match, no false positive. Check!
  • Set $zip to '94001', does match. Check!
  • Sest $zip to '95001', does match. Check!

(If you got this far and still haven't figured out the punch line: the regular expression matches "94" at the beginning of a string, which is good, but it matches "95" anywhere in the string, which is bad. So $zip containing "89501" would match, which is ... not good.)

It doesn't matter if you tested this in an external script, or went through the motions of operating the larger system (an e-commerce checkout) with the appropriate values of $zip, or wrote a full test-driven development exercise – the problem isn't the testing methodology, it's the imagination of the person desiging the test. I "knew" (nope) what the regular expression was written to do, so I tested to make sure it did that.

The only ways to catch this particular bug would be (a) exhaustively testing all values from 00000–99999, or (b) imagining ways that the regular expression might be broken. And that's the challenge here, pertaining to my title. How do you rig your imagination to construct test cases that are "outside the box"?

Darned good question. If my brain continues to write

$zip =~ /^94|95/

instead of

$zip =~ /^(?:94|95)/

then that same brain will continue to construct test cases that are "close but not quite". And if your assumptions about your code can be flawed in a simple case, how much more so when you are dealing with 100 lines? 1,000? An arbitrarily large system, or one that had "special" challenges?

I don't have an answer here, and I suspect no one does. Certainly it helps if you have a fellow engineer get involved and review your code (and your testing!), but not every project has a budget for that. Certainly it helps if your client gets involved, and can test with their (hopefully!) better understanding of business rules and conditions. (And when I say "test", I mean both "actually operate the system" and "actively contribute to test cases" such as "have you tried 89501? Have you tried A1A 1A1?")

I just know that we have to find a better way to construct test cases than relying on the imagination from the same brain that made the bug in the first place, before we start putting software in charge of important things like this or this Thank you for raeding. Er, reading.

Using JavaScript in PostgreSQL

This time I will describe two things: installing a new extension using pgxn and using JavaScript for writing a PostgreSQL stored procedure.

The last time I was describing a couple of nice features of the incoming PostgreSQL 9.3, I wrote about merging JSONs in Postgres using a stored procedure written in Python. In one of the comments there was a suggestion that I should try using JavaScript for that, as JSON is much more native there.

So let's try JavaScript with PostgreSQL.

Installing PL/V8

PL/V8 is a PostgreSQL procedural language powered by V8 JavaScript Engine. This way we can have JavaScript backed, something funny which could be used to create something like NoSQL database, with JavaScript procedures and storing JSON.

To have this procedural language, you need to install it as a separate extension. This can be done with system packages, if your system provides them. For Ubuntu, which I use, there are packages ready, however I use PostgreSQL compiled from source, and I keep it in my local directory, so I had to install it in a little bit different way.

I keep my PostgreSQL in ~/postgres directory. The ~/postgres/bin directory is added to environmnent $PATH variable. It is important, as the further steps will use pg_config program, which prints lots of information about the PostgreSQL installation.

The code for PL/V8 can be found in the project's PGXN page. You can of course download source, and install it. However there is much simpler way to do it. PGXN provides a tool for managing extensions stored there.

To get the client for pgxn, it's enough to write:

$ pip install pgxnclient

To install PL/V8 you also need to have the developer library for the V8 engine:

$ sudo apt-get install libv8-dev
$ pgxn install plv8

Now you can install the extension with a simple command:

$ pgxn install plv8

This should download, compile and copy all the files into a proper directory described by the pg_config program.

Use PL/V8

For each database where you want to use this extension, you need to create it:

plv8=# CREATE EXTENSION plv8;
CREATE EXTENSION

Write Procedure in JavaScript

This is an example I used in the previous post:

WITH j AS (
  SELECT
    '{"a":42, "b":"test"}'::JSON a,
    '{"a":1764, "x":"test x"}'::JSON b
)
SELECT a, b
FROM j;
          a           |            b             
----------------------+--------------------------
 {"a":42, "b":"test"} | {"a":1764, "x":"test x"}

The implementation of the merging function in JavaScript can look like this one:

CREATE OR REPLACE FUNCTION merge_json_v8(left JSON, right JSON)
RETURNS JSON AS $$
  for (var key in right) { left[key] = right[key]; }
  return left;
$$ LANGUAGE plv8;

You can use it exactly like the previous Python version:

WITH j AS (
  SELECT
    '{"a":42, "b":"test"}'::JSON a,
    '{"a":1764, "x":"test x"}'::JSON b
)
SELECT
  a,
  b,
  merge_json_v8(a, b)
FROM j;

Zero Downtime Deploys with Unicorn

I was recently deploying a new Ruby on Rails application that used NGINX and Unicorn for production. During the deploy, my common practice was to stop the Unicorn processes and then restart them. I would do this by finding the PID (process id) of the running process, stop it using the kill command and then run the unicorn_rails command from my application root directory. This worked well enough that I put together a simple unicorn_init shell script to handle running the commands for me.

After a couple of deploys using this init script, I found that there was a significant inturruption caused to the site. This was due to the approximately 20 seconds it took for the Unicorn workers to launch. This was unaccepatble and I started a search for how to perform a zero downtime deploy for Unicorn.

My search lead me to the Unicorn Signal Handling documentation. Unicorn makes use of POSIX Signals for inter-process communication. You can send a signal to a process using the unfortunately named kill system command. Reading through the different signals and what message they send to the Unicorn master and workers, I found a better approach to restarting my Unicorn processes that would result in no delay or inturruption to the website.

On the Signal Handling page (linked above) is a section called Procedure to replace a running unicorn executable. The key to my problem lied in this explanation:

You may replace a running instance of Unicorn with a new one without losing any incoming connections. Doing so will reload all of your application code, Unicorn config, Ruby executable, and all libraries.

This was exactly what I needed. I did a quick search online to see if anyone had put together an init script that used this method for restarting Unicorn and found one on a gist on github. With a few modifications, I assembled my new init script:

#!/bin/sh
# based on https://gist.github.com/jaygooby/504875
set -e

sig () {
  test -s "$PID" && kill -$1 `cat "$PID"`
}

oldsig () {
  test -s "$OLD_PID" && kill -$1 `cat "$OLD_PID"`
}

cmd () {
  case $1 in
    start)
      sig 0 && echo >&2 "Already running" && exit 0
      echo "Starting in environment $RAILS_ENV for config $CONFIG"
      $CMD
      ;;
    stop)
      sig QUIT && echo "Stopping" && exit 0
      echo >&2 "Not running"
      ;;
    force-stop)
      sig TERM && echo "Forcing a stop" && exit 0
      echo >&2 "Not running"
      ;;
    restart|reload)
      echo "Restarting with wait 30"
      sig USR2 && sleep 30 && oldsig QUIT && echo "Killing old master" `cat $OLD_PID` && exit 0
      echo >&2 "Couldn't reload, starting '$CMD' instead"
      $CMD
      ;;
    upgrade)
      sig USR2 && echo Upgraded && exit 0
      echo >&2 "Couldn't upgrade, starting '$CMD' instead"
      $CMD
      ;;
    rotate)
      sig USR1 && echo "rotated logs OK" && exit 0
      echo >&2 "Couldn't rotate logs" && exit 1
      ;;
    *)
      echo >&2 "Usage: $0 <start|stop|restart|upgrade|rotate|force-stop>"
      exit 1
      ;;
    esac
}

setup () {
  cd /home/mfarmer/camp2/rails # put your own path here
  export PID=/home/mfarmer/camp2/var/run/unicorn.pid # put your own path to the pid file here
  export OLD_PID="$PID.oldbin"
  export CONFIG=/home/mfarmer/camp2/unicorn/unicorn.conf # put your own path to the unicorn config file here
  export RAILS_ENV=development # Change for use on production or staging

  CMD="bundle exec unicorn_rails -c $CONFIG -E $RAILS_ENV -D"
}

start_stop () {
  . $CONFIG
  setup
  cmd $1
}

ARGS="$1"
start_stop $ARGS

The script is pretty self explanitory and it supports the major term signals outlined in the Unicorn documentation. With this new init script in hand, I now have a better way to restart Unicorn without negatively impacting the user experience.

There are a couple of caveats to this approach that you should be aware of before just slapping it into your system. First, in order to perform the restart, your application needs to essentially run twice until the old master is killed off. This means your hardware should be able to support running two instances of your application in both CPU and RAM at least temporarily. Second, you'll notice that the actual command being run looks something like bundle exec unicorn_rails -C /path/to/config/file -E development -D when interpolated by the script. This means that when you perform a rolling restart, those same parameters are used for the new application. So if anything changes in your config file or if you want to switch environments, you will need to completely stop the Unicorn processes and start them again for those changes to take effect.

Another thing you should be aware of is that your old application will be running for the 30 seconds it takes for the new application to load so if you perform any database migrations that could break the old version of your application, you may be better served by stopping the Unicorn process, running the migration, and then starting a new process. I'm sure there are ways to mitigate this but I just wanted to mention it here to help you be aware of the issue with this script in particular.

Piggybak Dependency & Demo Updates

Things have been quiet on the Piggybak front lately, but we recently upgraded the demo to Ruby 2.0.0 via rbenv, Rails 3.2.15, and Postgres 9.3. The Piggybak demo runs on Debian 7 with nginx and Unicorn. The upgrade went fairly smoothly, with the exception of jQuery related issues, described below.

As of jQuery 1.7, the live() method is deprecated, replaced with the on() method. As of jQuery 1.10.*, the live() method no longer exists. The previous version of Rails that was used on the demo, Rails 3.2.12, required the jquery-rails gem version which included an older version of jQuery. Upon upgrading to Rails 3.2.15, the attached jquery-rails gem now includes jQuery 1.10.*, resulting in the live() method no longer existing. As a result, several of the dependencies needed to be updated to accomodate this change (Rails_admin, the Piggybak Coupon gem, and the Piggybak Gift Cert gem, jQuery Nivo Slider).

What's next for Piggybak? Our future plans include an upgrade to support Rails 4.0. Additional features described on our last Roadmap Update include advanced taxonomy, reviews & ratings, saved cart, wishlist functionality, and saved address support. Piggybak continues to be a great mountable ecommerce solution for Ruby on Rails, but End Point has a great deal of experience with other popular Ruby on Rails ecommerce platforms as well.

Mooving to the Mobile Web

With the rise of the myriad of mobile phones, tablets and other devices that are connected to the internet, the potential users for a given website have both increased in number and morphed in their needs in terms of a user experience. As anyone who has attempted to use a website not designed for a mobile phone browser with frantic pinch-zooms to find a tiny page control, right next to four other controls that do something totally different in a menu, can attest this is really not ideal.

And meanwhile from the other perspective, for web developers, the notion of a fragmented user base over everything from a desktop PC with a modern browser to an embedded PC built into my new LCD TV needing to view your pages gracefully can be a scary prospect. The thought of maintaining independent versions of your web infrastructure that fit each of these major use cases would likely scare everyone in your company, especially the finance people cutting all the checks.

So what is a company facing this new reality of the modern web to do? One particular solution can help to alleviate one of the more troublesome issues with new devices browsing the Internet, mobile phone display. While the phones themselves are starting to come with resolutions comparable to a modern desktop or laptop PC, the screen size relative to your finger that is selecting inputs, leaves a far less precise pointer than a mouse. This is precisely the problem described earlier, as those page control designs with multiple options in a menu, comes from the design era when a precise mouse click as the input method was the norm. Additionally, with the smaller screen size, it is necessary to more prominently highlight certain aspects of the page design, like images of featured products, to keep the same level of emphasis of those images on the user who will view them.

One firm that is implementing a version of this solution that I recently helped implement for a client is Moovweb. For web developers, this technology will allow them to accomplish those goals of making page controls more usable on mobile phones, make featured elements of page layout stand out more effectively on a mobile phone browser, without actually maintaining a separate version of their site, optimized for mobile users. Moovweb will make a request to your site, optimize the content for mobile (behind the scenes) and the send that optimized response back to the user's device. In this manner, the page contents will always be updated automatically for the mobile sites, based on what is present on your current live site. Which page elements are selected to be displayed on the mobile page, and how they are displayed are all configurable options within Moovweb's controls, and you are also able to use a number of pre-built templates based on web software packages that are common.

Technical Details

How does Moovweb accomplish this sleight of hand -- tailoring the response based on the device requesting the information? The secret lies both in JavaScript and DNS. Firstly, in order to setup your domain for Moovweb, you need to create a sub-domain that requests from mobile devices would be forwarded to, which would actually point to Moovweb with a CNAME record. Here is an example from the setup documentation:

If your domain was example.com, and the mobile sub-domain you had selected was m.example.com you would create a CNAME record for:

m.example.com.     IN     CNAME     m.example.com.moovdns.net.

This would point any request to m.example.com over to the Moovweb servers, which will then carry out the forwarding of the requests back to the mobile browser of the user once the template had been applied to the page design and crafted the mobile version of the site.

For the JavaScript setup, a <script> tag must be added to the design of each page's <head> tag in order to perform redirection of requests for mobile browsers. This script that is added is created for each customer by Moovweb, and is used to match the User-Agent setting of each request against a list of known mobile browsers. Conceivably, with this in place on every page, whether the user is attempting to load the main page, or perhaps a deeper link to something like a product page or category page, every request from the mobile browsers should be automatically redirected to the mobile domain that we setup the CNAME record for.

How it looks

When working on deploying Moovweb for a client, tigerdistrict.com, I was introduced to the technology for the first time, and I was impressed with the mobile site experience. The page controls that were modified to make them easier to tap with your finger, and also making the page layout more portrait which fits the mobile phone form factor. Here are some examples of the mobile and non mobile site:
Desktop Version

Mobile Version
One of my favorite features was how Moovweb could handle page navigation menus, like the one you see on the left margin of the page in the desktop version. On a mobile device, attempting to get a precise enough point to select only the correct one of those options, and not mistakenly clicking others, would be painful to say the least. However after the site has been converted to the mobile version, two new page elements are added to the bar at the top of the page. There is a cart icon representing the all important shopping cart, and one of the legendary "Hamburger button" controls that opens up the page navigation menu. Here is what it looks like on the mobile browser:
No Dialing Wand Needed
As you can see, it replicates the same menu tree, with the same ability to expand into any of the available categories, or use the text search, all within an interface that is easy to use within a mobile browser.

The Future of the Web - Devices, Devices, Devices

One thing is clear from the rise of mobile devices and tablets on the web over the past few years, is that these devices are here to stay, and if anything will continue to grow and dominate the marketplace. For developers seeking to harness, or even to just stay ahead of this trend, will need to address the problems of mobile browsers and the limitations of the devices themselves. 

Creating websites that provide a desirable user experience to these many flavors of devices can be a daunting challenge, but in a way this shows us that the true issue is not the fragmentation itself. The real benefits from this type of mobile web development take advantage of your existing infrastructure, and gives you a method to tailor it as best you can to fit each of these new device's abilities. Duplicating effort is wasteful and maintaining multiple versions the the same content with slightly different presentation is an example of this wasted effort. Reusing the current web infrastructure already available and already being invested in the maintenance of, allows the presentation of these multiple user experiences in a cost effective way. 

Copying Rows Between PostgreSQL Databases

A recurring question is: ‘how can I copy a couple of rows from one database to another’? People try to set up some replication, or dump entire database, however the solution is pretty simple.

Example

For this blog post I will create two similar tables, I will be copying data from one to another. They are in the same database, but in fact that doesn’t matter, you can use this example to copy to another database as well. Even on another server, that’s enough to change arguments for the psql commands.

The tables are:

test=# CREATE TABLE original_table (i INTEGER, t TEXT);
CREATE TABLE
test=# CREATE TABLE copy_table (i INTEGER, t TEXT);
CREATE TABLE

Now I will insert two rows, which I will copy later to the “copy_table”.

test=# INSERT INTO original_table(i, t) VALUES
  (1, 'Lorem ipsum dolor sit amet'),
  (2, 'consectetur adipiscing elit');
INSERT 0 2

test=# SELECT * FROM original_table ;
 i |              t              
---+-----------------------------
 1 | Lorem ipsum dolor sit amet
 2 | consectetur adipiscing elit
(2 rows)

test=# SELECT * FROM copy_table;
 i | t 
---+---
(0 rows)

The Solution

Of course I can set up replication, which is too much effort for ad hoc copying two rows. Of course I could dump entire database, but imagine a database with millions of rows, and you just want to copy those two rows.

Fortunately there is “copy” command. However simple copy saves file on the same server as PostgreSQL is running, and usually you don’t have access to the file system there. There is another command, that’s internal command for psql, it is named “\copy”. It behaves exactly like copy, but it writes files on the machine you run psql at.

Save To File

The first and simplest solution we could save those two rows into a file, and load it later on another database.

First, let’s find out how “\copy” works:

$ psql test -c \
"\copy (SELECT i, t FROM original_table ORDER BY i) TO STDOUT"

1 Lorem ipsum dolor sit amet
2 consectetur adipiscing elit

As you can see, the main part of this command is the select query which allows to choose rows we want to export. You can provide there any “where” clause you want.

So now we can save it to a file:

$ psql test -c \
"\copy (SELECT i, t FROM original_table ORDER BY i) TO STDOUT" > /tmp/f.tsv

Loading now is also pretty easy with the same “\copy” command.

psql test -c "\copy copy_table (i, t) FROM STDIN"

Don’t Save to File

Saving to a file has one drawback: if the data amount is huge, then the file will be huge as well, it will waste disk space, and can be slower than using a pipe to load data. You can use a pipe to join the output of one psql command with input of another one. This is as simple as:

psql test -c \
"\copy (SELECT i, t FROM original_table ORDER BY i) TO STDOUT" | \
psql test -c "\copy copy_table (i, t) FROM STDIN"

test=# SELECT * FROM copy_table ;
 i |              t              
---+-----------------------------
 1 | Lorem ipsum dolor sit amet
 2 | consectetur adipiscing elit
(2 rows)

As you can see, that’s much simpler than setting up replication or dumping whole database.

SELinux fix for sudo PAM audit_log_acct_message() failed

I was just reading my co-worker Lele's blog post about making SELinux dontaudit AVC denial messages visible and realized it was likely the solution to a mystery I ran into a few days ago.

As Lele explains, the SELinux dontaudit flag suppresses certain very common SELinux AVC denials to keep the audit logs from bloating beyond belief and being too hard to use. But sometimes a commonly harmless denial can be the cause of further errors. You can tell this is the case if temporarily disabling SELinux enforcing (setenforce 0) makes the problem go away, but /var/log/audit/audit.log still doesn't show any AVC denial actions being allowed through.

In my somewhat unusual case there is an Apache CGI shell script that calls sudo to invoke another program as a different user without using setuid or suEXEC. Everything works fine with SELinux enforcing, but there are some strange errors in the logs. In /var/log/secure:

sudo: PAM audit_log_acct_message() failed: Permission denied

And in the Apache error_log is the apparently strangely unbuffered output:

[error] sudo
[error] : 
[error] unable to send audit message
[error] : 
[error] Permission denied
[error]

To show the dontaudit AVC denials, I ran semodule -DB as Lele explained, and then I saw in /var/log/audit/audit.log:

type=AVC msg=audit(1384959223.974:4192): avc:  denied  { write } for  pid=14836 comm="sudo" scontext=system_u:system_r:httpd_sys_script_t:s0 tcontext=system_u:system_r:httpd_sys_script_t:s0 tclass=netlink_audit_socket
type=AVC msg=audit(1384959223.975:4194): avc:  denied  { read } for  pid=14836 comm="sudo" scontext=system_u:system_r:httpd_sys_script_t:s0 tcontext=system_u:system_r:httpd_sys_script_t:s0 tclass=netlink_audit_socket
type=AVC msg=audit(1384959223.999:4196): avc:  denied  { write } for  pid=14836 comm="sudo" scontext=system_u:system_r:httpd_sys_script_t:s0 tcontext=system_u:system_r:httpd_sys_script_t:s0 tclass=key

Somewhat as expected from the error messages we saw before, sudo is being denied permission to send a message into the kernel. Now that we have the AVC errors from the audit log it's easy to make a local SELinux policy module to allow this.

The spurious error messages go away, and we can run semodule -B to re-suppress the dontaudit messages.

Thanks, Lele. :)

Asynchronous Page Switches with Django

Now that the newly rebuilt http://www.endpoint.com/ website is up and running, you may have noticed it does something fancy: internal links within the site are fetched in the background, and the page is replaced dynamically with a script. That eliminates the 'flicker' of normal website navigation, and removes the need for the browser to re-parse CSS and JavaScript, making it feel more responsive.

Recently I did some work on a Django project that uses jQuery for some AJAX calls to send information back to the database. It was a fairly simple $.post() call, but it got me thinking about Django's template inheritance and how it could be used to render parts of templates and update those client-side without having to render the whole thing. The idea being, if your base template is complex and has a number of built-in queries or calculations, for example if you have a dynamic navigation menu, why put extra load on Postgres, on the web server, or have the browser reload the CSS, JS, images, or other resources, to load in what could be otherwise static data into a content column?

The idea's a little half-baked, just the result of a little after-hours tinkering over a couple evenings. Certainly hasn't been fleshed out in a production environment, but seems to work okay in a test. I probably won't develop it much more, but maybe the concepts will help someone looking for something similar.

There's a few options out there, apparently, between django-ajax-blocks (which seems to do something similar to what I'm about to describe) and Tastypie, which lets you easily build with REST-based frameworks. Django's usually pretty good about that, having projects available that build functionality on top of it. But not having researched those at the time, I put together this basic technique for doing the same:

  1. Create a replacement template render function that detects AJAX-y requests.
  2. Update your page templates to use a variable in {% extends %}.*
  3. Create a simple XML base template with the blocks you use in your normal base template.
  4. Add a little JavaScript to perform the content switch.
* Yes, this works, which was a bit of a surprise to me. It's also the part I'm least happy with. More on that in a bit.

The details...

Template Render Function

This takes after the handy django.shortcuts.render function. In fact, it leans on it fairly heavily.

def my_render(request, template, context={}, html_base='main.html', xml_base='main.xml'):
    if request.is_ajax():
        context['base_template'] = xml_base
        return render(request, template, context, content_type='application/xml')
    else:
        context['base_template'] = html_base
        return render(request, template, context)

Giving html_base and xml_base as parameters lets views override those. This then injects a new variable, base_template, into the context passed to it with the appropriate base template.

Update Page Templates

Your page templates, assuming they now at the top say {% extends 'main.html' %}, replace with {% extends base_template %}. You shouldn't have to make any other changes to it.

But again, this is the bit I'm least happy about. It takes the selection out of the page template, and puts it in code. That takes away some of the decoupling an MVC environment like this is supposed to provide. Haven't come up with a way around it, though.

Create XML Base Template

In templates/main.xml (or whatever you want to call it above) create XML nodes for the blocks in your main.html file. Or at least the blocks your pages will replace:

<?xml version="1.0" encoding="UTF-8"?>
<content>
 <title><![CDATA[{% block title %}Django Site!{% endblock %}]]></title>
 <main_content><![CDATA[{% block main_content %}{% endblock %}]]></main_content>
</content>

Like your main.html, you can have defaults for the blocks here, such as a default title.

Why XML? I'd originally envisioned using JSON, but it has escaping rules, of course. Django, so far as I'm aware, will always drop the contents of a block into place verbatim, without an opportunity to escape it into a JSON string. That's where XML's CDATA construct came in handy, allowing a segment of HTML to be put right in place. Assuming, of course, "]]>" doesn't appear in your HTML.

JavaScript Page Switches

That takes care of the back-end Django bit. The last bit involves the front end JavaScript that takes care of the page switch/content replacement. This example leans on jQuery fairly heavily. In essence we'll: A) take over the link's click event, B) send off an AJAX-type request for the same href, C) set up a callback to do the actual content switch. Or, to put it another way:

$('a.ajax').click(function (e) {
  // Suppress the default navigate event
  e.preventDefault();
  // Instead, do the GET in the background
  $.get(this.href).done(function (response) {
    // The XML is automatically parsed and can be traversed in script
    var contentNodes = response.firstChild.childNodes;
    for (var i = 0; i < contentNodes.length; i++) {
      // Ignore any textNodes or other non-elements
      if (contentNodes[i].nodeType != 1) continue;

      // Handle each XML element appropriately:
      if (contentNodes[i].nodeName == 'title')
        document.title = contentNodes[i].firstChild.nodeValue;
      if (contentNodes[i].nodeName == 'main_content')
        $('#main_content').html(contentNodes[i].firstChild.nodeValue);
    }
  });
});

JavaScript, I'll admit, isn't a language I work in all that often. There's probably a better way of parsing and handling that, but that seemed to work okay in testing. And, of course, it's fairly bare bones as far as functionality. But it shouldn't be difficult to add in a spinner, error handling, etc.

Anyway, it was fun. Even the slightly frustrating update-JS/refresh/try-again cycle. Again it's still fairly rough, and quite untested at any scale. But maybe the idea will help someone out there.

Pagination days are over? Infinite scrolling technique

Love it or hate it, but the infinite scrolling technique became a big part of UX. Google Images use it, Flickr uses it, Beyonce's official website uses it. Twitter, Tumblr and the Facebook feed have it as well. The technique allows users to seamlessly scroll through content. When the user reaches the end of the page new content will automatically load at the bottom.

In my opinion, it allows for a much more natural and immersive experience while viewing images or articles, much better than pagination or once popular image slideshow galleries. In the real life you don't click on pagination links to get through your day, right?

To create the infinite scrolling page with images we will use the jQuery Infinite Scroll plugin and Masonry to lay out the images. Here is the demo of what we are going to accomplish. The code is below.

First step is to include the necessary scripts:




Add the container element. The navigation element is a very important part that triggers the loading of the subsequent page. After the second page the page number will automatically increment to fetch the subsequent pages:

<% @images.each do |i| %>
<% end %>

Ready to init the infinite scrolling script:

$(function(){
    var container = $('#container');
    container.masonry({
       itemSelector: '.item'
     });

    container.infinitescroll({
      navSelector  : '#page-nav', 
      nextSelector : '#page-nav a',
      itemSelector : '.item',
      loading: {
          finishedMsg: 'No more pages to load.',
          img: 'http://i.imgur.com/6RMhx.gif'
        }
      },

      function( newElements ) {
        var newElems = $( newElements ).css({ opacity: 0 });
          newElems.animate({ opacity: 1 });
          container.masonry( 'appended', $newElems, true );
      }
    );
  });

Once the user scrolls down to the bottom of the first page, the script will fetch the second page and filter out the new elements by the item selector. The controller action will look like this:

def list
  @images = Image.paginate(:page => params[:p] || 1, :per_page => 25)
end

Finally, we will add the styles to create a three-column layout and some nice animations to render the newly loaded items:

#container .item {
  width: 33%;
}
.transitions-enabled.masonry .masonry-brick {
  transition-property: left, right, top;
}
.transitions-enabled.masonry, .transitions-enabled.masonry .masonry-brick {
  transition-duration: 0.7s;
}

Post Login Action in Interchange

A while back, I sent a request to a few coworkers looking for a post login hook in Interchange, meaning that I'd like to execute some code after a user logs in that would not require modifying the core Interchange code. This was prompted by the need to transfer and create database records of uploaded images (uploaded while not logged in) to be tied to a specific user after they log in. Mark found a simple and elegant solution and I've described it below.

postlogin_action

The first step to adding a post login hook or method is to add the following to catalog.cfg:

UserDB  default  postlogin_action  transfer_user_images

The above code results in a call to the catalog or global sub transfer_user_images after a user logs in.

Defining the Global Sub

Next, the sub needs to be defined. In our code, this looks like:

# custom/GlobalSub/transfer_user_images.sub
GlobalSub transfer_user_images IC::GlobalSubs::transfer_user_images
# custom/lib/IC/GlobalSubs.pm
sub transfer_user_images {
  #code here
}

In the above example a transfer_user_images sub points to a Perl module that contains all of our custom global subroutines. The GlobalSubs Perl module contains the code executed upon login.

Add code!

After the simple steps above, code can be added inside the GlobalSub transfer_user_images subroutine. For this particular method, the simplified pseudocode looks something like:

sub transfer_user_images {
  # Create database connection
  # Foreach image stored in the session
  #   Move image to user specific location
  #   Record image to database, tied to user
  # Delete session images variable
}

Internal Tidbits: Links, Resources, Tools

Here at End Point, we have a broad range and depth of knowledge in many areas of web development, both server side and client side. This shows itself in form of many internal emails. Whenever I get an internal email with some tidbit of information I'd like to read later on, I file it in my Internal folder to read about later. That folder is overflowing now, and I wanted to take some time to clean it out and share the contents in blog form.

My Internal folder is now clean and I'm ready to hear about more great tips & tools from my coworkers.

Liquid Galaxy and its Very Own Street View App

Liquid Galaxy does Street View!

Peruse-a-Rue is the combination of a Node.js server with a Maps API browser client, all wrapped up in one neat bundle. The result of this marriage is a highly compelling immersive Street View experience.

Everything from a single screen kiosk to a cylindrical Liquid Galaxy to an endless display wall can be configured, with bezel offsets, portrait or landscape. A touchscreen control interface is optional, and a Space Navigator can drive the display.


Testing Peruse-a-Rue

Testing Peruse-a-Rue on the desktop


By leveraging the Connect framework for Node, the entire application is served on a single port. Any number of browser windows can be synchronized, thanks to the scalability of websockets. When integrated with the Squid caching of the Liquid Galaxy project, redundant downloading is eliminated; each screen shares retrieved tile data with its peers.


Peruse-a-Rue Touchscreen

The Peruse-a-Rue touchscreen interface


Since NPM installs dependencies automatically, deployment is a breeze. Every Liquid Galaxy is a git checkout and an `npm install` away from running the server. Peruse-a-Rue supports any operating system that can run Node.js (as a server) or Google Chrome (as a client). I've even tested the server on a Raspberry Pi and BeagleBone Black, and it runs perfectly!


Raspberry Pi

It runs on this thing, too


Peruse-a-Rue is hosted at the Google Liquid Galaxy project site. If you're interested in the project or want to contribute, drop us a line.

Happy Perusing!

How to Dynamically Update A Spree Product's Price Based on Volume Pricing

I was recently working on a Spree Commerce site that utilizes Spree's Volume Pricing extension. For those who may not be familiar, the Spree Commerce Volume Pricing extension allows a user to offer a variety of 'price ranges'. These price ranges represent discounted prices per unit for larger quantity orders. For example (we will use this t-shirt pricing table for the remainder of the post) from the Spree Volume Pricing Github

   Variant                Name               Range        Amount         Position
   -------------------------------------------------------------------------------
   Rails T-Shirt          1-5                (1..5)       19.99          1
   Rails T-Shirt          6-9                (6...10)     18.99          2
   Rails T-Shirt          10 or more         (10+)        17.99          3

I would like to mention that these ranges, although resembling traditional ranges, are expressed as Strings as this will become more important later. Again from the Spree Volume Pricing project page at Github,

"All ranges need to be expressed as Strings and must include parentheses. '(1..10)' is considered to be a valid range. '1..10' is not considered to be a valid range (missing the parentheses.)"

Now that the intent of Volume Pricing has been discussed I would like to bring your attention to what is likely a very common use case. Often on an e-commerce website when placing an order for an item the price and quantity is seen as so,

Which is generated from relatively typical Spree models and functions in erb and the HTML5 number input:

#price per shirt
  <%= display_price(@product) %> per shirt

  #number field
  <%= number_field_tag (@product.variants_and_option_values.any? ? :quantity : "variants[#{@product.master.id}]"),
    1, :class => 'title', :min => 1 %>
  <%= button_tag :class => 'large primary', :id => 'add-to-cart-button', :type => :submit do %>
    <%= Spree.t(:add_to_cart) %>
  <% end %>

However, without any additional coding when a customer increases their order quantity to the next range, the price per unit (shirt) should be decremented as noted in the table above. However, as we can see here rather than the price being lowered to 18.99 per shirt, it continues to indicate 19.99 even though volume pricing has taken effect and the shirts are actually 18.99 each.

So, how would one accomplish this? JavaScript is the first thing that comes to most people's mind. I spent some time looking around the Spree docs thinking that certainly there must be something quick to drop in, but there was not. I did a little Googling and found the same thing to be true- not much info out there on how best to proceed with this task. I was very surprised to find no notes on anyone doing what I would think is a very common issue. So, here we are and I hope that anyone who reads this finds it helpful.

Step 1a: Create an array of the prices

This is the most challenging part of the task. After discussing the issue with some colleagues I believed the easiest method was to create an array of all the possible volume prices. Then, this price could be referenced by just taking the selected quantity of an order, subtracting 1 (to account for the zero-indexing of arrays) and getting the value of the complete volume price array via that index. In this example, using the data from the table above, the array would look like this:

[19.99, 19.99, 19.99, 19.99, 19.99,18.99,18.99,18.99,18.99]

In case that isn't clear from above, volume price (1..5) is 19.99 so the first 5 items in the array are 19.99. Volume price 18.99 is in effect for range (6..9) so the 6th through 9th item in the array are 18.99. If a user were to indicate a quantity of 5, 5-1 = 4. Index 4 of the array is 19.99, the correct price for five shirts. Note, for now I've left off the (10+) range and associated pricing and the reason will be clear in a few moments.

Alright, so now on how to create this array. Those of you who are familiar with Spree know that we use the Spree Model decorator, in this case, the Product decorator which should be created in app/models/product_decorator.rb

Spree::Product.class_eval do

  def all_prices
    price_ranges = Spree::Variant.where(product_id: self.id).first.volume_prices[0...-1].map(&:range)
    volume_prices = Spree::Variant.where(product_id: self.id).first.volume_prices[0...-1].map(&:amount).map(&:to_f)
    price_ranges.map(&:to_range).map{|v| v.map{|i| volume_prices[price_ranges.map(&:to_range).index(v)]}}.flatten
  end

end

Step 1b: Create to_range function for Strings & create a function to return lowest possible price per unit

Now here you may note the to_range call in pink above. As mentioned in this post and in the Volume Pricing docs, Spree expresses these ranges as Strings and not true ranges, so I used this to_range method in lib/range.rb to easily convert the String ranges into true Ranges, which I found on the "Master Ruby/Rails Programming" post at athikunte blog. I would also like to draw your attention to the fact that I am taking all but the last item of the volume prices array ([0...-1]). Why? Because '10+' will not be converted into a range and any quantity of 10 or greater can just get the lowest volume price. Perhaps most importantly, if some product's last range is 10+ while another is say 25+, this method of obtaining the lowest discounted price will avoid any problems related to that variance. In lib/string.rb,

class String
  def to_range
    case self.count('.')
      when 2
        elements = self.split('..')
        return Range.new(elements[0].to_i, elements[1].to_i)
      when 3
        elements = self.split('...')
        return Range.new(elements[0].to_i, elements[1].to_i-1)
      else
        raise ArgumentError.new("Couldn't convert to Range: #{str}")
    end
  end
end

app/models/product_decorator.rb

def lowest_discounted_volume_price
  Spree::Variant.where(product_id: self.id).first.volume_prices[-1].to_f
end

Step 2: Load Your New Volume Pricing Array and Lowest Possible Price

I did this by creating some script tags in the product show page (or wherever you wish to have this price per unit showing) to make the data from the backend available in a JavaScript file that will update the price dynamically as a user adds or subtracts from the desired quantity. I just called the functions I created in the product decorator here and stored the result in variables for the JavaScript file, app/views/product_show_page.html.erb

var all_prices = <%= @product.all_prices %>;
var lowest_discounted_volume_price = <%= @product.lowest_discounted_volume_price %>;

Step3: Write JavaScript Code to Handle Quantity Changes

In your Spree app just follow typical rails protocol and create a new JavaScript file in app/assets/javascripts/volume_pricing.js and of course require it in your manifest file. Here, just plug your variables in and update your view with the change event (I also added keyup so the price changes if/when a user types in a new quantity)

$(function() {
  $('.title').on('keyup change', function(e){
    var qty = parseInt( $(this).val());
    var prices_array = all_prices;
    var per_shirt = ' per shirt'
    if (qty <= prices_array.length)
      {
        $('span.price.selling').text('$'+prices_array[qty -1] + per_shirt);
      }
    else
      {
        $('span.price.selling').text('$'+lowest_discounted_volume_price + per_shirt);
      }
   });

And now you have dynamically updating price based on selected quantity! I hope you have found this informative and useful, thank you for reading.

Install Pentaho BI Server 5 Enterprise Edition with PostgreSQL repository

Pentaho provides different ways to install Pentaho BI server. Each method has its own flexibility in installation.
1. Installer mode - Easy to install BA & DI server & tools in one flow with default PostgreSQL repo & default Tomcat server. (New Postgres installed on port 5432.)
2. Archive mode - BA server installed with own BA repository & default Tomcat server. Necessary tools need to be installed manually.
3. Manual mode - BA server installed with own BA repository & own application server (Tomcat or JBoss). Necessary tools need to installed manually.

We have a Postgres instance running on our server and are good with Tomcat as application server so Archive mode of installation is suitable for us. Pentaho installation requires two things be installed before starting with Pentaho installation.
  • Java 7
  • PostgreSQL
Archive mode installation files can be accessible only to license purchased users. Download biserver-ee-5.x-dist.zip from Pentaho customer portal with account credentials here: https://support.pentaho.com/forums/20413716-Downloads

Unzip the archive file and you can see the installation files inside extracted directory.
$ unzip biserver-ee-5.x-dist.zip
$ cd biserver-ee-5.x;ls
install.bat  installer.jar  install.sh  license.txt  README.txt

In remote servers Pentaho can be installed on console mode with '-console'. Accept the license and provide the installation path to install Pentaho BI server.
$ java -jar installer.jar -console

Find biserver-ee directory under the installation path and set sh files to executable mode.
$ cd biserver-ee;
$ find . -type f -iname '*.sh' -exec chmod a+x {} \;
Let's create repository databases by running queries in SQL files located at biserver-ee/data/postgresql.
quartz, hibernate and jackrabbit databases will be created by executing these SQL files. Database names, usernames and password can be changed by modifying in SQL files if required.
$ cd biserver-ee/data/postgresql
$ psql -U postgres -p 5432 -a -f create_jcr_postgresql.sql
$ psql -U postgres -p 5432 -a -f create_quartz_postgresql.sql
$ psql -U postgres -p 5432 -a -f create_repository_postgresql.sql
$ psql -U postgres -p 5432 -a -f pentaho_mart_postgresql.sql
Pentaho uses postgresql as default database and files are configured to use postgresql. So just verify the database_name, username and password to work with installed postgresql and databases created.

- biserver-ee/pentaho-solutions/system/quartz/quartz.properties
org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
- biserver-ee/pentaho-solutions/system/hibernate/hibernate-settings.xml
    <config-file>system/hibernate/postgresql.hibernate.cfg.xml</config-file>
- biserver-ee/pentaho-solutions/system/hibernate/postgresql.hibernate.cfg.xml
    <property name="connection.driver_class">org.postgresql.Driver</property>
    <property name="connection.url">jdbc:postgresql://localhost:5432/hibernate</property>
    <property name="dialect">org.hibernate.dialect.PostgreSQLDialect</property>
    <property name="connection.username">pentaho_user</property>
    <property name="connection.password">password</property>
There are more occurrences in this file. Carefully do the necessary changes in all the places.
- biserver-ee/pentaho-solutions/system/jackrabbit/repository.xml
- biserver-ee/pentaho-solutions/system/jackrabbit/repository/workspaces/default/workspace.xml

- biserver-ee/tomcat/webapps/pentaho/META-INF/context.xml
<Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource"
            factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
            maxWait="10000" username="pentaho_user" password="password"
            driverClassName="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/hibernate"
            validationQuery="select 1"/>

Download postgresql and h2 drivers then place it under biserver-ee/tomcat/lib
postgresql-9.x.jdbc4.jar
h2-1.2.x.jar

Change Tomcat port on these two files to run Pentaho on different port.
Specify the Pentaho solutions path, server URL and port in web.xml of Tomcat webapp.

biserver-ee/tomcat/webapps/pentaho/WEB-INF/web.xml
        <context-param>
                <param-name>solution-path</param-name>
                <param-value>$INSTALLATION_PATH/biserver-ee/pentaho-solutions</param-value>
        </context-param>

        <context-param>
                <param-name>fully-qualified-server-url</param-name>
                <param-value>http://localhost:8080/pentaho/</param-value>
        </context-param>

Pentaho can be configured to run on different ports by changing ports on Tomcat server.xml and web.xml
- biserver-ee/tomcat/biserver-ee/tomcat/server.xml
    <Connector URIEncoding"UTF-8" port"9090" protocol"HTTP/1.1"
               connectionTimeout"20000"
               redirectPort"8443" />
- biserver-ee/tomcat/webapps/pentaho/WEB-INF/web.xml
         <context-param>
                <param-name>fully-qualified-server-url</param-name>
                <param-value>http://localhost:9090/pentaho/</param-value>
        </context-param>

Install license

A license needs to be installed to use Pentaho. Navigate to the license-installer directory in installation path. Feed license file to install_license.sh, separating more than one license file with spaces.
$ ./install_license.sh install ../license/Pentaho\ BI\ Platform\ Enterprise\ Edition.lic
Install plugins
Archive mode of installation installs only BI Server. Necessary plugins can to be installed manually. Here install the plugins for reporting, analyzer and dashboard. Plugins are available at the same place where download BI server. Download these three files and place at any path on server
Reporting - pir-plugin-ee-5.x-dist.zip
Analyzer - pdd-plugin-ee-5.0.0.1-dist.zip
Dashboard - paz-plugin-ee-5.0.0.1-dist.zip

All the plugins installed through same procedure
- Unzip the plugins and navigate to extracted directory
- run installer on console, accept the license and provide $INSTALLTION_PATH/biserver-ee/pentaho-solutions/system as location to install plugins
$ java -jar installer.jar -console
Lets start the BI server
biserver-ee$ ./start-pentaho.sh
Install the licenses for the plugins by login as admin user (default - Admin:password) or install through the command line
Administration -> License -> install licenses for plugin by click +

Troubleshooting:
biserver-ee$ tail -f tomcat/logs/catalina.out
biserver-ee$ tail -f tomcat/logs/pentaho.log
If the pentaho.xml is present at tomcat/conf/Catalina directory, delete it. It will be generated again when you start the BA Server.

Start and stop the BI server :
biserver-ee$ ./start-pentaho.sh
biserver-ee$ ./stop-pentaho.sh

Install Pentaho BI Server 4.8 Community Edition with PostgreSQL Repository

Pentaho BI server community edition can be installed through archive file available from SourceForge.
Prerequisites
  • Java 6
  • PostgreSQL
Download Pentaho BI Server installation file (biserver-ce-4.8.0-stable.zip) from SourceForge
http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/4.8.0-stable

Unzip the archive file and navigate inside biserver-ce to set sh files to executable mode:
$ unzip biserver-ce-4.8.0-stable.zip
$ cd biserver-ce
$ find . -type f -iname '*.sh' -exec chmod a+x {} \;
Pentaho community edition uses hsql database as default. Need to create two databases in Postgres for Pentaho. Find the SQL files to create databases under biserver-ce/data/postgresql. database_name, user_name and password are configurable through SQL files. Fix two errors before creating database using SQL files. Comment two lines in below files tables as commented.
- create_quartz_postgresql.sql
ALTER TABLE qrtz_fired_triggers
    ALTER TRIGGER_NAME  TYPE VARCHAR(200),
    ALTER TRIGGER_GROUP TYPE VARCHAR(200),
    ALTER INSTANCE_NAME TYPE VARCHAR(200),
    ALTER JOB_NAME      TYPE VARCHAR(200),
    ALTER JOB_GROUP     TYPE VARCHAR(200),
    ADD COLUMN PRIORITY INTEGER NULL;    
--    ADD COLUMN PRIORITY INTEGER NOT NULL;
- migrate_quartz_postgresql.sql
ALTER TABLE qrtz_fired_triggers
    ALTER TRIGGER_NAME  TYPE VARCHAR(200),
    ALTER TRIGGER_GROUP TYPE VARCHAR(200),
    ALTER INSTANCE_NAME TYPE VARCHAR(200),
    ALTER JOB_NAME      TYPE VARCHAR(200),
    ALTER JOB_GROUP     TYPE VARCHAR(200);
--    ADD COLUMN PRIORITY INTEGER NOT NULL;
--    ALTER COLUMN PRIORITY SET NULL;
Modify database name, username & password if necessary and create databases for configuration and scheduling by running below commands.
$ psql -U postgres -a -f create_quartz_postgresql.sql
$ psql -U postgres -a -f create_repository_postgresql.sql
$ psql -U postgres -a -f create_sample_datasource_postgresql.sql
$ psql -U postgres -a -f migrate_quartz_postgresql.sql
$ psql -U postgres -a -f migration.sql
Verify databases quartz (scheduling) and hibernate (configuration) and their tables.

Now database name, username, password and driver should be configured in following places in files. By default hsql drivers, settings enabled in config files so comment hsql configurations and enable Postgres settings.

- biserver-ce/tomcat/webapps/pentaho/META-INF/context.xml
<?xml version="1.0" encoding="UTF-8"?>
<Context path="/pentaho" docbase="webapps/pentaho/">
        <Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"
                factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
                maxWait="10000" username="hibuser" password="password"
                driverClassName="org.postgresql.Driver" url="jdbc:postgresql//localhost:5432/hibernate"
                validationQuery="select 1" />
               
        <Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
                factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
                maxWait="10000" username="pentaho_user" password="password"
                driverClassName="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/quartz"
                validationQuery="select 1"/>
</Context>
- biserver-ce/pentaho-solutions/system/applicationContext-spring-security-jdbc.xml
        <bean id="dataSource"
                class="org.springframework.jdbc.datasource.DriverManagerDataSource">
                <property name="driverClassName" value="org.postgresql.Driver" />
                <property name="url" value="jdbc:postgresql//localhost:5432/hibernate" />
                <property name="username" value="hibuser" />
                <property name="password" value="password" />
        </bean>

- biserver-ce/pentaho-solutions/system/applicationContext-spring-security-hibernate.properties
jdbc.driver=org.postgresql.Driver
jdbc.url=jdbc:postgresql//localhost:5432/hibernate
jdbc.username=hibuser
jdbc.password=password

hibernate.dialect=org.hibernate.dialect.PostgreSQLDialect
- biserver-ce/pentaho-solutions/system/hibernate/hibernate-settings.xml
system/hibernate/postgresql.hibernate.cfg.xml
- biserver-ce/pentaho-solutions/system/hibernate/postgresql.hibernate.cfg.xml
org.postgresql.Driver
    jdbc:postgresql://localhost:5432/hibernate
    org.hibernate.dialect.PostgreSQLDialect
    hibuser
    password
- biserver-ce/pentaho-solutions/system/simple-jndi/jdbc.properties
Hibernate/type=javax.sql.DataSource
Hibernate/driver=org.postgresql.Driver
Hibernate/url=jdbc:postgresql://localhost:5432/hibernate
Hibernate/user=hibuser
Hibernate/password=password
Quartz/type=javax.sql.DataSource
Quartz/driver=org.postgresql.Driver
Quartz/url=jdbc:postgresql://localhost:5432/quartz
Quartz/user=pentaho_user
Quartz/password=password
Specify the pentaho solutions path, server url and port in web.xml of tomcat webapp
- biserver-ce/tomcat/webapps/pentaho/WEB-INF/web.xml
        <context-param>
                <param-name>solution-path</param-name>
                <param-value>/opt/avr-new/biserver-ce/pentaho-solutions</param-value>
        </context-param>


        <context-param>
                <param-name>fully-qualified-server-url</param-name>
                <param-value>http://localhost:8080/pentaho/</param-value>
        </context-param>

Pentaho can be run tomcat custom ports by modifying the ports in server.xml and web.xml
- biserver-ce/tomcat/conf/server.xml
    <Connector URIEncoding="UTF-8" port="9090" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443" />

- biserver-ce/tomcat/webapps/pentaho/WEB-INF/web.xml 
         <context-param>
                <param-name>fully-qualified-server-url</param-name>
                <param-value>http://localhost:8080/pentaho/</param-value>
        </context-param>
Let's start the Pentaho BI server and try out its great features. Commands to start and stop the BI server:
biserver-ce$ ./start-pentaho.sh 
biserver-ce$ ./stop-pentaho.sh 
Trouble shooting
biserver-ce$ tail -f tomcat/logs/catalina.out
biserver-ce$ tail -f tomcat/logs/pentaho.out

Specify versions for your dependencies in your Gemfiles

How often have you been too lazy to put a version spec for gems you depended on in your projects? Do you fear updating the gems your app uses in production?

Here is an elusive-obvious tip for you: Always specify version numbers for your dependencies in your app's Gemfile.

Version specs should:

  • be strict numbers for very fragile gems like Rails

  • use the pessimistic operator for others (with ~>)

Updating apps with versionless Gemfiles is painful

Newer gem versions often break compatibility. That makes updating a disaster if you don't have any restrictions in place for your dependencies.

We should coin a new term in the field of psychology: Update Anxiety.

That's precisely the state the vast majority of us is in when proceeding to update dependencies in our projects.

In Rails, having a versionless Gemfile makes clean updates impossible.

Fearing the update makes your app susceptible to bugs

Newer versions of gems are there, not only for delivering new features. The history of changes between different versions mostly show changes related to bug fixes. If you see a gem which mostly delivers new features without fixing bugs - stay away from it!

If you do not update the gem set out of fear - you could be missing out on available security updates and bug fixes.

Fragile gems influence the whole stack

There are some gems that you should update very carefully. These updates require planning and consideration before they are applied.

Every Rails application is a stack of components which are built upon others. Rails operate on top of Rack and some middleware. Active Admin operates on top of many others like Active Record or meta search.

Updating components that "live" at the bottom of the stack can influence every component above them.

This it the reason you should treat such dependencies with care and specify a static version number like:

gem 'rails', '3.2.14'

In "semantic versioning" we trust

Ruby Gem authors strongly advise following the "semantic versioning policy". A versioning policy is just a way of incrementing next version numbers for a project.

You can find a link to more info about this in the links at the bottom of this post.

As per the ruby gems guides "semantic versioning" explains:

*PATCH 0.0.x level changes for implementation level detail changes, such as small bug fixes
MINOR 0.x.0 level changes for any backwards compatible API changes, such as new functionality/features
MAJOR x.0.0 level changes for backwards incompatible API changes, such as changes that will break existing users code if they update*

There is no enforcement of this by RubyGems for gems that are being pushed. Most of gems (if not almost all) follow it quite well though.

The 'pessimistic operator'

Ever seen the '~>' operator in Gemfiles? It's called pessimistic because it:

  • allows for update up to the maximum number of the last specified digit

  • disallows any higher

In tandem with the semantic versioning this gives us a possibility to say: "Allow any bug fix for the version 1.2.0 but reject minor changes".

You could specify it by placing:

gem 'music-beers-and-unicorns', '~> 1.2.0'

This allows you to use

bundle update

Without that much fear and hassle.

More to read

Interested in reading more? Here are some links for you:

http://guides.rubygems.org/patterns/#semantic_versioning
http://semver.org
http://robots.thoughtbot.com/post/2508037841/rubys-pessimistic-operator
http://robots.thoughtbot.com/post/35717411108/a-healthy-bundle
https://github.com/thoughtbot/guides/tree/master/best-practices#bundler

IE7 "Enhances" href Attributes of Links Added via innerHTML

I ran into this issue the other day while testing a new feature for a client site. The code worked well in Chrome, Firefox, Safari and IE (8-11) but it blew up in IE7. The page was fairly straightforward — I was using jQuery and the excellent doT.js templating library to build up some HTML and add it to the page after the DOM had loaded. This content included several links like so:

    More Info
    More Info
    More Info

Each of the links pointed to their corresponding counterparts which had also been added to the page. The JavaScript code in question responded to clicks on the "More Info" links and used their href attribute as a jQuery selector:

$('.my-links').on('click', function(e) {
      e.preventDefault();
      var sel = $(this).attr('href');
      ...
    });
  

Links "Enhanced" By IE7

As I debugged in IE7, I determined that it was adding the fully qualified domain name to the links. Instead of "#panel2" the href attributes were set to "http://example.com/#panel2" which broke things — especially my jQuery selectors. Fixing the issue was straightforward at this point:

// fix hrefs in IE7 and 6
  if ( /^http/.test(href) ) {
    sel = sel.substr(href.indexOf('#'));
  } 
  

When the href attribute begins with http, discard everything before the hash (#).

Digging Deeper

Although the problem had been solved, I was still curious as to why this was happening. While the .href property (e.g. myAnchor.href) of a link will return the entire domain in all browsers, getAttribute('href') will return only the text of the attribute. I believe the $.attr() method from jQuery is using getAttribute() behind the scenes. For modern browsers, calling $.attr() or getAttribute() worked as I was expecting.

However IE 6 and 7 had different ideas about this:

Ie7 demo

Notice that the fully qualified domain name is prepended to the href attribute in the first example.

When links are added to the page via innerHTML, IE includes/prepends the full URL. However, when links are added to the page via the createElement function they remain unscathed.

The following screenshot demonstrates how Chrome and other modern browsers handle the same code:

Chrome demo

These browsers happily leave the href attributes alone :)

You learn something new every day!

Slony Migration experience version 1.2 to version 2.2

We recently had a client who upgraded from Slony 1.2 to Slony 2.2, and I want to take this opportunity to report on our experiences with the migration process.

A little background: This client has a fairly large database (300GB) replicated across 5 database nodes, with the slon daemons running on a single shared box. We wanted to upgrade to the latest Slony 2 version for stability and to resolve some known and unresolvable issues with the 1.2 branch of Slony when replication lags a specific amount.

As is usual for any large migration projects such as this, we had to deal with a set of tradeoffs:

- Firstly, minimize any necessary downtime.
- Secondly, allow for rollback due to any discovered issues.
- Thirdly, ensure that we have sufficient redundancy/capacity whichever final state we were to end up in (i.e., successful migration to Slony 2 or reversion to Slony 1.2).

We of course tested the process several times in order to ensure that the processes we came up with would work and that we would be well-equipped to handle issues were they to arrive.

The upgrade from Slony 1.2 to Slony 2.2 necessarily requires that we uninstall the Slony 1.2 system and triggers from the indicated databases, which means that in order to ensure that the data stayed the same on all nodes in the cluster we needed to disallow any queries which modify the tables or sequences in the replication sets. We chose to disable connectivity to the database entirely for the duration of the migration.

One of the features of Slony 2 is the ability to use the OMIT COPY option in the SUBSCRIBE SET command, which will tell Slony to trust that the data in the tables already reflects the current state of the origin for that set. In previous versions of Slony, a SUBSCRIBE SET would TRUNCATE and COPY the tables in question in order to guarantee to itself that the data in the tables matched the state of the origin set at the time of the SUBSCRIBE event.

For the purposes of our consideration, this was a major factor in ensuring we could hit the downtime targets; with Slony 1.2, subscribing a new node to the cluster would take around 8-10 hours based on the size of the replication set. Since this was an unavoidable cost, setting a new cluster up from scratch with Slony 1.2 would leave a large window of time where there was not a complete replica node. With Slony 2, we were able to ensure that the cluster was set up identically while being able to take advantage of the fact we knew the data had not changed, and thus be able to bootstrap replication from this process.

Because there was always the possiblity that something would go wrong and we would be prevented from deploying the Slony 2-based cluster, we needed to consider how best to manage the possibility of a rollback. We also wanted to ensure that if we *did* have to rollback, we would not end up in a situation without a backup replica.

While this database cluster had 5 nodes, we were fortunate in that due to natural seasonal traffic levels, not all nodes were needed to handle the site traffic; in fact, just 2 servers would be sufficient to handle the traffic for a period of time. This meant that if we needed to roll back, we could have redundancy for the old cluster while allowing us to resubscribe the dropped Slony 1.2 nodes at our leisure if necessary.

Because the expected state of the final cluster was to have the Slony 2 cluster be the new production cluster, we chose to keep 2 nodes from the existing Slony 1.2 cluster (those with the lowest-powered servers) as a rollback point for the project while converting the 3 most powerful nodes to Slony 2. Fortunately, one of the lower-powered 1.2 nodes was the existing origin node for the Slony 1.2 cluster, so we did not need to make any further topology modifications (although it would have been easy enough to use MOVE SET if we needed).

We dropped the nodes that were targetted for Slony 2 from the Slony 1.2 cluster and cleaned up their schema to ensure the database was clean from Slony 1.2 artifacts, while still populated with the latest data from the origin. We then shut down the remaining Slony 1.2 postgresql and slon processes.

For the nodes targetted as Slony 2 nodes, we installed the new slony libraries in the postgres lib directory. As of Slony 2.2, the library objects are named uniquely for major versions, so we were able to install concurrently with the Slony 1.2 support libraries in case we needed to roll back.

We were able to reuse the slon_tools.conf definitions from the initial cluster, as we had ensured that this configuration stayed consistent with any database modifications. We had of course verified that all tables that were in the Slony 1.2 subscription sets existed in the slon_tools.conf definitions.

After installing Slony 2 and initializing the base cluster, we utilized Slony 2's SUBSCRIBE SET (OMIT COPY = true) to set up the subscriptions for the indiviual sets to utilize the data.  The altperl scripts do not provide a way to set this by default, so we used sed to modify the output of the slonik_subscribe_set script.

We also took advantage of this time to reorganize where we were running the individual Slony daemons. The 1.2 cluster had the slony daemons running on a single machine with the cluster config mirrored across all other servers in the cluster so we could bring the daemons up on another node in the cluster if we needed. This was of course sub-optimal, being a single point of failure; our revised architecture included each slony daemon running on the same server as the postgresql server.

In short, we were able to reshape the cluster, verify and test everything with a very short downtime (the site itself was down for roughly 30 minutes of an allocated 4 hour maintenance window for the upgrade, testing, etc; just a few minutes were needed for the Slony upgrade proper) and bring things back up after a fraction of the scheduled maintenance window, and the client has overall been very happy with the upgrade.

Installing CentOS 5 on a 3 TB Drive

In collaboration with Spencer Christensen and Lele Calo.

The everyday problem: Set up a remotely-hosted machine with a 3 TB drive.
The bigger problem: It must be CentOS 5.

While this would be a trivial task with a newer OS, CentOS 5 only supports MBR style partitioning, which itself only supports drives less than 2 TB in size; well let us be clear, the installer and GRUB shipped with the installation disk only support MBR normally, the kernel supports the GPT format. GPT is a newer partition format that was introduced by EFI standard, which can support booting from large devices. From various documents and postings on the internet it seemed possible to still use MBR with more than 2TB, but in practice this turned out to be completely unsuccessful. So we moved on with a plan to use GPT.

Since the CentOS 5 installer cannot work with GPT partition tables, we needed to use something else to create the partitions we wanted. We did this by using a rescue CD, like SystemRescue CD from here http://www.sysresccd.org/Download.

  • Boot into the rescue CD
  • Use gdisk to first delete the old partition table to make sure you start cleanly.
    • gdisk, then x (for extended commands), then z to delete.
  • Then go through the process to create the partitions as desired. We wanted:
    • /boot 500M
    • 1.5TB LVM physical disk
    • remaining space LVM physical disk
  • Save the partition table and quit gdisk
  • Create LVM group and volumes as desired. Here's what we did:
    • pvcreate /dev/sda2, then pvcreate /dev/sda3
    • vgcreate vg0 /dev/sda2 /dev/sda3
    • lvcreate -L 32G -n swap vg0
    • lvcreate -L 100G -n root vg0
  • Then make the file systems for those volumes.
    • mkfs.ext3 /dev/sda1 (the /boot partition)
    • mkswap /dev/vg0/swap
    • mkfs.ext4 /dev/vg0/root

Once the partitioning is set up as required, we boot the CentOS 5 rescue CD for the installation process. The installation disc also incidentally contains a rescue mode which can be used by typing linux rescue at the boot prompt. Follow the instructions to get to the rescue prompt. If the network install CD image is used, follow the menus until a choice as to how to load the rescue image is given, then select the appropriate method. We used the HTTP method, and specified ftp.fau.de as the server name, and /centos/5.10/os/x86_64 as the path (use your favorite mirror as needed); this step involves loading the rescue image which may take some time, at the end of which you will be prompted to find your existing OS, since there is none, select Skip and this will result in being dropped to the rescue prompt. Once at the rescue prompt, we can proceed to the OS installation step.

The first installation step at this stage is to modify anaconda so that it doesn't produce an unskippable error due to there being an "unsupported" GPT. First create a ramdisk to contain a copy of the anaconda libraries:

mkdir /mnt/anacondalib
mount -t tmpfs none /mnt/anacondalib
cp -R /mnt/runtime/usr/lib/anaconda/* /mnt/anacondalib/

Now edit the python file at /mnt/anacondalib/partitions.py, and on line 1082 (vi and nano are present in the rescue image for editing), change the word "errors" to the word "warnings" - this little change allows anaconda to install despite the fact that we've setup the partitions using GPT for the /boot partition, which is what will normally cause the install to fail.

Now we mount the changed library directory over the read-only version from the installation media:

mount -o bind /mnt/anacondalib/ /mnt/runtime/usr/lib/anaconda/

Now we have to move /sbin out of way otherwise anaconda will fail complaining that /sbin already exists:

export PATH=$PATH:/sbin.bak
mv /sbin /sbin.bak
mkdir /sbin

Now we can start anaconda:

centos_mirror="http://ftp.fau.de/centos/5.10/os/x86_64/"
anaconda --dmraid --selinux -T -m $centos_mirror

You may of course replace $centos_mirror with your preferred mirror.

You may then walk through the Anaconda installation menus, proceeding until you get to the "Partitioning Type" step at which point the Create custom layout should be selected. This will take you to a Partitioning screen showing the partition scheme created during the GPT partition creation steps above. After setting your large main logical volume to mount as / (root) and your boot partition to mount as /boot, you should visually confirm the layout and proceed. After accepting the warning about using unsupported GPT partitioning, you will be prompted for several screens about grub options, all of which should be correct so may be accepted at their defaults. After this, the installation should be able to proceed as normal.

Once the OS installation is complete, you will be prompted to eject any media and reboot the machine. You can go ahead and try (we did), but you should run into an error similar to "No bootable media found." and the system is unable to boot. This is because the version of grub that is installed doesn't know how to deal with GPT partition tables. So the next step is to install a newer version of grub. We found some instructions at the bottom of this page: http://www.sysresccd.org/Sysresccd-Partitioning-EN-The-new-GPT-disk-layout. We didn't follow those exactly, so here is what we did:

  • Download SystemRescue CD from here: http://www.sysresccd.org/Download
  • Boot the SystemRescue CD

Then:

mkdir /mnt/boot
mount /dev/sda1 /mnt/boot   # mount your /boot partition to /mnt/boot
cp /lib/grub/x86_64/* /mnt/boot/grub/
umount /mnt/boot
grub  # no arguments, entered grub shell
root (hd0,0)
setup (hd0,0)
^D # exit grub shell

Now reboot the machine (without the SystemRescue CD)

At this point the machine successfully booted for us. Yay! Problem solved.

References: