News

Welcome to End Point’s blog

Ongoing observations by End Point people

OpenSSL CSR with Alternative Names one-line

20170216 - Edit - I changed this post to use a different method than what I used in the original version cause X509v3 extensions were not created or seen correctly by many certificate providers.

I find it hard to remember a period in my whole life in which I issued, reissued, renewed and revoked so many certificates.

And while that's usually fun and interesting, there's one thing I often needed and never figured out, till a few days ago, which is how to generate CSRs (Certificate Signing Requests) with AlternativeNames (eg: including www and non-www domain in the same cert) with a one-liner command.

This need is due to the fact that some certificate providers (like GeoTrust) don't cover the parent domain when requesting a new certificate (eg: CSR for www.endpoint.com won't cover endpoint.com), unless you specifically request so.

Luckily that's not the case with other Certificate products (like RapidSSL) which already offer this feature built-in.

This scenario is starting to be problematic more often since we're seeing a growing number of customers supporting sites with HTTPs connections covering both www and "non-www" subdomains for their site.

Luckily the solution is pretty simple and straight-forward and the only requirement is that you should type the CSR subject on the command line directly, basically without the use of the interactive question mechanism.

If you managed to understand how an SSL certificate works this shouldn't be a huge problem, anyway just as a recap here's the list of the meaning for the common Subject entries you'll need:

  • C => Country
  • ST => State
  • L => City
  • O => Organization
  • OU => Organization Unit
  • CN => Common Name (eg: the main domain the certificate should cover)
  • emailAddress => main administrative point of contact for the certificate

So by using the common syntax for OpenSSL subject written via command line you need to specify all of the above (the OU is optional) and add another section called subjectAltName=.

By adding DNS.n (where n is a sequential number) entries under the "subjectAltName" field you'll be able to add as many additional "alternate names" as you want, even not related to the main domain.

Obviously the first-level parent domain will be covered by most SSL products, unless specified differently.

So here's an example to generate a CSR which will cover *.your-new-domain.com and your-new-domain.com, all in one command:

openssl req -new -sha256 -nodes -out \*.your-new-domain.com.csr -newkey rsa:2048 -keyout \*.your-new-domain.com.key -config <(
cat <<-EOF
[req]
default_bits = 2048
prompt = no
default_md = sha256
req_extensions = req_ext
distinguished_name = dn

[ dn ]
C=US
ST=New York
L=Rochester
O=End Point
OU=Testing Domain
emailAddress=your-administrative-address@your-awesome-existing-domain.com
CN = www.your-new-domain.com

[ req_ext ]
subjectAltName = @alt_names

[ alt_names ]
DNS.1 = your-new-domain.com
DNS.2 = www.your-new-domain.com
EOF
)

To be honest, that's a sub-optimal solution for a few reasons but mostly that it's not comfortable to fix in case you did a typo or similar.

That's why I prefer creating a dedicated file (that you can also reuse in future) and then pipe that in openssl.

Of course you can use your text editor of choice, I used HEREDOC mostly because it shows better through blog posts in my opinion.

cat > csr_details.txt <<-EOF
[req]
default_bits = 2048
prompt = no
default_md = sha256
req_extensions = req_ext
distinguished_name = dn

[ dn ]
C=US
ST=New York
L=Rochester
O=End Point
OU=Testing Domain
emailAddress=your-administrative-address@your-awesome-existing-domain.com
CN = www.your-new-domain.com

[ req_ext ]
subjectAltName = @alt_names

[ alt_names ]
DNS.1 = your-new-domain.com
DNS.2 = www.your-new-domain.com
EOF

# Let's call openssl now by piping the newly created file in
openssl req -new -sha256 -nodes -out \*.your-new-domain.com.csr -newkey rsa:2048 -keyout \*.your-new-domain.com.key -config <( cat csr_details.txt )

Now with that I'm able to generate proper multi-domain CSRs effectively.

Please note the use of the -sha256 option to enable SHA256 signing instead of the old (and now definitely deprecated SHA1).

Thanks to all our readers for all the hints, ideas and suggestiong they gave me to improve this post, which apparently is still very useful to a lot of System Administrators out there.

Prevent MediaWiki showing PHP version with new extension: ControlSpecialVersion


Sok Kwu Wan

I recently created a new MediaWiki extension named ControlSpecialVersion whose purpose is to allow some control over what is shown on MediaWiki's "special" page Special:Version. The latest version of this extension can be downloaded from Mediawiki.org. You can see it in action on the Special:Version page for bucardo.org. The primary purpose of the module is to prevent showing the PHP and database versions to the public.

As with most MediaWiki extensions, installation is easy: download the tarball, unzip it into your extensions directory, and add this line to your LocalSettings.php file:


require_once( "$IP/extensions/ControlSpecialVersion/ControlSpecialVersion.php" );

By default, the extension removes the PHP version information from the page. It also changes the PostgreSQL reported version from its revision to simply the major version, and changes the name from the terrible-but-official "PostgreSQL" to the widely-accepted "Postgres". Here is what the Software section of bucardo.org looks like before and after the extension is used:


Note that we are also eliding the git revision information (sha and date). You can also do things such as hide the revision information from the extension list, remove the versions entirely, or even remove an extension from showing up at all. All the configuration parameters can be found on the extension's page on mediawiki.org.

It should be noted that there are typically two other places in which your PHP version may be exposed, both in the HTTP headers. If you are running Apache, it may show the version as part of the Server heading. To turn this off, edit you httpd.conf file and change the ServerTokens directive to ProductOnly. The other header is known as X-Powered-By and is added by PHP to any pages it serves (e.g. MediaWiki pages). To disable this header, edit your php.ini file and make sure expose_php is set to Off.

While these methods may or may not make your server safer, there really is no reason to expose certain information to the world. With this extension, you at least have the choice now.

Another Round of Tidbits: Browser Tools, Performance, UI

It's been a while since I shared a blog article where I share End Point tidbits, or bits of information passed around the End Point team that don't necessarily merit a single blog post, but are worth mentioning and archiving. Here are some notes shared since that last post that I've been collecting:

  • Skeuocard and creditcard.js are intuitive user interface (JS, CSS) plugins for credit card form inputs (card number, security code, billing name).

    Skeuocard Screenshot
  • StackExchange UX is a Stack Overflow resource for user interface Q&A.
  • wpgrep is an available tool for grepping through WordPress databases.
  • Here is a nifty little tool that analyzes GitHub commits to report on language convention, e.g. space vs. tab indentation & spacing in argument definitions.

    Example comparison of single vs. double quote convention in JavaScript.
  • Ag (The Silver Searcher) is a document searching tool similar to ack, with improved speed. There's also a Ag plugin for vim.
  • GitHub released Atom earlier this year. Atom is a desktop application text editor; features include Node.js support, modular design, and a full feature list to compete with existing text editors.
  • SpeedCurve is a web performance tool built on WebPagetest data. It focuses on providing a beautiful user interface and minimizing data storage.

    Example screenshot from SpeedCurve
  • Here is an interesting article by Smashing Magazine discussing mobile strategy for web design. It covers a wide range of challenges that come up in mobile web development.
  • Reveal.js, deck.js, Impress.js, Shower, and showoff are a few open source tools available for in-browser presentation support.
  • Have you seen Firefox's 3D view? It's a 3D representation of the DOM hierarchy. I'm a little skeptical of its value, but the documentation outlines a few use cases such as identifying broken HTML and finding stray elements.

    Example screenshot of Firefox 3D view
  • Here is an interesting article discussing how to approach sales by presenting a specific solution and alternative solutions to clients, rather than the generic "Let me know how I can help." approach.
  • A coworker inquired looking for web based SMS providers to send text messages to customer cellphones. Responses included services recommended such as txtwire, twilio, The Callr, and Clickatell.

Updating Firefox and the Black Screen

If you are updating your Firefox installation for Windows and you get a puzzling black screen of doom, here's a handy tip: disable graphics acceleration.

The symptoms here are that after you upgrade Firefox to version 33, the browser will launch into a black screen, possibly with a black dialog box (it's asking if you want to choose Firefox to be your default browser). Close this as you won't be able to do much with it.

Launch Firefox by holding down the SHIFT key and clicking on the Firefox icon. It will ask if you want to reset Firefox (Nope!) or launch in Safe mode (Yes).

Once you get to that point, click the "Open menu" icon (three horizonal bars, probably at the far right of your toolbar). Choose "Preferences", "Advanced", and uncheck "Use hardware acceleration when available".

Close Firefox, relaunch as normal, and you should be AOK. You can try re-enabling graphics acceleration if and when your graphics driver is updated.

Reference: here.

Postgres copy schema with pg_dump


Manny Calavera (animated by Lua!)
Image by Kitt Walker

Someone on the #postgresql IRC channel was asking how to make a copy of a schema; presented here are a few solutions and some wrinkles I found along the way. The goal is to create a new schema based on an existing one, in which everything is an exact copy. For all of the examples, 'alpha' is the existing, data-filled schema, and 'beta' is the newly created one. It should be noted that creating a copy of an entire database (with all of its schemas) is very easy: CREATE DATABASE betadb TEMPLATE alphadb;

The first approach for copying a schema is the "clone_schema" plpgsql function written by Emanuel Calvo. Go check it out, it's short. Basically, it gets a list of tables from the information_schema and then runs CREATE TABLE statements of the format CREATE TABLE beta.foo (LIKE alpha.foo INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING DEFAULTS). This is a pretty good approach, but it does leave out many types of objects, such as functions, domains, FDWs, etc. as well as having a minor sequence problem. It's also slow to copy the data, as it creates all of the indexes before populating the table via INSERT.

My preferred approach for things like this is to use the venerable pg_dump program, as it is in the PostgreSQL 'core' and its purpose in life is to smartly interrogate the system catalogs to produce DDL commands. Yes, parsing the output of pg_dump can get a little hairy, but that's always preferred to trying to create DDL yourself by parsing system catalogs. My quick solution follows.

pg_dump -n alpha | sed '1,/with_oids/ {s/ alpha/ beta/}' | psql

Sure, it's a bit of a hack in that it expects a specific string ("with_oids") to exist at the top of the dump file, but it is quick to write and fast to run; pg_dump creates the tables, copies the data over, and then adds in indexes, triggers, and constraints. (For an explanation of the sed portion, visit this post). So this solution works very well. Or does it? When playing with this, I found that there is one place in which this breaks down: assignment of ownership to certain database objects, especially functions. It turns out pg_dump will *always* schema-qualify the ownership commands for functions, even though the function definition right above it has no schema, but sensibly relies on the search_path. So you see this weirdness in pg_dump output:

--
-- Name: myfunc(); Type: FUNCTION; Schema: alpha; Owner: greg
--
CREATE FUNCTION myfunc() RETURNS text
    LANGUAGE plpgsql
    AS $$ begin return 'quick test'; end$$;

ALTER FUNCTION alpha.myfunc() OWNER TO greg;

Note the fully qualified "alpha.myfunc". This is a problem, and the sed trick above will not replace this "alpha" with "beta", nor is there a simple way to do so, without descending into a dangerous web of regular expressions and slippery assumptions about the file contents. Compare this with the ownership assignments for almost every other object, such as tables:

--
-- Name: mytab; Type: TABLE; Schema: alpha; Owner: greg
--
CREATE TABLE mytab (
    id integer
);

ALTER TABLE mytab OWNER TO greg;

No mention of the "alpha" schema at all, except inside the comment! Before going into why pg_dump is acting like that, I'll present my current favorite solution for making a copy of a schema: using pg_dump and some creative renaming:

$ pg_dump -n alpha -f alpha.schema
$ psql -c 'ALTER SCHEMA alpha RENAME TO alpha_old'
$ psql -f alpha.schema
$ psql -c 'ALTER SCHEMA alpha RENAME TO beta'
$ psql -c 'ALTER SCHEMA alpha_old TO alpha'

This works very well, with the obvious caveat that for a period of time, you don't have your schema available to your applications. Still, a small price to pay for what is most likely a relatively rare event. The sed trick above is also an excellent solution if you don't have to worry about setting ownerships.

Getting back to pg_dump, why is it schema-qualifying some ownerships, despite a search_path being used? The answer seems to lie in src/bin/pg_dump/pg_backup_archiver.c:

  /*                                                                                                                                                      
     * These object types require additional decoration.  Fortunately, the                                                                                  
     * information needed is exactly what's in the DROP command.                                                                                            
     */
    if (strcmp(type, "AGGREGATE") == 0 ||
        strcmp(type, "FUNCTION") == 0 ||
        strcmp(type, "OPERATOR") == 0 ||
        strcmp(type, "OPERATOR CLASS") == 0 ||
        strcmp(type, "OPERATOR FAMILY") == 0)
    {
        /* Chop "DROP " off the front and make a modifiable copy */
        char       *first = pg_strdup(te->dropStmt + 5);

Well, that's an ugly elegant hack and explains why the schema name keeps popping up for functions, aggregates, and operators: because their names can be tricky, pg_dump hacks apart the already existing DROP statement built for the object, which unfortunately is schema-qualified. Thus, we get the redundant (and sed-busting) schema qualification!

Even with all of that, it is still always recommended to use pg_dump when trying to create DDL. Someday Postgres will have a DDL API to allow such things, and/or commands like MySQL's SHOW CREATE TABLE, but until then, use pg_dump, even if it means a few other contortions.

Liquid Galaxy at the Ryder Cup 2014



End Point was proud to present the Liquid Galaxy for the French Golf Federation at this year’s Ryder Cup in Gleneagles, Scotland. The French Golf Federation will be hosting the cup in 2018 at Le Golf National, which is just outside of Paris and is also the current venue of the French Open.

Throughout the event, thousands of people came in and tried out the Liquid Galaxy. The platform displayed one of its many hidden talents and allowed golf fans from around the world to find and show off their home courses. One of the most interesting things to witness was watching golf course designers accurately guess the date of the satellite imagery based on which course changes were present.


This deployment presented special challenges: a remote location (the bustling tented village adjacent to the course) with a combination of available hardware from our European partners and a shipment from our Tennessee office. Despite these challenges, we assembled the system, negotiated the required network connectivity, deployed the custom interface, and delivered a great display for our sponsoring partners. The event was a great success and all enjoyed the unseasonably mild Scottish weather.




Rails Recursive Sorting for Multilevel Nested Array of Objects

Whenever you display data as a list of records, sorting them in a particular order is recommended. Most of the time, Rails treats data as an array, an array of objects, or as a nested array of objects (tree). We would like to use a general sorting mechanism to display the records in ascending or descending order, to provide a decent view to end users. Luckily, Rails comes with a sorting method called 'sort_by' which helps to sort the array of objects by specific object values.

Simple Array:

Trivially, an array can be sorted just by sorting using the “sort” method:
my_array = [ 'Bob', 'Charlie', 'Alice']

my_array = my_array.sort;  # (or just my_array.sort!)
This is the most basic way to sort elements in an array and is part of Ruby’s built-in API.

Array of Objects:

Usually, the result set of the Rails will have an array of objects and should be sorted based on specific attributes of the object in the array. Here is a sample array of objects which need to be sorted by date and fullname.
s_array =
[  
    {
        "date"=> "2014-05-07",
        "children"=> [],
        "fullname"=> "Steve Yoman"
    },
    {
        "date"=> "2014-05-06",
        "children"=> [],
        "fullname"=> "Josh Tolley"
    }
]

Solution:

1) Simple sorting

We can use the Rails sort_by method to sort the array of objects by date and fullname in order:
s_array = s_array.sort_by{|item| [ item['date'], item['fullname'] ]}
sort_by is passed an anonymous function which operates on each item, returning a value to be used as a sort key (returned as an anonymous array in this case). Because Ruby’s array have the Enumerable property, they will automatically be able to be used as sort keys as long as the elements containing them are as well. Because we are returning string properties, we get this for free. We can make use of Rails sort_by method to sort the array of objects by date and fullname in order.

2) Handling case on strings

Sometimes sorting directly on the object attribute will produce undesirable results, for instance if there is inconsistent case in the data. We can further normalize the case of the string used to get records to sort in the expected order:
s_array = s_array.sort_by{|item| [ item['date'], item['fullname'].downcase ]}
Here again we are returning an array to be used as a sort key, but we are using a normalized version of the input data to return.

Multilevel Nested Array of Objects

Sometimes objects in an array will contain the array as element and it continues multilevel. Sorting this kind of array requires recursive sorting to sort the all the levels of array of objects based on specific attributes in object. The following array has nested array and objects alternatively:

m_array =
[
    {
        "name"=> "Company",
        "children"=> [
            {
                "name"=> "Sales",
                "children"=> [
                    {
                        "date"=> "2014-05-07",
                        "children"=> [],
                        "fullname"=> "Steve Yoman"
                    },
                    {
                        "date"=> "2014-05-06",
                        "children"=> [],
                        "fullname"=> "Josh Tolley"
                    }
                ]
            },
            {
                "name"=> "Change Requests",
                "children"=> [
                    {
                        "name"=> "Upgrade Software",
                        "children"=> [
                            {
                                "date"=> "2014-05-01",
                                "children"=> [],
                                "fullname"=> "Selvakumar Arumugam"
                            },
                            {
                                "date"=> "2014-05-02",
                                "children"=> [],
                                "fullname"=> "Marina Lohova"
                            }
                        ]
                    },
                    {
                        "name"=> "Install Software",
                        "children"=> [
                            {
                                "date"=> "2014-05-01",
                                "children"=> [],
                                "fullname"=> "Selvakumar Arumugam"
                            },
                            {
                                "date"=> "2014-05-01",
                                "children"=> [],
                                "fullname"=> "Josh Williams"
                            }
                        ]
                    }
                ]
            }
        ]
    }
]

Solution:

In order to tackle this issue, we will want to sort all of the sub-levels of the nested objects in the same way. We will define a recursive function in order to handle this. We will also want to add additional error-handling.

In this specific example, we know each level of the data contains a “children” attribute, which contains an array of associated objects. We write our sort_multi_array function to recursively sort any such arrays it finds, which will in turn sort all children by name, date and case-insensitive fullname:
def sort_multi_array(items)
  items = items.sort_by{|item| [ item['name'], item['date'], item['fullname'].to_s.downcase ]}
  items.each{ |item| item['children'] = sort_multi_array(item['children']) if (item['children'].nil? ? [] : item['children']).size > 0 }
  items
end

m_array = sort_multi_array(m_array);

You can see that we first sort the passed-in array according to the object-specific attributes, then we check to see if there is an attribute ‘children’ which exists, and then we sort this array using the same function. This will support any number of levels of recursion on this data structure.

Notes about this implementation:

1. Case-insensitive sorting

The best practice when sorting the strings is to convert to one unique case (i.e upper or lower) on sorting. This ensures that records show up in the order that the user would expect, not the computer:
item['fullname'].downcase

2. Handling null values in case conversion

The nil values on the attributes need to be handled on the string manipulation process to avoid the unexpected errors. Here we converting to string before applying the case conversion:
item['fullname'].to_s.downcase

3. Handling null values in array size check

The nil values on the array attributes need to be handled on the sorting process to avoid the unexpected errors. Here we guard against the possibility of item[‘children’] being nil, and if it is, then we return an empty array instead:
(item['children'].nil? ? [] : item['children']).size

Parsing Email Addresses in Rails with Mail::Address

I've recently discovered the Mail::Address class and have started using it for storing and working with email addresses in Rails applications. Working with an email address as an Address object rather than a String makes it easy to retrieve different parts of the address and I recommend trying it out if you're dealing with email addresses in your application.

Mail is a Ruby library that handles email generation, parsing, and sending. Rails' own ActionMailer module is dependent on the Mail gem, so you'll find that Mail has already been included as part of your Rails application installations and is ready for use without any additional installation or configuration.

The Mail::Address class within the library can be used in Rails applications to provide convenient, object-oriented ways of working with email addresses.

The class documentation provides some of the highlights:

a = Address.new('Patrick Lewis (My email address) <patrick@example.endpoint.com>')
a.format       #=> 'Patrick Lewis <patrick@example.endpoint.com> (My email address)'
a.address      #=> 'patrick@example.endpoint.com
a.display_name #=> 'Patrick Lewis'
a.local        #=> 'patrick'
a.domain       #=> 'example.endpoint.com'
a.comments     #=> ['My email address']
a.to_s         #=> 'Patrick Lewis <patrick@example.endpoint.com> (My email address)'

Mail::Address makes it trivial to extract the username, domain name, or just about any other component part of an email address string. Also, its #format and #to_s methods allow you to easily return the full address as needed without having to recombine things yourself.

You can also build a Mail::Address object by assigning email and display name strings:

a = Address.new
a.address = "patrick@example.endpoint.com"
a.display_name = "Patrick Lewis"
a #=> #<Mail::Address:69846669408060 Address: |Patrick Lewis <patrick@example.endpoint.com>| >
a.display_name = "Patrick J. Lewis"
a #=> #<Mail::Address:69846669408060 Address: |"Patrick J. Lewis" <patrick@example.endpoint.com>| >
a.domain #=> "example.endpoint.com"

This provides an easy, reliable way to generate Mail::Address objects that catches input errors if the supplied address or display name are not parseable.

I encourage anyone who's manipulating email addresses in their Rails applications to try using this class. I've found it especially useful for defining application-wide constants for the 'From' addresses in my mailers; by creating them as Mail::Address objects I can access their full strings with display names and addresses in my mailers, but also grab just the email addresses themselves for obfuscation or other display purposes in my views.

Liquid Galaxy at UNC Chapel Hill


End Point has brought another academic Liquid Galaxy online! This new display platform is now on the storied campus of University of North Carolina - Chapel Hill. With a strong background in programming and technology, UNC wanted a premiere interactive platform to showcase the GIS data and other presentations the school researchers are putting together.

Neil Elliott, our hardware manager for Liquid Galaxy, first assembled, preconfigured, and tested the computer stack at our facility in Tennessee, bringing together the head node, display nodes, power control units, switches, and cases to build a “Liquid Galaxy Express”: an entirely self-contained Liquid Galaxy unit that fits in just under 1-meter cubed. From there, Neil drove the computers and custom-built frame directly to Chapel Hill. Our install manager Neil described it as: “It’s a great drive from our office in Tennessee over the mountains to Chapel Hill. When I arrived, the UNC staff was on-hand to help assemble things, lay out the space, and it all went very quickly. We were live by 4pm that same day.”
Overall, the installation took just over 6 hours, including assembly and final configuration. The University’s library and IT staff were on hand to assist with the frame assembly and mounting the dazzling 55” Samsung commercial displays. That evening, students were exploring and the staff was tweeting out Vines of the new display:



"From the moment the installer closed his tool box, students have been lining up non-stop to try the screens," said Amanda Henley, the Library's Geographic Information Systems Librarian. Located on the 2nd floor of the Davis Library, the Liquid Galaxy is open to students and staff for research and exploration. End Point is looking forward to working closely with the staff at UNC, as well as with their CompSci teams on developing great new functionality for the Liquid Galaxy. If you would like to get a Liquid Galaxy at your school, call us!


MediaWiki minor upgrade with patches

One of the more mundane (but important!) tasks for those running MediaWiki is keeping it updated with the latest version of the software. This is usually a fairly easy process. While the offical upgrade instructions for MediaWiki are good, they are missing some important items. I will lay out in detail what we do to upgrade MediaWiki installations.

Note that this is for "minor" upgrades to MediaWiki, where minor is defined as not moving more than a couple of actual versions, and not requiring anything other than patching some files. I will cover major upgrades in a future post. For this article, I assume you have full shell access, and not simply FTP, to the server that MediaWiki is running on.

The first step in upgrading is knowing when to upgrade - in other words, making sure you know about new releases. The best way to do this is to subscribe to the low-volume mediawiki-announce mailing list. The MediaWiki maintainers have a wonderful new policy of sending out "pre-announcement" emails stating the exact time that the new version will be released. Once we see that announcement, or when the version is actually released, we open a support ticket, which serves the dual purpose of making sure the upgrade does not get forgotten about, and of keeping an official record of the upgrade.

The official announcement should mention the location of a patch tarball, for example http://releases.wikimedia.org/mediawiki/1.23/mediawiki-1.23.5.patch.gz. If not, you can find the patches in the directory at http://releases.wikimedia.org/mediawiki/: look for your version, and the relevant patch. Download the patch, and grab the signature file as well, which will be the same file with "dot sig" appended to it. In the example above, the sig file would be http://releases.wikimedia.org/mediawiki/1.23/mediawiki-1.23.5.patch.gz.sig.

It is important to know that these patch files *only* cover patching from the previous version. If you are running version 1.23.2, for example, you would need to download and apply the patches for versions 1.23.3 and 1.23.4, before tackling version 1.23.5. You can also create your own patch file by checking out the MediaWiki git repository and using the version tags. In the previous example, you could run "git diff 1.23.2 1.23.5".

Once the patch is downloaded, I like to give it three sanity checks before installing it. First, is the PGP signature valid? Second, does this patch look sane? Third, does the patch match what is in the official git repository for MediaWiki?

To check the PGP signature, you use the sig file, which is a small external signature that one of the MediaWiki maintainers has generated for the patch itself. Since you may not have the public PGP key already, you should both verify the file and ask gpg to download the needed public key in one step. Here's what it looks like when you do:

$ gpg --keyserver pgp.mit.edu --keyserver-options auto-key-retrieve --verify mediawiki‑1.23.5.patch.gz.sig 
gpg: Signature made Wed 01 Oct 2014 06:21:47 PM EDT using RSA key ID 5DC00AA7
gpg: requesting key 5DC00AA7 from hkp server pgp.mit.edu
gpg: key 5DC00AA7: public key "Markus Glaser " imported
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   5  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 5u
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)
gpg: Good signature from "Markus Glaser "
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 280D B784 5A1D CAC9 2BB5  A00A 946B 0256 5DC0 0AA7

The important line here is the one saying "Good signature". The usage of gpg and PGP is beyond the scope of this article, but feel free to ask questions in the comments. Once verified, the next step is to make sure the patch looks sane. In other words, read through it and see exactly what it does! It helps to read the release notes right before you do this. Then:

$ gunzip -c mediawiki-1.23.5.patch.gz | more

While reading through, make note of any files that have been locally patched - you will need to check on them later. If you are not used to reading diff outputs, this may be a little confusing, but give it a shot anyway, so you know what you are patching. Most MediaWiki version upgrades are very small patches, and only alter a few items across a few files. Once that is done, the final sanity check is to make sure this patch matches what it in the canonical MediaWiki git repository.

This is actually a fairly tricky task, as it turns out the patch files are generated from a custom script, and are not just the output of "git diff old_version new_version". Feel free to skip ahead, this is one method I found for making sure the patch file and the git diff match up. By "git diff", I mean the output of "git diff 1.23.4 1.23.5", for example. The biggest problem is that the files are ordered differently. Thus, even if you remove all but the actual diff portions, you cannot easily compare them. Here, "patchfile" is the downloaded and gunzipped patch file, e.g. mediawiki-1.23.5.patch, and "gitfile" is the output of git diff across two different versions, e.g. the output of "git diff 1.23.4 1.23.5". First, we want to ensure that they both have the same group of files being diffed. Then we walk through each file in the order given by the patchfile, and generate a cross-tag diff. This is saved to a file, and then compared to the original patchfile. They will not be identical, but should match up for the actual diff portions of the file.

## The -f42 may change from version to version
$ diff -s <(grep diff patchfile | cut -d' ' -f42 | cut -d/ -f2- | sort) <( grep diff gitfile | cut -d' ' -f4 | cut -d/ -f2- | sort)
Files /dev/fd/63 and /dev/fd/62 are identical
$ grep diff patchfile | cut -d' ' -f24 | cut -d/ -f2- | grep -v RELEASE | xargs -L1 git diff 1.23.4 1.23.5 > gitfile2
$ diff -b patchfile gitfile2

Okay, we have verified that the patch looks sane. The next step is to make sure your MediaWiki has a clean git status. If you don't have your MediaWiki in git, now is the time to do so. It's as simple as:

$ cd /your/wiki/directory
$ echo -ne "images/\ncache/\n" > .gitignore
$ git init
$ git add .
$ git commit -a -q -m "Initial import of our MediaWiki directory"

Run "git status" and make sure you don't have any changed but uncommitted files. Once that is done, you are ready to apply the patch. Gunzip the patch file first, run the actual patch command in dryrun mode first, then do the final patch:

$ gunzip ~/mediawiki-1.23.5.patch.gz
$ patch -p1 --dry-run -i ~/mediawiki-1.23.5.patch
$ patch -p1 -i ~/mediawiki-1.23.5.patch

You may not have the "tests" directory installed, in which case it is safe to skip any missing file errors related to that directory. Just answer "Y" when asked if it is okay to skip that file. Here is an example of an actual patch from MediaWiki 1.23.3 to version 1.23.4:

$ patch -p1 -i ~/mediawiki-1.23.4.patch
patching file includes/config/GlobalVarConfig.php
patching file includes/db/DatabaseMysqli.php
patching file includes/DefaultSettings.php
patching file includes/libs/XmlTypeCheck.php
patching file includes/Sanitizer.php
patching file includes/upload/UploadBase.php
patching file RELEASE-NOTES-1.23
can't find file to patch at input line 387
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff -Nruw -x messages -x '*.png' -x '*.jpg' -x '*.xcf' -x '*.gif' -x '*.svg' -x '*.tiff' -x '*.zip' -x '*.xmp' -x '.git*' mediawiki-1.23.3/tests/phpunit/includes/upload/UploadBaseTest.php mediawiki-1.23.4/tests/phpunit/includes/upload/UploadBaseTest.php
|--- mediawiki-1.23.3/tests/phpunit/includes/upload/UploadBaseTest.php 2014-09-24 19:58:10.961599096 +0000
|+++ mediawiki-1.23.4/tests/phpunit/includes/upload/UploadBaseTest.php 2014-09-24 19:55:15.538575503 +0000
--------------------------
File to patch: 
Skip this patch? [y] y
Skipping patch.
2 out of 2 hunks ignored

The jump from 1.23.4 to 1.23.5 was much cleaner:

$ patch -p1 -i ~/mediawiki-1.23.5.patch
patching file includes/DefaultSettings.php
patching file includes/OutputPage.php
patching file RELEASE-NOTES-1.23

Once the patch is applied, immediately check everything into git. This keeps the patch separate from other changes in your git history, and allows us to roll back the patch easily if needed. State the version in your commit message:

$ git commit -a -m "Applied mediawiki-1.23.5.patch to move from version 1.23.4 to 1.23.5"

The next step is to run the update script. This almost always does nothing for minor releases, but it's a good practice to get into. Running it is simple:

$ php maintenance/update.php --quiet --quick

The "quick" option prevents the usual five-second warning. The "quiet" option is supposed to turn off any non-error output, but if you are using Semantic MediaWiki, you will still receive a screen-full of unwanted output. I need to submit a patch to fix that someday. :)

Now that the new version is installed, make sure the wiki is still working! First, visit the Special:Version page and confirm that the new version number appears. Then make sure you can view a random page, that you can edit a page, and that you can upload an image. Finally, load your extension testing page.

You don't have an extension testing page? To make one, create a new page named "Extension_testing". On this page, include as many working examples of your extensions as possible, especially non-standard or heavily-used ones. For each extension, put the name of the extension in a header, describe what the output should be, and then have the extension do something interesting in such a way that a non-working extension will be noticed very quickly when viewing the page!

If you have any locally patched files (we almost always do, especially UserMailer.php!), now is the time to check that the patch did not mess up your local changes. If they did, make adjustments as needed, then make sure to git commit everything.

At this point, your wiki should be up and running the latest version of MediaWiki. Notify the users of the wiki as needed, then close out the support ticket, noting any problems you encountered. Upgrading via patch is a very straightforward procedure, but major upgrades are not! Watch for a future post on that.