News

Welcome to End Point’s blog

Ongoing observations by End Point people

pg_healer: repairing Postgres problems automatically

Sometimes, the elephant gets hurt - inducing database errors! Data corruption is a fact of life in working with computers, and Postgres is not immune. With the addition of the "data checksums" feature, detecting such corruption is now much easier. But detection is not enough - what happens after the corruption is detected? What if Postgres could fix the problem all by itself - what if we could give the elephant a mutant healing power?!?

Now we can. I wrote an extension named pg_healer that does just that - detects corruption issues, and automatically repairs them. Let's see how it works with a demonstration. For this, we will be purposefully corrupting the "pgbench_branches" table, part of the venerable pgbench utility.

For the initial setup, we will create a new Postgres cluster and install the pgbench schema. The all-important checksum feature needs to be enabled when we initdb, and we will use a non-standard port for testing:

$ initdb --data-checksums dojo
The files belonging to this database system will be owned by user "greg".
...
Data page checksums are enabled.

creating directory dojo ... ok
creating subdirectories ... ok
...
$ echo port=9999 >> dojo/postgresql.conf
$ pg_ctl start -D dojo -l log.dojo.txt
server starting
$ createdb -p 9999 $USER
$ pgbench -p 9999 -i
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.35 s, remaining 0.00 s)
vacuum...
set primary keys...
done.

Next, we install the pg_healer extension. As it needs to access some low-level hooks, we need to load it on startup, by adding a line to the postgresql.conf file:

$ git clone git://github.com/turnstep/pg_healer.git
Cloning into 'pg_healer'...
$ cd pg_healer
$ make install
gcc -Wall -Wmissing-prototypes ... -c -o pg_healer.o pg_healer.c
gcc -Wall -Wmissing-prototypes ... -shared -o pg_healer.so pg_healer.o
...
$ echo "shared_preload_libraries = 'pg_healer'" >> dojo/postgresql.conf
$ pg_ctl restart -D dojo -l log.dojo.txt
waiting for server to shut down.... done
server stopped
server starting
## Make sure the extension has loaded cleanly.
## If it did not, the log file would complain
$ tail -2 log.dojo.txt
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

Now for the fun part. We want to purposefully corrupt the file containing the data for the pgbench_branches file, in simulation of a failing hard drive or other really serious problem. The type of problem that normally causes the DBA to get paged in the middle of the night. Before we do that, we want to take a peek at the contents of that table, and then find out which actual disk files contain the table:

$ psql -p 9999 -c "select * from pgbench_branches"
 bid | bbalance | filler 
-----+----------+--------
   1 |        0 | 
(1 row)

$ psql -p 9999 -Atc "select format('%s/%s',
  current_setting('data_directory'),
  pg_relation_filepath('pgbench_branches'))"
/home/greg/pg_healer/dojo/base/16384/198461

## That file is too cumbersome to keep typing out, so:
$ ln -s /home/greg/pg_healer/dojo/base/16384/198461 myrelfile

Let's throw a deadly shuriken right into the middle of it!

## Here is what the file looks like in its original uncorrupted form
## (checksum is in red):
$ xxd -a -g1 -u myrelfile
00000000: 00 00 00 00 30 69 BC 37 74 66 04 00 1C 00 E0 1F  ....0i.7tf......
00000010: 00 20 04 20 00 00 00 00 E0 9F 40 00 00 00 00 00  . . ......@.....
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00001fe0: F7 0B 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00001ff0: 01 00 03 00 01 09 18 03 01 00 00 00 00 00 00 00  ................

## Good ol' dd is the right tool for the job here:
$ echo -n "Shuriken!" | dd conv=notrunc oflag=seek_bytes seek=4000 bs=9 count=1 of=myrelfile
1+0 records in
1+0 records out
9 bytes (9 B) copied, 0.000156565 s, 57.5 kB/s

## Take a peek inside the file to make sure the shuriken got embedded deeply:
$ xxd -a -g1 -u myrelfile
00000000: 00 00 00 00 30 69 BC 37 74 66 04 00 1C 00 E0 1F  ....0i.7tf......
00000010: 00 20 04 20 00 00 00 00 E0 9F 40 00 00 00 00 00  . . ......@.....
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00000fa0: 53 68 75 72 69 6B 65 6E 21 00 00 00 00 00 00 00  Shuriken!.......
00000fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00001fe0: F7 0B 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00001ff0: 01 00 03 00 01 09 18 03 01 00 00 00 00 00 00 00  ................

These shurikens are not so deadly, but quite yummy!
(photograph by kahvikisu)

Now that we've messed up the file, watch closely at what happens when we try to read from it. We are going to do this three times. The first time, the table will still be in the shared buffer cache, and thus will show no error. The second time, the table will be read from the disk and throw an error. At this point, pg_healer will see the error and repair it. The final read will pull from the completely healed table:


$ psql -p 9999 -c "select * from pgbench_branches"
 bid | bbalance | filler 
-----+----------+--------
   1 |        0 | 
(1 row)

## This will force the table out of shared_buffers, so that the next
## time it is accessed, Postgres must read from the disk:
$ psql -p 9999 -qtc "select pg_healer_remove_from_buffer('pgbench_branches')"

$ psql -p 9999 -c "select * from pgbench_branches"
WARNING:  page verification failed, calculated checksum 9478 but expected 26228
INFO:  File has been healed: base/16384/198461 (intrinsic healing)
ERROR:  invalid page in block 0 of relation base/16384/198461

## Mutant healing power was activated. Observe:
$ psql -p 9999 -c "select * from pgbench_accounts"
 bid | bbalance | filler 
-----+----------+--------
   1 |        0 | 
(1 row)

The corruption we created before changed the "free space" of the Postgres "page" structure. There are multiple ways pg_healer can fix things: this demonstrates one of the "intrinsic" fixes, which require no external knowledge to fix. Corruption can occur anywhere on the page, of course, including inside your data (as opposed to the meta-data or free space). One of the methods of fixing this is for pg_healer to use another copy of the table to try and repair the original table.

While eventually pg_healer will be able to reach out to replicas for a copy of the (non-corrupted) table data it needs, a simpler method is to simply create a good copy inside the data directory. There is a helper function that does just that, by copying the important files to a new directory. Details on how this is kept refreshed will be covered later; for now, let's see it in action and observe how it can help Postgres heal itself from more serious corruption problems:

$ psql -p 9999 -c 'create extension pg_healer'
CREATE EXTENSION
$ psql -p 9999 -qc 'checkpoint'
$ psql -p 9999 -c 'select pg_healer_cauldron()'

Rather than free space, let's corrupt something a little more important: the line pointers, which indicate where, inside the page, that each tuple (aka table row) is located. Extremely critical information, that is about to get blown away with another deadly shuriken!

$ echo -n "Shuriken!" | dd conv=notrunc oflag=seek_bytes seek=20 bs=9 count=1 of=myrelfile
1+0 records in
1+0 records out
9 bytes (9 B) copied, 9.3577e-05 s, 96.2 kB/s
$ xxd -a -g1 -u myrelfile
00000000: 00 00 00 00 30 69 BC 37 74 66 04 00 1C 00 E0 1F  ....0i.7tf......
00000010: 00 20 04 20 53 68 75 72 69 6B 65 6E 21 00 00 00  . . Shuriken!...
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00001fe0: F7 0B 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00001ff0: 01 00 03 00 01 09 18 03 01 00 00 00 00 00 00 00  ................

$ psql -p 9999 -qtc "select pg_healer_remove_from_buffer('pgbench_branches')"

$ psql -p 9999 -c "select * from pgbench_branches"
WARNING:  page verification failed, calculated checksum 8393 but expected 26228
INFO:  File has been healed: base/16384/198461 (external checksum match)
ERROR:  invalid page in block 0 of relation base/16384/198461

$ psql -p 9999 -c "select * from pgbench_branches"
 bid | bbalance | filler 
-----+----------+--------
   1 |        0 | 
(1 row)

## Has the shuriken really been removed?
$ xxd -a -g1 -u myrelfile
00000000: 00 00 00 00 30 69 BC 37 74 66 04 00 1C 00 E0 1F  ....0i.7tf......
00000010: 00 20 04 20 00 00 00 00 E0 9F 40 00 00 00 00 00  . . ......@.....
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00001fe0: F7 0B 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00001ff0: 01 00 03 00 01 09 18 03 01 00 00 00 00 00 00 00  ................

Once again, pg_healer has repaired the file. This time, however, it reached out to a version of the file outside the data directory, copied the old page data to the new page data, and then used the checksum to confirm that the changes were correct. This method only works, however, if the original file and the copy have the same checksum - which means that no changes have been made since the copy was made via pg_healer_cauldron(). As this is not always possible, there is a third method pg_healer can use, which is to examine things row by row and to try and repair the damage.

For this final demo, we are going to change the table by adding a new row, which ensures that the checksums against the copy will no longer match. After that, we are going to add some corruption to one of the table rows (aka tuples), and see if pg_healer is able to repair the table:

$ psql -p 9999 -qtc 'insert into pgbench_branches values (2,12345)'
$ psql -p 9999 -qc 'checkpoint'

## Throw a shuriken right into an active row!
$ echo -n "Shuriken!" | dd conv=notrunc oflag=seek_bytes seek=8180 bs=9 count=1 of=myrelfile
1+0 records in
1+0 records out
9 bytes (9 B) copied, 0.000110317 s, 81.6 kB/s

## If you look close, you will notice the checksum (in red) has also changed:
$ xxd -a -g1 -u myrelfile
00000000: 00 00 00 00 70 B0 8E 38 A4 8E 00 00 20 00 C0 1F  ....p..8.... ...
00000010: 00 20 04 20 00 00 00 00 E0 9F 40 00 C0 9F 40 00  . . ......@...@.
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
*
00001fc0: 05 0C 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00001fd0: 02 00 03 00 01 08 18 03 02 00 00 00 39 30 00 00  ............90..
00001fe0: F7 0B 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00001ff0: 01 00 03 00 53 68 75 72 69 6B 65 6E 21 00 00 00  ....Shuriken!...

$ psql -p 9999 -qtc "select pg_healer_remove_from_buffer('pgbench_branches')"

$ psql -p 9999 -c "select * from pgbench_branches"
WARNING:  page verification failed, calculated checksum 56115 but expected 36516
INFO:  File has been healed: base/16384/198461 (external tuple healing)
ERROR:  invalid page in block 0 of relation base/16384/198461

$ psql -p 9999 -c "select * from pgbench_branches"
 bid | bbalance | filler 
-----+----------+--------
   1 |        0 | 
   2 |    12345 | 
(2 rows)

There are still some rough edges, but for a proof of concept it works quite nicely. While reacting to corruption errors as they appear is nice, in the future I would like it to be more proactive, and run as a background process that scans the database for any problems and fixes them. Ideally, it should be able to handle a wider class of table corruption problems, as well as problems in indexes, free space maps, system catalogs, etc. Please jump in and lend a hand - the project is on github as pg_healer.

Data corruption is a fact of life DBAs must confront, be it from failing hard drives, cosmic rays, or other reason. While the detection of such errors was greatly improved in Postgres 9.3 with the --data-checksums argument to initdb (which ought to default on!), it's time to not just detect, but heal!

Executing Custom SQL in Django Migrations

Since version 1.7, Django has natively supported database migrations similar to Rails migrations. The biggest difference fundamentally between the two is the way the migrations are created: Rails migrations are written by hand, specifying changes you want made to the database, while Django migrations are usually automatically generated to mirror the database schema in its current state.

Usually, Django’s automatic schema detection works quite nicely, but occasionally you will have to write some custom migration that Django can’t properly generate, such as a functional index in PostgreSQL.

Creating an empty migration

To create a custom migration, it’s easiest to start by generating an empty migration. In this example, it’ll be for an application called blog:

$ ./manage.py makemigrations blog --empty -n create_custom_index
Migrations for 'blog':
  0002_create_custom_index.py:

This generates a file at blog/migrations/0002_create_custom_index.py that will look something like this:

# -*- coding: utf-8 -*-                                                                                                                                                                                             
# Generated by Django 1.9.4 on 2016-09-17 17:35                                                                                                                                                                     
from __future__ import unicode_literals                                                                                                                                                                             
                                                                                                                                                                                                                    
from django.db import migrations                                                                                                                                                                                    
                                                                                                                                                                                                                    
                                                                                                                                                                                                                    
class Migration(migrations.Migration):                                                                                                                                                                              
                                                                                                                                                                                                                    
    dependencies = [                                                                                                                                                                                                
        ('blog', '0001_initial'),                                                                                                                                                                                   
    ]                                                                                                                                                                                                               
                                                                                                                                                                                                                    
    operations = [                                                                                                                                                                                                  
    ]

Adding Custom SQL to a Migration

The best way to run custom SQL in a migration is through the migration.RunSQL operation. RunSQL allows you to write code for migrating forwards and backwards—that is, applying migrations and unapplying them. In this example, the first string in RunSQL is the forward SQL, the second is the reverse SQL.

# -*- coding: utf-8 -*-                                                                                                                                                                                             
# Generated by Django 1.9.4 on 2016-09-17 17:35                                                                                                                                                                     
from __future__ import unicode_literals                                                                                                                                                                             
                                                                                                                                                                                                                    
from django.db import migrations                                                                                                                                                                                    
                                                                                                                                                                                                                    
                                                                                                                                                                                                                    
class Migration(migrations.Migration):                                                                                                                                                                              
                                                                                                                                                                                                                    
    dependencies = [                                                                                                                                                                                                
        ('blog', '0001_initial'),                                                                                                                                                                                   
    ]                                                                                                                                                                                                               
                                                                                                                                                                                                                    
    operations = [                                                                                                                                                                                                  
        migrations.RunSQL(                                                                                                                                                                                          
            "CREATE INDEX i_active_posts ON posts(id) WHERE active",                                                                                                                                         
            "DROP INDEX i_active_posts"                                                                                                                                                                             
        )                                                                                                                                                                                                           
    ]

Unless you’re using Postgres for your database, you’ll need to install the sqlparse library, which allows Django to break the SQL strings into individual statements.

Running the Migrations

Running your migrations is easy:

$ ./manage.py migrate
Operations to perform:
  Apply all migrations: blog, sessions, auth, contenttypes, admin
Running migrations:
  Rendering model states... DONE
  Applying blog.0002_create_custom_index... OK

Unapplying migrations is also simple. Just provide the name of the app to migrate and the id of the migration you want to go to, or “zero” to reverse all migrations on that app:

$./manage.py migrate blog 0001
Operations to perform:
  Target specific migration: 0001_initial, from blog
Running migrations:
  Rendering model states... DONE
  Unapplying blog.0002_create_custom_index... OK

Hand-written migrations can be used for many other operations, including data migrations. Full documentation for migrations can be found in the Django documentation.


(This post originally covered South migrations and was updated by Phin Jensen to illustrate the now-native Django migrations.)

Josh and Lele Interviewed as Experts in Leading Cloud Technologies



End Point's Josh Williams and Lele Calò were recently interviewed for a piece that is now published on Clutch, a site offering B2B research and reviews. Josh and Lele were interviewed as experts in leading cloud technologies. The interview can be found here.

Josh and Lele began the interview by giving some background on End Point and its various offerings. These offerings include software development, hosting, consulting, and the Liquid Galaxy division.

Josh and Lele then answered questions regarding cloud technologies. These questions included technologies they work with, challenges that cloud technologies help companies overcome, and recommendation for companies moving toward cloud services. Recently, End Point has had customers using Amazon Web Services [AWS], Rackspace, OVH, Google Cloud, Linode, and DigitalOcean. Josh and Lele agree that cloud services are quite beneficial in dealing with usage profiles and load inconsistencies. Josh and Lele also agreed that when it comes to companies moving into cloud services, moving components in a modular fashion and not all at once is a key to success.

Clutch asked Josh and Lele about costs in utilizing cloud technologies. Lele explained that the cost will vary based on what the company needs to do with its cloud infrastructure. Josh went on to add that companies need to properly evaluate their options in order to determine what is best. Also, if one is in position to architect their infrastructure in order to take advantage of the cloud, they should do so.

Next, Josh and Lele were asked about how they go about choosing the right platform for projects. They made sure to stress that needs vary for every client, and that the right platform for one client may not be the right platform for another. The two agreed that AWS has at times been an appealing option due to its many features. For projects where fewer features are necessary and cost is tight, Rackspace has been a good option. Josh and Lele also touched on the support that the various cloud technologies offer, and their strengths and weaknesses. Finally, Josh and Lele were asked to rate different platforms in terms of functionality, ease of use, response time, whether they would recommend the service, and what their satisfaction level has been.

If you're interested in learning more about our cloud services, please visit endpoint.com or contact us here for more information.

Surviving the Framework Hype Cycle (MWRC 2016)

Back in March, I attended MountainWest RubyConf with my co-workers Phunk and Phin. Shortly thereafter, Phin wrote about a couple of his favorite talks: Writing a Test Framework from Scratch and How to Build a Skyscraper.

I've found that another talk from the conference has stuck with me and I've referred to it several times in conversations. It is "Surviving the Framework Hype Cycle" by Brandon Hays. It is funny, engaging, insightful, and especially cathartic given the rapid pace of change in frameworks these days.

It can be tough to set aside the time to focus on a conference talk video, but I think this one is well worth it for programmers!

The 5 phases of the framework hype cycle that he identifies and elaborates on are:

  1. Technology Trigger
  2. Peak of Inflated Expectations
  3. Trough of Disillusionment
  4. Slope of Enlightenment
  5. Plateau of Productivity

And he shows how different groups of people with different needs can benefit at different stages, akin to settling new land:

  • Pioneers
  • Settlers
  • Town Planners

The ebb and flow of technology fads can be a lot easier to navigate when we realize there isn't an absolute right and wrong to technology choices, and we must consider the project and its maturity level, the people involved, and the current state of the technologies under consideration.

See also the Surviving the Framework Hype Cycle video page and Brandon's @tehviking Twitter account.

Ruby Fight Club

DSC_0002
Photo by Peter Gordon

First Rule: Do Not Talk About Ruby Fight Club

This post may get me kicked out for talking about the club, but…I was asked to share a few thoughts about something we tried out earlier this year at End Point. We ran an internal meetup dubbed Ruby Fight Club to read and discuss Practical Object-Oriented Design in Ruby (POODR) by Sandi Metz.

Poodr cover In the past we've met together as a company and in smaller team or project focused groups but this was first time we used a book as the guiding, recurring topic with a smaller, committed group who made time for it. Attendance was optional and we did have some folks who opted out even though they loved the subject matter and the idea, because they knew they couldn't make it most of the time. I think this made the group tighter-knit.

Our group of six remote engineers met weekly for one hour to discuss one chapter from the book. We worked through each chapter together which often led to Q & A sessions and deeper discussions about Ruby and CS theory in general. Each week, one member of the team would lead the discussion. It required some preparation in advance and gave us all a chance to work on our presenting and moderating skills. We used Google Hangouts to meet and the presenter would share their screen to show slides and example code.

Honing Our Craft

As consultants we have to balance technical perfection with other considerations — most often budget and timelines/deadlines. This reality does not lend itself well to developing and learning to apply new technical concepts. It's also challenging to apply these new concepts in large, established projects where our role is to maintain and extend existing code. The conventions and architecture are well established and it's hard to push against projects with inertia like this. Setting some time aside each week not tied to client projects gave us the opportunity to focus on our skills and develop them together. Although several in the group had read the POODR book before, they reported that working through each chapter and discussing it in a group setting greatly increased their understanding of the material.

Face Time

Our company has a large number of full-time remote engineers. At time of this writing close to 90% of us work in a location other than our head office in NYC. Remote work allows us the freedom to live where we want (I'm writing this from my home office in a small town in the interior of British Columbia, Canada) but requires extra effort for us to build team camaraderie and social connections. Ruby Fight Club was great for this. Although Hangouts, Skype and other web-conferencing apps have their challenges, everyone let us know they appreciated being able to spend time with the group each week. We'd often chat with each other for a few minutes before and after our book discussions.

Discussing the book (somewhat) face-to-face over video also increased the fidelity and efficiency of our discussions. When you can see people's faces, expressions and gestures it's much easier to glean information and understand them. It feels silly and obvious to say this after the fact but it's novel when the majority of your workday communication is done over IRC/Slack, email and phone/audio.

Got Meetups?

During one of our meetings I asked the group if anyone went to meetups local to them. I was surprised to learn that none in the group did so. For every person the response was similar: “I would but there aren't any groups meeting nearby”. I had assumed that I was the odd one out in this way because I live in a small town. The leader of our group lives in Chicago and I thought for sure there would be a meetup for him to attend but there wasn't. The closest group to him required a long trip across town on public transit. Occasionally I do travel 45-60 mins (each way!) from home to attend meetups in larger, nearby towns. People are often surprised I travel so far but it's worth it to me.

What Worked Well

We stuck to the time limit (mostly) which was a good thing. We found we usually had enough time to socialize a bit, discuss the book, have a Q & A time and then get back to our client work. Sharing the presenting / discussion leading role was also nice. Leading a session took a little time to prepare but because we all took turns doing this we shared the load. I'd like to personally thank Phunk (our fearless Ruby Fight Club leader) for having the idea and executing so well on it. We all really appreciate it and look forward to running more meetups like it in future.

What We Might Do Differently Next Time

I think it would be good to allow a little more time for socializing and chat. This happened naturally in Ruby Fight Club but it would be cool to devote a block of time specifically to it. It would also be nice to allow time for people to do a show and tell / lightning talk about something they are interested in. That said, having a book to work through was really great. Every week there is something to discuss and it was often a starting point to further discussions. What are you doing at your companies to stay sharp and connected? Let us know, we'd love to hear from you!

Reach customers and drive sales with MailChimp

It's a good idea for ecommerce stores to regularly contact their customers. This not only reminds customers that your business exists, but also allows the sharing of new products and resources that can enrich the lives of your customers and clients. One of the easiest ways to stay in touch is by using an email newsletter service, such as MailChimp.

MailChimp offers the regular suite of email newsletter services: lists, campaigns, and reports — but in addition, they allow an ecommerce store to integrate sales data back into MailChimp. When you have detailed shopping statistics for each subscriber, it open new possibilities for customized marketing campaigns.

Endless possibilities

For example, imagine you have an email mailing list with 1,000 recipients. Instead of mailing the same generic newsletter to each subscriber, what if you could segment the list to identify your 100 best customers, and email them a special campaign?

Additional ideas could include:

  • Reach out to inactive subscribers, offering a coupon
  • Invite your best customers to a secret sale
  • Re-engage customers who placed items in their cart, but left without purchasing
  • Offer complementary products to purchasers of Product X

Automatic marketing

Once your store has sales data for subscribers, and you've decided on the campaigns you want to run with this data, the next step is to automate the process. This is where MailChimp's Automation feature comes in. Spend some time up-front to craft the automated campaigns, then sit back and let MailChimp run them for you; day in, day out.

Steps to implement

There are several off-the-shelf integrations for ecommerce stores, including Magento and BigCommerce.

Users of Perl and Interchange can use our newly-released toolsets of the Mail::Chimp3 CPAN module and the integration for Interchange5.

Contact us today for expert help integrating one of these solutions with your ecommerce store.

Go beyond the simple newsletter

Most businesses already have an email newsletter. Hopefully, you are sending regular email campaigns with it. This is a great first step. Going beyond this to segment your email list and reach out to these segments with relevant information to each of them, is the next step. Not only can this increase your sales, but it also respects your clients' and customers' time and preferences. It's a win-win for all.

Additional resource: MailChimp for Online Sellers

Create categories in Virtuemart 3 with Joomla 2.5 and 3.5 programmatically

Introduction

Code that I'm going to show you is ready to download here: https://github.com/peter-hank/com_morevirtuemart

Virtuemart is an open-source e-commerce application written in PHP. It's pretty popular with a 4% share of the whole e-commerce market (https://blog.aheadworks.com/2015/05/ecommerce-platforms-popularity-may-2015-two-platforms-take-half/). Today I will show you how to extend it and use its functionality from an external Joomla (free and open-source content management system) component.

Creating a component

We are going to create a new component to show the code in a nice and clear form. I'm using a component generator from here: https://www.component-creator.com. We don't need any tables, models and views for the purpose of this blog post. Plain and simple component is what we need. After creating the component download it and install in a Joomla administration interface.

Component overview

The component structure looks like this:

# tree components/com_morevirtuemart
components/com_morevirtuemart
├── controller.php
├── controllers
│   └── index.html
├── helpers
│   ├── index.html
│   └── morevirtuemart.php
├── index.html
├── models
│   ├── fields
│   │   ├── createdby.php
│   │   ├── filemultiple.php
│   │   ├── foreignkey.php
│   │   ├── index.html
│   │   ├── modifiedby.php
│   │   ├── submit.php
│   │   ├── timecreated.php
│   │   └── timeupdated.php
│   ├── forms
│   │   └── index.html
│   └── index.html
├── morevirtuemart.php
├── router.php
└── views
    └── index.html

We don't need to use more than a generic controller in a main directory. Its content is not very exciting for now:

<?php

// No direct access
defined('_JEXEC') or die;

jimport('joomla.application.component.controller');

/**
 * Class MorevirtuemartController
 *
 * @since  1.6
 */
class MorevirtuemartController extends JControllerLegacy
{
 /**
  * Method to display a view.
  *
  * @param   boolean $cachable  If true, the view output will be cached
  * @param   mixed   $urlparams An array of safe url parameters and their variable types, for valid values see {@link JFilterInput::clean()}.
  *
  * @return  JController   This object to support chaining.
  *
  * @since    1.5
  */
 public function display($cachable = false, $urlparams = false)
 {
  $view = JFactory::getApplication()->input->getCmd('view', '');
  JFactory::getApplication()->input->set('view', $view);

  parent::display($cachable, $urlparams);

  return $this;
 }
}

We can remove a display function and replace it with a function called createCategory so a MorevirtuemartController class will look like this:

class MorevirtuemartController extends JControllerLegacy
{
 public function createCategory ()
 {
  echo 'I\'m alive!';

  die();
 }
}

To test that our component is fully working right now try to open Joomla with this url: index.php?option=com_morevirtuemart&task=createCategory. Of course you need to prepend it with your domain name to make it work. The result should be just an empty page with our text: "I'm alive!"

Now you know how to call a controller task within a browser.

Creating a category

To use Virtuemart classes we need to initialize all of its logic and configuration. Look here:

<?php

// No direct access
defined('_JEXEC') or die;

jimport('joomla.application.component.controller');

// loading Virtuemart classes and configuration
if (!class_exists( 'VmConfig' )) require(JPATH_ADMINISTRATOR . DS . 'components' . DS . 'com_virtuemart'.DS.'helpers'.DS.'config.php');
$config= VmConfig::loadConfig();
if (!class_exists( 'VirtueMartModelVendor' )) require(JPATH_VM_ADMINISTRATOR.DS.'models'.DS.'vendor.php');
if(!class_exists('TableMedias')) require(JPATH_VM_ADMINISTRATOR.DS.'tables'.DS.'medias.php');
if(!class_exists('TableCategories')) require(JPATH_VM_ADMINISTRATOR.DS.'tables'.DS.'categories.php');
if (!class_exists( 'VirtueMartModelCategory' )) require(JPATH_VM_ADMINISTRATOR.DS.'models'.DS.'category.php');

/**
 * Class MorevirtuemartController
 *
 * @since  1.6
 */
class MorevirtuemartController extends JControllerLegacy
{
 public function createCategory ()
 {
  echo 'I\'m alive!';
  die();
 }
}

What we are doing here is importing needed Virtuemart classes and initializing the configuration of each. We are all set now and ready for an implementation.

The most difficult thing here is how to work with a Virtuemart internal authorization system. Before an every CRUD action there is a check for an authorization access. We need to find a workaround for this, there are multiple way of doing this, I will show you how I did that. Create a new file in components/com_morevirtuemart/model and call it category.php. Inside this file put this piece of code:

<?php

// No direct access
defined('_JEXEC') or die;

class VirtueMartModelCategoryLocal extends VirtueMartModelCategory {
 public function __construct() {
  parent::__construct();
 }
}

I hope it's pretty clear what we are doing here: extending a Virtuemart category model. What I'm going to do now is to copy a store function from a VirtueMartModelCategory class and put it in our local class of a category model and remove an authentication check from it (you need to secure it by yourself but using a different method). The result of this action is:

<?php

// No direct access
defined('_JEXEC') or die;

class VirtueMartModelCategoryLocal extends VirtueMartModelCategory {
  public function __construct() 
  {
    parent::__construct();
  }

  public function store(&$data) 
  {
    $table = $this->getTable('categories');

    if ( !array_key_exists ('category_template' , $data ) ){
      $data['category_template'] = $data['category_layout'] = $data['category_product_layout'] = 0 ;
    }
    if(VmConfig::get('categorytemplate') == $data['category_template'] ){
      $data['category_template'] = 0;
    }

    if(VmConfig::get('categorylayout') == $data['category_layout']){
      $data['category_layout'] = 0;
    }

    if(VmConfig::get('productlayout') == $data['category_product_layout']){
      $data['category_product_layout'] = 0;
    }

    $table->bindChecknStore($data);

    if(!empty($data['virtuemart_category_id'])){
      $xdata['category_child_id'] = (int)$data['virtuemart_category_id'];
      $xdata['category_parent_id'] = empty($data['category_parent_id'])? 0:(int)$data['category_parent_id'];
      $xdata['ordering'] = empty($data['ordering'])? 0: (int)$data['ordering'];

        $table = $this->getTable('category_categories');

      $table->bindChecknStore($xdata);

    }

    $this->clearCategoryRelatedCaches();

    return $data['virtuemart_category_id'] ;
  }
}

In addition, I removed call to media creation. If you need to create media along with the categories you need to extend it as well, the same as we did with a category model. We are all set now to create a new category from a component controller, we just need to include an extended model there and call a store function. Our controller should look like this now:

categoryModel = new VirtueMartModelCategoryLocal();
 }

 public function createCategory ()
 {
  $catData = [
   'category_name' => 'Brand new Virtuemart category',
  'category_parent_id' => 0,
   'published' => 0
   ];

  $catId = $this->categoryModel->store($catData);

  echo 'Created a new category';

  die();
 }
}

Wow! The new category should be there. There are many more available attributes to be set, we've used a needed minimum: category name, its parent and status.

The end

There are a few things I should mention:

  • ensure to add an authentication to your code, you can use a Joomla authentication system,
  • be aware that Virtuemart code can change in time so be careful with updates,
  • I recommend to use a CLI script for this kind of operations.
Thanks for reading.