Welcome to End Point’s blog

Ongoing observations by End Point people

Parallel Inventory Access using PostgreSQL

Inventory management has a number of challenges. One of the more vexing issues with which I've dealt is that of forced serial access. We have a product with X items in inventory. We also have multiple concurrent transactions vying for that inventory. Under any normal circumstance, whether the count is a simple scalar, or is comprised of any number of records up to one record/quantity, the concurrent transactions are all going to hone in on the same record, or set of records. In doing so, all transactions must wait and get their inventory serially, even if doing so isn't of interest.

If inventory is a scalar value, we don't have much hope of circumventing the problem. And, in fact, we wouldn't want to under that scenario because each transaction must reflect the part of the whole it consumed so that the next transaction knows how much is left to work with.

However, if we have inventory represented with one record = one quantity, we aren't forced to serialize in the same way. If we have multiple concurrent transactions vying for inventory, and the sum of the need is less than that available, why must the transactions wait at all? They would normally line up serially because, no matter what ordering you apply to the selection (short of random), it'll be the same ordering for each transaction (and even an increasing probability of conflict with random as concurrency increases). Thus, to all of them, the same inventory record looks the "most interesting" and, so, each waits for the lock from the transaction before it to resolve before moving on.

What we really want is for those transactions to attack the inventory like an easter-egg hunt. They may all make a dash for the "most interesting" egg first, but only one of them will get it. And, instead of the other transaction standing there, coveting the taken egg, we want them to scurry on unabated and look for the next "most interesting" egg to throw in their baskets.

We can leverage some PostgreSQL features to accomplish this goal. The key for establishing parallel access into the inventory is to use the row lock on the inventory records as an indicator of a "soft lock" on the inventory. That is, we assume any row-locked inventory will ultimately be consumed, but recognize that it might not be. That allows us to pass over locked inventory, looking for other inventory to fill the need; but if we find we don't have enough inventory for our need, those locked records indicate that we should take another pass and try again. Eventually, we either get all the inventory we need, or we have consumed all the inventory there is, meaning less than we asked for but with no locked inventory present.

We write a pl/pgsql function to do all the dirty work for us. The function has the following args:

  • Name of table on which we want to apply parallel access
  • Query that retrieves all pertinent records, and in the desired order
  • Integer number of records we ultimately want locked for this transaction.
  • The function returns a setof ctid. Using the ctid has the advantage of the function needing to know nothing about the composition of the table and providing exceedingly fast access back to the records of interest. Thus, the function can be applied to any table if desired and doesn't depend on properly indexed fields in the case of larger tables.

    CREATE OR REPLACE FUNCTION getlockedrows (
           tname TEXT,
           query TEXT,
           desired INT
    AS $EOR$
       total   INT NOT NULL := 0;
       locked  BOOL NOT NULL := FALSE;
       myst    TEXT;
       myrec   RECORD;
       mytid   TEXT;
       found   TID[];
       loops   INT NOT NULL := 1;
       -- Variables: tablename, full query of interest returning ctids of tablename rows, and # of rows desired.
       RAISE DEBUG 'Desired rows: %', desired;
       May want a sanity limit here, based on loops:
       IF loops > 10 THEN
           RAISE EXCEPTION 'Giving up. Try again later.';
       END IF;
               total := 0;
               FOR myrec IN EXECUTE query
                   RAISE DEBUG 'Checking lock on id %',myrec.ctid;
                   mytid := myrec.ctid;
                   myst := 'SELECT 1 FROM '
                       || quote_ident(tname)
                       || ' WHERE ctid = $$'
                       || mytid
                       || '$$ FOR UPDATE NOWAIT';
                       EXECUTE myst;
                       -- If it worked:
                       total := total + 1;
                       found[total] := myrec.ctid;
                       -- quit as soon as we have all requested
                       EXIT outermost WHEN total >= desired;
                   -- It did not work
                       WHEN LOCK_NOT_AVAILABLE THEN
                           -- indicate we have at least one candidate locked
                           locked := TRUE;
               END LOOP; -- end each row in the table
               IF NOT locked THEN
                   -- We have as many in found[] as we can get.
                   RAISE DEBUG 'Found % of the requested % rows.',
                   EXIT outermost;
               END IF;
               -- We did not find as many rows as we wanted!
               -- But, some are currently locked, so keep trying.
               RAISE DEBUG 'Did not find enough rows!';
               RAISE EXCEPTION 'Roll it back!';
                   PERFORM pg_sleep(RANDOM()*0.1+0.45);
                   locked := FALSE;
                   loops := loops + 1;
       END LOOP outermost;
       FOR x IN 1 .. total LOOP
           RETURN NEXT found[x];
       END LOOP;

    The function makes a pass through all the records, attempting to row lock each one as it can. If we happen to lock as many as requested, we exit <<outermost>> immediately and start returning ctids. If we pass through all records without hitting any locks, we return the set even though it's less than requested. The calling code can decide how to react if there aren't as many as requested.

    To avoid artificial deadlocks, with each failed pass of <<outermost>>, we raise exception of the encompassing block. That is, with each failed pass, we start over completely instead of holding on to those records we've already locked. Once a run has finished, it's all or nothing.

    We also mix up the sleep times just a bit so any two transactions that happen to be locked into a dance precisely because of their timing will (likely) break the cycle after the first loop.

    Example of using our new function from within a pl/pgsql function:

       text_query := $EOQ$
    SELECT ctid
    FROM inventory
       AND status = 'AVAILABLE'
    ORDER BY age, location
       OPEN curs_inv FOR
           SELECT inventory_id
           FROM inventory
           WHERE ctid IN (
                   SELECT *
                   FROM getlockedrows(
           FETCH curs_inv INTO int_invid;
           UPDATE inventory
           SET status = 'SOLD'
           WHERE inventory_id = int_invid;
       END LOOP;

    The risk we run with this approach is that our ordering will not be strictly enforced. In the above example, if it's absolutely critical that the sort on age and location never be violated, then we cannot run our access to the inventory in parallel. The risk comes if T1 grabs the first record, T2 only needs one and grabs the second, but T1 aborts for some other reason and never consumes the record it originally locked.


    Uqbar said...

    Maybe I missed the actual thing.
    But in my implementation the inventory is nothing but a list of "inventory updates".
    To make an update I simply INSERT a new row for each variation.
    To un-make it I make another INSERT with reversed sign.
    To get the inventory level I "SELECT sum(...)".
    Then every while I consolidate the updates by replacing the rows with a single line. And possibly back the old lines up for history.
    It's a mirror of the "double entry accounting", also becasue inventory updates is actually inventory accounting.
    On an oldish 1.4 MHz centrino laptop (4500 RPMs disk, 512 MB RAM) I get the stock level in about 500 ms over 2M+ rows.
    The updates show 5- msec for insertions.
    I come with this implementation exactly to avoid synchronisation problems.
    And it works ... for me.

    Jon Jensen said...

    That's a reasonable way to do it too, though you'll likely need to implement an archive table at some point as your table grows.

    You also still would need to "claim" inventory as Mark describes, to deal with simultaneous batch inventory allocations, wouldn't you?

    One system we work with that is like what Mark referred to has an inventory table with over 10 million rows, and that's without storing the complete change history, which would easily balloon the size when considering updates to move inventory from one location to another, changing status, etc.

    Getting the stock level needs to take a lot less than 500 ms for this operation, even under heavy multi-user concurrency.

    But it's nice that you avoid any need to VACUUM! And if it works for you on old hardware, you can of course just upgrade for the foreseeable future if needed. :)