Inventory management has a number of challenges. One of the more vexing issues with which I've dealt is that of forced serial access. We have a product with X items in inventory. We also have multiple concurrent transactions vying for that inventory. Under any normal circumstance, whether the count is a simple scalar, or is comprised of any number of records up to one record/quantity, the concurrent transactions are all going to hone in on the same record, or set of records. In doing so, all transactions must wait and get their inventory serially, even if doing so isn't of interest.
If inventory is a scalar value, we don't have much hope of circumventing the problem. And, in fact, we wouldn't want to under that scenario because each transaction must reflect the part of the whole it consumed so that the next transaction knows how much is left to work with.
However, if we have inventory represented with one record = one quantity, we aren't forced to serialize in the same way. If we have multiple concurrent transactions vying for inventory, and the sum of the need is less than that available, why must the transactions wait at all? They would normally line up serially because, no matter what ordering you apply to the selection (short of random), it'll be the same ordering for each transaction (and even an increasing probability of conflict with random as concurrency increases). Thus, to all of them, the same inventory record looks the "most interesting" and, so, each waits for the lock from the transaction before it to resolve before moving on.
What we really want is for those transactions to attack the inventory like an easter-egg hunt. They may all make a dash for the "most interesting" egg first, but only one of them will get it. And, instead of the other transaction standing there, coveting the taken egg, we want them to scurry on unabated and look for the next "most interesting" egg to throw in their baskets.
We can leverage some PostgreSQL features to accomplish this goal. The key for establishing parallel access into the inventory is to use the row lock on the inventory records as an indicator of a "soft lock" on the inventory. That is, we assume any row-locked inventory will ultimately be consumed, but recognize that it might not be. That allows us to pass over locked inventory, looking for other inventory to fill the need; but if we find we don't have enough inventory for our need, those locked records indicate that we should take another pass and try again. Eventually, we either get all the inventory we need, or we have consumed all the inventory there is, meaning less than we asked for but with no locked inventory present.
We write a pl/pgsql function to do all the dirty work for us. The function has the following args:
The function returns a setof ctid. Using the ctid has the advantage of the function needing to know nothing about the composition of the table and providing exceedingly fast access back to the records of interest. Thus, the function can be applied to any table if desired and doesn't depend on properly indexed fields in the case of larger tables.
CREATE OR REPLACE FUNCTION getlockedrows (
tname TEXT,
query TEXT,
desired INT
)
RETURNS SETOF TID
STRICT
VOLATILE
LANGUAGE PLPGSQL
AS $EOR$
DECLARE
total INT NOT NULL := 0;
locked BOOL NOT NULL := FALSE;
myst TEXT;
myrec RECORD;
mytid TEXT;
found TID[];
loops INT NOT NULL := 1;
BEGIN
-- Variables: tablename, full query of interest returning ctids of tablename rows, and # of rows desired.
RAISE DEBUG 'Desired rows: %', desired;
<<outermost>>
LOOP
/*
May want a sanity limit here, based on loops:
IF loops > 10 THEN
RAISE EXCEPTION 'Giving up. Try again later.';
END IF;
*/
BEGIN
total := 0;
FOR myrec IN EXECUTE query
LOOP
RAISE DEBUG 'Checking lock on id %',myrec.ctid;
mytid := myrec.ctid;
myst := 'SELECT 1 FROM '
|| quote_ident(tname)
|| ' WHERE ctid = $$'
|| mytid
|| '$$ FOR UPDATE NOWAIT';
BEGIN
EXECUTE myst;
-- If it worked:
total := total + 1;
found[total] := myrec.ctid;
-- quit as soon as we have all requested
EXIT outermost WHEN total >= desired;
-- It did not work
EXCEPTION
WHEN LOCK_NOT_AVAILABLE THEN
-- indicate we have at least one candidate locked
locked := TRUE;
END;
END LOOP; -- end each row in the table
IF NOT locked THEN
-- We have as many in found[] as we can get.
RAISE DEBUG 'Found % of the requested % rows.',
total,
desired;
EXIT outermost;
END IF;
-- We did not find as many rows as we wanted!
-- But, some are currently locked, so keep trying.
RAISE DEBUG 'Did not find enough rows!';
RAISE EXCEPTION 'Roll it back!';
EXCEPTION
WHEN RAISE_EXCEPTION THEN
PERFORM pg_sleep(RANDOM()*0.1+0.45);
locked := FALSE;
loops := loops + 1;
END;
END LOOP outermost;
FOR x IN 1 .. total LOOP
RETURN NEXT found[x];
END LOOP;
RETURN;
END;
$EOR$
;
The function makes a pass through all the records, attempting to row lock each one as it can. If we happen to lock as many as requested, we exit <<outermost>> immediately and start returning ctids. If we pass through all records without hitting any locks, we return the set even though it's less than requested. The calling code can decide how to react if there aren't as many as requested.
To avoid artificial deadlocks, with each failed pass of <<outermost>>, we raise exception of the encompassing block. That is, with each failed pass, we start over completely instead of holding on to those records we've already locked. Once a run has finished, it's all or nothing.
We also mix up the sleep times just a bit so any two transactions that happen to be locked into a dance precisely because of their timing will (likely) break the cycle after the first loop.
Example of using our new function from within a pl/pgsql function:
...
text_query := $EOQ$
SELECT ctid
FROM inventory
WHERE sku = 'COOLSHOES'
AND status = 'AVAILABLE'
ORDER BY age, location
$EOQ$
;
OPEN curs_inv FOR
SELECT inventory_id
FROM inventory
WHERE ctid IN (
SELECT *
FROM getlockedrows(
'inventory',
text_query,
3
)
);
LOOP
FETCH curs_inv INTO int_invid;
EXIT WHEN NOT FOUND;
UPDATE inventory
SET status = 'SOLD'
WHERE inventory_id = int_invid;
END LOOP;
...
The risk we run with this approach is that our ordering will not be strictly enforced. In the above example, if it's absolutely critical that the sort on age and location never be violated, then we cannot run our access to the inventory in parallel. The risk comes if T1 grabs the first record, T2 only needs one and grabs the second, but T1 aborts for some other reason and never consumes the record it originally locked.
2 comments:
Maybe I missed the actual thing.
But in my implementation the inventory is nothing but a list of "inventory updates".
To make an update I simply INSERT a new row for each variation.
To un-make it I make another INSERT with reversed sign.
To get the inventory level I "SELECT sum(...)".
Then every while I consolidate the updates by replacing the rows with a single line. And possibly back the old lines up for history.
It's a mirror of the "double entry accounting", also becasue inventory updates is actually inventory accounting.
On an oldish 1.4 MHz centrino laptop (4500 RPMs disk, 512 MB RAM) I get the stock level in about 500 ms over 2M+ rows.
The updates show 5- msec for insertions.
I come with this implementation exactly to avoid synchronisation problems.
And it works ... for me.
That's a reasonable way to do it too, though you'll likely need to implement an archive table at some point as your table grows.
You also still would need to "claim" inventory as Mark describes, to deal with simultaneous batch inventory allocations, wouldn't you?
One system we work with that is like what Mark referred to has an inventory table with over 10 million rows, and that's without storing the complete change history, which would easily balloon the size when considering updates to move inventory from one location to another, changing status, etc.
Getting the stock level needs to take a lot less than 500 ms for this operation, even under heavy multi-user concurrency.
But it's nice that you avoid any need to VACUUM! And if it works for you on old hardware, you can of course just upgrade for the foreseeable future if needed. :)
Post a Comment