Even more so than Rails 3, NoSQL was a popular technical topic at RailsConf this year. I haven't had much exposure to NoSQL except for reading a few articles written by Ethan (Quick Thoughts on NoSQL Live Boston Conference, NoSQL Live: The Dynamo Derivatives (Cassandra, Voldemort, Riak), and Cassandra, Thrift, and Fibers in EventMachine), so I attended a few sessions to learn more.
First, it was reinforced several times that if you can read JSON, you should have no problem comprehending NoSQL. So, it shouldn't be too hard to jump into code examples! Next, I found it helpful when one of the speakers presented high-level categorization of NoSQL, whether or not the categories meant much to me at the time:
- Key-Value Stores: Advantages include that this is the simplest possible data model. Disadvantages include that range queries are not straightforward and modeling can get complicated. Examples include Redis, Riak, Voldemort, Tokyo Cabinet, MemcacheDB.
- Document stores: Advantages include that the value associated with a key is a document that exposes a structure that allows some database operations to be performed on it. Examples include CouchDB, MongoDB, Riak, FleetDB.
- Column-based stores: Examples include Cassandra, HBase.
- Graph stores: Advantages include that this allows for deep relationships. Examples include Neo4j, HypergraphDB, InfoGrid.
In one NoSQL talk, Flip Sasser presented an example to demonstrate how an ecommerce application might be migrated to use NoSQL, which was the most efficient (and very familiar) way for me to gain an understanding of NoSQL use in a Rails application. Flip introduced the models and relationships shown here:
In the transition to NoSQL, the transaction model stays as is. As a purchase is created, the Notification.create method is called.
class Purchase < ActiveRecord::Base
after_create :create_notification
# model relationships
# model validations
def total
quantity * product.price
end
protected
def create_notification
notifications.create({
:action => "purchased #{quantity == 1 ? 'a' : quantity} #{quantity == 1 ? product.name : product.name.pluralize}",
:description => "Spent a total of #{total}",
:item => self,
:user => user
}
)
end
end
Flip moves the product class to Document store because it needs a lot of flexibility to handle the diverse product metadata. The structure of the product class is defined in the product class and nowhere else.
Before
class Product < ActiveRecord::Base serialize :info, Hash end
After
class Product include MongoMapper::Document key :name, String key :image_path, String key :info, Hash timestamps! end
The Notification class is moved to a Key-Value store. After a user completes a purchase, the create method is called to store a notification against the user that is to receive the notification.
Before
class Notification < ActiveRecord::Base # model relationships # model validations end
After
require 'ostruct'
class Notification < OpenStruct
class << self
def create(attributes)
message = "#{attributes[:user].name} #{attributes[:action]}"
attributes[:user].follower_ids.each do |follower_id|
Red.lpush("user:#{follower_id}:notifications", {:message => message, :description => attributes[:description], :timestamp => Time.now}.to_json)
end
end
end
end
The user model remains an ActiveRecord model and uses the devise gem for user authentication, but is modified to retrieve the notifications, now an OpenStruct. The result is that whenever a user's friend makes a purchase, the user is notified of the purchase. In this simple example, a purchase contains one product only.
Before
class User < ActiveRecord::Base
# user authentication here
# model relationships
def notifications
Notification.where("friend_relationships.friend_id = notifications.user_id OR notifications.user_id = #{id}").
joins("LEFT JOIN friend_relationships ON friend_relationships.user_id = #{id}")
end
end
After
class User < ActiveRecord::Base
# user authentication here
# model relationships
def followers
User.where('users.id IN (friend_relationships.user_id)').
joins("JOIN friend_relationships ON friend_relationships.friend_id = #{id}")
end
def follower_ids
followers.map(&:id)
end
def notifications
(Red.lrange("user:#{id}:notifications", 0, -1) || []).map{|notification| Notification.new(ActiveSupport::JSON.decode(notification))}
end
end
The disadvantages to the NoSQL and RDBMS hybrid is that data portability is limited and ActiveRecord plugins can no longer be used. But the general idea is that performance justifies the move to NoSQL for some data. In several sessions I attended, the speakers reiterated that you will likely never be in a situation where you'll only use NoSQL, but that it's another tool available to suit performance-related business needs. I later spoke with a few Spree developers and we concluded that the NoSQL approach may work well in some applications for product and variant data for improved performance with flexibility, but we didn't come to an agreement on where else this approach may be applied.
Learn more about End Point's Ruby on Rails Development or Ruby on Rails Ecommerce Services.

4 comments:
Steph, thanks for your usual informative posting. :)
It's interesting the degree to which MongoDB is of interest to the Rails community. I think people find the idea of working with deeply nested objects inside of a single "row" or "document" appealing. But what problem is it really solving? It makes sharding marginally less hideous than with a traditional RDBMS, because you cram the related data into a single logical entity and thus don't have to worry about crafting the data partitioning strategy to be foreign-key-sensitive. But sharding is still fundamentally hideous, rather like the query syntax MongoDB expects people to use. I can't help but think that MongoDB's apparent mass appeal owes largely to the ease with which one can work with it from the application layer. It's rather like MySQL in that respect, and indeed the project itself feels like the intellectual offspring of MySQL. Who cares if your data is badly organized and tends towards meaninglessness if it's easy right now?
Cassandra gives some good modeling flexibility that, with some finessing, can feel rather like the nested document approach Mongo promotes, while at the same time actually giving something of meaningful value. Control over consistency, true high availability, write scaling, etc. Of course, how many systems actually need these things?
It should be noted that a lot of NoSQL solutions use Thrift as their interface, and the Ruby Thrift client can leave one underwhelmed. And Ruby itself is just not the best for high-volume data processing. Don't expect to instantly process huge volumes of data in your Rails app just because you started using some distributed NoSQL database; you'll have a variety of application-level bottlenecks and may well be better off implementing a data management service in one of the JVM-based languages. I've been trying to pick my favorite Cassandra client in Scala for exactly that reason.
Thanks again.
- Ethan
Thanks for the supplementary information and comments, Ethan.
FWIW, I attended Michael Koziarski's talk on an Introduction to Cassandra and CassandraObject at the conference, where infrastructure was discussed, but I didn't think it fit into this article even though it was interesting.
A talk on CassandraObject? That's cool. I tried working with it a couple months back. A lot of good work has gone into it. However, I ultimately opted against using it. Specifically, it disregards (or did at the time; this may have changed) columns that you haven't told it about. I opted instead to build an attribute-mapping class that lets you perform simple column-to-attribute mappings, or more complex ones where you can map column names to collection attributes via pattern matching. This seemed to me like a more effective way to address the situation in which you may have any number of some structure within a single row, which is not an uncommon situation in these loosely-structured schemas.
Anyway, CassandraObject seems like it has some good stuff to offer; it just didn't meet my needs at the time.
- Ethan
Holy crap, a vanity Google search turned up a blog post about my presentation! Wow! I'm so happy my presentation actually HELPED someone - I thought I totally bombed!
Post a Comment