ActiveRecord count and sum always operate on the database

I've always used ActiveRecord's count method in tests and occasionally saw what I considered strange behaviour in it's result. Today, I saw a validation failing because of my understanding of count, so I dug into why. Based on what I know now, I'm going to start using length and size in place of count unless my intention is truly the SQL notion of COUNT(*).

The difference in these three methods is:

  • count will always issue a SQL query in the form SELECT COUNT(*) FROM relation...

  • size will return a size of the association collection if it's already loaded. This is most like Array#size in that it operates on the in-memory collection. Otherwise, if the association collection is not loaded, it will call count which issues a query.

  • length will load the association collection and then call size for the return value. Because size will operate on the collection that was just loaded, length never results in a count operation. Also, if the collection is already loaded, length and size are equivalent.

So, the recommendation in the Rails source is that if you're going to need the records anyway, use length because it results in one less query. If you don't want to load all the records and really just want the size, use size.

So now to my original problem. I had a validation that was failing on create.

class Order < ActiveRecord::Base
  has_many :order_lines

  validate :maximum_number_of_items_exceeded

  def maximum_number_of_items_exceeded
    errors.add(:order_lines, 'must total 20 items or less') if order_lines.sum(:quantity) > 20

The problem turned out to be an assumption that the sum method was operating on the in-memory association collection that was in the process of being saved along with it's parent record. Instead, it was issuing a SQL query:

SELECT SUM("order_lines"."quantity") AS sum_id FROM "order_lines" WHERE "order_lines"."order_id" = $1 [["order_id", 59547231]]

What made this tricky to discover was that the validation was working correctly on update. Also confusing was that getting a size on this collection on create returned 1 so I knew there were order_lines to be summed. Finally, calling count on the collection always returned zero. That led me to discovering the difference between the two talked about above. The count was, of course, issuing:

SELECT COUNT(*) FROM "order_lines" WHERE "order_lines"."order_id" = $1 [["order_id", 925119186]]

So, the condition just needed to be changed to operate on the in-memory collection as was originally intended. So the validation method becomes:

   def maximum_number_of_items_exceeded
     total_quantity = { |total, quantity| total + quantity }
     errors.add(:order_lines, 'must total 20 items or less') if total_quantity > 20

Most of my understanding came from reading the Rails source - especially the comments which are pretty great. The relevant files are:

rails/activerecord/lib/active_record/associations/collection_association.rb rails/activerecord/lib/active_record/associations/collection_proxy.rb

as of Rails 4.1. These links were helpful too:

ActiveRecord: size vs count

count vs length vs size

Hopefully this will help you if you've ever wondered about the difference between these methods.