How Ruby on Rails ActiveRecord chaining works - Josh Brody How Ruby on Rails ActiveRecord chaining works | Josh Brody
Back

How Ruby on Rails ActiveRecord chaining works

Here’s something that confused me for longer than I’d like to admit: when you chain where calls in Rails, you’re not executing multiple queries. You’re not even executing one query. You’re building a description of a query that hasn’t happened yet.

class PostsController < ApplicationController
  def index
    @posts = Post.where(active: true).where(user_id: params[:user_id])
  end
end

That @posts variable? It’s not a collection of posts. It’s a Relation object—a query waiting to happen. The actual SQL doesn’t fire until something forces it to.

The lazy evaluation trick

ActiveRecord is lazy. Not in a bad way—in a clever way.

When you call where, you don’t get results. You get back the same type of object you started with, but with your conditions accumulated. Call where again, same thing. The object keeps collecting constraints without ever touching the database.

Here’s a minimal demonstration. We’ll build a fake Relation class that tracks whether it’s been loaded:

12345678910111213141516171819202122232425262728293031
class FakeRelation
  attr_reader :conditions

  def initialize
    @conditions = []
    @loaded = false
  end

  def where(condition)
    @conditions << condition
    self
  end

  def loaded?
    @loaded
  end

  def load
    @loaded = true
    self
  end
end

relation = FakeRelation.new
relation.where(active: true).where(admin: false)

puts "loaded before forcing: #{relation.loaded?}"
puts "conditions accumulated: #{relation.conditions.inspect}"

relation.load
puts "loaded after forcing: #{relation.loaded?}"
loads ~35MB Ruby environment
loaded before forcing: false
conditions accumulated: [{active: true}, {admin: false}]
loaded after forcing: true

The query only fires when something actually needs the data. each, map, to_a, rendering in a view—these trigger the load. Until then, it’s just a blueprint.

This is why you can do things like:

scope = Post.where(active: true)
scope = scope.where(user_id: 5) if filter_by_user?
scope = scope.order(:created_at) if sort_chronologically?
scope

You’re composing a query piece by piece. No database round-trips until you’re done building.

How chaining actually works

The mechanism is simple once you see it: every query method returns self.

When where runs, it stores your conditions internally, then returns the same Relation object. That’s it. That’s the whole trick. Because it returns self, you can immediately call another method on the result.

Let’s trace exactly what happens with each method call:

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
class TracingRelation
  def initialize
    @conditions = []
    @order_by = nil
    @limit_value = nil
  end

  def where(condition)
    puts "where called with: #{condition.inspect}"
    @conditions << condition
    puts "  returning self (object_id: #{object_id})"
    self
  end

  def order(field)
    puts "order called with: #{field.inspect}"
    @order_by = field
    puts "  returning self (object_id: #{object_id})"
    self
  end

  def limit(n)
    puts "limit called with: #{n}"
    @limit_value = n
    puts "  returning self (object_id: #{object_id})"
    self
  end

  def to_sql
    parts = ["SELECT * FROM users"]
    parts << "WHERE #{@conditions.map { |c| c.map { |k,v| "#{k} = #{v.inspect}" }.join(" AND ") }.join(" AND ")}" if @conditions.any?
    parts << "ORDER BY #{@order_by}" if @order_by
    parts << "LIMIT #{@limit_value}" if @limit_value
    parts.join(" ")
  end
end

puts "building query..."
puts ""

relation = TracingRelation.new
  .where(active: true)
  .where(role: "admin")
  .order(:created_at)
  .limit(10)

puts ""
puts "final SQL: #{relation.to_sql}"
loads ~35MB Ruby environment
building query...

where called with: {active: true}
  returning self (object_id: 672)
where called with: {role: "admin"}
  returning self (object_id: 672)
order called with: :created_at
  returning self (object_id: 672)
limit called with: 10
  returning self (object_id: 672)

final SQL: SELECT * FROM users WHERE active = true AND role = "admin" ORDER BY created_at LIMIT 10

Notice every method returns the same object. That’s what enables the chaining.

Why returning self matters

If where returned something other than self—say, nil or the conditions array—the chain would break:

12345678910111213141516171819202122232425
class BrokenRelation
  def initialize
    @conditions = []
  end

  def where(condition)
    @conditions << condition
    @conditions  # returning the array instead of self
  end

  def limit(n)
    puts "limit called"
  end
end

relation = BrokenRelation.new

begin
  relation.where(active: true).limit(10)
rescue NoMethodError => e
  puts "chain broke: #{e.message}"
end

puts ""
puts "what where actually returned: #{relation.where(foo: true).class}"
loads ~35MB Ruby environment
chain broke: undefined method 'limit' for an instance of Array

what where actually returned: Array

The chain breaks because where returns an Array, and Array doesn’t have a limit method. This is why every chainable method must return self.

Building a more complete simulation

Here’s a fuller implementation that shows how each triggers loading:

123456789101112131415161718192021222324252627282930313233343536
class User
  def self.all_args
    @all_args ||= []
  end

  def self.where(*args)
    all_args << args
    self
  end

  def self.load
    puts "executing SQL with: #{all_args.join(", ")}"
    @results = [1, 2, 3]
  end

  def self.loaded?
    !@results.nil?
  end

  def self.each
    load unless loaded?
    @results.each { |r| yield r }
  end
end

users = User.where(active: true).where(admin: false)
puts "loaded before iteration: #{users.loaded?}"

puts ""
puts "now iterating..."
users.each do |user|
  puts "  got user: #{user}"
end

puts ""
puts "loaded after iteration: #{users.loaded?}"
loads ~35MB Ruby environment
loaded before iteration: false

now iterating...
executing SQL with: {active: true}, {admin: false}
  got user: 1
  got user: 2
  got user: 3

loaded after iteration: true

Each where call appends to the conditions list and returns self. The SQL generation happens later—when each calls load.

Conditional query building

One of the big wins of lazy evaluation is building queries dynamically:

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
class QueryBuilder
  def initialize
    @conditions = []
    @order_by = nil
  end

  def where(condition)
    @conditions << condition
    self
  end

  def order(field)
    @order_by = field
    self
  end

  def to_sql
    sql = "SELECT * FROM posts"
    if @conditions.any?
      clauses = @conditions.flat_map { |c| c.map { |k,v| "#{k} = #{v.inspect}" } }
      sql += " WHERE #{clauses.join(' AND ')}"
    end
    sql += " ORDER BY #{@order_by}" if @order_by
    sql
  end
end

def filtered_posts(params)
  scope = QueryBuilder.new
  scope = scope.where(category: params[:category]) if params[:category]
  scope = scope.where(author_id: params[:author_id]) if params[:author_id]
  scope = scope.where(published: true) if params[:published_only]
  scope = scope.order(params[:order_by]) if params[:order_by]
  scope
end

puts "no filters:"
puts filtered_posts({}).to_sql
puts ""

puts "category filter only:"
puts filtered_posts({category: "ruby"}).to_sql
puts ""

puts "multiple filters:"
puts filtered_posts({
  category: "ruby",
  author_id: 42,
  published_only: true,
  order_by: :created_at
}).to_sql
loads ~35MB Ruby environment
no filters:
SELECT * FROM posts

category filter only:
SELECT * FROM posts WHERE category = "ruby"

multiple filters:
SELECT * FROM posts WHERE category = "ruby" AND author_id = 42 AND published = true ORDER BY created_at

No database round-trips happen until you actually need the data. You’re just composing a query description in memory.

Why this matters in practice

Understanding lazy evaluation helps you avoid some common mistakes.

Debugging queries: If you’re trying to figure out what SQL Rails is generating, calling to_sql on a Relation shows you the query without executing it. Useful when your results look wrong and you need to see what’s actually being asked.

Post.where(active: true).where(user_id: 5).to_sql
# => "SELECT \"posts\".* FROM \"posts\" WHERE \"posts\".\"active\" = TRUE AND \"posts\".\"user_id\" = 5"

Console behavior: If you’ve ever wondered why typing a query in rails console immediately shows results, it’s because the console calls inspect on the return value. inspect triggers load. The console is forcing evaluation that wouldn’t happen in your actual code.

# In console:
User.where(active: true) # Immediately shows results

# In your code:
@users = User.where(active: true) # Nothing happens yet

The Relation object

What you’re actually working with is an ActiveRecord::Relation. It’s not an array. It’s not a model. It’s a query builder that happens to act like a collection when you need it to.

Relation includes Enumerable, which is why you can call map, select, find, and friends on it. But those methods trigger loading first. The Relation hands off to the loaded results.

Here’s a simulation of how Enumerable integration works:

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
class EnumerableRelation
  include Enumerable

  def initialize
    @conditions = []
    @loaded = false
    @results = nil
  end

  def where(condition)
    @conditions << condition
    self
  end

  def load
    return self if @loaded
    puts "loading results from database..."
    @results = ["user1", "user2", "user3"]
    @loaded = true
    self
  end

  def loaded?
    @loaded
  end

  def each
    load
    @results.each { |r| yield r }
  end

  def reload
    puts "clearing cached results..."
    @loaded = false
    @results = nil
    self
  end
end

relation = EnumerableRelation.new.where(active: true)

puts "using map (triggers load):"
puts relation.map(&:upcase).inspect
puts ""

puts "using select (uses cached results):"
puts relation.select { |u| u.include?("1") }.inspect
puts ""

puts "after reload, next access triggers load again:"
relation.reload
puts relation.first
loads ~35MB Ruby environment
using map (triggers load):
loading results from database...
["USER1", "USER2", "USER3"]

using select (uses cached results):
["user1"]

after reload, next access triggers load again:
clearing cached results...
loading results from database...
user1

Once a Relation has loaded, it remembers the results. Subsequent iterations don’t re-query. If the underlying data might have changed, you need reload to force a fresh query.

The pattern in general

This “return self” pattern—sometimes called a fluent interface or method chaining—isn’t unique to Rails’ implementation of ActiveRecord. You see it in jQuery, in builder patterns, in query builders across languages.

Naively:

1234567891011121314151617181920212223242526272829303132333435363738
class HtmlBuilder
  def initialize
    @elements = []
  end

  def div(content = nil, &block)
    if block
      @elements << "<div>"
      instance_eval(&block)
      @elements << "</div>"
    else
      @elements << "<div>#{content}</div>"
    end
    self
  end

  def p(content)
    @elements << "<p>#{content}</p>"
    self
  end

  def span(content)
    @elements << "<span>#{content}</span>"
    self
  end

  def to_html
    @elements.join("\n")
  end
end

html = HtmlBuilder.new
  .div("hello")
  .div { p("nested paragraph") }
  .span("footer")
  .to_html

puts html
loads ~35MB Ruby environment
<div>hello</div>
<div>
<p>nested paragraph</p>
</div>
<span>footer</span>

The core idea: instead of returning a result, return the object itself so the caller can keep calling methods. Accumulate state internally. Only produce a final result when explicitly asked.

It’s a good pattern when you have multi-step configuration. It’s a bad pattern when the intermediate states are confusing or when order of operations matters in non-obvious ways.

ActiveRecord gets away with it because the mental model is clear: you’re building a SQL query. Each method adds a clause. The query runs when you need data.

Summary

  • where and friends return a Relation, not results;
  • The query doesn’t execute until something forces enumeration;
  • Chaining works because each method returns self;
  • loaded? tells you if the query has run;
  • to_sql shows the query without running it;
  • The console’s inspect triggers loading—your code won’t behave the same way.

Once you see the Relation as a query-in-progress rather than a result set, the behavior stops being surprising.

Stay in the loop

Occasional essays on design, tools, and the craft of building things. No spam, unsubscribe anytime.

Ambient weather

The background of this site reflects the current weather and time of day in Saint Paul. The orbs shift in color and behavior based on what's happening outside my window.

Learn more about how this works