Understanding Active Record With Active Record, Jr.
If you’re like me and decided to jump into learning Ruby on Rails head first, you’re probably vaguely familiar with a lot of terms, but don’t know what they mean or have any idea what they do. You’ve probably heard about something called Active Record, and you know that it somehow magically allows you to write a few lines of code to create tables in your database and models in your app that you can work with. You have no idea how this works – you just put your faith in this thing, and your app is functioning – awesome.
But knowing what’s going on under the hood is pretty interesting. Let’s get into it.
Active Record is the M in MVC – the model – which is the layer of the system responsible for representing business data and logic. Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database. It is an implementation of the Active Record pattern, which itself is a description of an Object Relational Mapping System.
And if you’re curious about Object Relational Mapping (ORM), the guides have a nice definition for that too:
Object-Relational Mapping … is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.
Active Record allows us to interact with our database, storing and accessing records without writing a single line of SQL.
In the context of Ruby on Rails, this incredible library allows you to set up your models and tables with amazing ease. With a simple command, rails generate model Post title:string body:text, you get a model that maps to a posts table that is waiting to be created in a database migration file. Rails provides this migration file for you upon generation of your model. It probably looks something like this:
When you run rake db:migrate, the bit of code above executes the SQL that creates a posts table in your database with columns called “title” and “body”, which accepts string data, an ID for each, and a created_at and updated_at times for each (generated magically by t.timestamps).
From here, you could get into your rails console and begin creating post records with tons of new methods at your disposal, such as:
123456789
p= Post.new
=> <Post id: nil, title: nil, body: nil, created_at: nil, updated_at: nil>
p.assign_attributes(title: "new post", body: "this is the first post")=> nil
p.save!
(0.2ms) begin transaction
SQL (36.5ms) INSERT INTO "posts"("body", "created_at", "title", "updated_at") VALUES (?, ?, ?, ?)[["body", "this is the first post"], ["created_at", Sat, 19 Apr 2014 19:23:59 UTC +00:00], ["title", "new post"], ["updated_at", Sat, 19 Apr 2014 19:23:59 UTC +00:00]](1.7ms) commit transaction=> true
Where do these methods come from? And how does your app know these can be executed on your new Post class? This is where ActiveRecord::Base comes in.
When you look at the “post.rb” file in the models directory, you see a rather unimpressive file that looks like this:
12
classPost<ActiveRecord::Baseend
It doesn’t look like much is going on. But if you notice the < ActiveRecord::Base bit, and you understand a bit about inheritance in Ruby, you’ll understand that this says your Post model inherits all the functionality provided by ActiveRecord::Base. Methods like .new(), .assign_attributes(), and .save! are dynamically available for any model you create by ActiveRecord::Base.
I didn’t fully understand how this worked until I worked with Active Record, Jr., a Dev Bootcamp creation that models basic concepts of Active Record in a skeleton application.
Our challenge involved an application modeling the relationship between Dev Bootcamp cohorts and students. A student belongs to a cohort and a cohort could have many students.
We were given three separate models – a Student model, a Cohort model, and a Database model that bridged the app and our SQLite database. Here’s the student model:
classStudent<Database::Modeldefself.allDatabase::Model.execute("SELECT * FROM students").mapdo|row|Student.new(row)endenddefself.create(attributes)record=self.new(attributes)record.saverecordenddefself.where(query,*args)Database::Model.execute("SELECT * FROM students WHERE #{query}",*args).mapdo|row|Student.new(row)endenddefself.find(pk)self.where('id = ?',pk).firstendself.attribute_names=[:id,:cohort_id,:first_name,:last_name,:email,:gender,:birthdate,:created_at,:updated_at]attr_reader:attributes,:old_attributes# e.g., Student.new(:id => 1, :first_name => 'Steve', :last_name => 'Rogers', ...)definitialize(attributes={})attributes.symbolize_keys!raise_error_if_invalid_attribute!(attributes.keys)# This defines the value even if it's not present in attributes@attributes={}Student.attribute_names.eachdo|name|@attributes[name]=attributes[name]end@old_attributes=@attributes.dupenddefsaveifnew_record?results=insert!elseresults=update!end# When we save, remove changes between new and old attributes@old_attributes=@attributes.dupresultsend# We say a record is "new" if it doesn't have a defined primary key in its# attributesdefnew_record?self[:id].nil?end# e.g., student['first_name'] #=> 'Steve'def[](attribute)raise_error_if_invalid_attribute!(attribute)@attributes[attribute]end# e.g., student['first_name'] = 'Steve'def[]=(attribute,value)raise_error_if_invalid_attribute!(attribute)@attributes[attribute]=valueenddefcohortCohort.where('id = ?',self[:cohort_id]).firstenddefcohort=(cohort)self[:cohort_id]=cohort[:id]self.savecohortendprivatedefinsert!self[:created_at]=DateTime.nowself[:updated_at]=DateTime.nowfields=self.attributes.keysvalues=self.attributes.valuesmarks=Array.new(fields.length){'?'}.join(',')insert_sql="INSERT INTO students (#{fields.join(',')}) VALUES (#{marks})"results=Database::Model.execute(insert_sql,*values)# This fetches the new primary key and updates this instanceself[:id]=Database::Model.last_insert_row_idresultsenddefupdate!self[:updated_at]=DateTime.nowfields=self.attributes.keysvalues=self.attributes.valuesupdate_clause=fields.map{|field|"#{field} = ?"}.join(',')update_sql="UPDATE students SET #{update_clause} WHERE id = ?"# We have to use the (potentially) old ID attribute in case the user has re-set it.Database::Model.execute(update_sql,*values,self.old_attributes[:id])endend
This is a much larger file than the Post model I showed you earlier. Inside this file, you’ll notice all sorts of Active Record-like methods that allow you to query the database to perform CRUD (create, read, update, delete) operations on your students table:
Also, note the @attribute_names instance variable, which is being used to store the object’s attributes (in this case, [:id, :cohort_id, :first_name, :last_name, :email, :gender, :birthdate, :created_at, :updated_at]). Note too that these are hard-coded into the Student model – we’ll look at changing that later on.
While this is fine, the cohort model file contains almost exactly the same code – just specialized for querying the cohorts table. For example, the self.all method in the student model reads exactly the same in the cohort model, the only difference being that “students” and “Student” is switched out for “cohorts” and “Cohort”.
12345
defself.allDatabase::Model.execute("SELECT * FROM cohorts").mapdo|row|Cohort.new(row)endend
You’ll also notice that each of these methods utilizes the Model class defined in the Database module, which interfaces with our database.
require'sqlite3'moduleDatabaseclassInvalidAttributeError<StandardError;endclassNotConnectedError<StandardError;endclassModeldefself.inherited(klass)enddefself.connection@connectionenddefself.filename@filenameenddefself.database=(filename)@filename=filename.to_s@connection=SQLite3::Database.new(@filename)# Return the results as a Hash of field/value pairs# instead of an Array of values@connection.results_as_hash=true# Automatically translate data from database into# reasonably appropriate Ruby objects@connection.type_translation=trueenddefself.attribute_names@attribute_namesenddefself.attribute_names=(attribute_names)@attribute_names=attribute_namesend# Input looks like, e.g.,# execute("SELECT * FROM students WHERE id = ?", 1)# Returns an Array of Hashes (key/value pairs)defself.execute(query,*args)raiseNotConnectedError,"You are not connected to a database."unlessconnected?prepared_args=args.map{|arg|prepare_value(arg)}Database::Model.connection.execute(query,*prepared_args)enddefself.last_insert_row_idDatabase::Model.connection.last_insert_row_idenddefself.connected?!self.connection.nil?enddefraise_error_if_invalid_attribute!(attributes)# This guarantees that attributes is an array, so we can call both:# raise_error_if_invalid_attribute!("id")# and# raise_error_if_invalid_attribute!(["id", "name"])Array(attributes).eachdo|attribute|unlessvalid_attribute?(attribute)raiseInvalidAttributeError,"Invalid attribute for #{self.class}: #{attribute}"endendenddefto_sattribute_str=self.attributes.map{|key,val|"#{key}: #{val.inspect}"}.join(', ')"#<#{self.class}#{attribute_str}>"enddefvalid_attribute?(attribute)self.class.attribute_names.include?attributeendprivatedefself.prepare_value(value)casevaluewhenTime,DateTime,Datevalue.to_selsevalueendendendend
Wouldn’t it be great if we could take all the duplicate code in our Student and Cohort models and put it inside the Model class in such a way that it knows how to query any model we want? Let’s try it out.
I’ll start with the initialize() method that allows us to instantiate new instances of students and cohorts.
1234567891011121314
definitialize(attributes={})attributes.symbolize_keys!raise_error_if_invalid_attribute!(attributes.keys)# This defines the value even if it's not present in attributes@attributes={}Student.attribute_names.eachdo|name|@attributes[name]=attributes[name]end@old_attributes=@attributes.dupend
How can we put this method inside the Model class so that it’s class-agnostic? It’s actually pretty simple. We can just replace Student.attribute_names... with self.class.attribute_names... and place this method inside of the Model class.
Why self.class? When you call Student.new and pass in values to all of its attributes, the initialize method needs to iterate over the hard-coded attribute names for the Student model. self in this context refers to the newly instantiated student object. We want to be able to call the Student class method, so we call .class on self to return the class of the object:
Now, because Student inherits from Database::Model, we can still call Student.new when instantiating a new student object.
We can easily do the same thing for the all, create(), .where(), .find(), .new_record?, [](), []=(), insert!, and update! methods – move them into Database::Model, and use self in place of the explicit model name. Take note of the extra method I created – pluralize_name.
defself.allDatabase::Model.execute("SELECT * FROM #{self.pluralize_name}").mapdo|row|self.new(row)endenddefself.create(attributes)record=self.new(attributes)record.saverecordenddefself.where(query,*args)Database::Model.execute("SELECT * FROM #{self.pluralize_name} WHERE #{query}",*args).mapdo|row|self.new(row)endenddefself.find(pk)self.where('id = ?',pk).firstenddefself.pluralize_nameself.name.downcase+'s'enddefnew_record?self[:id].nil?enddef[](attribute)raise_error_if_invalid_attribute!(attribute)@attributes[attribute]enddef[]=(attribute,value)raise_error_if_invalid_attribute!(attribute)@attributes[attribute]=valueend# Original methods would go here...privatedefinsert!self[:created_at]=DateTime.nowself[:updated_at]=DateTime.nowfields=self.attributes.keysvalues=self.attributes.valuesmarks=Array.new(fields.length){'?'}.join(',')insert_sql="INSERT INTO #{self.class.pluralize_name} (#{fields.join(',')}) VALUES (#{marks})"results=Database::Model.execute(insert_sql,*values)# This fetches the new primary key and updates this instanceself[:id]=Database::Model.last_insert_row_idresultsenddefupdate!self[:updated_at]=DateTime.nowfields=self.attributes.keysvalues=self.attributes.valuesupdate_clause=fields.map{|field|"#{field} = ?"}.join(',')update_sql="UPDATE #{self.class.pluralize_name} SET #{update_clause} WHERE id = ?"# We have to use the (potentially) old ID attribute in case the user has re-set it.Database::Model.execute(update_sql,*values,self.old_attributes[:id])end
I created the pluralize_name method to convert the singular model name to its plural form, so that “Student” will become “students”, which is necessary for querying the correct table in my database. Obivously I’d need a much smarter method to account for words that can’t be pluralized simply by adding an ’s’ to the end of it, but this method works well enough for the tables used in this exercise. In Rails, ActiveSupport::Inflector takes care of pluralizing more complicated words.
Also, note that the insert! and update! methods are instance methods, so we need to call self.class in order to call the class name.
With these methods factored out, take a look at how much slimmer our Student model is now:
We can do even better by getting rid of those hard-coded attribute names. It’d be annoying if we had to tell every one of our models about the attributes we defined in our database – they should just know.
Fortunately, there’s an easy way to access our table’s attributes with the PRAGMA table_info() statement. This will return a bunch of metadata about your columns:
Then we can use the Class#inherited method (a callback invoked whenever a subclass of the current class is created, re: Ruby Docs) to create the attribute names for our models as soon as our models are created.
Now as soon as we create a subclass of Database::Model, the inherited method will be called, grabbing the column names and assigning them as attributes of the specific model. With that, we can get rid of the hard-coded attribute names. We’ll put the attr_reader :attributes, :old_attributes inside of Database::Model as well.
Much better. With this, much like we do with Active Record, we have all the functionality we had before, but the code is DRY and more flexible. The only methods we have left handle the relationships specific to the model. Active Record handles this for us as well, which you can read about in the Active Record Associations docs.
While there’s a lot more going on inside of Active Record, implementing Active Record, Jr. was a great exercise in understanding the ‘magic’ involved in such a library.
For more on the specific methods inside of Active Record, I’d recommend this video, ‘Demystifying ActiveRecord’, a presentation by Anthony Lewis, a software engineer at PeopleAdmin: