Category

Ruby on Rails

Common Table Expressions within Rails

We use ActiveAdmin heavily at work.

And I have a love and hate relationship with it. While it’s an easy way to quickly spin up an admin dashboard in Rails, it is so opinionated that any time you want to do anything slightly out of the ordinary, it quickly becomes a nightmare for me. I guess people may say the same thing about Rails, but I digress…

I recently had to modify a CSV export within ActiveAdmin so that it includes two rows in the CSV export for each one row that’s in the database. For the sake of example, let’s say that we have a payments table where we have three columns.

In ActiveAdmin, you define how you want the csv export to look like in a “csv” block like this.

The above block of code will produce a csv file that will have one row per database row in the payments table. However, the requirement that I was given called for two csv rows per one database row with the processing_fee being displayed as amount. Since we can’t force ActiveAdmin’s CSV processor to produce two rows, we need to feed the CSV processor the collection with two records per one record already preprocessed. At first, I decided to attempt this with Common Table Expressions.

What’s a Common Table Expression (CTE)?

I think of Common Table Expressions as a temporary table that you construct within a query that you can query from. I think of it as a more temporary lite version of database views. Let’s say that you need to produce a result set with data coming from multiple tables. You could perform complex joins and then query from the joined result set, or you could create a CTE and simplify your queries. In my specific case, I needed to create a temporary table that contained two rows per one row in the database with the second row displaying the processing_fee as amount. How do we achieve this with Common Table Expression?

The query that I came up with was

The above query will construct a temporary table called cte_payments with the result set from the query contained inside the parenthesis. Then you can query the cte_payments table as you normally would. With the above query, we can do the following within ActiveAdmin to create a custom scope to feed into ActiveAdmin’s CSV processor.

The above in the if request.csv? block work in creating a collection of result sets that you want. However, find_by_sql method returns an Array rather than the ActiveRecord::Relation collection and unfortunately ActiveAdmin requires that you feed ActiveRecord::Relation into its CSV processor. If you are working outside of ActiveAdmin or in a situation where an Array will work fine for you, you can stop reading here. If you need your custom query that you execute within Rails to return an ActiveRecord::Relation, then read on.

If you must need to have your ActiveRecord query return an ActiveRecord::Relation collection, and you would like to stick with using CTEs, you can try using the postgres_ext gem that was created by the folks at DockYard. Unfortunately, the gem is no longer maintained. You can also dig into Arel and play in that world. There’s an excellent blog post on how to work in Arel to utilize Common Table Expressions here. I personally think that working directly with SQL is easier than working with a gem that’s no longer maintained and trying to figure out Arel, so I opted for rewriting my CTE query with a regular subquery that I can use with the ActiveRecord’s from method. If I rewrite the CTE query that I wrote above as a subquery, and use ActiveRecord to return an ActiveRecord::Relation collection, it would look like the following.

The above method will return an ActiveRecord::Relation instead of an Array that you get with using Common Table Expressions.

Rails and PostgreSQL Arrays

This is something that I recently learned while working on Building Localize Ninja.

I have a Project model, where I need to store two sets of data, the source_locale and the target_locales. Basically, a Project represents a localization project that the translators/collaborators will be working on translating the app. The source_locale will only contain one language which will probably be English for many people and target_locales will be one to many languages.

The default way I approach this would have been creating a locales table where I insert all language that will be available and then creating a join table between the projects and the locales table. However, I wanted to see if there was a way to associates projects with multiple target_locales without joining tables.

At first, I thought about storing comma separated values as a string directly into a column called target_locales in the projects table. And after querying, I could split the string by commas and determine which languages the project is targeting. This is a bit archaic but I’ve seen it being done this way in other projects I have worked on and it seems to work for them. However, it felt a bit too hackity hack hack for me so I looked into it further.

After a few minutes of research, I learned that PostgreSQL supports storing arrays directly in a column. With this, rather than storing a list of languages in a separate table and then joining tables every time I want to figure out which target locales the project has, I could define the list of languages in Ruby as a constant and store the list of target languages in the array column, avoiding having an extra table just for the sake of keeping a list of languages.

To add an array column that will contain strings, migration will Rails will look something like this.

Note that you have to define what data type the array will hold. In my case, it will be strings, thus I used the text data type. If I wanted to store numbers, I would have used the integer data type instead.

Validations

Let’s say that we want to allow having projects to have the same name as long as they have different values for the target_locales column. In Ruby, arrays are considered to be not equal if they have the same values in different order. For example,

Let’s say that we want our program to recognize that [1, 2, 3] == [1, 3, 2] are equal and return true since they technically contain the exact same values, albeit in a different order. Within the world of Rails, we could have a before_save callback so that we sort our array before saving it into the database like this.

Something like the above will sort the list of target_locales that we have so that all of the locales saved in the database will be nil and empty string free and be sorted so that we can easily compare the target_locales list of one project to another.

If you wanted to make the above more performant, you could do the above within PostgreSQL land, sorting the array by creating a custom function and then creating a custom rule that will execute within an insert statement. I actually prefer doing things like this within Ruby land because there have been times where I was debugging something and couldn’t figure out why something unexpected was happening and then realizing a few hours later that there was a custom database rule that was firing that I didn’t realize existed. I think when I’m coding Rails apps, I tend to be stuck in Ruby land and tend to concentrate on the code in front of me. Thus, I generally like to have as much logic written in Ruby as possible unless I have a good reason to move operations out into the database. Usually when I consider such options, it’s due to performance bottlenecks.

Since I’m currently at the beginning stages of the project, I am sticking with doing this in Ruby land. If the project ever gets some users and I see performance bottlenecks, I will consider moving the above logic out into the database.

Custom spec types with RSpec

We use RSpec as our testing framework at Modern Message. And I learned something new about RSpec today.

You know those types that you can pass into RSpec specs? For example:

Or

Starting from RSpec 3+, we have to specifically define these spec types. And it turns out that these spec types provide RSpec specific methods that you can use like post, expect, and etc that you use to write your tests. Also, it turns out that you can define your custom spec types as well.

Why am I writing about custom spec types?

At Modern Message, I’m currently working on a feature where I have to have common before and after hooks for multiple specs. To be specific, I’m currently using the stripe-ruby-mock library to stub out external requests made to Stripe in my tests.

For each test that utilizes the stripe-ruby-mock gem, we need a before and after hook that starts and stops the StripeMock class. You can read through the docs in the GitHub page for the library, but essentially it looks something like this.

In my specific circumstance, I am writing tests for webhook events that Stripe sends out. For example, Stripe has a webhook event called invoice.payment_succeeded. To create a corresponding model in my Rails app, I created a folder called stripe_service in my models folder and a corresponding ruby class file app/models/stripe_service/invoice_payment_succeeded.rb.

Do note that Stripe does have multiple webhook events and in my specific case (and probably in most Rails applications), you would have multiple classes and files in the stripe_service folder that represents different webhook events. And that means more spec files in spec/models/stripe_service/ folder with more repetitive before and after hooks starting and stopping the StripeMock class.

While I’m one of those people who believe that there’s nothing wrong with a little bit of duplication (especially in my case where we only have three webhook event classes), we can remove this before and after hook duplication with a custom spec type. Thus, if we create a custom spec type, our Stripe::InvoicePaymentSucceeded spec will look like this.

To create this custom stripe_service type, we need to go into our spec/spec_helper.rb file and define this new type. Here’s an example.

Here, the first thing we do is that we specify that all of the specs in the spec/models/stripe_service folder to have the type stripe_service. And in the following lines of the configuration, we use our new strip_service metadata type, we specify that before and after hooks both start and stop the StripeMock classes. With this configuration, each time we specify our specs to have the type stripe_service, it’ll automatically run the before and after hooks automatically, reducing duplication in our specs.

after_create vs after_save vs after_commit

Callbacks are useful tools to use in Rails applications.

The ones I tend to use most are ones that run before or after certain actions are taken upon an ActiveRecord model. Some of the common ones you may see in a typical Rails codebase are before_save, before_create, after_create, and after_save. In this blog post, I’ll be writing specifically about the difference between the after_create and after_save, and then throw after_commit to the mix since it can be preferable to use after_commit in specific situations.

First, let’s take a look at the difference between after_create, after_save, and after_commit.


after_create

This is called after ActiveRecord::Base.save is called on new objects that haven’t been saved yet (meaning that no record exists in the database)

after_save

This is the same thing as after_create but is called whether the object is a new or an existing record in the database.

after_commit

This is called after the database transaction has completed.


Now, the part about “after the database transaction has completed” for after_commit is important because this will determine whether you will prefer after_commit vs the after_create or after_save. Basically what this means is that after_create or after_save can still run before the changes are actually saved to the database, meaning the database can raise an error if the data you want to commit to the database violates your database constraints (like a uniqueness constraint violation). Thus, in certain situations where you want to work with data that you are sure that has been written to the database, after_commit is preferable to after_create or after_save.

This is one of those “gotchas” about various callback options a developer may have that’s not blatantly obvious. Choosing the right callback can save you a lot of headache in creating accidental bugs.

 

Adding columns with default values to large tables in Postgres

I recently had to add a new column with a default value to an existing table in a Rails application.

The Rails application was backed by PostgreSQL and the table that we were adding this new column to had over 2 million rows of data. Now, adding a new column with a default value in Rails is a simple task that involves writing a migration that looks like

 

In most cases, the above will work. However, if you have really big table with large amounts of data, which in this specific case there were over 2 million rows of data, the above migration will take an eternity to run. This is because adding a default value for a column in a table will get Postgres to go over every row and update the default values for the said column. If you have a large table, this operation can take some time. What’s worse is that because the above migration is locked in a transaction, the table in question becomes locked during the migration and your application will grind to a halt since your database table won’t be functional.

So… how do we solve this? Let’s first list our problems in the situation where we’re adding a new default column to a really large table.

Problems

  • Column creation (or simply adding default value to an existing column) causes all rows in that table to be updated at the same time, which can be time consuming in really large tables
  • Updates are slow in Postgres since it has to guarantee consistency
  • During the update, the table that’s in operation is locked, causing the application to grind to a halt until the update finishes

To solve this problem, we need to work around the above facts so that we can add the default column and allow all of the rows in the large table to be updated while keeping the application running. How do we do this? Well… we can do the following.

Solution

  • Adding a column to a table without default values in Postgres is fast. Thus, we should add the new column with a default value of null first to get the creation of the new column out of the way.
  • Rather than updating every row in the table all at once, we can split up the number of records and update them in batches. This makes the work more manageable for the database.
  • If we make the updates of the rows in the table with default values more manageable, the table won’t be locked during the operations and the application can keep running during the migration.

So, let’s do this. To do this in a Rails migration, we need to recognize a few more things about how Rails handles migrations. Rails migrations are done in a transaction. Meaning anything in that def change are being ran in a transaction to allow for rollbacks. If we do any data updates there, it means that all of those updates will be wrapped in a transaction, locking the entire table. This means that we’ll need a way to disable this auto-transaction mechanism of Rails migrations and handle our transactions manually, picking and choosing where we want to wrap our database operations in a transactions and where we want to disable wrapping our database operations in transactions. Thankfully, Rails comes with a method called disable_ddl_transaction! that we can use in our migration files to disable the default transaction locking behavior in our migrations. Also, we want to replace the def change method with the def up and def down  so that we can manually define our rollback strategy just in case we want to rollback our migration. Below is my attempt at utilizing this feature of Rails and handling the transactions on my own.

For the sake of demonstration, let’s say that we’re adding a integer column named number_of_retries with a default value of 0 to a table called users.

You can see that I wrap all database operations (adding a column, altering columns with default values, and etc.) in transactions, while I do not wrap the part where I update the default user values in transactions. The find_in_batches has a default batch_size of 1,000, but you can always increase that to a larger number for faster migration speed if you want. I just found that 1,000 to be a good number for stable migrations. I’ve had situations where the migration timed out when I set the default batch_size to a higher number like 10,000. Also, as you can see, having a separate def up and def down methods in the migration allows us a way to safely rollback the migration if needed.

The above migration will work. Granted, it won’t be an instant migration. You’ll still have to wait for the migration to run. However, it’ll prevent locking of the table which grinds the application to a halt, which ends up in bad experience for your users. Unfortunately as of now, there are no good simple ways to add new columns with default values quickly to an existing large table. The above is a workaround that utilizes certain principles to add new columns with default values on really large tables.

I hope that this post is helpful for those looking to solve similar problems on their production Rails applications.

ActiveRecord Optimistic Locking

I briefly covered distributed locks and pessimistic locking here. In this specific post, I’ll cover optimistic locking.

Optimistic locking is just an alternative to pessimistic locking except that it sort of “builds in” locking mechanisms to an entire table and its corresponding ActiveRecord model. In pessimistic locking, you have to manually call with_lock on an ActiveRecord model while in optimistic locking, you introduce a column called lock_version in your database table, which automatically enables optimistic locking.

The lock_version column will be incremented every time a change is committed to the record in question. Thus, if there are two processes accessing the same record, and one process makes an update to the record, the second process won’t be able to modify the record unless it re-retrieves the newly updated data. This can prevent problems that can come with concurrent access to the same data.

This is pretty much straight from the Rails docs but I can’t think of any better way to explain some example usages of optimistic locking, so here goes.

Example 1: Optimistic locking preventing two instances of the same record overriding each other.

Example 2: Optimistic locking preventing deletion of the same record when lock_version is out of date.

To add optimistic locking to an ActiveRecord model, add the column lock_version to your database table with a datatype of integer. Something like this.

lock_version is the default column name that Rails will look for, but you can also customize the column name by overriding an attribute called locking_column within the ActiveRecord model like this.

Personally, I prefer pessimistic locking since I can choose specific instances where I want to lock records. To me, optimistic locking seems akin to the evil (in my mind) default_scope that override default scoping behaviors in ActiveRecord.

I would say optimistic locking is a toolbox that one can grab for, but only do so when there’s a really good reason to do it as it completely overrides out-of-box default ActiveRecord behavior.

Database transactions and distributed locks

Race conditions are something I rarely think about when I’m developing web applications.

This is primarily because when I think of race conditions, my mind automatically jumps to managing threads, which frankly does happen with Android development at times, but rarely in Android development either as long as you utilize the libraries provided by Google.

However, race conditions can happen in web development as well, especially when multiple users are trying to access the same record in the database and modifying it at the same time. This means that you have to ensure that your code can cope with concurrent data access. The standard tools that we use in web development to deal with concurrent data access are database transactions and distributed locks.

In this post, I’ll cover what transactions and distributed locks are, and when to use either of them.

** The sample code is written in Rails, but the concept can be applied to any database backed applications.**

Transactions

Let’s take a look at a typical create action in ActiveRecord.

The above code creates a new product for a single user. This code is fine, since it’ll only affect one user if it fails. If it fails, you’ll probably get notified by your bug tracking tool and address it individually. Now, let’s look at some more code that works with a user and related products that are associated with the user.

At first glance, the above code looks completely fine. But, it has issues.

  1. What if one of the products of user fails to be added to the user2‘s list of products in the middle of the loop?
  2. What if some other process comes in and runs the code for adding the list of products to user2‘s list of products while the first process is running?

If either of the two happens, the code that we’ve written won’t have its intended affect. In fact, the dataset that we end up with for user2 will make our head spin as we debug what happened to user2‘s records.

This is where we want to use transactions. Transactions is just a general database concept that we can utilize easily in ActiveRecord. All ActiveRecord classes have the transaction method that you can wrap your database calls in. Let’s wrap our loop inside a transaction like this.

Wrapping our code in a transaction will execute the code in a single operation. If any part of the code inside the transaction fails, the entire operation will be rolled back. In our specific example, this has the benefit of ensuring that all of user‘s products will be copied over to user2‘s list of products.

The gist of this whole example is, if you’re ever in a situation where you’re affecting more than one row in a database at a time, group your code together inside a transaction.

When transaction can go wrong

Transactions sound great right? It is until you have two or multiple processes running the same transaction at the same time accessing the same dataset. What happens in this situation is that the two (or multiple) processes running the same transaction will commence at the same time, and make their database commits regardless of the actions committed by the other processes. If we’re working with transactions that update existing records, this can easily create race conditions that can create unpredictable results in our dataset.

Distributed locks

We can remedy the problem of concurrent data access with distributed locks. What locks do is ensure that the locked code only runs in a single thread at a time. This means, that no two concurrent processes can run the same piece of code at the same time, preventing race conditions. In Rails, we have a convenient method that we can call on ActiveRecord objects to lock them.

In Rails, this is referred to as Pessimistic locking.

Let’s see the difference between calling transaction vs with_lock on an ActiveRecord object.

If you look, the transaction method simply opens a begin / commit transaction while with_lock method reloads the record with for update added to the SQL, which prevents any other database connections from loading the record for modification.

So what do I do with this information?

If you’ve ever ran into an issue where you suspected race condition was the culprit, I would begin by inspecting the suspected parts of your codebase where you’re making database calls and wrap hem properly in a transaction or lock records properly.

If you are working on a large production application, it probably doesn’t make sense to go through all of your code that makes database calls and determine whether that line of code is the culprit. Instead, anytime you get a bug report where the user is asking questions that goes something like “why did this happen to my data?”, see if your code is written in a way where concurrent data access could be the culprit.

Form objects in Rails

Most applications have forms that take in a myriad of inputs from the user and saves the information that the user typed into the database.

Rails provides various helpers to build forms that interact with ActiveRecord to easily create and persist user input data into the database. Rails specifically provides form_for, form_tag, and more recently in Rails 5, form_with.

Just as an FYI, while I heavily refer to Rails in this post, this pattern can be used with any web application frameworks. This is a pattern not specific to Rails but can be applied in many different application development environments.

These helpers are very easy to use when you’re working with one data model. For example, let’s say that we’re building an application to manage rooms in hotels. In this application, we need to build a form to create rooms for the hotel so that we can book users to these rooms. Now, because we want to advertise these rooms in an eye-appealing way, we want to be able to attach multiple photos for each room. Here’s an example of the Room model that we’ll be working with.

And our Picture model and the Hotel model are as the following.

Our objective in this blog post will be to build a form to create rooms and one picture in the same form.

accepts_nested_attributes_for – Rails’s answer to nested associations in forms

As stated above, building a form to interact with one model is very easy. We just use one of Rails’s built in form builders and map it to one instance of the model. However, when we want to associate multiple models within the same form, things get a little complicated. Rails provides a built in API called accepts_nested_attributes_for to help us build nested forms that can handle multiple models that have associations with each other. To use accepts_nested_attributes_for to build our Room form, we’ll need to add that to our Room model.

Our form for the Room model will now have nesting to accommodate the pictures association that we’ll be building in. It’ll utilize the fields_for helper within the form.

And our controller to interact with this form will look like

This is the standard Rails way of handling associations within forms. It works, but I’m not a fan of this method due to a few reasons. I’ll go over the Pros vs Cons over this standard Rails way.

Pros
  1. It’s built into Rails
    • Yes, this is a pro. This feature is built into Rails which means it’s ready for us to use. Rails team is maintaining the code for this feature and is improving and fixing bugs and we’re relying on their work to build our nested forms. Having other people maintain features that help us build our applications is a pro in my opinion.
  2. Rails handles validations for nested attributes for us and wraps up the operation in a transaction
    • Because this is a feature built into Rails, validations that are defined at the model level will be handled by Rails for us. This reduces the work that we need to do as we’re handing over this responsibility over to Rails.
    • When using accepts_nested_attributes_for Rails will wrap the database operation in a database transaction so that associated records all get saved at once. In our specific example, this means that the new Room won’t be saved unless the Picture data is also correctly entered by the user.
Cons
  1. It increases nesting
    • If you take a look at the form with the f.fields_for along with the nested pictures_attributes in the room_params, you can see that accepts_nested_attributes_for helper in Rails increases nesting. I personally prefer avoiding nesting if possible because it makes the code easier to reason about for me.
    • In general, flat code is easier to reason about vs nested code. This is akin to having a lot of nested if statements making code more difficult to reason about.
  2. accepts_nested_attributes for is notoriously difficult to utilize (for me at least)
    • I have no idea why, but every time I reach out for accepts_nested_attributes, I start getting stressed before I even get started. To get accepts_nested_attributes to work, I have to build the form, test it and wonder why it’s not working, see which params are coming into the controller, rework the form, controller, and the model to see which one is at fault, and etc.
    • The error messages from Rails for the accepts_nested_attributes is not very helpful. You’ll just see a bunch of error messages that don’t make much sense and see a bunch of random nils and empty params showing up, making it difficult to debug and getting your form working.
  3. As your application gets more complicated, making changes to your nested forms become more difficult.
    • In this specific example, we are working with only two associations: Room and the Picture. Applications inevitably grow larger in complexity, thus it’s entirely possible that our Room model will acquire more complicated associates and we will have to increase the complexity of our nested form, nested params in our controller, validations in our models, and etc. This makes changes much more difficult to make

Form objects to the rescue

Form objects, while not the perfect end-all solution, can alleviate some of the cons that I listed above that come from using accepts_nested_attributes_for helper from Rails. Within the context of Rails, form objects are essentially plain old Ruby objects that interact that handle the logic required to save your records. You don’t even really need forms to utilize form objects. You can also use them as an intermediary between your incoming parameters and your controllers in strictly API applications as well. Let’s refactor our above code using form objects.

First, remove the accepts_nested_attributes_for and the validates_associated helpers in the Room model.

And create a new Ruby class called RoomForm that looks like this.

If you look at the RoomForm class, you’ll see that it’s just a plain old Ruby object with the ActiveModel::Model module from Rails thrown in there so that we can utilize its various validation helpers. We give the form object three attributes that are needed to create a new room and a picture via attr_accessors and then validate the form object with Rails’s built in validates method. We then give the form object a save method that will create the new room and the picture if the form object passes the validates validations that we defined. It’s also important to note that we wrap the create! operations in a database transaction so that we don’t accidentally create a new room without a picture.

Now we need to modify our RoomsController to use our new RoomForm instead of our Room model.

And finally our form itself

If you actually load up the application and submit the form, everything should work. This refactoring with the form objects have several pros compared to the Rails’s traditional accepts_nested_attributes_for method. Unfortunately, it has cons as well (every pattern has a trade-off in software development).

Pros
  1. Eliminates nesting and promotes flat hierarchy
    • If you look at our new form that utilizes the RoomForm and the refactored RoomsController you can see that nesting is completely gone. This new flat hierarchy makes the code easier to reason about and to extend in the future.
  2. Skinnier controller with business logic contained within the form object
    • What usually happens with nested forms that interact with models that have complex associations is that the logic for saving these associations eventually gets long and complicated within the controllers. By moving away the logic for saving these associations out from the controller and into the form objects keeps the controllers tidy and neat.
    • Because we’re moving the business logic of handling saving of data into the form objects, we can reuse these form objects throughout the codebase, keeping our code DRY.
Cons
  1. Duplicate validations in both form object and the model.
    • I mentioned how the form objects help our codebase become more DRY, but that isn’t the complete truth. If you look at the validations in our form object, you’ll notice that it’s basically a repetition of the validations in the Room model and the Picture model. Our example is simple so it’s not THAT bad, but as our application grows in complexity, this could cause a toll in duplication.

As you can see, form objects eliminate the cons of using the default accepts_nested_attributes_for helper while introducing new cons. Unfortunately, using this pattern will mean that you’ll have to be willing to make the trade off with the slight increase of duplication in validation code.

At the end of the day…

I personally think form objects are worth integrating into your every day development whenever you start noticing nesting in your forms. I think the increase in validation duplication is worth the trade off in the flat hierarchy that you end up with in your forms and controllers.

This pattern is considered useful enough that there are gems out there that implement form objects form you. Personally, I find implementing form objects to be simple enough to roll out my own and I like the flexibility that I gain from self implementation. For those of you who are interested in using a gem instead, I see that https://github.com/trailblazer/reform has a lot of stars on GitHub and seems to have a lot of features built in that you can use out of the box. Also, the fact that it’s associated with the Trailblazer framework should give it some credibility.

Different styles of organizing ActiveRecord models

This is something I recently noted while working at my new employer’s codebase.

The platform that I work on at my current full time job is your standard Rails application with some Backbone sprinkled in there for organizing JavaScript code. The first thing I noticed when I started working on this new codebase was that the way the code in ActiveRecord models were written was very different than the codebases that I’ve come across before and the way I like to write my Rails models.

I think a simple example will help explain the differences I’ve noticed a lot better. Normally, when I write a typical model in a Rails app, I like to group different types of code together. For example, I like to put all of my validations in one place, all of my class methods in another, all of my instance methods in another, and etc. For example, let’s say that we have your average User model.

This could be your typical Rails model that represents a User. However, if I rewrite this User model following the style guide of the codebase that I work full time on, it’ll look like this.

See how the ROLES constant was moved to above the validates :role validation and the set_slug and the set_full_name before_save callbacks were moved right above to the methods that they’re calling.

If I’m being 100% honest, I find this coding style to be kind of ugly. However, I’ve noticed that organizing my Rails model this way provides a huge advantage in that I find it easier to dissect the immediate impact of each method, variable, constants, and etc has on my model. The above User model is a very simple example, but in real production applications, you’ll find models that span hundreds or even a few thousands of lines long. In these large classes, it’s a huge advantage to look at a method and immediately know that it’s being called in an ActiveRecord callback, whether the method includes any variables that are required by the model to be valid, and etc.

I still don’t know for sure which style I prefer. The style in the second example is new to me, and I still think it kind of looks ugly, but I have to admit that it gives me a better insight into what the code is actually doing at first glance rather than having to skip around a large Ruby file trying to figure out how the code actually works.

Dynamically create methods and scopes in Rails

The active record models with the status columns generally indicate that there are different states that the model can be in.

And usually, when there are status columns within a model, it’s common to query for specific statuses and to check for the status of the model. For example, let’s say that we have an Invoice model that have three different statuses: pending, paid, overdue. Most people would write code for querying for these statuses and checking for the status of the model like this.

This works. Some developers, who are aware of creating scopes may create scopes for the different statuses instead. And also, they may add methods to the Invoice model to determine the status of the Invoices (because you know, refactoring).

The above is definitely better, and I (and a lot of people) would probably go, “Good Enough”, and get along with our lives. But our Invoice model can be made more succinct and more flexible if we utilize a little bit of underutilized Ruby features.

Let’s say that the Invoice model can have a lot more statuses, like 10 different statuses, like paid, pending, overdue, overpaid, underpaid, read, unread, clicked, missing, scammed (Note, some of these status names don’t make sense, I just wanted to quickly come up with 10). Well, that’s a lot of custom scopes and methods you have to write. Thankfully, there’s a little trick to creating scopes with a little bit of Ruby magic during runtime.

For the scopes, we can store the pre-defined statuses in a constant, loop over them, and create scopes during runtime. For the methods that check for the Invoice’s current status, Ruby has this thing called define_method that allows you to define methods during run time.

https://ruby-doc.org/core-2.2.0/Module.html#method-i-define_method

At first glance, one may go, “Why would I ever use this define_method thingy instead of actually writing the method myself?” Well, in the context of creating methods that check for predefined statuses on the fly, it can be pretty useful in that it can create all of these methods for us rather than us having to write the scopes by hand. To utilize these two techniques so that we don’t have to manually write 10 different scopes and methods by hand, we need to do the following.

  1. Create a constant that defines what statuses the Invoice model is allowed to have.
  2. Loop through the statuses, and for each status, create a scope and then define a custom method using Ruby’s define_method.

Below is how you do it.

Try the refactoring of the Invoice model above and you’ll see that the controller still works as it should. And the Invoice model in this form is much more concise and flexible since if we want to add more scopes and methods that pertain to the status, all we have to add is the new statuses in the Invoice::STATUS constant.