acts_as_list refactoring part 3

Fri, 02 Jun 2017 01:07:13 +0000

I refactor acts_as_list Ruby gem again: watch as I choose better names, strip unnecessary variables, work with some ActiveRecord internals and make code intent clearer. In this refactoring adventure I'm going to focus on just one 11-line method, and surprisingly, there's a lot of things that can be improved in just one method.

You don't need to read part 2 and part 1 to understand this article.

acts_as_list is a Rails gem. It allows you to treat Rails model records as part of an ordered list and offers methods like #move_to_bottom and #move_higher.

Step 1: a hairy method using #send

.update_all_with_touch method caught my attention as it's a somewhat long (11 lines) and hairy method. This method executes passed SQL, as Rails' #update_all does, but also updates standard timestamps like updated_at.

define_singleton_method :update_all_with_touch do |updates|                
  record = new                                                             
  attrs = record.send(:timestamp_attributes_for_update_in_model)           
  now = record.send(:current_time_from_proper_timezone)                    

  attrs.each do |attr|                                                     
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end                                                                      

  update_all(updates)                                                      
end

Let's have a look. On the line 2 (see above ↑) it creates a new model instance (acts_as_list is supposed to extend ActiveRecord models, so naturally new would create one). Then it sends two messages to the created model instance record via #send. The reason it uses #send is because both those methods are private, so it can't just say record.timestamp_attributes_for_update_in_model. Now, this is some cognitive load, because every time I read these lines I can't help think of why #send has to be used. But I'll get to it later, let's look at the rest of the method now.

One the lines 6-10 (see above ↑), SQL is built and appended to updates argument, modifying it. Each of the timestamp_attributes_for_update_in_model is updated with current time. And after the SQL was built, it's executed with Rails' standard #update_all.

So, this method does two things - build SQL and execute it. And most of the method is taken up by building SQL.

Is anything wrong with this method? For my taste, it's too hairy, and it goes into too much detail about details of building SQL. So, the first thing I want to do is to go to a higher level of abstraction on building SQL:

define_singleton_method :update_all_with_touch do |updates|
  update_all(updates << touch_record_sql)
end

private

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  updates = ""
  attrs.each do |attr|
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  updates
end

The result on the line 2 (see above ↑) allows us to grasp what's going on much faster. updates is being appended with SQL of touch_record_sql. It took me more than two pomodoro to figure out a decent name for the method. I've tried many, including update_standard_timestamps_to_current_time_sql (if only it wasn't that long). I prefer touch_record_sql because it uses a well known term touch, which is inherited from Unix touch(1) command and Rails' #touch. Touch means update appropriate timestamps.

Step 1.1: a misnomer

I've copied the code from above for easier reference:

define_singleton_method :update_all_with_touch do |updates|
  update_all(updates << touch_record_sql)
end

private

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  updates = ""
  attrs.each do |attr|
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  updates
end

On the lines 12-17 (see above ↑), updates variable we inherited from .update_all_with_touch method, doesn't explain what's going on well enough. It may mean updates we want to do to db records, but it's far from being obvious. It's not a bad name, but I prefer sql, to be in tune with the method's name touch_record_sql:

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  sql = ""
  attrs.each do |attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  sql
end

Step 1.2: #each that collects data into a variable

One the lines 6-11 (see above ↑) #each loops over timestamp attributes and collects SQL fragments into sql variable. It's a typical misuse of #each and it could be replaced with #map(...).join(", ") if we didn't need the leading ,. In this case, #each can be replaced with #inject:

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  attrs.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

Step 1.3: using #send to execute private methods

As I mentioned previously, #send is used here to run private methods on a model instance (see the lines 3-4 above ↑). And, it incurs cognitive load, because you have to wonder why #send is used here. So, I chose to move this code to an instance method:

define_singleton_method :touch_record_sql do
  new.touch_record_sql
end

...

define_method :touch_record_sql do
  connection = self.class.connection
  attrs = timestamp_attributes_for_update_in_model
  now = current_time_from_proper_timezone

  attrs.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

The line 2 (see above ↑) raises a question though: Why do we need to create model instance to build SQL?. Oh well, it's not perfect.

On the line 8 (see above ↑) I had to add connection variable, so even though record = new is no longer needed, we've not reduced the number of lines. But #send is gone! (see the lines 9-10 above ↑).

Step 1.4: a redundant variable

On the line 9 (see above ↑) there's a variable attrs that is used on the line 12 only. And, since we no longer have #send, i.e. the right side of the variable assignment isn't hairy, we can just do away with it. After all, what new does attrs tells us that timestamp_attributes_for_update_in_model does not?

define_method :touch_record_sql do
  connection = self.class.connection
  now = current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

Step 1.5: another redundant variable

On the line 3 (see above ↑), there's now variable that is only used in one place, at the line 6. It could be said that there's a performance benefit to keeping current_time_from_proper_timezone call out of the #inject loop, but it also could be said that it's premature optimisation.

It looks like a stalemate, but thankfully, there's another angle we can use here - readability. From readability perspective, having now out of the loop clarifies that the now value doesn't depend on the loop.

But on the line 6 there's also some code that doesn't depend on the loop - #{connection.quote(connection.quoted_date(now))}, and it's hard to reason about two parts of the value that doesn't depend on the loop. So, I'm still going to inline now, and see how it goes:

define_method :touch_record_sql do
  connection = self.class.connection

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(current_time_from_proper_timezone))}"
  end
end

Step 1.6: expression that doesn't depend on loop

Have to say, I like less lines here, but the right value in the SQL assignment doesn't depend on loop values, so it should be moved out of the loop for clarity:

define_method :touch_record_sql do
  connection = self.class.connection
  quoted_now = connection.quote(connection.quoted_date(
    current_time_from_proper_timezone))

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{quoted_now}"
  end
end

Step 1.7: a hairy assignment

At this point, when looking at the lines 3-4 (see above ↑) I'm asking myself why not extract quoted_now into a method. I did just that:

define_method :touch_record_sql do
  connection = self.class.connection
  quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{quoted_now}"
  end
end

private

def quoted_current_time_from_proper_timezone
  self.class.connection.quote(self.class.connection.quoted_date(
    current_time_from_proper_timezone))
end

Step 1.8: an unclear moment

On the lines 3-7 (see above ↑) we can see that quoted_now is used only on the line 6, and a question arises Why not inline it and live happily ever after?. We already discussed that quoted_now must be the same for all its uses within SQL, but I've failed to encode this knowledge into words. So, I'm going to use a variable name that clearly explains that - cached_quoted_now:

define_method :touch_record_sql do
  connection = self.class.connection
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end
end

I quite like the result. Have to say, I first got the idea of using a different name for quoted_now when I imagined that the extracted method would be called #quoted_now, so I had to invent a new name for the variable as quoted_now = self.quoted_now would suck.

Step 1.9: another redundant variable

On the line 2 (see above ↑) you can see a remnant of the beginning of this refactoring, connection variable. Now that it's used only once on the line 6, it can be inlined. So, typing self.class.connection sucks, so why not have #connection method? I'd say that it goes agains the Single Responsibility Principle, but convenience trumps it in this case!

define_method :touch_record_sql do
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end
end

private

delegate :connection, to: self

Not that bad.

Step 1.10: #inject isn't the best way

I got a suggestion from reader Alex Piechowski that #map and #join can be used here, instead of #inject:

define_method :touch_record_sql do
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.map do |attr|
    ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end.join
end

Thank you, Alex!

Afterword

I planned on doing more stuff in this refactoring, but it's quite a lot as it is. And most impressive to me, it all came from refactoring a single method. Just how much can you draw from a single method? Turns out, quite a lot.

Happy hacking!

P.S. my PR was accepted by acts_as_list project!

The post acts_as_list refactoring part 3 first appeared on Ruby clarity.

]]>

extract variable - Ruby clarity