better naming - Ruby clarity

acts_as_list refactoring part 3

Dmitry — Fri, 02 Jun 2017 01:07:13 +0000

I refactor acts_as_list Ruby gem again: watch as I choose better names, strip unnecessary variables, work with some ActiveRecord internals and make code intent clearer. In this refactoring adventure I'm going to focus on just one 11-line method, and surprisingly, there's a lot of things that can be improved in just one method.

You don't need to read part 2 and part 1 to understand this article.

acts_as_list is a Rails gem. It allows you to treat Rails model records as part of an ordered list and offers methods like #move_to_bottom and #move_higher.

Step 1: a hairy method using #send

.update_all_with_touch method caught my attention as it's a somewhat long (11 lines) and hairy method. This method executes passed SQL, as Rails' #update_all does, but also updates standard timestamps like updated_at.

define_singleton_method :update_all_with_touch do |updates|                
  record = new                                                             
  attrs = record.send(:timestamp_attributes_for_update_in_model)           
  now = record.send(:current_time_from_proper_timezone)                    

  attrs.each do |attr|                                                     
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end                                                                      

  update_all(updates)                                                      
end

Let's have a look. On the line 2 (see above ↑) it creates a new model instance (acts_as_list is supposed to extend ActiveRecord models, so naturally new would create one). Then it sends two messages to the created model instance record via #send. The reason it uses #send is because both those methods are private, so it can't just say record.timestamp_attributes_for_update_in_model. Now, this is some cognitive load, because every time I read these lines I can't help think of why #send has to be used. But I'll get to it later, let's look at the rest of the method now.

One the lines 6-10 (see above ↑), SQL is built and appended to updates argument, modifying it. Each of the timestamp_attributes_for_update_in_model is updated with current time. And after the SQL was built, it's executed with Rails' standard #update_all.

So, this method does two things - build SQL and execute it. And most of the method is taken up by building SQL.

Is anything wrong with this method? For my taste, it's too hairy, and it goes into too much detail about details of building SQL. So, the first thing I want to do is to go to a higher level of abstraction on building SQL:

define_singleton_method :update_all_with_touch do |updates|
  update_all(updates << touch_record_sql)
end

private

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  updates = ""
  attrs.each do |attr|
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  updates
end

The result on the line 2 (see above ↑) allows us to grasp what's going on much faster. updates is being appended with SQL of touch_record_sql. It took me more than two pomodoro to figure out a decent name for the method. I've tried many, including update_standard_timestamps_to_current_time_sql (if only it wasn't that long). I prefer touch_record_sql because it uses a well known term touch, which is inherited from Unix touch(1) command and Rails' #touch. Touch means update appropriate timestamps.

Step 1.1: a misnomer

I've copied the code from above for easier reference:

define_singleton_method :update_all_with_touch do |updates|
  update_all(updates << touch_record_sql)
end

private

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  updates = ""
  attrs.each do |attr|
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  updates
end

On the lines 12-17 (see above ↑), updates variable we inherited from .update_all_with_touch method, doesn't explain what's going on well enough. It may mean updates we want to do to db records, but it's far from being obvious. It's not a bad name, but I prefer sql, to be in tune with the method's name touch_record_sql:

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  sql = ""
  attrs.each do |attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  sql
end

Step 1.2: #each that collects data into a variable

One the lines 6-11 (see above ↑) #each loops over timestamp attributes and collects SQL fragments into sql variable. It's a typical misuse of #each and it could be replaced with #map(...).join(", ") if we didn't need the leading ,. In this case, #each can be replaced with #inject:

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  attrs.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

Step 1.3: using #send to execute private methods

As I mentioned previously, #send is used here to run private methods on a model instance (see the lines 3-4 above ↑). And, it incurs cognitive load, because you have to wonder why #send is used here. So, I chose to move this code to an instance method:

define_singleton_method :touch_record_sql do
  new.touch_record_sql
end

...

define_method :touch_record_sql do
  connection = self.class.connection
  attrs = timestamp_attributes_for_update_in_model
  now = current_time_from_proper_timezone

  attrs.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

The line 2 (see above ↑) raises a question though: Why do we need to create model instance to build SQL?. Oh well, it's not perfect.

On the line 8 (see above ↑) I had to add connection variable, so even though record = new is no longer needed, we've not reduced the number of lines. But #send is gone! (see the lines 9-10 above ↑).

Step 1.4: a redundant variable

On the line 9 (see above ↑) there's a variable attrs that is used on the line 12 only. And, since we no longer have #send, i.e. the right side of the variable assignment isn't hairy, we can just do away with it. After all, what new does attrs tells us that timestamp_attributes_for_update_in_model does not?

define_method :touch_record_sql do
  connection = self.class.connection
  now = current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

Step 1.5: another redundant variable

On the line 3 (see above ↑), there's now variable that is only used in one place, at the line 6. It could be said that there's a performance benefit to keeping current_time_from_proper_timezone call out of the #inject loop, but it also could be said that it's premature optimisation.

It looks like a stalemate, but thankfully, there's another angle we can use here - readability. From readability perspective, having now out of the loop clarifies that the now value doesn't depend on the loop.

But on the line 6 there's also some code that doesn't depend on the loop - #{connection.quote(connection.quoted_date(now))}, and it's hard to reason about two parts of the value that doesn't depend on the loop. So, I'm still going to inline now, and see how it goes:

define_method :touch_record_sql do
  connection = self.class.connection

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(current_time_from_proper_timezone))}"
  end
end

Step 1.6: expression that doesn't depend on loop

Have to say, I like less lines here, but the right value in the SQL assignment doesn't depend on loop values, so it should be moved out of the loop for clarity:

define_method :touch_record_sql do
  connection = self.class.connection
  quoted_now = connection.quote(connection.quoted_date(
    current_time_from_proper_timezone))

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{quoted_now}"
  end
end

Step 1.7: a hairy assignment

At this point, when looking at the lines 3-4 (see above ↑) I'm asking myself why not extract quoted_now into a method. I did just that:

define_method :touch_record_sql do
  connection = self.class.connection
  quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{quoted_now}"
  end
end

private

def quoted_current_time_from_proper_timezone
  self.class.connection.quote(self.class.connection.quoted_date(
    current_time_from_proper_timezone))
end

Step 1.8: an unclear moment

On the lines 3-7 (see above ↑) we can see that quoted_now is used only on the line 6, and a question arises Why not inline it and live happily ever after?. We already discussed that quoted_now must be the same for all its uses within SQL, but I've failed to encode this knowledge into words. So, I'm going to use a variable name that clearly explains that - cached_quoted_now:

define_method :touch_record_sql do
  connection = self.class.connection
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end
end

I quite like the result. Have to say, I first got the idea of using a different name for quoted_now when I imagined that the extracted method would be called #quoted_now, so I had to invent a new name for the variable as quoted_now = self.quoted_now would suck.

Step 1.9: another redundant variable

On the line 2 (see above ↑) you can see a remnant of the beginning of this refactoring, connection variable. Now that it's used only once on the line 6, it can be inlined. So, typing self.class.connection sucks, so why not have #connection method? I'd say that it goes agains the Single Responsibility Principle, but convenience trumps it in this case!

define_method :touch_record_sql do
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end
end

private

delegate :connection, to: self

Not that bad.

Step 1.10: #inject isn't the best way

I got a suggestion from reader Alex Piechowski that #map and #join can be used here, instead of #inject:

define_method :touch_record_sql do
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.map do |attr|
    ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end.join
end

Thank you, Alex!

Afterword

I planned on doing more stuff in this refactoring, but it's quite a lot as it is. And most impressive to me, it all came from refactoring a single method. Just how much can you draw from a single method? Turns out, quite a lot.

Happy hacking!

P.S. my PR was accepted by acts_as_list project!

The post acts_as_list refactoring part 3 first appeared on Ruby clarity.

acts_as_list refactoring part 2

Dmitry — Fri, 27 Jan 2017 15:57:48 +0000

In this post I'm continuing refactoring of acts_as_list gem I started in part 1.

As you might remember, I've split .acts_as_list method into several modules, each module dedicated to an option passed to the method. E.g. ColumnMethodDefiner module defines methods related to the column option (the option defines column name for storing record's list position).

This post is dedicated to refactoring of the ColumnMethodDefiner module.

Improving ColumnMethodDefiner module

So, I've extracted code related to column option of .acts_as_list to ColumnMethodDefiner. Here's an excerpt:

Step 1: what is "column"?

Line 7 (see above ↑) references column, but what column is that? Line 6 hints that we're talking about position column, i.e. column means "name of the column that holds record's position in the list". I.e. position_column_name. Unfortunately, it's too hard to read, so I opted for position_column, which is easier to read:

I like that the method defined on the line 6 (see above ↑) has the same name as #position_column method. Earlier, we had to reason as to why column argument and #position_column method contained the same data, were named differently. But no more! One concept less!

Step 2: inconsistent module name

At this point, ColumnMethodDefiner module's mission is to define methods related to position_column, but the module is named as if it works with just Column. It is inconsistent, so I'm going to rename it to PositionColumnMethodDefiner:

On the line 4 (see above ↑), we still use column argument though, but from the module name, we can infer that we talk about position column.

I would have liked to deprecate the column argument and introduce position_column to replace it, but that would be changing functionality, and refactoring is all about restructuring code and keeping functionality intact.

Step 3: a method that's too long

PositionColumnMethodDefiner.call is 46 lines long and starts with defining some instance methods:

Since the method is too long, I'm going to extract #define_instance_methods:

Because in part 1 I've chosen to extract stuff related to position column to a separate module, I can now extract methods from .call method and not be afraid to pollute the namespace (as opposed to a single module for all .acts_as_list options).

A sidenote on what not to do

An interesting thing to note is that line 3 (see above ↑) doesn't need to be inside .class_eval block that starts on line 5. At first, I made a mistake of putting the .define_instance_methods method call inside the block, and it led to a problem. The problem was that inside .class_eval block, self points not to the PositionColumnMethodDefiner module, and I had to do a hack to call .define_instance_methods. It was ugly! Feast your eyes on this:

Yuck!

Step 3.1: extract class method definitions

Starting at the line 12 (see below ↓), there are several class methods defined via #define_singleton_method:

I'm going to extract those class method definitions into a method:

Sidenote about Object#define_singleton_method

It was my first time encountering #define_singleton_method, and the docs didn't explain it well: "Defines a singleton method in the receiver". WTF is a singleton method? I know the singleton pattern, but that doesn't make any sense here.

It turns out, a singleton method is a method defined on an object instance. A class, for example, Object class, is an instance of class Class, so a class method foo on Object (Object.foo) is a singleton method too. As well as a method defined on a string:

s = "abc"
s.define_singleton_method :foo
s.foo

So, in Ruby def self.foo method is a class method, and at the same time, a singleton method.

If you feel like diving into this a bit more, there's a great article Ways to Define Singleton Methods in Ruby.

Step 4: mass assignment protection

After I've extracted class and instance method definition we're left with adding position_column as an accessible attribute on line 10 (see below ↓). attr_accessible allows to specify a white list of model attributes that can be set via mass-assignment.

Step 4.1: redundand interpolation

At the line 10 (see above ↑), position_column is interpolated and then converted to a Symbol. We can do away with the interpolation here (see the line 10 below ↓):

Step 4.2: comments

One of the worst things you can find in code is comments, and I hate them with passion. Sometimes you can't help but have comments, sometimes it's a necessary evil, but not in this case. On the lines 7-8 (see above ↑) the comments explain that we only protect position_column from mass-assignment if the user already uses mass-assignment protection. Can we say the same thing without comments? Absolutely!

So, instead of a long conditional, we have a method call .mass_assignment_protection_was_used_by_user?, that is much easier to understand and is at the right level of abstraction.

However, lines 7-9 (see above ↑) are still at the wrong level of abstraction, so I'm going to extract them into a method:

So, I've extracted protecting position_column attribute into .protect_attributes_from_mass_assignment method (see line 7 above ↑).

I feel it reads much better without any comments now.

Step 4.3: .mass_assignment_protection_was_used_by_user?

Let's see whether the code that I've extracted can be improved:

On the line 3 (see above ↑) we check whether accessible_attributes is defined. But what is accessible_attributes? It seems that it's an undocumented Rails method.

In Rails 2.3.8 accessible_attributes used to reference attr_accessible attribute (used to store those attributes that allow mass assignment). In Rails 4, attr_accessible was removed in favour of strong parameters and thus, would no longer be defined.

This explains why accessible_attributes may not be defined, and I will not dive deeper into undocumented Rails stuff.

Step 4.3.1: gratuitous use of defined?

defined?(accessible_attributes) returns a truthful value if . accessible_attributes is defined. However, it would also return a truthful value if a variable named accessible_attributes was defined. It's not very likely that such variable would be defined, but for somebody reading it thoroughly, it makes code harder to understand. "Did the author really mean that accessible_attributes variable counts as mass protection defined?". Thus, it's better to replace defined? with #respond_to?:

In this way, it's clear that we're looking for a method .accessible_attributes, and there are no further questions.

Step 4.3.2: gratuitous negation

But we're not done with the .mass_assignment_protection_was_used_by_user? method yet. On the line 3 (see above ↑) we check whether accessible_attributes is not #blank?. It's probably always better to avoid using negation. In this case, we can use #present?:

Now I'm happy with the method.

Step 5: too much of passing caller_class around

To remind you what the state of .call method is:

We are passing caller_class to each method call here. We could define a class instance variable and reference it in class methods later:

Voila! Reads much better!

Step 6: but it's not thread safe!

But alas, using a class instance variable is not thread safe :(

I have two choices here:

Use a service object.
Use a thread variable.

Step 6.1: using a service object

Long story short, I've refactored to this:

And, I can't stand it. The cure is worse than the disease. In the #call method (see the lines 12-17 above ↑) I'm passing an instance variable @position_column as a method argument. It's awful, but it's that or I have to say something like position_column = @ position_column for the variable to be picked up by a #class_eval block. Neither of the options are good. So, it's a no-go.

Step 6.2: using a thread variable

So, I've refactored to use a thread variable:

Much better than service object, but the cognitive load is there. It's just far from being standard to say self.caller_class = caller_class. And thread variable instead of just another method argument? That takes much more thinking. "Why a thread variable?", "What does self.caller_class = caller_class assignment mean?". It's a no-go either.

Step 7: back to the functional solution

So, in the end I was unable to improve on this:

Can you think of a way to improve it?

What to expect from part 3?

In part 3 I'll dive into methods defined with #define_singleton_method in .define_class_methods. Some of them use class instance variables, so they may not be thread safe. I'm looking forward to finding out.

That's all for today, and, happy hacking!

P.S. my PR was accepted by acts_as_list project!

The post acts_as_list refactoring part 2 first appeared on Ruby clarity.

Be lazy and don’t keep context in your head

Dmitry — Thu, 07 Apr 2016 16:39:07 +0000

Sometimes, code we read makes us to remember context. Consider the following code (it sends an invite email to a user):

Lines 2-8 (see above ↑) deal with the case of existing user, and the rest deal with the case of new user.

When reading lines 3-5 you have to remember that you're dealing with an existing user, same with lines 12-22, only there you deal with a new user. After all, both existing user and new user cases use same variable name: user.

But there's a better way than keeping it in your head. Keeping it in the code:

Reading lines 3-5 (see above ↑), you clearly see that you're dealing with an existing user. And same for the lines dealing with a new user.

Happy hacking!

The post Be lazy and don’t keep context in your head first appeared on Ruby clarity.

better naming - Ruby clarity

acts_as_list refactoring part 3

Step 1: a hairy method using #send

Fed up working on bad code? Here's a way out!

For people that that want to stop suffering from bad code I’ve made a FREE course

Step 1.1: a misnomer

Step 1.2: #each that collects data into a variable

Step 1.3: using #send to execute private methods

Step 1.4: a redundant variable

Step 1.5: another redundant variable

Step 1.6: expression that doesn't depend on loop

Step 1.7: a hairy assignment

Step 1.8: an unclear moment

Step 1.9: another redundant variable

Step 1.10: #inject isn't the best way

Afterword

acts_as_list refactoring part 2

Improving ColumnMethodDefiner module

Step 1: what is "column"?

Fed up working on bad code? Here's a way out!

For people that that want to stop suffering from bad code I’ve made a FREE course

Step 2: inconsistent module name

Step 3: a method that's too long

A sidenote on what not to do

Step 3.1: extract class method definitions

Sidenote about Object#define_singleton_method

Step 4: mass assignment protection

Step 4.1: redundand interpolation

Step 4.2: comments

Step 4.3: .mass_assignment_protection_was_used_by_user?

Step 4.3.1: gratuitous use of defined?

Step 4.3.2: gratuitous negation

Step 5: too much of passing caller_class around

Step 6: but it's not thread safe!

Step 6.1: using a service object

Step 6.2: using a thread variable

Step 7: back to the functional solution

What to expect from part 3?

Be lazy and don’t keep context in your head