extract method - Ruby clarity http://rubyclarity.com/ Refactorings of Ruby/Rails projects Sat, 25 Sep 2021 11:30:29 +0000 en-US hourly 1 https://wordpress.org/?v=5.4.7 Is it always a good idea to split long methods into smaller ones? An experiment. https://rubyclarity.com/2017/07/is-it-always-a-good-idea-to-split-long-methods-into-smaller-ones-an-experiment/?utm_source=rss&utm_medium=rss&utm_campaign=is-it-always-a-good-idea-to-split-long-methods-into-smaller-ones-an-experiment https://rubyclarity.com/2017/07/is-it-always-a-good-idea-to-split-long-methods-into-smaller-ones-an-experiment/#comments Fri, 07 Jul 2017 20:40:33 +0000 https://rubyclarity.com/?p=412 There's a long discussion on Reddit named Is it OK to split long functions and methods into smaller ones even though they won't be called by anything else?. Some people in that discussion hold an opinion that it's not always the best idea to split a long method into smaller methods. Well, of course, you can refactor code in many different ways, including ways that read worse than the original long method. But I think that in most cases it

The post Is it always a good idea to split long methods into smaller ones? An experiment. first appeared on Ruby clarity.

]]>
There's a long discussion on Reddit named Is it OK to split long functions and methods into smaller ones even though they won't be called by anything else?. Some people in that discussion hold an opinion that it's not always the best idea to split a long method into smaller methods. Well, of course, you can refactor code in many different ways, including ways that read worse than the original long method. But I think that in most cases it makes for better readability to split long methods into smaller ones.

So, I want to do an experiment and try to split a long method that's difficult to split into smaller methods. I know that it's easy to extract methods when the code uses several levels of abstraction or a low level of abstraction. In such a case it's easy to achieve better readability by extracting some methods. So, the hardest long method I can think of would use a high level of abstraction, from which it wouldn't be easy to go to a level higher.

Such a method can be found in Rails' ActiveRecord::Persistence module. It's the #touch method, and it deals with timestamp attributes that need to be updated with current time, scopes, primary keys, locks and SQL UPDATE. In other words, the level of abstraction is high enough and it's the level that deals with database-level concepts. I expect it won't be that easy to go to a higher level of abstraction here.

Introducing the #touch method

def touch(*names, time: nil)
  unless persisted?
    raise ActiveRecordError, <<-MSG.squish
      cannot touch on a new or destroyed record object. Consider using
      persisted?, new_record?, or destroyed? before touching
    MSG
  end

  time ||= current_time_from_proper_timezone
  attributes = timestamp_attributes_for_update_in_model
  attributes.concat(names)

  unless attributes.empty?
    changes = {}

    attributes.each do |column|
      column = column.to_s
      changes[column] = write_attribute(column, time)
    end

    primary_key = self.class.primary_key
    scope = self.class.unscoped.where(primary_key => _read_attribute(primary_key))

    if locking_enabled?
      locking_column = self.class.locking_column
      scope = scope.where(locking_column => _read_attribute(locking_column))
      changes[locking_column] = increment_lock
    end

    clear_attribute_changes(changes.keys)
    result = scope.update_all(changes) == 1

    if !result && locking_enabled?
      raise ActiveRecord::StaleObjectError.new(self, "touch")
    end

    @_trigger_update_callback = result
    result
  else
    true
  end
end

At 42 lines, #touch is quite long and requires some explaining. Arguments #touch accepts are names (meaning timestamp attributes to update) and time (the time to set timestamps to). Then we decline to work on non-persisted records (lines 2-7). Then we set up the default time value to current time and merge standard timestamp attributes and passed timestamp attributes (names) into attributes array (lines 9-11).

Then goes the most interesting part, when there are some attributes to update (i.e. attributes isn't empty). We instantiate changes hash to pass to #update_all later, and fill it up with attribute keys and time values, and set timestamp attributes on the record to time (lines 14-19). Then we setup a scope we'll be using to match this very record that we #touch, using primary key (lines 21-22). Then we deal with the case when locking is on, updating both record, changes hash and the scope (lines 24-28).

And then clear #changed, so that after we've updated/touched the record, the stuff we've updated in db is no longer marked as changed (line 30). Then we run #update_all. If we've failed to find the record, because of locking issues, we raise a StaleObjectError (lines 33-35). And at last, we setup a special flag telling Rails whether we've actually changed data in the db.

Take 1: an easy way to split

Now, an easy way to split a method is to see how it's structured, and split at the seams. I'd expect a junior developer to do just that. And that's what I did:

unless attributes.empty?
  changes = {}

  update_record_and_changes_with_time(attributes, time, changes)
  scope = scope_by_primary_key

  if locking_enabled?
    scope = extend_scope_to_match_locking_column_value(scope)
    update_record_and_changes_with_lock_increment(changes)
  end

  clear_attribute_changes(changes.keys)
  result = scope.update_all(changes) == 1

  if !result && locking_enabled?
    raise ActiveRecord::StaleObjectError.new(self, "touch")
  end

  @_trigger_update_callback = result
  result
else
  true
end

As you can see, I just replaced more verbose code with explanations. The same level of abstractions is used. The main problem here is whether it's still clear what's going on. I've shown this code to a person and he says it's still clear.

However, this is really my 3rd take on refactoring #touch. The first two refactoring were pretty bad. It took me awhile to understand all the intricacies of this method.

Fed up working on bad code? Here's a way out!

For people that that want to stop suffering from bad code I’ve made a FREE course

Hidden concepts

There are hidden concepts in the original code that the reader has to figure out by themselves. I describe them beneath the code (no need to read the code below without explanations).

unless attributes.empty?
  changes = {}

  attributes.each do |column|
    column = column.to_s
    changes[column] = write_attribute(column, time)
  end

  primary_key = self.class.primary_key
  scope = self.class.unscoped.where(primary_key => _read_attribute(primary_key))

  if locking_enabled?
    locking_column = self.class.locking_column
    scope = scope.where(locking_column => _read_attribute(locking_column))
    changes[locking_column] = increment_lock
  end

  clear_attribute_changes(changes.keys)
  result = scope.update_all(changes) == 1

  if !result && locking_enabled?
    raise ActiveRecord::StaleObjectError.new(self, "touch")
  end

  @_trigger_update_callback = result
  result
else
  true
end

So, the hidden concepts are:

  1. changes and record attributes are updated with the same information at the same time (lines 4-7 and 15). It follows that the record attributes and changes hold the same information afterwards.
  2. scope is used to match the record #touch is called on (lines 9-10 and 14).
  3. Scope extension to include locking information has to happen before #increment_lock is called.

Also, there is understanding of why we need #clear_attribute_changes (on the line 18) after we're done with adding locking information to changes. In fact, that statement used to be placed before the code that deals with locking, and was fixed later. The bug was that locking column was updated, but not cleared off #changed. Of course, now the tests prevent from regressions, but it'd be great to understand all the intricacies easily.

I believe that if we could put all the code that deals with updating record attributes into one place, and place #clear_attribute_changes afterwards, it'll be a bit clearer. This brings us to another refactoring take:

Take 2: split by activity

In this take I've clearly separated scope building and updating of record attributes and changes hash:

unless attributes.empty?
  scope = prepare_scope_to_match_this_record

  changes = {}
  update_record_and_changes_with_same_data(attributes, time, changes)

  result = scope.update_all(changes) == 1

  if !result && locking_enabled?
    raise ActiveRecord::StaleObjectError.new(self, "touch")
  end

  @_trigger_update_callback = result
  result
else
  true
end

private

def update_record_and_changes_with_same_data(attributes, time, changes)
  attributes.each do |column|
    column = column.to_s
    changes[column] = write_attribute(column, time)
  end

  if locking_enabled?
    changes[self.class.locking_column] = increment_lock
  end

  clear_attribute_changes(changes.keys)
end

I think the line 2 is as good as it gets, but the line 5 could be named better. The level of abstraction used is essentially the same as the original code, dealing with records and scopes.

As for revealing of the hidden concepts, we have:

  1. Clear indication that the record attributes and changes are updated with the same data (see the line 5).
  2. It's clear that scope is meant to match the record #touch is called on.
  3. It's not clear why scope should be prepared before changing record, at least not in a prominent way. I considered naming scope preparing method #prepare_scope_to_match_this_unchanged_record, but _unchanged_ would just add cognitive overhead. It's not every day that you're moving things around. And, match this record hints that values used for that should be such that the record can be found, i.e. unchanged.

Take 3: split by the domain concepts

The first two takes haven't really deviated from the existing code structure, the first take especially. But there's another way to look at it, from the domain perspective. You could say it's thinking out of the box.

What does the code tell us about domain? Do we see domain concepts manifesting in the code?

If I look at #touch with that in mind, I can't help noticing that domain logic for #touch should be:

  1. Updating the record attributes with new time. Standard timestamp attributes and passed timestamp attributes are updated.
  2. Saving those attributes in db.

That's the level of abstraction that should be used, when looking from the domain context perspective:

unless attributes.empty?
  attributes.each do |column|
    column = column.to_s
    write_attribute(column, time)
  end

  touch_columns(*attributes)
else
  true
end

So, here we update the standard and passed timestamp attributes on the record and tell #touch_columns to save them in db.

I like this take the most. There's no word about locking as it doesn't really belong in the domain logic, and we don't know how the attributes are going to get saved, all we care about is that they get saved.

The name #touch_columns is not very good. In Rails, there is #update_attributes that #saves the record, and there is #update_columns that just executes SQL and doesn't call any callbacks. #touch_columns is in-between, skipping validations, but calling callbacks. I just don't know Rails enough to come up with a better name. But otherwise, it's a good take.

Conclusion

I've selected code that I considered to be hard to split into smaller methods. It had high level of abstraction, which increased the difficulty level.

So, Is it always a good idea to split a long method into smaller ones? The conclusion I've made is that it depends on your refactoring skills and tenacity. I could have stopped after my first two attempts (take 1 is actually my 3rd attempt) and declared it impossible to split #touch, because the first two attempts were bad.

All three takes described in this post are fitting as a replacement of the original code. Take 1 is better than the original code because it allows to understand what's going on faster, and provides the same level of understanding of the implementation. Take 2 is better than the original code because it highlights hidden concepts that the reader would have to take more time to get otherwise. And finally, take 3 is better than the original code because it uses domain-level concepts, and that makes #touch much easier to reason about.

I would also like to note that doing just extract methods as the original question seems to imply, is very limiting. Refactoring is much more than just extracting methods.

Hire me to help your team ship features faster.

The post Is it always a good idea to split long methods into smaller ones? An experiment. first appeared on Ruby clarity.

]]>
https://rubyclarity.com/2017/07/is-it-always-a-good-idea-to-split-long-methods-into-smaller-ones-an-experiment/feed/ 5
acts_as_list refactoring part 3 https://rubyclarity.com/2017/06/acts_as_list-refactoring-part-3/?utm_source=rss&utm_medium=rss&utm_campaign=acts_as_list-refactoring-part-3 https://rubyclarity.com/2017/06/acts_as_list-refactoring-part-3/#respond Fri, 02 Jun 2017 01:07:13 +0000 https://rubyclarity.com/?p=382 I refactor acts_as_list Ruby gem again: watch as I choose better names, strip unnecessary variables, work with some ActiveRecord internals and make code intent clearer. In this refactoring adventure I'm going to focus on just one 11-line method, and surprisingly, there's a lot of things that can be improved in just one method. You don't need to read part 2 and part 1 to understand this article. acts_as_list is a Rails gem. It allows you to treat Rails model records

The post acts_as_list refactoring part 3 first appeared on Ruby clarity.

]]>
I refactor acts_as_list Ruby gem again: watch as I choose better names, strip unnecessary variables, work with some ActiveRecord internals and make code intent clearer. In this refactoring adventure I'm going to focus on just one 11-line method, and surprisingly, there's a lot of things that can be improved in just one method.

You don't need to read part 2 and part 1 to understand this article.

acts_as_list is a Rails gem. It allows you to treat Rails model records as part of an ordered list and offers methods like #move_to_bottom and #move_higher.

Step 1: a hairy method using #send

.update_all_with_touch method caught my attention as it's a somewhat long (11 lines) and hairy method. This method executes passed SQL, as Rails' #update_all does, but also updates standard timestamps like updated_at.

define_singleton_method :update_all_with_touch do |updates|                
  record = new                                                             
  attrs = record.send(:timestamp_attributes_for_update_in_model)           
  now = record.send(:current_time_from_proper_timezone)                    

  attrs.each do |attr|                                                     
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end                                                                      

  update_all(updates)                                                      
end

Let's have a look. On the line 2 (see above ↑) it creates a new model instance (acts_as_list is supposed to extend ActiveRecord models, so naturally new would create one). Then it sends two messages to the created model instance record via #send. The reason it uses #send is because both those methods are private, so it can't just say record.timestamp_attributes_for_update_in_model. Now, this is some cognitive load, because every time I read these lines I can't help think of why #send has to be used. But I'll get to it later, let's look at the rest of the method now.

One the lines 6-10 (see above ↑), SQL is built and appended to updates argument, modifying it. Each of the timestamp_attributes_for_update_in_model is updated with current time. And after the SQL was built, it's executed with Rails' standard #update_all.

So, this method does two things - build SQL and execute it. And most of the method is taken up by building SQL.

Is anything wrong with this method? For my taste, it's too hairy, and it goes into too much detail about details of building SQL. So, the first thing I want to do is to go to a higher level of abstraction on building SQL:

define_singleton_method :update_all_with_touch do |updates|
  update_all(updates << touch_record_sql)
end

private

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  updates = ""
  attrs.each do |attr|
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  updates
end

The result on the line 2 (see above ↑) allows us to grasp what's going on much faster. updates is being appended with SQL of touch_record_sql. It took me more than two pomodoro to figure out a decent name for the method. I've tried many, including update_standard_timestamps_to_current_time_sql (if only it wasn't that long). I prefer touch_record_sql because it uses a well known term touch, which is inherited from Unix touch(1) command and Rails' #touch. Touch means update appropriate timestamps.

Fed up working on bad code? Here's a way out!

For people that that want to stop suffering from bad code I’ve made a FREE course

Step 1.1: a misnomer

I've copied the code from above for easier reference:

define_singleton_method :update_all_with_touch do |updates|
  update_all(updates << touch_record_sql)
end

private

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  updates = ""
  attrs.each do |attr|
    updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  updates
end

On the lines 12-17 (see above ↑), updates variable we inherited from .update_all_with_touch method, doesn't explain what's going on well enough. It may mean updates we want to do to db records, but it's far from being obvious. It's not a bad name, but I prefer sql, to be in tune with the method's name touch_record_sql:

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  sql = ""
  attrs.each do |attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end

  sql
end

Step 1.2: #each that collects data into a variable

One the lines 6-11 (see above ↑) #each loops over timestamp attributes and collects SQL fragments into sql variable. It's a typical misuse of #each and it could be replaced with #map(...).join(", ") if we didn't need the leading ,. In this case, #each can be replaced with #inject:

define_singleton_method :touch_record_sql do
  record = new
  attrs = record.send(:timestamp_attributes_for_update_in_model)
  now = record.send(:current_time_from_proper_timezone)

  attrs.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

Step 1.3: using #send to execute private methods

As I mentioned previously, #send is used here to run private methods on a model instance (see the lines 3-4 above ↑). And, it incurs cognitive load, because you have to wonder why #send is used here. So, I chose to move this code to an instance method:

define_singleton_method :touch_record_sql do
  new.touch_record_sql
end

...

define_method :touch_record_sql do
  connection = self.class.connection
  attrs = timestamp_attributes_for_update_in_model
  now = current_time_from_proper_timezone

  attrs.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

The line 2 (see above ↑) raises a question though: Why do we need to create model instance to build SQL?. Oh well, it's not perfect.

On the line 8 (see above ↑) I had to add connection variable, so even though record = new is no longer needed, we've not reduced the number of lines. But #send is gone! (see the lines 9-10 above ↑).

Step 1.4: a redundant variable

On the line 9 (see above ↑) there's a variable attrs that is used on the line 12 only. And, since we no longer have #send, i.e. the right side of the variable assignment isn't hairy, we can just do away with it. After all, what new does attrs tells us that timestamp_attributes_for_update_in_model does not?

define_method :touch_record_sql do
  connection = self.class.connection
  now = current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}"
  end
end

Step 1.5: another redundant variable

On the line 3 (see above ↑), there's now variable that is only used in one place, at the line 6. It could be said that there's a performance benefit to keeping current_time_from_proper_timezone call out of the #inject loop, but it also could be said that it's premature optimisation.

It looks like a stalemate, but thankfully, there's another angle we can use here - readability. From readability perspective, having now out of the loop clarifies that the now value doesn't depend on the loop.

But on the line 6 there's also some code that doesn't depend on the loop - #{connection.quote(connection.quoted_date(now))}, and it's hard to reason about two parts of the value that doesn't depend on the loop. So, I'm still going to inline now, and see how it goes:

define_method :touch_record_sql do
  connection = self.class.connection

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(current_time_from_proper_timezone))}"
  end
end

Step 1.6: expression that doesn't depend on loop

Have to say, I like less lines here, but the right value in the SQL assignment doesn't depend on loop values, so it should be moved out of the loop for clarity:

define_method :touch_record_sql do
  connection = self.class.connection
  quoted_now = connection.quote(connection.quoted_date(
    current_time_from_proper_timezone))

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{quoted_now}"
  end
end

Step 1.7: a hairy assignment

At this point, when looking at the lines 3-4 (see above ↑) I'm asking myself why not extract quoted_now into a method. I did just that:

define_method :touch_record_sql do
  connection = self.class.connection
  quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{quoted_now}"
  end
end

private

def quoted_current_time_from_proper_timezone
  self.class.connection.quote(self.class.connection.quoted_date(
    current_time_from_proper_timezone))
end

Step 1.8: an unclear moment

On the lines 3-7 (see above ↑) we can see that quoted_now is used only on the line 6, and a question arises Why not inline it and live happily ever after?. We already discussed that quoted_now must be the same for all its uses within SQL, but I've failed to encode this knowledge into words. So, I'm going to use a variable name that clearly explains that - cached_quoted_now:

define_method :touch_record_sql do
  connection = self.class.connection
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end
end

I quite like the result. Have to say, I first got the idea of using a different name for quoted_now when I imagined that the extracted method would be called #quoted_now, so I had to invent a new name for the variable as quoted_now = self.quoted_now would suck.

Step 1.9: another redundant variable

On the line 2 (see above ↑) you can see a remnant of the beginning of this refactoring, connection variable. Now that it's used only once on the line 6, it can be inlined. So, typing self.class.connection sucks, so why not have #connection method? I'd say that it goes agains the Single Responsibility Principle, but convenience trumps it in this case!

define_method :touch_record_sql do
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.inject("") do |sql, attr|
    sql << ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end
end

private

delegate :connection, to: self

Not that bad.

Step 1.10: #inject isn't the best way

I got a suggestion from reader Alex Piechowski that #map and #join can be used here, instead of #inject:

define_method :touch_record_sql do
  cached_quoted_now = quoted_current_time_from_proper_timezone

  timestamp_attributes_for_update_in_model.map do |attr|
    ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}"
  end.join
end

Thank you, Alex!

Afterword

I planned on doing more stuff in this refactoring, but it's quite a lot as it is. And most impressive to me, it all came from refactoring a single method. Just how much can you draw from a single method? Turns out, quite a lot.

Happy hacking!

P.S. my PR was accepted by acts_as_list project!

The post acts_as_list refactoring part 3 first appeared on Ruby clarity.

]]>
https://rubyclarity.com/2017/06/acts_as_list-refactoring-part-3/feed/ 0
acts_as_list refactoring part 2 https://rubyclarity.com/2017/01/acts_as_list-refactoring-part-2/?utm_source=rss&utm_medium=rss&utm_campaign=acts_as_list-refactoring-part-2 https://rubyclarity.com/2017/01/acts_as_list-refactoring-part-2/#respond Fri, 27 Jan 2017 15:57:48 +0000 https://rubyclarity.com/?p=196 In this post I'm continuing refactoring of acts_as_list gem I started in part 1. As you might remember, I've split .acts_as_list method into several modules, each module dedicated to an option passed to the method. E.g. ColumnMethodDefiner module defines methods related to the column option (the option defines column name for storing record's list position). This post is dedicated to refactoring of the ColumnMethodDefiner module. Improving ColumnMethodDefiner module So, I've extracted code related to column option of .acts_as_list to ColumnMethodDefiner.

The post acts_as_list refactoring part 2 first appeared on Ruby clarity.

]]>
In this post I'm continuing refactoring of acts_as_list gem I started in part 1.

As you might remember, I've split .acts_as_list method into several modules, each module dedicated to an option passed to the method. E.g. ColumnMethodDefiner module defines methods related to the column option (the option defines column name for storing record's list position).

This post is dedicated to refactoring of the ColumnMethodDefiner module.

Improving ColumnMethodDefiner module

So, I've extracted code related to column option of .acts_as_list to ColumnMethodDefiner. Here's an excerpt:

Step 1: what is "column"?

Line 7 (see above ↑) references column, but what column is that? Line 6 hints that we're talking about position column, i.e. column means "name of the column that holds record's position in the list". I.e. position_column_name. Unfortunately, it's too hard to read, so I opted for position_column, which is easier to read:

I like that the method defined on the line 6 (see above ↑) has the same name as #position_column method. Earlier, we had to reason as to why column argument and #position_column method contained the same data, were named differently. But no more! One concept less!

Fed up working on bad code? Here's a way out!

For people that that want to stop suffering from bad code I’ve made a FREE course

Step 2: inconsistent module name

At this point, ColumnMethodDefiner module's mission is to define methods related to position_column, but the module is named as if it works with just Column. It is inconsistent, so I'm going to rename it to PositionColumnMethodDefiner:

On the line 4 (see above ↑), we still use column argument though, but from the module name, we can infer that we talk about position column.

I would have liked to deprecate the column argument and introduce position_column to replace it, but that would be changing functionality, and refactoring is all about restructuring code and keeping functionality intact.

Step 3: a method that's too long

PositionColumnMethodDefiner.call is 46 lines long and starts with defining some instance methods:

Since the method is too long, I'm going to extract #define_instance_methods:

Because in part 1 I've chosen to extract stuff related to position column to a separate module, I can now extract methods from .call method and not be afraid to pollute the namespace (as opposed to a single module for all .acts_as_list options).

A sidenote on what not to do

An interesting thing to note is that line 3 (see above ↑) doesn't need to be inside .class_eval block that starts on line 5. At first, I made a mistake of putting the .define_instance_methods method call inside the block, and it led to a problem. The problem was that inside .class_eval block, self points not to the PositionColumnMethodDefiner module, and I had to do a hack to call .define_instance_methods. It was ugly! Feast your eyes on this:

Yuck!

Step 3.1: extract class method definitions

Starting at the line 12 (see below ↓), there are several class methods defined via #define_singleton_method:

I'm going to extract those class method definitions into a method:

Sidenote about Object#define_singleton_method

It was my first time encountering #define_singleton_method, and the docs didn't explain it well: "Defines a singleton method in the receiver". WTF is a singleton method? I know the singleton pattern, but that doesn't make any sense here.

It turns out, a singleton method is a method defined on an object instance. A class, for example, Object class, is an instance of class Class, so a class method foo on Object (Object.foo) is a singleton method too. As well as a method defined on a string:

s = "abc"
s.define_singleton_method :foo
s.foo

So, in Ruby def self.foo method is a class method, and at the same time, a singleton method.

If you feel like diving into this a bit more, there's a great article Ways to Define Singleton Methods in Ruby.

Step 4: mass assignment protection

After I've extracted class and instance method definition we're left with adding position_column as an accessible attribute on line 10 (see below ↓). attr_accessible allows to specify a white list of model attributes that can be set via mass-assignment.

Step 4.1: redundand interpolation

At the line 10 (see above ↑), position_column is interpolated and then converted to a Symbol. We can do away with the interpolation here (see the line 10 below ↓):

Step 4.2: comments

One of the worst things you can find in code is comments, and I hate them with passion. Sometimes you can't help but have comments, sometimes it's a necessary evil, but not in this case. On the lines 7-8 (see above ↑) the comments explain that we only protect position_column from mass-assignment if the user already uses mass-assignment protection. Can we say the same thing without comments? Absolutely!

So, instead of a long conditional, we have a method call .mass_assignment_protection_was_used_by_user?, that is much easier to understand and is at the right level of abstraction.

However, lines 7-9 (see above ↑) are still at the wrong level of abstraction, so I'm going to extract them into a method:

So, I've extracted protecting position_column attribute into .protect_attributes_from_mass_assignment method (see line 7 above ↑).

I feel it reads much better without any comments now.

Step 4.3: .mass_assignment_protection_was_used_by_user?

Let's see whether the code that I've extracted can be improved:

On the line 3 (see above ↑) we check whether accessible_attributes is defined. But what is accessible_attributes? It seems that it's an undocumented Rails method.

In Rails 2.3.8 accessible_attributes used to reference attr_accessible attribute (used to store those attributes that allow mass assignment). In Rails 4, attr_accessible was removed in favour of strong parameters and thus, would no longer be defined.

This explains why accessible_attributes may not be defined, and I will not dive deeper into undocumented Rails stuff.

Step 4.3.1: gratuitous use of defined?

defined?(accessible_attributes) returns a truthful value if . accessible_attributes is defined. However, it would also return a truthful value if a variable named accessible_attributes was defined. It's not very likely that such variable would be defined, but for somebody reading it thoroughly, it makes code harder to understand. "Did the author really mean that accessible_attributes variable counts as mass protection defined?". Thus, it's better to replace defined? with #respond_to?:

In this way, it's clear that we're looking for a method .accessible_attributes, and there are no further questions.

Step 4.3.2: gratuitous negation

But we're not done with the .mass_assignment_protection_was_used_by_user? method yet. On the line 3 (see above ↑) we check whether accessible_attributes is not #blank?. It's probably always better to avoid using negation. In this case, we can use #present?:

Now I'm happy with the method.

Step 5: too much of passing caller_class around

To remind you what the state of .call method is:

We are passing caller_class to each method call here. We could define a class instance variable and reference it in class methods later:

Voila! Reads much better!

Step 6: but it's not thread safe!

But alas, using a class instance variable is not thread safe :(

I have two choices here:

  1. Use a service object.
  2. Use a thread variable.

Step 6.1: using a service object

Long story short, I've refactored to this:

And, I can't stand it. The cure is worse than the disease. In the #call method (see the lines 12-17 above ↑) I'm passing an instance variable @position_column as a method argument. It's awful, but it's that or I have to say something like position_column = @ position_column for the variable to be picked up by a #class_eval block. Neither of the options are good. So, it's a no-go.

Step 6.2: using a thread variable

So, I've refactored to use a thread variable:

Much better than service object, but the cognitive load is there. It's just far from being standard to say self.caller_class = caller_class. And thread variable instead of just another method argument? That takes much more thinking. "Why a thread variable?", "What does self.caller_class = caller_class assignment mean?". It's a no-go either.

Step 7: back to the functional solution

So, in the end I was unable to improve on this:

Can you think of a way to improve it?

What to expect from part 3?

In part 3 I'll dive into methods defined with #define_singleton_method in .define_class_methods. Some of them use class instance variables, so they may not be thread safe. I'm looking forward to finding out.

That's all for today, and, happy hacking!

P.S. my PR was accepted by acts_as_list project!

The post acts_as_list refactoring part 2 first appeared on Ruby clarity.

]]>
https://rubyclarity.com/2017/01/acts_as_list-refactoring-part-2/feed/ 0
acts_as_list refactoring part 1 https://rubyclarity.com/2016/11/acts_as_list-refactoring-part-1/?utm_source=rss&utm_medium=rss&utm_campaign=acts_as_list-refactoring-part-1 https://rubyclarity.com/2016/11/acts_as_list-refactoring-part-1/#comments Mon, 28 Nov 2016 17:20:27 +0000 https://rubyclarity.com/?p=184 Today I'm going to refactor acts_as_list Rails library. It allows to treat Rails model records as part of an ordered list and offers methods like #move_to_bottom and #move_higher. Step 1: .acts_as_list introduction .acts_as_list is available as a class method in ActiveRecord::Base when acts_as_list gem is loaded. Here's an excerpt from .acts_as_list definition: Using ClassMethods module is customary in Rails, but it's not a requirement to be familiar with it to read this article. All you need to know is that

The post acts_as_list refactoring part 1 first appeared on Ruby clarity.

]]>
Today I'm going to refactor acts_as_list Rails library. It allows to treat Rails model records as part of an ordered list and offers methods like #move_to_bottom and #move_higher.

Step 1: .acts_as_list introduction

.acts_as_list is available as a class method in ActiveRecord::Base when acts_as_list gem is loaded. Here's an excerpt from .acts_as_list definition:

Using ClassMethods module is customary in Rails, but it's not a requirement to be familiar with it to read this article. All you need to know is that .acts_as_list is a class method when used in a Rails model.

As you can see on the line 3 above ↑, there are 4 options that can be passed to .acts_as_list:

  • column: db column to store position in the list.
  • scope: restricts what is to be considered a list. For example, enabled = true SQL could be used as scope, to limit list items to those that are enabled.
  • top_of_list: a number the first element of the list will have as position.
  • add_new_at: specifies whether new items get added to the :top or :bottom of the list.

Fed up working on bad code? Here's a way out!

For people that that want to stop suffering from bad code I’ve made a FREE course

Step 2: the problem with passing options

Options are passed as options argument, and a hash is expected (see the line 2 above ↑). Then, the default configuration hash is updated with the passed options on the line 4, thus overriding defaults with the passed options.

The problem here is that the caller can make mistakes:

  • By passing not a Hash, but something else:

acts_as_list :column
acts_as_list 1

  • By passing a wrongly spelled option (:columm instead of :column):

acts_as_list columm: "order"

In both cases, .acts_as_list will fail silently, leaving the user to figure out what went wrong by themselves.

Using Ruby 2 keyword arguments solves both described problems:

Step 3: configuration variable

Using configuration variable after using keyword arguments does look confusing, and it's so much shorter without it:

I realise that it puts cognitive load on us, to figure out that scope is part of configuration, but if the method is short (and currently it's not short), it'll be ok. Meanwhile, I'll enjoy shorter names :)

Step 4: the problem with ad-hoc solution

As you can see on the line 3 above ↑, _id suffix is added to scope. The problem with this line is twofold:

  1. It tells the story of what it does, but doesn't tell why.
  2. It's an ad-hoc solution, and is harder to read than a standard solution.

I thought of extracting that into a method (thus, solving the 1st problem), but fortunately, I guessed that there must be a method out there doing that already. And indeed, there is: ActiveSupport::Inflector.foreign_key. So, I'm going to use it:

The #foreign_key method fits perfectly here, because, scope is described in the comments as Given a symbol, it'll attach _id and use that as the foreign key restriction. Not only it's a standard solution, the story it tells, fits well into what .acts_as_list does.

As you can see on the line 2 above ↑, I've chosen to include ActiveSupport::Inflector into ClassMethods, thus polluting all classes ClassMethods will be extending. But this is temporary, and I'll figure out later, how to fix that.

Step 4.1: a hairy conditional

On the lines 5-7 (see above ↑), we add _id suffix to scope if it's a Symbol and doesn't end with _id already. This code is ripe for extracting a method:

On the line 2 (see above ↑) you can see that I haven't extracted the check of whether scope is a Symbol. I believe, it would be less readable to have just scope = idify(scope) as it'd look like we add _id suffix always. But this is not the case, the suffix is added only for symbols (strings are left untouched).

However, there's one problem with this setup. Having #idify in the module ClassMethods pollutes namespace of ActiveRecord::Base.

Step 5: split .acts_as_list into smaller pieces

At this stage, .acts_as_list method is 118 lines long. Here's a short snippet:

The code in .acts_as_list defines methods and Rails callbacks, related to column, scope, top_of_list, add_new_at arguments. It seems like a good idea to group code by those arguments, putting scope-related stuff into one place and column-related, into some other place.

Step 5.1: ways to split .acts_as_list

I see 3 approaches to split .acts_as_list, and I'm going to describe them below.

Approach 1: a module with methods

To avoid polluting ClassMethods namespace, add a module AuxMethods and split .acts_as_list into multiple methods. It'd look something like this:

The problem with this approach is that methods names aren't very readable. Also, since we can't include AuxMethods to ClassMethods, we can't get rid of AuxMethods. prefix. And it doesn't read that well too.

Approach 2: a service object

A service object could look like this:

I think, it's even worse than the approach 1. It looks like the methods that are defined when #define_column_methods is called, are defined on the definer object. And, it's one line longer.

Approach 3: multiple modules

This is my favourite of the three, because:

  • The first thing you read is what argument the defined methods belong to, e.g. ColumnMethodDefiner.
  • The modules' .call methods only take the arguments the modules need (better than the approach 2).

Step 5.2: the result of splitting into multiple modules

After the module extraction (I chose the approach 3), .acts_as_list looks like this:

So, instead of 118 lines, .acts_as_list is 30 lines now, and fits into a page.

Step 5.3: redundant class_eval

Exactly because I have reduced the number of lines, I can now pay more attention to what's left. And, on the line 9 (see above ↑) there's a redundant .class_eval call. This call changes execution context from self to, well, self. That's why it's redundant. After removal, we get (see the lines 9-11 below ↓):

Step 5.4: Rails callbacks

On the lines 13-25 (see above ↑), there are lots of Rails callbacks created. I've already added ColumnMethodDefiner.call, etc, so having callback code here breaks Single Level of Abstraction. I've extracted the Rails callbacks into a separate module (see the line 13 below ↓):

Step 5.5: #acts_as_list_class method

If Rails callbacks break Single Level of Abstraction, doesn't code on the lines 9-11 (see above ↑) break it too? It does. Because it's so small, it seems that there's no harm in having it there as it is, but I don't really care to read that #acts_as_list_class is added, I'd rather read a high-leveled description of what kind of functionality it provides.

So, I've looked up the rest of the code and, #acts_as_list_class is just used internally by the gem. So, it's an auxiliary method. I've extracted it into its own module (see the line 9 below ↓):

Is this the best I can do?

I could possibly treat definers as plugins and load them with:

But I think it'd be an overkill. My main argument against that is that these modules aren't really plugins. If there was a standard way to add plugins in Ruby, that might have been plausible, but adding an ad-hoc plugin system would only make things more complicated. And instead of reading a number of .calls, reader would have to figure out the plugin system. A no-go.

So, that's the best I can do with this method (as per step 4.5).

What to expect from part 2?

In part 2 I'll reap the consequences of choosing the approach 3 to split .acts_as_list into modules, and will refactor one of those modules. I've already started on that, so I can say that it's interesting to see how the choice to use a separate module allowed to further improve the code by extracting methods. Single Responsibility Principle isn't there for nothing after all :)

If you want to know when the part 2 is out, sign up for my email list.

Happy hacking!

The post acts_as_list refactoring part 1 first appeared on Ruby clarity.

]]>
https://rubyclarity.com/2016/11/acts_as_list-refactoring-part-1/feed/ 3
CreateSend refactoring part 1 https://rubyclarity.com/2016/03/createsend-refactoring-part-1/?utm_source=rss&utm_medium=rss&utm_campaign=createsend-refactoring-part-1 https://rubyclarity.com/2016/03/createsend-refactoring-part-1/#respond Sun, 20 Mar 2016 02:01:56 +0000 https://rubyclarity.com/?p=98 Today I'm going to refactor a part of Ruby library called CreateSend. I'll work on lib/createsend/createsend.rb, which contains CreateSend::CreateSend class and some other stuff. Introduction CreateSend is a base class for accessing CampaignMonitor API. It provies .user_agent to set HTTP user agent and .exchange_token to get OAuth access token. In summary, it's a Ruby wrapper for accessing API. Classes like Campaign inherit from CreateSend to add specific methods to work with campaigns, etc. First look The first things declared are

The post CreateSend refactoring part 1 first appeared on Ruby clarity.

]]>
Today I'm going to refactor a part of Ruby library called CreateSend. I'll work on lib/createsend/createsend.rb, which contains CreateSend::CreateSend class and some other stuff.

Introduction

CreateSend is a base class for accessing CampaignMonitor API. It provies .user_agent to set HTTP user agent and .exchange_token to get OAuth access token.

In summary, it's a Ruby wrapper for accessing API. Classes like Campaign inherit from CreateSend to add specific methods to work with campaigns, etc.

First look

The first things declared are default user agent constant (line 3) and some exceptions.

What's good:

  • long string split into two (lines 12-13), instead of having overly long string.
  • thorough comments for error classes, detailing which HTTP error codes they represent (lines 17-26)

Step 1: concise public interface

When reading CreateSendError class (lines 6-15 ↑) I notice that there's too much information there that I don't really need to know (lines 10-13 ↑). I believe, formatting of error message can be moved to a private method, and people reading the code after that can just skip any private methods, because they aren't essential for understanding.

So, I extract method:

Step 2: minor fixes

The comment on line 13 (see above ↑) doesn't really add anything to understanding. It's clear that the code references ResultData, Code and Message, so the comment just repeats what code tells. It is safe to remove it.

On line 14 (see above ↑), extra gets assigned an empty string if ResultData isn't present. nil interpolated into string will yield a "" as well, so there's no need to assign it.

Here's the result:

Step 2.1

On line 2 (see above ↑), extra variable is used to store extra result data. It's kinda easy to remember what's stored in the variable, since it's used on the next line (and we could have used a as the variable name). But I want to make it not necessary to remember at all!

What we really have here is not extra, but result data, to be precise, formatted result data. So, I'm going to use formatted_result_data instead:

I realise that that formatted isn't ideal name for variable contents concantenated with description. If you have a better idea, please leave it in the comments.

Step 2.2

Taking one last look at the CreateSendError class, I realise that it's not DRY, as it's inside CreateSend module, which makes it CreateSend::CreateSendError. Thus, I'm going to make it CreateSend::Error.

Step 3: introducing CreateSend class

Take a look:

We can see that the class uses a HTTP library HTTParty, has some notion of authentication, uses a certificate and can set HTTP user agent.

First thing I don't like about it, is that it's CreateSend::CreateSend class, not DRY at all. From the class comment and from looking at other code, I know that this class is used as a base (for example, Subscriber class is a descendent of it). So, it seems it'd be better to call it Base.

Next up, on line 7 is certificate setup. Too much going on that I don't need to know. Thus, I'm going to move that code to a separate module.

Here's the result:

Step 3.1: setting user agent

Take a look:

.user_agent sets up User-Agent HTTP header. It accepts a single argument, and if it's falsey, the default user agent is used instead.
user_agent nil or user_agent false is a pretty obscure way to say "please present yourself as Createsend to HTTP servers". I think .default_user_agent expresses it better.

The result:

Step 4: class methods

There are 3 class methods left to refactor: .authorize_url, .exchange_token and .refresh_access_token. They all have to do with OAuth. Even though I've never had to deal with OAuth, I'll have no problem refactoring them.

Step 4.1: authorize_url method

authorize_url sounds like it authorizes something, but in fact, it just returns an URL constructed from its arguments. Take a look:

At the very least, the method name should be authorization_url, at most construct_authorization_url. I'll opt for construct_authorization_url as the most descriptive.

On line 2 (see above ↑), the comment repeats argument list. It is redundant information that only leads to longer reading time and doesn't add anything to understanding. After removing that, I ended up with "Construct authorization URL for your application", and that essentially repeats method name construct_authorization_url. So, I've removed the comment altogeter.

Next, there's a lot of duplication with CGI.escape and #to_s. It turns out, there's a Hash#to_query in ActiveSupport that can build a HTTP query string. So I'm going to use that.

The result:

I would like to simplify it further, by treating arguments not as named attributes, but as an array, but that'd probably be an overkill.

Step 4.2: exchange_token

exchange_token does something OAuth-related. It constructs ULR parameters, makes a call to HTTP API and returns result (raising exception on failure). Take a look:

Lines 4-8 (see above ↑) construct URL parameters, same thing as in the step 4.1, I'm going to use Hash#to_query here as well:

Step 4.2.2

Line 9 (see above ↑) is used to add options variable, used on the next line. I think it makes code harder to read. If I inline it into line 10, it won't be required to read it, which is a win:

Step 4.2.3

It's still hard to read, so I'm going to extract some methods:

.request_token constructs URL and makes a HTTP call, returning a hash. Since I've already refactored the code that comprises it, there's nothing to change there.

.fail_on_erroneous_response accepts message argument and raises an exception on erroneous response. One reason to have message argument is to know the error message inside of .exchange_token as it helps to understand what's going on.

Step 4.2.4: fail_on_erroneous_response

At this point (lines 12-18 above ↑) we have two variables (message and err) for essentially the same thing - error message. err can be changed to full_error_message:

I still find it hard to read, the code tells how it does its job, instead of expressing intent. Thus, I'll use extract method again:

Step 4.2.5

After refactoring extracted parts of .exchange_token, there's still that part where it returns the result (line 6):

I dislike to use such APIs because I have to remember what it returns before I need to use it. E.g.:

access_token, expires_in, refresh_token = Base.exchange_token(...)
...
do_stuff(access_token)

So, I replaced array with a data clump:

At this point I feel happy about .exchange_token.

Step 4.3: refresh_access_token

.refresh_access_token has the same ailments as .exchange_token:

And is easy to fix in one go:

There's a lot left to be refactored in Base class, and I will do it in part 2!

Happy hacking!

The post CreateSend refactoring part 1 first appeared on Ruby clarity.

]]>
https://rubyclarity.com/2016/03/createsend-refactoring-part-1/feed/ 0