Musings - Ruby clarity http://rubyclarity.com/ Refactorings of Ruby/Rails projects Tue, 22 May 2018 15:02:59 +0000 en-US hourly 1 https://wordpress.org/?v=5.4.7 A real-world example of technical debt https://rubyclarity.com/2018/05/a-real-world-example-of-technical-debt/?utm_source=rss&utm_medium=rss&utm_campaign=a-real-world-example-of-technical-debt https://rubyclarity.com/2018/05/a-real-world-example-of-technical-debt/#respond Thu, 17 May 2018 22:04:31 +0000 https://rubyclarity.com/?p=506 Creating technical debt can be as simple as writing a draft and not refactoring it. That's exactly what I've done here. Now, I want to go through my code and see if it does, in fact, have technical debt. I wrote a script to show me a quote from a light novel I'm a fan of. The script is called as overlord_quote <search_term>, and it prints a random quote made of 6 sentences. First, it matches all sentences with the

The post A real-world example of technical debt first appeared on Ruby clarity.

]]>
Creating technical debt can be as simple as writing a draft and not refactoring it. That's exactly what I've done here. Now, I want to go through my code and see if it does, in fact, have technical debt.

I wrote a script to show me a quote from a light novel I'm a fan of. The script is called as overlord_quote <search_term>, and it prints a random quote made of 6 sentences. First, it matches all sentences with the search_term, then selects a random sentence from the matches, and then prints it and 5 sentences right after it. It also tells you how many quotes are there for the search_term. As simple as that.

Here it is, if you'd like to see the whole script.

Bad code

Here's some bad code from the script:

def random_quote
  quote_index = @quotes[rand @quotes.size]
  @sentences[quote_index...quote_index + TOTAL_SENTENCES_TO_SHOW].map { |quote| prettify quote }
    .join(" ")
end

So, this part of the code selects a random sentence from the sentences matching search_term. On line 2, a random quote is selected from @quotes and it becomes a quote_index? That's hard to understand without knowing that @quotes doesn't contain the matching sentences. It actually contains indexes of all the matching sentences. I chose to call the matching sentences quotes, but it's actually a misnomer. @quotes should be called @matching_sentence_indexes or something like that.

Line 3 isn't too bad, but #prettify is a misnomer too. It actually removes all the \ns from the passed sentence, and doesn't do any formatting or coloring.

Overall, the code suffers from bad naming and it's too low-level.

Is it technical debt?

So, is it technical debt? If it is, why?

Like a financial debt, the technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice.
From TechnicalDebt by Martin Fowler.

If I will need to fix a bug or add a feature in 3 months time, it will take more time to understand the script than if it was clean code with thoughtful naming. Thus, my quick and dirty code would have incurred extra effort to understand it. And that would happen every time I'd have to read it after a pause in development. If I had other people working on it, they'd have to put extra effort too. So, I can conclude that it is indeed technical debt.

Also, my code doesn't have any tests, so for every change, I would spend extra efforts to test it manually. You might consider this the interest payment on my technical debt.

How would this code look without the debt?

To give you an example of paying off some of the technical debt, I've refactored #random_quote by extracting some methods:

def random_quote
  format_quote \
    fetch_quote(starting_sentence_index: random_matching_sentence_index)
end

Here it is in full. Please note that I have paid off just a part of the debt, and it's not the best possible code, but it's better.

More examples of technical debt

You can find a list of bad code smells (they are technical debt too) explained, in Martin Fowler's book Refactoring: Ruby edition. The bad code smells are explained in English, without code, but are easy to understand.

How to pay off technical debt

If you're fed up paying interest on your technical debt, would you like to learn how to pay off the principal? My FREE course can help you start. You will learn how to refactor technical debt into clean code and how to keep its amount low in your code.


If you need help paying off technical debt to go faster, I can help.

The post A real-world example of technical debt first appeared on Ruby clarity.

]]>
https://rubyclarity.com/2018/05/a-real-world-example-of-technical-debt/feed/ 0
Is it always a good idea to split long methods into smaller ones? An experiment. https://rubyclarity.com/2017/07/is-it-always-a-good-idea-to-split-long-methods-into-smaller-ones-an-experiment/?utm_source=rss&utm_medium=rss&utm_campaign=is-it-always-a-good-idea-to-split-long-methods-into-smaller-ones-an-experiment https://rubyclarity.com/2017/07/is-it-always-a-good-idea-to-split-long-methods-into-smaller-ones-an-experiment/#comments Fri, 07 Jul 2017 20:40:33 +0000 https://rubyclarity.com/?p=412 There's a long discussion on Reddit named Is it OK to split long functions and methods into smaller ones even though they won't be called by anything else?. Some people in that discussion hold an opinion that it's not always the best idea to split a long method into smaller methods. Well, of course, you can refactor code in many different ways, including ways that read worse than the original long method. But I think that in most cases it

The post Is it always a good idea to split long methods into smaller ones? An experiment. first appeared on Ruby clarity.

]]>
There's a long discussion on Reddit named Is it OK to split long functions and methods into smaller ones even though they won't be called by anything else?. Some people in that discussion hold an opinion that it's not always the best idea to split a long method into smaller methods. Well, of course, you can refactor code in many different ways, including ways that read worse than the original long method. But I think that in most cases it makes for better readability to split long methods into smaller ones.

So, I want to do an experiment and try to split a long method that's difficult to split into smaller methods. I know that it's easy to extract methods when the code uses several levels of abstraction or a low level of abstraction. In such a case it's easy to achieve better readability by extracting some methods. So, the hardest long method I can think of would use a high level of abstraction, from which it wouldn't be easy to go to a level higher.

Such a method can be found in Rails' ActiveRecord::Persistence module. It's the #touch method, and it deals with timestamp attributes that need to be updated with current time, scopes, primary keys, locks and SQL UPDATE. In other words, the level of abstraction is high enough and it's the level that deals with database-level concepts. I expect it won't be that easy to go to a higher level of abstraction here.

Introducing the #touch method

def touch(*names, time: nil)
  unless persisted?
    raise ActiveRecordError, <<-MSG.squish
      cannot touch on a new or destroyed record object. Consider using
      persisted?, new_record?, or destroyed? before touching
    MSG
  end

  time ||= current_time_from_proper_timezone
  attributes = timestamp_attributes_for_update_in_model
  attributes.concat(names)

  unless attributes.empty?
    changes = {}

    attributes.each do |column|
      column = column.to_s
      changes[column] = write_attribute(column, time)
    end

    primary_key = self.class.primary_key
    scope = self.class.unscoped.where(primary_key => _read_attribute(primary_key))

    if locking_enabled?
      locking_column = self.class.locking_column
      scope = scope.where(locking_column => _read_attribute(locking_column))
      changes[locking_column] = increment_lock
    end

    clear_attribute_changes(changes.keys)
    result = scope.update_all(changes) == 1

    if !result && locking_enabled?
      raise ActiveRecord::StaleObjectError.new(self, "touch")
    end

    @_trigger_update_callback = result
    result
  else
    true
  end
end

At 42 lines, #touch is quite long and requires some explaining. Arguments #touch accepts are names (meaning timestamp attributes to update) and time (the time to set timestamps to). Then we decline to work on non-persisted records (lines 2-7). Then we set up the default time value to current time and merge standard timestamp attributes and passed timestamp attributes (names) into attributes array (lines 9-11).

Then goes the most interesting part, when there are some attributes to update (i.e. attributes isn't empty). We instantiate changes hash to pass to #update_all later, and fill it up with attribute keys and time values, and set timestamp attributes on the record to time (lines 14-19). Then we setup a scope we'll be using to match this very record that we #touch, using primary key (lines 21-22). Then we deal with the case when locking is on, updating both record, changes hash and the scope (lines 24-28).

And then clear #changed, so that after we've updated/touched the record, the stuff we've updated in db is no longer marked as changed (line 30). Then we run #update_all. If we've failed to find the record, because of locking issues, we raise a StaleObjectError (lines 33-35). And at last, we setup a special flag telling Rails whether we've actually changed data in the db.

Take 1: an easy way to split

Now, an easy way to split a method is to see how it's structured, and split at the seams. I'd expect a junior developer to do just that. And that's what I did:

unless attributes.empty?
  changes = {}

  update_record_and_changes_with_time(attributes, time, changes)
  scope = scope_by_primary_key

  if locking_enabled?
    scope = extend_scope_to_match_locking_column_value(scope)
    update_record_and_changes_with_lock_increment(changes)
  end

  clear_attribute_changes(changes.keys)
  result = scope.update_all(changes) == 1

  if !result && locking_enabled?
    raise ActiveRecord::StaleObjectError.new(self, "touch")
  end

  @_trigger_update_callback = result
  result
else
  true
end

As you can see, I just replaced more verbose code with explanations. The same level of abstractions is used. The main problem here is whether it's still clear what's going on. I've shown this code to a person and he says it's still clear.

However, this is really my 3rd take on refactoring #touch. The first two refactoring were pretty bad. It took me awhile to understand all the intricacies of this method.

Fed up working on bad code? Here's a way out!

For people that that want to stop suffering from bad code I’ve made a FREE course

Hidden concepts

There are hidden concepts in the original code that the reader has to figure out by themselves. I describe them beneath the code (no need to read the code below without explanations).

unless attributes.empty?
  changes = {}

  attributes.each do |column|
    column = column.to_s
    changes[column] = write_attribute(column, time)
  end

  primary_key = self.class.primary_key
  scope = self.class.unscoped.where(primary_key => _read_attribute(primary_key))

  if locking_enabled?
    locking_column = self.class.locking_column
    scope = scope.where(locking_column => _read_attribute(locking_column))
    changes[locking_column] = increment_lock
  end

  clear_attribute_changes(changes.keys)
  result = scope.update_all(changes) == 1

  if !result && locking_enabled?
    raise ActiveRecord::StaleObjectError.new(self, "touch")
  end

  @_trigger_update_callback = result
  result
else
  true
end

So, the hidden concepts are:

  1. changes and record attributes are updated with the same information at the same time (lines 4-7 and 15). It follows that the record attributes and changes hold the same information afterwards.
  2. scope is used to match the record #touch is called on (lines 9-10 and 14).
  3. Scope extension to include locking information has to happen before #increment_lock is called.

Also, there is understanding of why we need #clear_attribute_changes (on the line 18) after we're done with adding locking information to changes. In fact, that statement used to be placed before the code that deals with locking, and was fixed later. The bug was that locking column was updated, but not cleared off #changed. Of course, now the tests prevent from regressions, but it'd be great to understand all the intricacies easily.

I believe that if we could put all the code that deals with updating record attributes into one place, and place #clear_attribute_changes afterwards, it'll be a bit clearer. This brings us to another refactoring take:

Take 2: split by activity

In this take I've clearly separated scope building and updating of record attributes and changes hash:

unless attributes.empty?
  scope = prepare_scope_to_match_this_record

  changes = {}
  update_record_and_changes_with_same_data(attributes, time, changes)

  result = scope.update_all(changes) == 1

  if !result && locking_enabled?
    raise ActiveRecord::StaleObjectError.new(self, "touch")
  end

  @_trigger_update_callback = result
  result
else
  true
end

private

def update_record_and_changes_with_same_data(attributes, time, changes)
  attributes.each do |column|
    column = column.to_s
    changes[column] = write_attribute(column, time)
  end

  if locking_enabled?
    changes[self.class.locking_column] = increment_lock
  end

  clear_attribute_changes(changes.keys)
end

I think the line 2 is as good as it gets, but the line 5 could be named better. The level of abstraction used is essentially the same as the original code, dealing with records and scopes.

As for revealing of the hidden concepts, we have:

  1. Clear indication that the record attributes and changes are updated with the same data (see the line 5).
  2. It's clear that scope is meant to match the record #touch is called on.
  3. It's not clear why scope should be prepared before changing record, at least not in a prominent way. I considered naming scope preparing method #prepare_scope_to_match_this_unchanged_record, but _unchanged_ would just add cognitive overhead. It's not every day that you're moving things around. And, match this record hints that values used for that should be such that the record can be found, i.e. unchanged.

Take 3: split by the domain concepts

The first two takes haven't really deviated from the existing code structure, the first take especially. But there's another way to look at it, from the domain perspective. You could say it's thinking out of the box.

What does the code tell us about domain? Do we see domain concepts manifesting in the code?

If I look at #touch with that in mind, I can't help noticing that domain logic for #touch should be:

  1. Updating the record attributes with new time. Standard timestamp attributes and passed timestamp attributes are updated.
  2. Saving those attributes in db.

That's the level of abstraction that should be used, when looking from the domain context perspective:

unless attributes.empty?
  attributes.each do |column|
    column = column.to_s
    write_attribute(column, time)
  end

  touch_columns(*attributes)
else
  true
end

So, here we update the standard and passed timestamp attributes on the record and tell #touch_columns to save them in db.

I like this take the most. There's no word about locking as it doesn't really belong in the domain logic, and we don't know how the attributes are going to get saved, all we care about is that they get saved.

The name #touch_columns is not very good. In Rails, there is #update_attributes that #saves the record, and there is #update_columns that just executes SQL and doesn't call any callbacks. #touch_columns is in-between, skipping validations, but calling callbacks. I just don't know Rails enough to come up with a better name. But otherwise, it's a good take.

Conclusion

I've selected code that I considered to be hard to split into smaller methods. It had high level of abstraction, which increased the difficulty level.

So, Is it always a good idea to split a long method into smaller ones? The conclusion I've made is that it depends on your refactoring skills and tenacity. I could have stopped after my first two attempts (take 1 is actually my 3rd attempt) and declared it impossible to split #touch, because the first two attempts were bad.

All three takes described in this post are fitting as a replacement of the original code. Take 1 is better than the original code because it allows to understand what's going on faster, and provides the same level of understanding of the implementation. Take 2 is better than the original code because it highlights hidden concepts that the reader would have to take more time to get otherwise. And finally, take 3 is better than the original code because it uses domain-level concepts, and that makes #touch much easier to reason about.

I would also like to note that doing just extract methods as the original question seems to imply, is very limiting. Refactoring is much more than just extracting methods.

Hire me to help your team ship features faster.

The post Is it always a good idea to split long methods into smaller ones? An experiment. first appeared on Ruby clarity.

]]>
https://rubyclarity.com/2017/07/is-it-always-a-good-idea-to-split-long-methods-into-smaller-ones-an-experiment/feed/ 5