The post Refactoring a wall of JavaScript from dev.to project first appeared on Ruby clarity.
]]>Also, we're experimenting with the naming process, inspired by
Naming as a process.This video was originally streamed on my Twitch channel. If you'd like to be notified, when I stream, you can subscribe on Twitch or on Twitter
This video was originally streamed on my Twitch channel. If you'd like to be notified, when I stream, you can subscribe on Twitch or on Twitter.
Happy hacking!
The post Refactoring a wall of JavaScript from dev.to project first appeared on Ruby clarity.
]]>The post A real-world example of technical debt first appeared on Ruby clarity.
]]>I wrote a script to show me a quote from a light novel I'm a fan of. The script is called as overlord_quote <search_term>
, and it prints a random quote made of 6 sentences. First, it matches all sentences with the search_term
, then selects a random sentence from the matches, and then prints it and 5 sentences right after it. It also tells you how many quotes are there for the search_term
. As simple as that.
Here it is, if you'd like to see the whole script.
Here's some bad code from the script:
def random_quote quote_index = @quotes[rand @quotes.size] @sentences[quote_index...quote_index + TOTAL_SENTENCES_TO_SHOW].map { |quote| prettify quote } .join(" ") end
So, this part of the code selects a random sentence from the sentences matching search_term
. On line 2, a random quote is selected from @quotes
and it becomes a quote_index
? That's hard to understand without knowing that @quotes
doesn't contain the matching sentences. It actually contains indexes of all the matching sentences. I chose to call the matching sentences quotes, but it's actually a misnomer. @quotes
should be called @matching_sentence_indexes
or something like that.
Line 3 isn't too bad, but #prettify
is a misnomer too. It actually removes all the \n
s from the passed sentence, and doesn't do any formatting or coloring.
Overall, the code suffers from bad naming and it's too low-level.
So, is it technical debt? If it is, why?
Like a financial debt, the technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice.
From TechnicalDebt by Martin Fowler.
If I will need to fix a bug or add a feature in 3 months time, it will take more time to understand the script than if it was clean code with thoughtful naming. Thus, my quick and dirty code would have incurred extra effort to understand it. And that would happen every time I'd have to read it after a pause in development. If I had other people working on it, they'd have to put extra effort too. So, I can conclude that it is indeed technical debt.
Also, my code doesn't have any tests, so for every change, I would spend extra efforts to test it manually. You might consider this the interest payment on my technical debt.
To give you an example of paying off some of the technical debt, I've refactored #random_quote
by extracting some methods:
def random_quote format_quote \ fetch_quote(starting_sentence_index: random_matching_sentence_index) end
Here it is in full. Please note that I have paid off just a part of the debt, and it's not the best possible code, but it's better.
You can find a list of bad code smells (they are technical debt too) explained, in Martin Fowler's book Refactoring: Ruby edition. The bad code smells are explained in English, without code, but are easy to understand.
If you're fed up paying interest on your technical debt, would you like to learn how to pay off the principal? My FREE course can help you start. You will learn how to refactor technical debt into clean code and how to keep its amount low in your code.
If you need help paying off technical debt to go faster, I can help.
The post A real-world example of technical debt first appeared on Ruby clarity.
]]>The post Is it always a good idea to split long methods into smaller ones? An experiment. first appeared on Ruby clarity.
]]>So, I want to do an experiment and try to split a long method that's difficult to split into smaller methods. I know that it's easy to extract methods when the code uses several levels of abstraction or a low level of abstraction. In such a case it's easy to achieve better readability by extracting some methods. So, the hardest long method I can think of would use a high level of abstraction, from which it wouldn't be easy to go to a level higher.
Such a method can be found in Rails' ActiveRecord::Persistence
module. It's the #touch
method, and it deals with timestamp attributes that need to be updated with current time, scopes, primary keys, locks and SQL UPDATE. In other words, the level of abstraction is high enough and it's the level that deals with database-level concepts. I expect it won't be that easy to go to a higher level of abstraction here.
def touch(*names, time: nil) unless persisted? raise ActiveRecordError, <<-MSG.squish cannot touch on a new or destroyed record object. Consider using persisted?, new_record?, or destroyed? before touching MSG end time ||= current_time_from_proper_timezone attributes = timestamp_attributes_for_update_in_model attributes.concat(names) unless attributes.empty? changes = {} attributes.each do |column| column = column.to_s changes[column] = write_attribute(column, time) end primary_key = self.class.primary_key scope = self.class.unscoped.where(primary_key => _read_attribute(primary_key)) if locking_enabled? locking_column = self.class.locking_column scope = scope.where(locking_column => _read_attribute(locking_column)) changes[locking_column] = increment_lock end clear_attribute_changes(changes.keys) result = scope.update_all(changes) == 1 if !result && locking_enabled? raise ActiveRecord::StaleObjectError.new(self, "touch") end @_trigger_update_callback = result result else true end end
At 42 lines, #touch
is quite long and requires some explaining. Arguments #touch
accepts are names
(meaning timestamp attributes to update) and time
(the time to set timestamps to). Then we decline to work on non-persisted records (lines 2-7). Then we set up the default time
value to current time and merge standard timestamp attributes and passed timestamp attributes (names
) into attributes
array (lines 9-11).
Then goes the most interesting part, when there are some attributes
to update (i.e. attributes
isn't empty). We instantiate changes
hash to pass to #update_all
later, and fill it up with attribute
keys and time
values, and set timestamp attributes on the record to time
(lines 14-19). Then we setup a scope we'll be using to match this very record that we #touch
, using primary key (lines 21-22). Then we deal with the case when locking is on, updating both record, changes
hash and the scope (lines 24-28).
And then clear #changed
, so that after we've updated/touched the record, the stuff we've updated in db is no longer marked as changed (line 30). Then we run #update_all
. If we've failed to find the record, because of locking issues, we raise a StaleObjectError
(lines 33-35). And at last, we setup a special flag telling Rails whether we've actually changed data in the db.
Now, an easy way to split a method is to see how it's structured, and split at the seams. I'd expect a junior developer to do just that. And that's what I did:
unless attributes.empty? changes = {} update_record_and_changes_with_time(attributes, time, changes) scope = scope_by_primary_key if locking_enabled? scope = extend_scope_to_match_locking_column_value(scope) update_record_and_changes_with_lock_increment(changes) end clear_attribute_changes(changes.keys) result = scope.update_all(changes) == 1 if !result && locking_enabled? raise ActiveRecord::StaleObjectError.new(self, "touch") end @_trigger_update_callback = result result else true end
As you can see, I just replaced more verbose code with explanations. The same level of abstractions is used. The main problem here is whether it's still clear what's going on. I've shown this code to a person and he says it's still clear.
However, this is really my 3rd take on refactoring #touch
. The first two refactoring were pretty bad. It took me awhile to understand all the intricacies of this method.
There are hidden concepts in the original code that the reader has to figure out by themselves. I describe them beneath the code (no need to read the code below without explanations).
unless attributes.empty? changes = {} attributes.each do |column| column = column.to_s changes[column] = write_attribute(column, time) end primary_key = self.class.primary_key scope = self.class.unscoped.where(primary_key => _read_attribute(primary_key)) if locking_enabled? locking_column = self.class.locking_column scope = scope.where(locking_column => _read_attribute(locking_column)) changes[locking_column] = increment_lock end clear_attribute_changes(changes.keys) result = scope.update_all(changes) == 1 if !result && locking_enabled? raise ActiveRecord::StaleObjectError.new(self, "touch") end @_trigger_update_callback = result result else true end
So, the hidden concepts are:
changes
and record attributes are updated with the same information at the same time (lines 4-7 and 15). It follows that the record attributes and changes
hold the same information afterwards.scope
is used to match the record #touch
is called on (lines 9-10 and 14).#increment_lock
is called.Also, there is understanding of why we need #clear_attribute_changes
(on the line 18) after we're done with adding locking information to changes
. In fact, that statement used to be placed before the code that deals with locking, and was fixed later. The bug was that locking column was updated, but not cleared off #changed
. Of course, now the tests prevent from regressions, but it'd be great to understand all the intricacies easily.
I believe that if we could put all the code that deals with updating record attributes into one place, and place #clear_attribute_changes
afterwards, it'll be a bit clearer. This brings us to another refactoring take:
In this take I've clearly separated scope
building and updating of record attributes and changes
hash:
unless attributes.empty? scope = prepare_scope_to_match_this_record changes = {} update_record_and_changes_with_same_data(attributes, time, changes) result = scope.update_all(changes) == 1 if !result && locking_enabled? raise ActiveRecord::StaleObjectError.new(self, "touch") end @_trigger_update_callback = result result else true end private def update_record_and_changes_with_same_data(attributes, time, changes) attributes.each do |column| column = column.to_s changes[column] = write_attribute(column, time) end if locking_enabled? changes[self.class.locking_column] = increment_lock end clear_attribute_changes(changes.keys) end
I think the line 2 is as good as it gets, but the line 5 could be named better. The level of abstraction used is essentially the same as the original code, dealing with records and scopes.
As for revealing of the hidden concepts, we have:
changes
are updated with the same data (see the line 5).scope
is meant to match the record #touch
is called on.scope
should be prepared before changing record, at least not in a prominent way. I considered naming scope preparing method #prepare_scope_to_match_this_unchanged_record
, but _unchanged_
would just add cognitive overhead. It's not every day that you're moving things around. And, match this record hints that values used for that should be such that the record can be found, i.e. unchanged.The first two takes haven't really deviated from the existing code structure, the first take especially. But there's another way to look at it, from the domain perspective. You could say it's thinking out of the box.
What does the code tell us about domain? Do we see domain concepts manifesting in the code?
If I look at #touch
with that in mind, I can't help noticing that domain logic for #touch
should be:
time
. Standard timestamp attributes and passed timestamp attributes are updated.That's the level of abstraction that should be used, when looking from the domain context perspective:
unless attributes.empty? attributes.each do |column| column = column.to_s write_attribute(column, time) end touch_columns(*attributes) else true end
So, here we update the standard and passed timestamp attributes on the record and tell #touch_columns
to save them in db.
I like this take the most. There's no word about locking as it doesn't really belong in the domain logic, and we don't know how the attributes are going to get saved, all we care about is that they get saved.
The name #touch_columns
is not very good. In Rails, there is #update_attributes
that #save
s the record, and there is #update_columns
that just executes SQL and doesn't call any callbacks. #touch_columns
is in-between, skipping validations, but calling callbacks. I just don't know Rails enough to come up with a better name. But otherwise, it's a good take.
I've selected code that I considered to be hard to split into smaller methods. It had high level of abstraction, which increased the difficulty level.
So, Is it always a good idea to split a long method into smaller ones? The conclusion I've made is that it depends on your refactoring skills and tenacity. I could have stopped after my first two attempts (take 1 is actually my 3rd attempt) and declared it impossible to split #touch
, because the first two attempts were bad.
All three takes described in this post are fitting as a replacement of the original code. Take 1 is better than the original code because it allows to understand what's going on faster, and provides the same level of understanding of the implementation. Take 2 is better than the original code because it highlights hidden concepts that the reader would have to take more time to get otherwise. And finally, take 3 is better than the original code because it uses domain-level concepts, and that makes #touch
much easier to reason about.
I would also like to note that doing just extract methods as the original question seems to imply, is very limiting. Refactoring is much more than just extracting methods.
The post Is it always a good idea to split long methods into smaller ones? An experiment. first appeared on Ruby clarity.
]]>The post acts_as_list refactoring part 3 first appeared on Ruby clarity.
]]>You don't need to read part 2 and part 1 to understand this article.
acts_as_list is a Rails gem. It allows you to treat Rails model records as part of an ordered list and offers methods like #move_to_bottom
and #move_higher
.
.update_all_with_touch
method caught my attention as it's a somewhat long (11 lines) and hairy method. This method executes passed SQL, as Rails' #update_all
does, but also updates standard timestamps like updated_at
.
define_singleton_method :update_all_with_touch do |updates| record = new attrs = record.send(:timestamp_attributes_for_update_in_model) now = record.send(:current_time_from_proper_timezone) attrs.each do |attr| updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}" end update_all(updates) end
Let's have a look. On the line 2 (see above ↑) it creates a new model instance (acts_as_list is supposed to extend ActiveRecord
models, so naturally new
would create one). Then it sends two messages to the created model instance record
via #send
. The reason it uses #send
is because both those methods are private, so it can't just say record.timestamp_attributes_for_update_in_model
. Now, this is some cognitive load, because every time I read these lines I can't help think of why #send
has to be used. But I'll get to it later, let's look at the rest of the method now.
One the lines 6-10 (see above ↑), SQL is built and appended to updates
argument, modifying it. Each of the timestamp_attributes_for_update_in_model
is updated with current time. And after the SQL was built, it's executed with Rails' standard #update_all
.
So, this method does two things - build SQL and execute it. And most of the method is taken up by building SQL.
Is anything wrong with this method? For my taste, it's too hairy, and it goes into too much detail about details of building SQL. So, the first thing I want to do is to go to a higher level of abstraction on building SQL:
define_singleton_method :update_all_with_touch do |updates| update_all(updates << touch_record_sql) end private define_singleton_method :touch_record_sql do record = new attrs = record.send(:timestamp_attributes_for_update_in_model) now = record.send(:current_time_from_proper_timezone) updates = "" attrs.each do |attr| updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}" end updates end
The result on the line 2 (see above ↑) allows us to grasp what's going on much faster. updates
is being appended with SQL of touch_record_sql
. It took me more than two pomodoro to figure out a decent name for the method. I've tried many, including update_standard_timestamps_to_current_time_sql
(if only it wasn't that long). I prefer touch_record_sql
because it uses a well known term touch, which is inherited from Unix touch(1)
command and Rails' #touch
. Touch means update appropriate timestamps.
I've copied the code from above for easier reference:
define_singleton_method :update_all_with_touch do |updates| update_all(updates << touch_record_sql) end private define_singleton_method :touch_record_sql do record = new attrs = record.send(:timestamp_attributes_for_update_in_model) now = record.send(:current_time_from_proper_timezone) updates = "" attrs.each do |attr| updates << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}" end updates end
On the lines 12-17 (see above ↑), updates
variable we inherited from .update_all_with_touch
method, doesn't explain what's going on well enough. It may mean updates we want to do to db records, but it's far from being obvious. It's not a bad name, but I prefer sql
, to be in tune with the method's name touch_record_sql
:
define_singleton_method :touch_record_sql do record = new attrs = record.send(:timestamp_attributes_for_update_in_model) now = record.send(:current_time_from_proper_timezone) sql = "" attrs.each do |attr| sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}" end sql end
One the lines 6-11 (see above ↑) #each
loops over timestamp attributes and collects SQL fragments into sql
variable. It's a typical misuse of #each
and it could be replaced with #map(...).join(", ")
if we didn't need the leading ,
. In this case, #each
can be replaced with #inject
:
define_singleton_method :touch_record_sql do record = new attrs = record.send(:timestamp_attributes_for_update_in_model) now = record.send(:current_time_from_proper_timezone) attrs.inject("") do |sql, attr| sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}" end end
As I mentioned previously, #send
is used here to run private methods on a model instance (see the lines 3-4 above ↑). And, it incurs cognitive load, because you have to wonder why #send
is used here. So, I chose to move this code to an instance method:
define_singleton_method :touch_record_sql do new.touch_record_sql end ... define_method :touch_record_sql do connection = self.class.connection attrs = timestamp_attributes_for_update_in_model now = current_time_from_proper_timezone attrs.inject("") do |sql, attr| sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}" end end
The line 2 (see above ↑) raises a question though: Why do we need to create model instance to build SQL?. Oh well, it's not perfect.
On the line 8 (see above ↑) I had to add connection
variable, so even though record = new
is no longer needed, we've not reduced the number of lines. But #send
is gone! (see the lines 9-10 above ↑).
On the line 9 (see above ↑) there's a variable attrs
that is used on the line 12 only. And, since we no longer have #send
, i.e. the right side of the variable assignment isn't hairy, we can just do away with it. After all, what new does attrs
tells us that timestamp_attributes_for_update_in_model
does not?
define_method :touch_record_sql do connection = self.class.connection now = current_time_from_proper_timezone timestamp_attributes_for_update_in_model.inject("") do |sql, attr| sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(now))}" end end
On the line 3 (see above ↑), there's now
variable that is only used in one place, at the line 6. It could be said that there's a performance benefit to keeping current_time_from_proper_timezone
call out of the #inject
loop, but it also could be said that it's premature optimisation.
It looks like a stalemate, but thankfully, there's another angle we can use here - readability. From readability perspective, having now
out of the loop clarifies that the now
value doesn't depend on the loop.
But on the line 6 there's also some code that doesn't depend on the loop - #{connection.quote(connection.quoted_date(now))}
, and it's hard to reason about two parts of the value that doesn't depend on the loop. So, I'm still going to inline now
, and see how it goes:
define_method :touch_record_sql do connection = self.class.connection timestamp_attributes_for_update_in_model.inject("") do |sql, attr| sql << ", #{connection.quote_column_name(attr)} = #{connection.quote(connection.quoted_date(current_time_from_proper_timezone))}" end end
Have to say, I like less lines here, but the right value in the SQL assignment doesn't depend on loop values, so it should be moved out of the loop for clarity:
define_method :touch_record_sql do connection = self.class.connection quoted_now = connection.quote(connection.quoted_date( current_time_from_proper_timezone)) timestamp_attributes_for_update_in_model.inject("") do |sql, attr| sql << ", #{connection.quote_column_name(attr)} = #{quoted_now}" end end
At this point, when looking at the lines 3-4 (see above ↑) I'm asking myself why not extract quoted_now
into a method. I did just that:
define_method :touch_record_sql do connection = self.class.connection quoted_now = quoted_current_time_from_proper_timezone timestamp_attributes_for_update_in_model.inject("") do |sql, attr| sql << ", #{connection.quote_column_name(attr)} = #{quoted_now}" end end private def quoted_current_time_from_proper_timezone self.class.connection.quote(self.class.connection.quoted_date( current_time_from_proper_timezone)) end
On the lines 3-7 (see above ↑) we can see that quoted_now
is used only on the line 6, and a question arises Why not inline it and live happily ever after?. We already discussed that quoted_now
must be the same for all its uses within SQL, but I've failed to encode this knowledge into words. So, I'm going to use a variable name that clearly explains that - cached_quoted_now
:
define_method :touch_record_sql do connection = self.class.connection cached_quoted_now = quoted_current_time_from_proper_timezone timestamp_attributes_for_update_in_model.inject("") do |sql, attr| sql << ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}" end end
I quite like the result. Have to say, I first got the idea of using a different name for quoted_now
when I imagined that the extracted method would be called #quoted_now
, so I had to invent a new name for the variable as quoted_now = self.quoted_now
would suck.
On the line 2 (see above ↑) you can see a remnant of the beginning of this refactoring, connection
variable. Now that it's used only once on the line 6, it can be inlined. So, typing self.class.connection
sucks, so why not have #connection
method? I'd say that it goes agains the Single Responsibility Principle, but convenience trumps it in this case!
define_method :touch_record_sql do cached_quoted_now = quoted_current_time_from_proper_timezone timestamp_attributes_for_update_in_model.inject("") do |sql, attr| sql << ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}" end end private delegate :connection, to: self
Not that bad.
I got a suggestion from reader Alex Piechowski that #map
and #join
can be used here, instead of #inject
:
define_method :touch_record_sql do cached_quoted_now = quoted_current_time_from_proper_timezone timestamp_attributes_for_update_in_model.map do |attr| ", #{connection.quote_column_name(attr)} = #{cached_quoted_now}" end.join end
Thank you, Alex!
I planned on doing more stuff in this refactoring, but it's quite a lot as it is. And most impressive to me, it all came from refactoring a single method. Just how much can you draw from a single method? Turns out, quite a lot.
Happy hacking!
P.S. my PR was accepted by acts_as_list project!
The post acts_as_list refactoring part 3 first appeared on Ruby clarity.
]]>The post acts_as_list refactoring part 2 first appeared on Ruby clarity.
]]>As you might remember, I've split .acts_as_list
method into several modules, each module dedicated to an option passed to the method. E.g. ColumnMethodDefiner
module defines methods related to the column
option (the option defines column name for storing record's list position).
This post is dedicated to refactoring of the ColumnMethodDefiner
module.
So, I've extracted code related to column
option of .acts_as_list
to ColumnMethodDefiner
. Here's an excerpt:
module ActiveRecord::Acts::List::ColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, column) | |
caller_class.class_eval do | |
attr_reader :position_changed | |
define_method :position_column do | |
column | |
end | |
define_method :"#{column}=" do |position| | |
write_attribute(column, position) | |
@position_changed = true | |
end | |
# only add to attr_accessible | |
# if the class has some mass_assignment_protection | |
if defined?(accessible_attributes) and !accessible_attributes.blank? | |
attr_accessible :"#{column}" | |
end | |
... |
Line 7 (see above ↑) references column
, but what column is that? Line 6 hints that we're talking about position column, i.e. column
means "name of the column that holds record's position in the list". I.e. position_column_name
. Unfortunately, it's too hard to read, so I opted for position_column
, which is easier to read:
module ActiveRecord::Acts::List::ColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
caller_class.class_eval do | |
attr_reader :position_changed | |
define_method :position_column do | |
position_column | |
end | |
define_method :"#{position_column}=" do |position| | |
write_attribute(position_column, position) | |
@position_changed = true | |
end | |
# only add to attr_accessible | |
# if the class has some mass_assignment_protection | |
if defined?(accessible_attributes) and !accessible_attributes.blank? | |
attr_accessible :"#{position_column}" | |
end | |
... |
I like that the method defined on the line 6 (see above ↑) has the same name as #position_column
method. Earlier, we had to reason as to why column
argument and #position_column
method contained the same data, were named differently. But no more! One concept less!
At this point, ColumnMethodDefiner
module's mission is to define methods related to position_column
, but the module is named as if it works with just Column
. It is inconsistent, so I'm going to rename it to PositionColumnMethodDefiner
:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
caller_class = self | |
PositionColumnMethodDefiner.call(caller_class, column) | |
... |
On the line 4 (see above ↑), we still use column
argument though, but from the module name, we can infer that we talk about position column.
I would have liked to deprecate the column
argument and introduce position_column
to replace it, but that would be changing functionality, and refactoring is all about restructuring code and keeping functionality intact.
PositionColumnMethodDefiner.call
is 46 lines long and starts with defining some instance methods:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
caller_class.class_eval do | |
attr_reader :position_changed | |
define_method :position_column do | |
position_column | |
end | |
define_method :"#{position_column}=" do |position| | |
write_attribute(position_column, position) | |
@position_changed = true | |
end | |
... |
Since the method is too long, I'm going to extract #define_instance_methods
:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
caller_class.class_eval do | |
... | |
end | |
private | |
def self.define_instance_methods(caller_class, position_column) | |
caller_class.class_eval do | |
attr_reader :position_changed | |
... |
Because in part 1 I've chosen to extract stuff related to position column to a separate module, I can now extract methods from .call
method and not be afraid to pollute the namespace (as opposed to a single module for all .acts_as_list
options).
An interesting thing to note is that line 3 (see above ↑) doesn't need to be inside .class_eval
block that starts on line 5. At first, I made a mistake of putting the .define_instance_methods
method call inside the block, and it led to a problem. The problem was that inside .class_eval
block, self
points not to the PositionColumnMethodDefiner
module, and I had to do a hack to call .define_instance_methods
. It was ugly! Feast your eyes on this:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
SELF = self | |
def self.call(caller_class, position_column) | |
caller_class.class_eval do | |
SELF.define_instance_methods(caller_class, position_column) | |
... | |
end |
Yuck!
Starting at the line 12 (see below ↓), there are several class methods defined via #define_singleton_method
:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
caller_class.class_eval do | |
# only add to attr_accessible | |
# if the class has some mass_assignment_protection | |
if defined?(accessible_attributes) and !accessible_attributes.blank? | |
attr_accessible :"#{position_column}" | |
end | |
define_singleton_method :quoted_position_column do | |
@_quoted_position_column ||= connection.quote_column_name(position_column) | |
end | |
... |
I'm going to extract those class method definitions into a method:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_class_methods(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
... |
It was my first time encountering #define_singleton_method
, and the docs didn't explain it well: "Defines a singleton method in the receiver". WTF is a singleton method? I know the singleton pattern, but that doesn't make any sense here.
It turns out, a singleton method is a method defined on an object instance. A class, for example, Object
class, is an instance of class Class
, so a class method foo
on Object
(Object.foo
) is a singleton method too. As well as a method defined on a string:
s = "abc" s.define_singleton_method :foo s.foo
So, in Ruby def self.foo
method is a class method, and at the same time, a singleton method.
If you feel like diving into this a bit more, there's a great article Ways to Define Singleton Methods in Ruby.
After I've extracted class and instance method definition we're left with adding position_column
as an accessible attribute on line 10 (see below ↓). attr_accessible
allows to specify a white list of model attributes that can be set via mass-assignment.
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_class_methods(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
caller_class.class_eval do | |
# only add to attr_accessible | |
# if the class has some mass_assignment_protection | |
if defined?(accessible_attributes) and !accessible_attributes.blank? | |
attr_accessible :"#{position_column}" | |
end | |
end | |
end | |
... |
At the line 10 (see above ↑), position_column
is interpolated and then converted to a Symbol
. We can do away with the interpolation here (see the line 10 below ↓):
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_class_methods(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
caller_class.class_eval do | |
# only add to attr_accessible | |
# if the class has some mass_assignment_protection | |
if defined?(accessible_attributes) and !accessible_attributes.blank? | |
attr_accessible position_column.to_sym | |
end | |
end | |
end | |
... |
One of the worst things you can find in code is comments, and I hate them with passion. Sometimes you can't help but have comments, sometimes it's a necessary evil, but not in this case. On the lines 7-8 (see above ↑) the comments explain that we only protect position_column
from mass-assignment if the user already uses mass-assignment protection. Can we say the same thing without comments? Absolutely!
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_class_methods(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
if mass_assignment_protection_was_used_by_user?(caller_class) | |
caller_class.class_eval do | |
attr_accessible position_column.to_sym | |
end | |
end | |
end | |
... |
So, instead of a long conditional, we have a method call .mass_assignment_protection_was_used_by_user?
, that is much easier to understand and is at the right level of abstraction.
However, lines 7-9 (see above ↑) are still at the wrong level of abstraction, so I'm going to extract them into a method:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_class_methods(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
if mass_assignment_protection_was_used_by_user?(caller_class) | |
protect_attributes_from_mass_assignment(caller_class, position_column) | |
end | |
end | |
... |
So, I've extracted protecting position_column
attribute into .protect_attributes_from_mass_assignment
method (see line 7 above ↑).
I feel it reads much better without any comments now.
Let's see whether the code that I've extracted can be improved:
def self.mass_assignment_protection_was_used_by_user?(caller_class) | |
caller_class.class_eval do | |
defined?(accessible_attributes) and !accessible_attributes.blank? | |
end | |
end |
On the line 3 (see above ↑) we check whether accessible_attributes
is defined. But what is accessible_attributes
? It seems that it's an undocumented Rails method.
In Rails 2.3.8 accessible_attributes
used to reference attr_accessible
attribute (used to store those attributes that allow mass assignment). In Rails 4, attr_accessible
was removed in favour of strong parameters and thus, would no longer be defined.
This explains why accessible_attributes
may not be defined, and I will not dive deeper into undocumented Rails stuff.
defined?(accessible_attributes)
returns a truthful value if . accessible_attributes
is defined. However, it would also return a truthful value if a variable named accessible_attributes
was defined. It's not very likely that such variable would be defined, but for somebody reading it thoroughly, it makes code harder to understand. "Did the author really mean that accessible_attributes
variable counts as mass protection defined?". Thus, it's better to replace defined?
with #respond_to?
:
def self.mass_assignment_protection_was_used_by_user?(caller_class) | |
caller_class.class_eval do | |
respond_to?(:accessible_attributes) and !accessible_attributes.blank? | |
end | |
end |
In this way, it's clear that we're looking for a method .accessible_attributes
, and there are no further questions.
But we're not done with the .mass_assignment_protection_was_used_by_user?
method yet. On the line 3 (see above ↑) we check whether accessible_attributes
is not #blank?
. It's probably always better to avoid using negation. In this case, we can use #present?
:
def self.mass_assignment_protection_was_used_by_user?(caller_class) | |
caller_class.class_eval do | |
respond_to?(:accessible_attributes) and accessible_attributes.present? | |
end | |
end |
Now I'm happy with the method.
To remind you what the state of .call
method is:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_class_methods(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
if mass_assignment_protection_was_used_by_user?(caller_class) | |
protect_attributes_from_mass_assignment(caller_class, position_column) | |
end | |
end | |
... |
We are passing caller_class
to each method call here. We could define a class instance variable and reference it in class methods later:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
@caller_class = caller_class | |
define_class_methods(position_column) | |
define_instance_methods(position_column) | |
if mass_assignment_protection_was_used_by_user? | |
protect_attributes_from_mass_assignment(position_column) | |
end | |
end | |
... | |
def self.mass_assignment_protection_was_used_by_user?(caller_class) | |
@caller_class.class_eval do | |
respond_to?(:accessible_attributes) and accessible_attributes.present? | |
end | |
end | |
... |
Voila! Reads much better!
But alas, using a class instance variable is not thread safe :(
I have two choices here:
Long story short, I've refactored to this:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
Definer.new(caller_class, position_column).call | |
end | |
class Definer | |
def initialize(caller_class, position_column) | |
@caller_class, @position_column = caller_class, position_column | |
end | |
def call | |
define_class_methods(@position_column) | |
define_instance_methods(@position_column) | |
if mass_assignment_protection_was_used_by_user? | |
protect_attributes_from_mass_assignment(@position_column) | |
end | |
end | |
... |
And, I can't stand it. The cure is worse than the disease. In the #call
method (see the lines 12-17 above ↑) I'm passing an instance variable @position_column
as a method argument. It's awful, but it's that or I have to say something like position_column = @ position_column
for the variable to be picked up by a #class_eval
block. Neither of the options are good. So, it's a no-go.
So, I've refactored to use a thread variable:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
self.caller_class = caller_class | |
define_class_methods(position_column) | |
define_instance_methods(position_column) | |
if mass_assignment_protection_was_used_by_user? | |
protect_attributes_from_mass_assignment(position_column) | |
end | |
end | |
private | |
def self.caller_class=(value) | |
Thread.current.thread_variable_set :acts_as_list_caller_class, value | |
end | |
def self.caller_class | |
Thread.current.thread_variable_get :acts_as_list_caller_class | |
end | |
def self.define_class_methods(position_column) | |
caller_class.class_eval do | |
define_singleton_method :quoted_position_column do | |
@_quoted_position_column ||= connection.quote_column_name(position_column) | |
end | |
... |
Much better than service object, but the cognitive load is there. It's just far from being standard to say self.caller_class = caller_class
. And thread variable instead of just another method argument? That takes much more thinking. "Why a thread variable?", "What does self.caller_class = caller_class
assignment mean?". It's a no-go either.
So, in the end I was unable to improve on this:
module ActiveRecord::Acts::List::PositionColumnMethodDefiner #:nodoc: | |
def self.call(caller_class, position_column) | |
define_class_methods(caller_class, position_column) | |
define_instance_methods(caller_class, position_column) | |
if mass_assignment_protection_was_used_by_user?(caller_class) | |
protect_attributes_from_mass_assignment(caller_class, position_column) | |
end | |
end | |
... |
Can you think of a way to improve it?
In part 3 I'll dive into methods defined with #define_singleton_method
in .define_class_methods
. Some of them use class instance variables, so they may not be thread safe. I'm looking forward to finding out.
That's all for today, and, happy hacking!
P.S. my PR was accepted by acts_as_list project!
The post acts_as_list refactoring part 2 first appeared on Ruby clarity.
]]>The post acts_as_list refactoring part 1 first appeared on Ruby clarity.
]]>#move_to_bottom
and #move_higher
.
.acts_as_list
is available as a class method in ActiveRecord::Base
when acts_as_list
gem is loaded. Here's an excerpt from .acts_as_list
definition:
module ClassMethods | |
def acts_as_list(options = {}) | |
configuration = { column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom} | |
configuration.update(options) if options.is_a?(Hash) | |
... |
Using ClassMethods
module is customary in Rails, but it's not a requirement to be familiar with it to read this article. All you need to know is that .acts_as_list
is a class method when used in a Rails model.
As you can see on the line 3 above ↑, there are 4 options that can be passed to .acts_as_list
:
column
: db column to store position in the list.scope
: restricts what is to be considered a list. For example, enabled = true
SQL could be used as scope
, to limit list items to those that are enabled.top_of_list
: a number the first element of the list will have as position.add_new_at
: specifies whether new items get added to the :top or :bottom of the list.Options are passed as options
argument, and a hash is expected (see the line 2 above ↑). Then, the default configuration
hash is updated with the passed options
on the line 4, thus overriding defaults with the passed options.
The problem here is that the caller can make mistakes:
Hash
, but something else:acts_as_list :column acts_as_list 1
:columm
instead of :column
):acts_as_list columm: "order"
In both cases, .acts_as_list
will fail silently, leaving the user to figure out what went wrong by themselves.
Using Ruby 2 keyword arguments solves both described problems:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
configuration = { column: column, scope: scope, top_of_list: top_of_list, add_new_at: add_new_at } |
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
configuration = { column: column, scope: scope, top_of_list: top_of_list, add_new_at: add_new_at } | |
if configuration[:scope].is_a?(Symbol) && configuration[:scope].to_s !~ /_id$/ | |
configuration[:scope] = :"#{configuration[:scope]}_id" | |
end |
Using configuration
variable after using keyword arguments does look confusing, and it's so much shorter without it:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
if scope.is_a?(Symbol) && scope.to_s !~ /_id$/ | |
scope = :"#{scope}_id" | |
end |
I realise that it puts cognitive load on us, to figure out that scope
is part of configuration, but if the method is short (and currently it's not short), it'll be ok. Meanwhile, I'll enjoy shorter names :)
As you can see on the line 3 above ↑, _id
suffix is added to scope
. The problem with this line is twofold:
I thought of extracting that into a method (thus, solving the 1st problem), but fortunately, I guessed that there must be a method out there doing that already. And indeed, there is: ActiveSupport::Inflector.foreign_key. So, I'm going to use it:
module ClassMethods | |
include ActiveSupport::Inflector | |
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
if scope.is_a?(Symbol) && scope.to_s !~ /_id$/ | |
scope = foreign_key(scope).to_sym | |
end |
The #foreign_key
method fits perfectly here, because, scope
is described in the comments as Given a symbol, it'll attach _id and use that as the foreign key restriction. Not only it's a standard solution, the story it tells, fits well into what .acts_as_list
does.
As you can see on the line 2 above ↑, I've chosen to include ActiveSupport::Inflector
into ClassMethods
, thus polluting all classes ClassMethods
will be extending. But this is temporary, and I'll figure out later, how to fix that.
On the lines 5-7 (see above ↑), we add _id
suffix to scope
if it's a Symbol
and doesn't end with _id
already. This code is ripe for extracting a method:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
scope = idify(scope) if scope.is_a?(Symbol) | |
.... | |
end | |
def idify(name) | |
return name.to_sym if name.to_s =~ /_id$/ | |
foreign_key(name).to_sym | |
end |
On the line 2 (see above ↑) you can see that I haven't extracted the check of whether scope
is a Symbol
. I believe, it would be less readable to have just scope = idify(scope)
as it'd look like we add _id
suffix always. But this is not the case, the suffix is added only for symbols (strings are left untouched).
However, there's one problem with this setup. Having #idify
in the module ClassMethods
pollutes namespace of ActiveRecord::Base
.
At this stage, .acts_as_list
method is 118 lines long. Here's a short snippet:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
scope = idify(scope) if scope.is_a?(Symbol) | |
caller_class = self | |
class_eval do | |
define_singleton_method :acts_as_list_top do | |
top_of_list.to_i | |
end | |
define_method :acts_as_list_top do | |
top_of_list.to_i | |
end | |
define_method :acts_as_list_class do | |
caller_class | |
end | |
... |
The code in .acts_as_list
defines methods and Rails callbacks, related to column
, scope
, top_of_list
, add_new_at
arguments. It seems like a good idea to group code by those arguments, putting scope
-related stuff into one place and column
-related, into some other place.
I see 3 approaches to split .acts_as_list
, and I'm going to describe them below.
To avoid polluting ClassMethods
namespace, add a module AuxMethods
and split .acts_as_list
into multiple methods. It'd look something like this:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
caller_class = self | |
AuxMethods.define_column_methods(caller_class, column) | |
AuxMethods.define_scope_methods(caller_class, scope) | |
AuxMethods.define_top_of_list_methods(caller_class, top_of_list) | |
AuxMethods.define_add_new_at_methods(caller_class, att_new_at) | |
... |
The problem with this approach is that methods names aren't very readable. Also, since we can't include AuxMethods
to ClassMethods
, we can't get rid of AuxMethods.
prefix. And it doesn't read that well too.
A service object could look like this:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
caller_class = self | |
definer = MethodDefiner.new(caller_class, column, scope, top_of_list, add_new_at) | |
definer.define_column_methods | |
definer.define_scope_methods | |
definer.define_top_of_list_methods | |
definer.define_add_new_at_methods | |
... |
I think, it's even worse than the approach 1. It looks like the methods that are defined when #define_column_methods
is called, are defined on the definer
object. And, it's one line longer.
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
caller_class = self | |
ColumnMethodDefiner.call(caller_class, column) | |
ScopeMethodDefiner.call(caller_class, scope) | |
TopOfListMethodDefiner.call(caller_class, top_of_list) | |
AddNewAtMethodDefiner.call(caller_class, add_new_at) | |
... |
This is my favourite of the three, because:
ColumnMethodDefiner
..call
methods only take the arguments the modules need (better than the approach 2).After the module extraction (I chose the approach 3), .acts_as_list
looks like this:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
caller_class = self | |
ColumnMethodDefiner.call(caller_class, column) | |
ScopeMethodDefiner.call(caller_class, scope) | |
TopOfListMethodDefiner.call(caller_class, top_of_list) | |
AddNewAtMethodDefiner.call(caller_class, add_new_at) | |
class_eval do | |
define_method :acts_as_list_class do | |
caller_class | |
end | |
end | |
before_validation :check_top_position | |
before_destroy :lock! | |
after_destroy :decrement_positions_on_lower_items | |
before_update :check_scope | |
after_update :update_positions | |
after_commit :clear_scope_changed | |
if add_new_at.present? | |
before_create "add_to_list_#{add_new_at}".to_sym | |
end | |
include ::ActiveRecord::Acts::List::InstanceMethods | |
end |
So, instead of 118 lines, .acts_as_list
is 30 lines now, and fits into a page.
Exactly because I have reduced the number of lines, I can now pay more attention to what's left. And, on the line 9 (see above ↑) there's a redundant .class_eval
call. This call changes execution context from self
to, well, self
. That's why it's redundant. After removal, we get (see the lines 9-11 below ↓):
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
caller_class = self | |
ColumnMethodDefiner.call(caller_class, column) | |
ScopeMethodDefiner.call(caller_class, scope) | |
TopOfListMethodDefiner.call(caller_class, top_of_list) | |
AddNewAtMethodDefiner.call(caller_class, add_new_at) | |
define_method :acts_as_list_class do | |
caller_class | |
end | |
before_validation :check_top_position | |
before_destroy :lock! | |
after_destroy :decrement_positions_on_lower_items | |
before_update :check_scope | |
after_update :update_positions | |
after_commit :clear_scope_changed | |
if add_new_at.present? | |
before_create "add_to_list_#{add_new_at}".to_sym | |
end | |
include ::ActiveRecord::Acts::List::InstanceMethods | |
end |
On the lines 13-25 (see above ↑), there are lots of Rails callbacks created. I've already added ColumnMethodDefiner.call
, etc, so having callback code here breaks Single Level of Abstraction. I've extracted the Rails callbacks into a separate module (see the line 13 below ↓):
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
caller_class = self | |
ColumnMethodDefiner.call(caller_class, column) | |
ScopeMethodDefiner.call(caller_class, scope) | |
TopOfListMethodDefiner.call(caller_class, top_of_list) | |
AddNewAtMethodDefiner.call(caller_class, add_new_at) | |
define_method :acts_as_list_class do | |
caller_class | |
end | |
CallbackDefiner.call(caller_class, add_new_at) | |
include ::ActiveRecord::Acts::List::InstanceMethods | |
end |
If Rails callbacks break Single Level of Abstraction, doesn't code on the lines 9-11 (see above ↑) break it too? It does. Because it's so small, it seems that there's no harm in having it there as it is, but I don't really care to read that #acts_as_list_class
is added, I'd rather read a high-leveled description of what kind of functionality it provides.
So, I've looked up the rest of the code and, #acts_as_list_class
is just used internally by the gem. So, it's an auxiliary method. I've extracted it into its own module (see the line 9 below ↓):
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
caller_class = self | |
ColumnMethodDefiner.call(caller_class, column) | |
ScopeMethodDefiner.call(caller_class, scope) | |
TopOfListMethodDefiner.call(caller_class, top_of_list) | |
AddNewAtMethodDefiner.call(caller_class, add_new_at) | |
AuxMethodDefiner.call(caller_class) | |
CallbackDefiner.call(caller_class, add_new_at) | |
include ::ActiveRecord::Acts::List::InstanceMethods | |
end |
I could possibly treat definers as plugins and load them with:
def acts_as_list(column: "position", scope: "1 = 1", top_of_list: 1, add_new_at: :bottom) | |
load_definer_plugins dir: "definers" | |
include ::ActiveRecord::Acts::List::InstanceMethods | |
end |
But I think it'd be an overkill. My main argument against that is that these modules aren't really plugins. If there was a standard way to add plugins in Ruby, that might have been plausible, but adding an ad-hoc plugin system would only make things more complicated. And instead of reading a number of .call
s, reader would have to figure out the plugin system. A no-go.
So, that's the best I can do with this method (as per step 4.5).
In part 2 I'll reap the consequences of choosing the approach 3 to split .acts_as_list
into modules, and will refactor one of those modules. I've already started on that, so I can say that it's interesting to see how the choice to use a separate module allowed to further improve the code by extracting methods. Single Responsibility Principle isn't there for nothing after all :)
If you want to know when the part 2 is out, sign up for my email list.
Happy hacking!
The post acts_as_list refactoring part 1 first appeared on Ruby clarity.
]]>The post CreateSend refactoring part 2 first appeared on Ruby clarity.
]]>In part 1 I've finished all class methods of Base
classs and now I'm going to refactor instance methods.
def initialize(*args) | |
if args.size > 0 | |
auth args.first # Expect auth details as first argument | |
end | |
end |
initialize
(see above ↑) chooses not to use named arguments, and treats method arguments as an array of many arguments. And yet, it only processes one argument from the whole args
array.
It's misleading to accept arguments and throw them away. It will require a look into source code to find out why some passed arguments caused no change in behaviour. On the other hand, if we state that initialize
only accepts one argument, Ruby will complain if we pass any other parameters. Easier to debug. Thus:
def initialize(new_auth = nil) | |
auth new_auth if new_auth | |
end |
This change required changing subclasses of Base
, and they all pass in auth
argument.
# Authenticate using either OAuth or an API key. | |
def auth(auth_details) | |
@auth_details = auth_details | |
end |
auth
is probably short for authenticate
, and if so, it's misleading. No authentication is happening here, just an assignment. It can be replaced with an attr_accessor
. It could be even replaced with nothing (meaning, @auth_details
could be enough to have, and instance variables don't need to be declared), but I don't know enough, maybe it's part of public API. Thus:
# Holds either OAuth or an API key. | |
attr_accessor :auth_details |
# Refresh the current OAuth token using the current refresh token. | |
def refresh_token | |
if not @auth_details or | |
not @auth_details.has_key? :refresh_token or | |
not @auth_details[:refresh_token] | |
raise '@auth_details[:refresh_token] does not contain a refresh token.' | |
end | |
tokens = Base.refresh_access_token @auth_details[:refresh_token] | |
@auth_details = { | |
:access_token => tokens.access_token, | |
:refresh_token => tokens.refresh_token} | |
tokens | |
end |
The conditional on the lines 3-7 (see above ↑) is overly complex, using many not
s and an unnecessary has_key?
(line 4). has_key? :refresh_token
is redundant because we later check @auth_details[:refresh_token]
value. So, if the key isn't present, the conditinal evaluates to false. If we don't check for key, value of :refresh_token
would be nil
, leading to the same false
. And, if key is present, value check will determine true
or false
. Thus:
# Refresh the current OAuth token using the current refresh token. | |
def refresh_token | |
raise '@auth_details[:refresh_token] does not contain a refresh token.' \ | |
unless @auth_details && @auth_details[:refresh_token] | |
tokens = Base.refresh_access_token @auth_details[:refresh_token] | |
@auth_details = { | |
:access_token => tokens.access_token, | |
:refresh_token => tokens.refresh_token} | |
tokens | |
end |
I'd use a variable for @auth_details[:refresh_token]
result, to avoid querying hash twice, but I couldn't think of a good name for it, as refresh_token
is already taken by method name.
# Gets your clients. | |
def clients | |
response = get('/clients.json') | |
response.map{|item| Hashie::Mash.new(item)} | |
end | |
# Get your billing details. | |
def billing_details | |
response = get('/billingdetails.json') | |
Hashie::Mash.new(response) | |
end |
These methods (see above ↑) wrap access to JSON API. I've included two methods, but there are more of them. I'm going to skip them as I don't see how to improve them.
def get(*args) | |
args = add_auth_details_to_options(args) | |
handle_response Base.get(*args) | |
end | |
alias_method :cs_get, :get | |
def post(*args) | |
args = add_auth_details_to_options(args) | |
handle_response Base.post(*args) | |
end | |
alias_method :cs_post, :post | |
def put(*args) | |
args = add_auth_details_to_options(args) | |
handle_response Base.put(*args) | |
end | |
alias_method :cs_put, :put | |
def delete(*args) | |
args = add_auth_details_to_options(args) | |
handle_response Base.delete(*args) | |
end | |
alias_method :cs_delete, :delete |
All these methods (see above ↑) are almost the same, and can be reduced to something like cs_method :get, :post, ...
using metaprogramming. Like this:
class << self | |
private | |
def cs_method(*names) | |
names.each { |name| define_cs_method name } | |
end | |
def define_cs_method(name) | |
define_method(name) do |*args| | |
args = add_auth_details_to_options(args) | |
handle_response Base.send(name, *args) | |
end | |
alias_method "cs_#{name}", name | |
end | |
... | |
cs_method :get, :post, :put, :delete | |
... |
add_auth_details_to_options
is used in step 5 methods to "add auth details to options", at the moment not clear, why it's options
and not args
. Here it is:
def add_auth_details_to_options(args) | |
if @auth_details | |
options = {} | |
if args.size > 1 | |
options = args[1] | |
end | |
if @auth_details.has_key? :access_token | |
options[:headers] = { | |
"Authorization" => "Bearer #{@auth_details[:access_token]}" } | |
elsif @auth_details.has_key? :api_key | |
if not options.has_key? :basic_auth | |
options[:basic_auth] = { | |
:username => @auth_details[:api_key], :password => 'x' } | |
end | |
end | |
args[1] = options | |
end | |
args | |
end |
It does look overwhelming at the first glance. But I have no intention of being overwhelmed by it. I'm going to simplify it step by step.
The first thing I see is that at line 18 (see above ↑), args
is returned unchanged, if there's no @auth_details
present. This is typical guard clause case.
def add_auth_details_to_options(args) | |
return args unless @auth_details | |
options = {} | |
if args.size > 1 | |
options = args[1] | |
end | |
if @auth_details.has_key? :access_token | |
options[:headers] = { | |
"Authorization" => "Bearer #{@auth_details[:access_token]}" } | |
elsif @auth_details.has_key? :api_key | |
if not options.has_key? :basic_auth | |
options[:basic_auth] = { | |
:username => @auth_details[:api_key], :password => 'x' } | |
end | |
end | |
args[1] = options | |
args | |
end |
At lines 4-7 (see above ↑) we see that so called options
are expected as the 2nd element of args
array, and we use it if present. It's actually quite hard to reason about this code because if args[1]
is present and options
get assigned it, and it's nil
, we might get an exception later on, if @auth_details
has certain data in it and it tries to use nil
as a Hash
. I want to simplify it!
The lines 8-16 (see above ↑) add stuff to options
. And the line 17 (see above ↑) assigns options
back as the second element of args
.
It's way too complicated. According to Single Responsibility Principle, a method should have one responsibility only. For this method it means adding stuff to options. Here's the result:
def add_auth_details_to_options(options) | |
return unless @auth_details | |
if @auth_details.has_key? :access_token | |
options[:headers] = { | |
"Authorization" => "Bearer #{@auth_details[:access_token]}" } | |
elsif @auth_details.has_key? :api_key | |
if not options.has_key? :basic_auth | |
options[:basic_auth] = { | |
:username => @auth_details[:api_key], :password => 'x' } | |
end | |
end | |
end |
None of that nasty business with args[1]
is present anymore, much simpler! And look how all the args[1]
business is in one place here:
def define_cs_method(name) | |
define_method(name) do |*args| | |
options = {} | |
if args.size > 1 | |
options = args[1] | |
end | |
args[1] = options | |
add_auth_details_to_options(options) | |
handle_response Base.send(name, *args) | |
end | |
alias_method "cs_#{name}", name | |
end |
It is the same code (lines 3-7 ↑), same functionality, but it's all in one place! Being in one place means there's no switching of contexts required, which means it's easier to read.
Let's continue with improving add_auth_details_to_options
. I'm placing the same code here again, so it's easier to compare with the code I'll have refactored:
def add_auth_details_to_options(options) | |
return unless @auth_details | |
if @auth_details.has_key? :access_token | |
options[:headers] = { | |
"Authorization" => "Bearer #{@auth_details[:access_token]}" } | |
elsif @auth_details.has_key? :api_key | |
if not options.has_key? :basic_auth | |
options[:basic_auth] = { | |
:username => @auth_details[:api_key], :password => 'x' } | |
end | |
end | |
end |
Lines 4-6 (see above ↑) are pretty straightforward, just adding an entry into options
if :access_token
is present in @auth_details
. For some reason the code is really careful to put even nil
values of @auth_details[:access_token]
into authorization header (if it checked for value instead of key presence, it'd not put the authorization header in at all).
Lines 7-12 (see above ↑) are similar, but line 8 can be merged into line 7. Guess, it's the most boring refactoring in this article:
def add_auth_details_to_options(options) | |
return unless @auth_details | |
if @auth_details.has_key? :access_token | |
options[:headers] = { | |
"Authorization" => "Bearer #{@auth_details[:access_token]}" } | |
elsif @auth_details.has_key?(:api_key) && !options.has_key?(:basic_auth) | |
options[:basic_auth] = { | |
:username => @auth_details[:api_key], :password => 'x' } | |
end | |
end |
Now, add_auth_details_to_options
looks neat and tidy.
def define_cs_method(name) | |
define_method(name) do |*args| | |
options = {} | |
if args.size > 1 | |
options = args[1] | |
end | |
args[1] = options | |
add_auth_details_to_options(options) | |
handle_response Base.send(name, *args) | |
end | |
alias_method "cs_#{name}", name | |
end |
So, we have a variable number of args at line 2 (see above ↑), but why? It turns out, line 10 calls method name
on Base
class, meaning, it'll execute Base.get
, Base.post
, etc. Those are methods from HTTParty, and HTTParty uses .get(*args, &block)
-like API, so it's understandable that CreateSend also uses it. Thus, my plan to introduce named arguments is foiled and I have to find another way to improve the code.
From what I know about args
(from looking at HTTParty examples), the 1st argument is path and the 2nd argument is options. Lines 3-7 (see above ↑) can be made to better explain arguments they're dealing with:
def define_cs_method(name) | |
define_method(name) do |*args| | |
path, options, *rest = *args | |
options ||= {} | |
add_auth_details_to_options(options) | |
handle_response Base.send(name, path, options, *rest) | |
end | |
alias_method "cs_#{name}", name | |
end |
The code at lines 3-4 (see above ↑) isn't equivalent to the original code. The original code would fail if nil
was passed as options
(because we wouldn't assign options
to be {}
). But whether that behaviour was intentional or accidental, I have no idea. So, I rely on the fact that tests still pass.
That's all that's of interest left in the Base
class. Hope you enjoyed it.
Happy hacking!
The post CreateSend refactoring part 2 first appeared on Ruby clarity.
]]>The post Be lazy and don’t keep context in your head first appeared on Ruby clarity.
]]>def process_invitation(invitation) | |
if user = User.find_by(email: invitation['email']) | |
unless user.belongs_to_team?(@team.id) | |
membership = add_membership(user, invitation['role']) | |
send_mail(membership.id, user: false) | |
end | |
return | |
end | |
user = User.new(user_data(invitation['email'])) | |
unless user.save | |
Rails.logger.info \ | |
'Notify inviter that the user could not be invited for a reason' | |
return | |
end | |
user.add_to_default_team unless @team.hide_default_team? | |
membership = add_membership(user, invitation['role']) | |
user.switch_team @team.id | |
user.save! | |
send_mail(membership.id, user: true) | |
end |
Lines 2-8 (see above ↑) deal with the case of existing user, and the rest deal with the case of new user.
When reading lines 3-5 you have to remember that you're dealing with an existing user, same with lines 12-22, only there you deal with a new user. After all, both existing user and new user cases use same variable name: user
.
But there's a better way than keeping it in your head. Keeping it in the code:
def process_invitation(invitation) | |
if existing_user = User.find_by(email: invitation['email']) | |
unless existing_user.belongs_to_team?(@team.id) | |
membership = add_membership(existing_user, invitation['role']) | |
send_mail(membership.id, new_user: false) | |
end | |
return | |
end | |
new_user = User.new(user_data(invitation['email'])) | |
unless new_user.save | |
Rails.logger.info \ | |
'Notify inviter that the user could not be invited for a reason' | |
return | |
end | |
new_user.add_to_default_team unless @team.hide_default_team? | |
membership = add_membership(new_user, invitation['role']) | |
new_user.switch_team @team.id | |
new_user.save! | |
send_mail(membership.id, new_user: true) | |
end |
Reading lines 3-5 (see above ↑), you clearly see that you're dealing with an existing user. And same for the lines dealing with a new user.
Happy hacking!
The post Be lazy and don’t keep context in your head first appeared on Ruby clarity.
]]>The post CreateSend refactoring part 1 first appeared on Ruby clarity.
]]>CreateSend::CreateSend
class and some other stuff.
CreateSend
is a base class for accessing CampaignMonitor API. It provies .user_agent
to set HTTP user agent and .exchange_token
to get OAuth access token.
In summary, it's a Ruby wrapper for accessing API. Classes like Campaign
inherit from CreateSend
to add specific methods to work with campaigns, etc.
module CreateSend | |
USER_AGENT_STRING = "createsend-ruby-#{VERSION}-#{RUBY_VERSION}-p#{RUBY_PATCHLEVEL}-#{RUBY_PLATFORM}" | |
# Represents a CreateSend API error. Contains specific data about the error. | |
class CreateSendError < StandardError | |
attr_reader :data | |
def initialize(data) | |
@data = data | |
# @data should contain Code, Message and optionally ResultData | |
extra = @data.ResultData ? "\nExtra result data: #{@data.ResultData}" : "" | |
super "The CreateSend API responded with the following error"\ | |
" - #{@data.Code}: #{@data.Message}#{extra}" | |
end | |
end | |
# Raised for HTTP response codes of 400...500 | |
class ClientError < StandardError; end | |
# Raised for HTTP response codes of 500...600 | |
class ServerError < StandardError; end | |
# Raised for HTTP response code of 400 | |
class BadRequest < CreateSendError; end | |
# Raised for HTTP response code of 401 | |
class Unauthorized < CreateSendError; end | |
# Raised for HTTP response code of 404 | |
class NotFound < ClientError; end |
The first things declared are default user agent constant (line 3) and some exceptions.
What's good:
When reading CreateSendError
class (lines 6-15 ↑) I notice that there's too much information there that I don't really need to know (lines 10-13 ↑). I believe, formatting of error message can be moved to a private method, and people reading the code after that can just skip any private methods, because they aren't essential for understanding.
So, I extract method:
# Represents a CreateSend API error. Contains specific data about the error. | |
class CreateSendError < StandardError | |
attr_reader :data | |
def initialize(data) | |
@data = data | |
super format_data_as_message | |
end | |
private | |
def format_data_as_message | |
# @data should contain Code, Message and optionally ResultData | |
extra = @data.ResultData ? "\nExtra result data: #{@data.ResultData}" : "" | |
"The CreateSend API responded with the following error"\ | |
" - #{@data.Code}: #{@data.Message}#{extra}" | |
end | |
end |
The comment on line 13 (see above ↑) doesn't really add anything to understanding. It's clear that the code references ResultData
, Code
and Message
, so the comment just repeats what code tells. It is safe to remove it.
On line 14 (see above ↑), extra
gets assigned an empty string if ResultData
isn't present. nil
interpolated into string will yield a ""
as well, so there's no need to assign it.
Here's the result:
def format_data_as_message | |
extra = "\nExtra result data: #{@data.ResultData}" if @data.ResultData | |
"The CreateSend API responded with the following error"\ | |
" - #{@data.Code}: #{@data.Message}#{extra}" | |
end |
On line 2 (see above ↑), extra
variable is used to store extra result data. It's kinda easy to remember what's stored in the variable, since it's used on the next line (and we could have used a
as the variable name). But I want to make it not necessary to remember at all!
What we really have here is not extra
, but result data, to be precise, formatted result data. So, I'm going to use formatted_result_data
instead:
def format_data_as_message | |
formatted_result_data = "\nExtra result data: #{@data.ResultData}" \ | |
if @data.ResultData | |
"The CreateSend API responded with the following error"\ | |
" - #{@data.Code}: #{@data.Message}#{formatted_result_data}" | |
end |
I realise that that formatted
isn't ideal name for variable contents concantenated with description. If you have a better idea, please leave it in the comments.
Taking one last look at the CreateSendError
class, I realise that it's not DRY, as it's inside CreateSend
module, which makes it CreateSend::CreateSendError
. Thus, I'm going to make it CreateSend::Error
.
# Represents a CreateSend API error. Contains specific data about the error. | |
class Error < StandardError | |
... | |
end |
Take a look:
# Provides high level CreateSend functionality/data you'll probably need. | |
class CreateSend | |
include HTTParty | |
attr_reader :auth_details | |
# Specify cert authority file for cert validation | |
ssl_ca_file File.expand_path(File.join(File.dirname(__FILE__), 'cacert.pem')) | |
# Set a custom user agent string to be used when instances of | |
# CreateSend::CreateSend make API calls. | |
# | |
# user_agent - The user agent string to use in the User-Agent header when | |
# instances of this class make API calls. If set to nil, the | |
# default value of CreateSend::USER_AGENT_STRING will be used. | |
def self.user_agent(user_agent) | |
headers({'User-Agent' => user_agent || USER_AGENT_STRING}) | |
end |
We can see that the class uses a HTTP library HTTParty, has some notion of authentication, uses a certificate and can set HTTP user agent.
First thing I don't like about it, is that it's CreateSend::CreateSend
class, not DRY at all. From the class comment and from looking at other code, I know that this class is used as a base (for example, Subscriber
class is a descendent of it). So, it seems it'd be better to call it Base
.
Next up, on line 7 is certificate setup. Too much going on that I don't need to know. Thus, I'm going to move that code to a separate module.
Here's the result:
# Provides high level CreateSend functionality/data you'll probably need. | |
class Base | |
include HTTParty | |
extend Certificate | |
attr_reader :auth_details | |
# Specify cert authority file for cert validation | |
ssl_ca_file cert_path('cacert.pem') |
module Certificate | |
def cert_path(file_name) | |
File.expand_path(file_name, File.dirname(__FILE__)) | |
end | |
end |
Take a look:
# Set a custom user agent string to be used when instances of | |
# CreateSend::Base make API calls. | |
# | |
# user_agent - The user agent string to use in the User-Agent header when | |
# instances of this class make API calls. If set to nil, the | |
# default value of CreateSend::USER_AGENT_STRING will be used. | |
def self.user_agent(user_agent) | |
headers({'User-Agent' => user_agent || USER_AGENT_STRING}) | |
end |
.user_agent
sets up User-Agent
HTTP header. It accepts a single argument, and if it's falsey, the default user agent is used instead.
user_agent nil
or user_agent false
is a pretty obscure way to say "please present yourself as Createsend to HTTP servers". I think .default_user_agent
expresses it better.
The result:
# Set a custom user agent string to be used when instances of | |
# CreateSend::Base make API calls. | |
# | |
# user_agent - The user agent string to use in the User-Agent header when | |
# instances of this class make API calls. | |
def self.user_agent(user_agent) | |
headers('User-Agent' => user_agent) | |
end | |
# Set user agent to be CreateSend. | |
def self.default_user_agent | |
user_agent USER_AGENT_STRING | |
end |
There are 3 class methods left to refactor: .authorize_url
, .exchange_token
and .refresh_access_token
. They all have to do with OAuth. Even though I've never had to deal with OAuth, I'll have no problem refactoring them.
authorize_url
sounds like it authorizes something, but in fact, it just returns an URL constructed from its arguments. Take a look:
# Get the authorization URL for your application, given the application's | |
# client_id, redirect_uri, scope, and optional state data. | |
def self.authorize_url(client_id, redirect_uri, scope, state=nil) | |
qs = "client_id=#{CGI.escape(client_id.to_s)}" | |
qs << "&redirect_uri=#{CGI.escape(redirect_uri.to_s)}" | |
qs << "&scope=#{CGI.escape(scope.to_s)}" | |
qs << "&state=#{CGI.escape(state.to_s)}" if state | |
"#{@@oauth_base_uri}?#{qs}" | |
end |
At the very least, the method name should be authorization_url
, at most construct_authorization_url
. I'll opt for construct_authorization_url
as the most descriptive.
On line 2 (see above ↑), the comment repeats argument list. It is redundant information that only leads to longer reading time and doesn't add anything to understanding. After removing that, I ended up with "Construct authorization URL for your application", and that essentially repeats method name construct_authorization_url
. So, I've removed the comment altogeter.
Next, there's a lot of duplication with CGI.escape
and #to_s
. It turns out, there's a Hash#to_query in ActiveSupport that can build a HTTP query string. So I'm going to use that.
The result:
def self.construct_authorization_url(client_id, redirect_uri, scope, | |
state=nil) | |
params = { | |
client_id: client_id, redirect_uri: redirect_uri, scope: scope | |
} | |
params[:state] = state if state | |
"#{@@oauth_base_uri}?#{params.to_query}" | |
end |
I would like to simplify it further, by treating arguments not as named attributes, but as an array, but that'd probably be an overkill.
exchange_token
does something OAuth-related. It constructs ULR parameters, makes a call to HTTP API and returns result (raising exception on failure). Take a look:
# Exchange a provided OAuth code for an OAuth access token, 'expires in' | |
# value, and refresh token. | |
def self.exchange_token(client_id, client_secret, redirect_uri, code) | |
body = "grant_type=authorization_code" | |
body << "&client_id=#{CGI.escape(client_id.to_s)}" | |
body << "&client_secret=#{CGI.escape(client_secret.to_s)}" | |
body << "&redirect_uri=#{CGI.escape(redirect_uri.to_s)}" | |
body << "&code=#{CGI.escape(code.to_s)}" | |
options = {:body => body} | |
response = HTTParty.post(@@oauth_token_uri, options) | |
if response.has_key? 'error' and response.has_key? 'error_description' | |
err = "Error exchanging code for access token: " | |
err << "#{response['error']} - #{response['error_description']}" | |
raise err | |
end | |
r = Hashie::Mash.new(response) | |
[r.access_token, r.expires_in, r.refresh_token] | |
end |
Lines 4-8 (see above ↑) construct URL parameters, same thing as in the step 4.1, I'm going to use Hash#to_query
here as well:
def self.exchange_token(client_id, client_secret, redirect_uri, code) | |
body = { | |
grant_type: 'authorization_code', | |
client_id: client_id, | |
client_secret: client_secret, | |
redirect_uri: redirect_uri, | |
code: code | |
}.to_query | |
options = {:body => body} | |
response = HTTParty.post(@@oauth_token_uri, options) | |
if response.has_key? 'error' and response.has_key? 'error_description' | |
err = "Error exchanging code for access token: " | |
err << "#{response['error']} - #{response['error_description']}" | |
raise err | |
end | |
r = Hashie::Mash.new(response) | |
[r.access_token, r.expires_in, r.refresh_token] | |
end |
Line 9 (see above ↑) is used to add options
variable, used on the next line. I think it makes code harder to read. If I inline it into line 10, it won't be required to read it, which is a win:
def self.exchange_token(client_id, client_secret, redirect_uri, code) | |
body = { | |
grant_type: 'authorization_code', | |
client_id: client_id, | |
client_secret: client_secret, | |
redirect_uri: redirect_uri, | |
code: code | |
}.to_query | |
response = HTTParty.post(@@oauth_token_uri, body: body) | |
if response.has_key? 'error' and response.has_key? 'error_description' | |
err = "Error exchanging code for access token: " | |
err << "#{response['error']} - #{response['error_description']}" | |
raise err | |
end | |
r = Hashie::Mash.new(response) | |
[r.access_token, r.expires_in, r.refresh_token] | |
end |
It's still hard to read, so I'm going to extract some methods:
def self.exchange_token(client_id, client_secret, redirect_uri, code) | |
response = request_token(client_id, client_secret, redirect_uri, code) | |
fail_on_erroneous_response(response, | |
"Error exchanging code for access token") | |
r = Hashie::Mash.new(response) | |
[r.access_token, r.expires_in, r.refresh_token] | |
end | |
class << self | |
private | |
def fail_on_erroneous_response(response, message) | |
if response.has_key? 'error' and response.has_key? 'error_description' | |
err = "#{message}: " | |
err << "#{response['error']} - #{response['error_description']}" | |
raise err | |
end | |
end | |
def request_token(client_id, client_secret, redirect_uri, code) | |
body = { | |
grant_type: 'authorization_code', | |
client_id: client_id, | |
client_secret: client_secret, | |
redirect_uri: redirect_uri, | |
code: code | |
}.to_query | |
HTTParty.post(@@oauth_token_uri, body: body) | |
end | |
end |
.request_token
constructs URL and makes a HTTP call, returning a hash. Since I've already refactored the code that comprises it, there's nothing to change there.
.fail_on_erroneous_response
accepts message
argument and raises an exception on erroneous response. One reason to have message
argument is to know the error message inside of .exchange_token
as it helps to understand what's going on.
At this point (lines 12-18 above ↑) we have two variables (message
and err
) for essentially the same thing - error message. err
can be changed to full_error_message
:
def fail_on_erroneous_response(response, message) | |
return unless response.has_key?('error') && | |
response.has_key?('error_description') | |
full_error_message = "#{message}: " << | |
"#{response['error']} - #{response['error_description']}" | |
raise full_error_message | |
end |
I still find it hard to read, the code tells how it does its job, instead of expressing intent. Thus, I'll use extract method again:
def fail_on_erroneous_response(response, message) | |
return unless erroneous?(response) | |
raise format_response_error_message(response, message) | |
end | |
def erroneous?(response) | |
response.has_key?('error') && response.has_key?('error_description') | |
end | |
def format_response_error_message(response, message) | |
"#{message}: #{response['error']} - #{response['error_description']}" | |
end |
After refactoring extracted parts of .exchange_token
, there's still that part where it returns the result (line 6):
def self.exchange_token(client_id, client_secret, redirect_uri, code) | |
response = request_token(client_id, client_secret, redirect_uri, code) | |
fail_on_erroneous_response(response, | |
"Error exchanging code for access token") | |
r = Hashie::Mash.new(response) | |
[r.access_token, r.expires_in, r.refresh_token] | |
end |
I dislike to use such APIs because I have to remember what it returns before I need to use it. E.g.:
access_token, expires_in, refresh_token = Base.exchange_token(...) ... do_stuff(access_token)
So, I replaced array with a data clump:
# Exchange a provided OAuth code for an OAuth access token, 'expires in' | |
# value, and refresh token. | |
def self.exchange_token(client_id, client_secret, redirect_uri, code) | |
response = request_token(client_id, client_secret, redirect_uri, code) | |
fail_on_erroneous_response(response, | |
"Error exchanging code for access token") | |
TokenResponse.from_hash(response) | |
end |
class TokenResponse | |
ATTRS = %w[access_token expires_in refresh_token] | |
attr_reader *ATTRS | |
def self.from_hash(hash) | |
self.new *hash.values_at(*ATTRS) | |
end | |
def initialize(access_token, expires_in, refresh_token) | |
@access_token, @expires_in, @refresh_token = | |
access_token, expires_in, refresh_token | |
end | |
end |
At this point I feel happy about .exchange_token
.
.refresh_access_token
has the same ailments as .exchange_token
:
# Refresh an OAuth access token, given an OAuth refresh token. | |
# Returns a new access token, 'expires in' value, and refresh token. | |
def self.refresh_access_token(refresh_token) | |
options = { | |
:body => "grant_type=refresh_token&refresh_token=#{CGI.escape(refresh_token)}" } | |
response = HTTParty.post(@@oauth_token_uri, options) | |
if response.has_key? 'error' and response.has_key? 'error_description' | |
err = "Error refreshing access token: " | |
err << "#{response['error']} - #{response['error_description']}" | |
raise err | |
end | |
r = Hashie::Mash.new(response) | |
[r.access_token, r.expires_in, r.refresh_token] | |
end |
And is easy to fix in one go:
# Refresh an OAuth access token, given an OAuth refresh token. | |
# Returns a new access token, 'expires in' value, and refresh token. | |
def self.refresh_access_token(refresh_token) | |
response = request_access_token(refresh_token) | |
fail_on_erroneous_response(response, 'Error refreshing access token') | |
TokenResponse.from_hash response | |
end | |
class << self | |
... | |
def request_access_token(refresh_token) | |
body = { | |
grant_type: 'refresh_token', | |
refresh_token: refresh_token | |
}.to_query | |
HTTParty.post(@@oauth_token_uri, body: body) | |
end | |
end |
There's a lot left to be refactored in Base
class, and I will do it in part 2!
Happy hacking!
The post CreateSend refactoring part 1 first appeared on Ruby clarity.
]]>The post PaginatedResource refactoring first appeared on Ruby clarity.
]]>The class is called PaginatedResource
and you can find the original source here. The idea behind the class is that it fetches elements from an external source on demand, and you just call #each
and don't worry about fetching.
Take a look:
module DropletKit | |
class PaginatedResource | |
include Enumerable | |
PER_PAGE = 20 | |
attr_reader :action, :resource, :collection | |
attr_accessor :total | |
def initialize(action, resource, *args) | |
@current_page = 0 | |
@total = nil | |
@action = action | |
@resource = resource | |
@collection = [] | |
@args = args | |
@options = args.last.kind_of?(Hash) ? args.last : {} | |
end |
Elements are fetched from external source one page at a time, and on line 5, PER_PAGE
constant defines default number of elements per page. When there's a need, a page full of elements will be fetched from external source.
@current_page
tells us number of the last fetched page.@total
tells us how many elements are there altogether at the external source.@collection
is an array that holds already fetched elements.#initialize
can be given a hash of options as the last argument, and the only option it supports is :per_page
, overriding PER_PAGE
constant.Based on the meanings above, I will do the following renames:
@current_page
-> @last_fetched_page
. current_page
explains that it's current page number, but what does it mean? We have external source and already fetched elements, and to which does current_page
refer is not clear at this point. last_fetched_page
on the other hand, explains right away that it refers to external element source.@total
-> @total_remote_elements
. On the first glance, totally not clear what total
represents. Total of what? total_remote_elements
conveys that it's about "remote" elements. Knowing that this class is about fetching elements from external resource should help understand "remote".@collection
-> @fetched_elements
. I feel it's by far the best rename. Especially because it's part of public interface (from outside, collection
looks as another way of getting elements, instead of using #each
).I have also added comments, to explain what the class does:
module DropletKit | |
# PaginatedResource provides an Enumerable interface to external resource, | |
# fetching elements as needed. | |
# | |
# #each is able to start at specified index, i.e. #each(5) will start yielding | |
# elements starting at 6th element. | |
class PaginatedResource | |
include Enumerable | |
PER_PAGE = 20 | |
attr_reader :action, :resource, :fetched_elements | |
attr_accessor :total_remote_elements | |
def initialize(action, resource, *args) | |
@last_fetched_page = 0 | |
@total_remote_elements = nil | |
@action = action | |
@resource = resource | |
@fetched_elements = [] | |
@args = args | |
@options = args.last.kind_of?(Hash) ? args.last : {} | |
end |
#initialize
is somewhat haphazard, and I'd like to change a few things:
* Move simple argument assignments to the top, so they are brain-dead easy to skim over.
* Remove @total_remote_elements = nil
because all unassigned attributes are nil
by default, there's no need to assign them.
* Bundle fetch-related attributes together.
I considered bundling @options
with the top group of assignments, because it's still assigned from arguments, but then, the top group wouldn't be so easy to read.
Here are my changes:
def initialize(action, resource, *args) | |
@action = action | |
@resource = resource | |
@args = args | |
@options = args.last.kind_of?(Hash) ? args.last : {} | |
@last_fetched_page = 0 | |
@fetched_elements = [] | |
end |
As you may have noticed, there's attr_accessor :total_remote_elements
. Assigning total_remote_elements
from outside doesn't make much sense because:
Enumerable#first(n)
can be used to get first n elements.total_remote_elements
was set from outside to a bigger number than number of remote elements, that would probably cause an error when fetching non-existing elements. Not very useful.So, I've removed it, and tests still pass. It only was used internally by PaginatedResource
.
Take a look:
def each(start = 0) | |
# Start off with the first page if we have no idea of anything yet | |
fetch_next_page if @total_remote_elements.nil? | |
return to_enum(:each, start) unless block_given? | |
Array(@fetched_elements[start..-1]).each do |element| | |
yield(element) | |
end | |
unless last? | |
start = [@fetched_elements.size, start].max | |
fetch_next_page | |
each(start, &Proc.new) | |
end | |
self | |
end |
The first thing #each
does (on line 3) is fetch next page if @total_remote_elements
is nil
. When I first read it, it wasn't clear why @total_remote_elements
of nil
causes a fetch, and the comment didn't help much. As I read more code I understood that @total_remote_elements
gets assigned on the first fetch, so, if it's nil
, it means that nothing was fetched yet and we fetch the first page for setting up stuff. And that's what I want to convey on line 3:
def each(start = 0) | |
# Start off with the first page if we have no idea of anything yet | |
fetch_next_page if nothing_fetched_yet? | |
return to_enum(:each, start) unless block_given? | |
Array(@fetched_elements[start..-1]).each do |element| | |
yield(element) | |
end | |
unless last? | |
start = [@fetched_elements.size, start].max | |
fetch_next_page | |
each(start, &Proc.new) | |
end | |
self | |
end | |
def nothing_fetched_yet? | |
@total_remote_elements.nil? | |
end |
On line 5 (see the code above ↑) we return an Enumerator, if block wasn't provided. I feel that lines 3 and 6-8 belong together, as they do the actual fetching and yielding work, and to_enum
between them just gets in the way. So, I move enumerator creation to the top:
def each(start = 0) | |
return to_enum(:each, start) unless block_given? | |
# Start off with the first page if we have no idea of anything yet | |
fetch_next_page if nothing_fetched_yet? | |
Array(@fetched_elements[start..-1]).each do |element| | |
yield(element) | |
end | |
unless last? | |
start = [@fetched_elements.size, start].max | |
fetch_next_page | |
each(start, &Proc.new) | |
end | |
self | |
end |
yield
already fetched elements. If start
is beyond what was fetched, we'd get nil
as the result of @fetched_elements[start..-1]
, so Array()
converts nil
to []
.each
.So, altogether, new elements are fetched on demand and yielded.
I'd like to change the abstraction here from pages (last?
) to elements (more_elements_to_fetch?
) as it's easier to understand and easier to calculate. The only place I see the page abstraction useful is in retrieving new pages. But for calculating whether we can fetch more elements, it's overkill. Here are my changes (line 10):
def each(start = 0) | |
return to_enum(:each, start) unless block_given? | |
# Start off with the first page if we have no idea of anything yet | |
fetch_next_page if nothing_fetched_yet? | |
Array(@fetched_elements[start..-1]).each do |element| | |
yield(element) | |
end | |
if more_elements_to_fetch? | |
start = [@fetched_elements.size, start].max | |
fetch_next_page | |
each(start, &Proc.new) | |
end | |
self | |
end | |
def more_elements_to_fetch? | |
@total_remote_elements > @fetched_elements.size | |
end |
On line 11 (see the code above ↑) we update start to omit yielding already yielded elements (lines 6-8 take care of yielding whatever was fetched before). It's a bit hard though to get that meaning from the code. So, I tried to explain it better (lines 11-12):
def each(start = 0) | |
return to_enum(:each, start) unless block_given? | |
# Start off with the first page if we have no idea of anything yet | |
fetch_next_page if nothing_fetched_yet? | |
Array(@fetched_elements[start..-1]).each do |element| | |
yield(element) | |
end | |
if more_elements_to_fetch? | |
# Ensure we omit from yielding already yielded elements | |
start = after_fetched_elements unless start > after_fetched_elements | |
fetch_next_page | |
each(start, &Proc.new) | |
end | |
self | |
end | |
def after_fetched_elements | |
@fetched_elements.size | |
end |
On line 14 (see the code above ↑), #each
is called recursively, passing Proc.new
as block. I had to look it up, and apparently, Proc.new
translates to the current passed block. But recursion isn't needed here and each recursive call does some extra work on lines 2-8, which are only really needed for the first #each
call. So, I replaced recursion with a loop:
def each(start = 0, &block) | |
return to_enum(:each, start) unless block_given? | |
# Start off with the first page if we have no idea of anything yet | |
fetch_next_page if nothing_fetched_yet? | |
yield_fetched_elements(start, &block) | |
while more_elements_to_fetch? | |
# Ensure we omit from yielding already yielded elements | |
start = after_fetched_elements unless start > after_fetched_elements | |
fetch_next_page | |
yield_fetched_elements(start, &block) | |
end | |
self | |
end | |
def yield_fetched_elements(start) | |
Array(@fetched_elements[start..-1]).each do |element| | |
yield(element) | |
end | |
end |
Next is #total_pages
method:
def total_pages | |
return nil if nothing_fetched_yet? | |
(@total_remote_elements.to_f / per_page.to_f).ceil | |
end |
Not sure why #total_pages
is part of public interface, perhaps because of tests referencing it. The only thing I've changed here is replacing return nil
with just return
. There's no need to specify nil
because return
without argument will produce nil
. Here it is:
def total_pages | |
return if nothing_fetched_yet? | |
(@total_remote_elements.to_f / per_page.to_f).ceil | |
end |
Next is #==
method. It compares PaginatedResource
with objects, responding to #[]
:
def ==(other) | |
each_with_index.each.all? {|object, index| object == other[index] } | |
end |
each
is redundant, so I removed it:
def ==(other) | |
each_with_index.all? { |object, index| object == other[index] } | |
end |
Next is #retrieve
method (see the code below ↓). It fetches a page of elements from the resource we got passed in #initialize
. Then it adds newly fetched elements to @fetched_elements
and, on the first retrieve only, it sets @total_remote_elements
:
def retrieve(page, per_page = self.per_page) | |
invoker = ResourceKit::ActionInvoker.new(action, resource, *@args) | |
invoker.options[:per_page] ||= per_page | |
invoker.options[:page] = page | |
@fetched_elements += invoker.handle_response | |
if @total_remote_elements.nil? | |
meta = MetaInformation.extract_single(invoker.response.body, :read) | |
@total_remote_elements = meta.total.to_i | |
end | |
end |
On line 6 (see the code above ↑), +=
is used to add newly fetched elements to @fetched_elements
. It translates to call Array#+
, and that means that a new array is created every time elements are retrieved. A more efficient way is to use Array#concat
, which adds elements to the existing array.
def retrieve(page, per_page = self.per_page) | |
invoker = ResourceKit::ActionInvoker.new(action, resource, *@args) | |
invoker.options[:per_page] ||= per_page | |
invoker.options[:page] = page | |
@fetched_elements.concat(invoker.handle_response) | |
if @total_remote_elements.nil? | |
meta = MetaInformation.extract_single(invoker.response.body, :read) | |
@total_remote_elements = meta.total.to_i | |
end | |
end |
The last change I want to make (see lines 8-11 above ↑) is to replace if @total_remote_elements.nil? then assign @total_remote_elements
with @total_remote_elements ||=
. I think it makes clear that @total_remote_elements
is assigned here, and you can stop reading right away if you're not interested in that.
def retrieve(page, per_page = self.per_page) | |
invoker = ResourceKit::ActionInvoker.new(action, resource, *@args) | |
invoker.options[:per_page] ||= per_page | |
invoker.options[:page] = page | |
@fetched_elements.concat(invoker.handle_response) | |
@total_remote_elements ||= begin | |
meta = MetaInformation.extract_single(invoker.response.body, :read) | |
meta.total.to_i | |
end | |
end |
Here are all the changes put together:
module DropletKit | |
# PaginatedResource provides an Enumerable interface to external resource, | |
# fetching elements as needed. | |
# | |
# #each is able to start at specified index, i.e. #each(5) will start yielding | |
# elements starting at 6th element. | |
class PaginatedResource | |
include Enumerable | |
PER_PAGE = 20 | |
attr_reader :action, :resource, :fetched_elements | |
def initialize(action, resource, *args) | |
@action = action | |
@resource = resource | |
@args = args | |
@options = args.last.kind_of?(Hash) ? args.last : {} | |
@last_fetched_page = 0 | |
@fetched_elements = [] | |
end | |
def per_page | |
@options[:per_page] || PER_PAGE | |
end | |
def each(start = 0, &block) | |
return to_enum(:each, start) unless block_given? | |
# Start off with the first page if we have no idea of anything yet | |
fetch_next_page if nothing_fetched_yet? | |
yield_fetched_elements(start, &block) | |
while more_elements_to_fetch? | |
# Ensure we omit from yielding already yielded elements | |
start = after_fetched_elements unless start > after_fetched_elements | |
fetch_next_page | |
yield_fetched_elements(start, &block) | |
end | |
self | |
end | |
def total_pages | |
return if nothing_fetched_yet? | |
(@total_remote_elements.to_f / per_page.to_f).ceil | |
end | |
def ==(other) | |
each_with_index.all? { |object, index| object == other[index] } | |
end | |
private | |
def after_fetched_elements | |
@fetched_elements.size | |
end | |
def fetch_next_page | |
@last_fetched_page += 1 | |
retrieve(@last_fetched_page) | |
end | |
def more_elements_to_fetch? | |
@total_remote_elements > @fetched_elements.size | |
end | |
def nothing_fetched_yet? | |
@total_remote_elements.nil? | |
end | |
def retrieve(page, per_page = self.per_page) | |
invoker = ResourceKit::ActionInvoker.new(action, resource, *@args) | |
invoker.options[:per_page] ||= per_page | |
invoker.options[:page] = page | |
@fetched_elements.concat(invoker.handle_response) | |
@total_remote_elements ||= begin | |
meta = MetaInformation.extract_single(invoker.response.body, :read) | |
meta.total.to_i | |
end | |
end | |
def yield_fetched_elements(start) | |
Array(@fetched_elements[start..-1]).each do |element| | |
yield(element) | |
end | |
end | |
end | |
end |
Happy hacking!
The post PaginatedResource refactoring first appeared on Ruby clarity.
]]>