Today I'm going to refactor a class from Digital Ocean Droplet Kit, a library to control droplets (that's what Digital Ocean calls virtual machines).
The class is called PaginatedResource
and you can find the original source here. The idea behind the class is that it fetches elements from an external source on demand, and you just call #each
and don't worry about fetching.
How PaginatedResource works
Take a look:
Elements are fetched from external source one page at a time, and on line 5, PER_PAGE
constant defines default number of elements per page. When there's a need, a page full of elements will be fetched from external source.
@current_page
tells us number of the last fetched page.@total
tells us how many elements are there altogether at the external source.@collection
is an array that holds already fetched elements.#initialize
can be given a hash of options as the last argument, and the only option it supports is:per_page
, overridingPER_PAGE
constant.
Step 1: meaningful names
Based on the meanings above, I will do the following renames:
@current_page
->@last_fetched_page
.current_page
explains that it's current page number, but what does it mean? We have external source and already fetched elements, and to which doescurrent_page
refer is not clear at this point.last_fetched_page
on the other hand, explains right away that it refers to external element source.@total
->@total_remote_elements
. On the first glance, totally not clear whattotal
represents. Total of what?total_remote_elements
conveys that it's about "remote" elements. Knowing that this class is about fetching elements from external resource should help understand "remote".@collection
->@fetched_elements
. I feel it's by far the best rename. Especially because it's part of public interface (from outside,collection
looks as another way of getting elements, instead of using#each
).
I have also added comments, to explain what the class does:
Step 2: tidy up initialize
#initialize
is somewhat haphazard, and I'd like to change a few things:
* Move simple argument assignments to the top, so they are brain-dead easy to skim over.
* Remove @total_remote_elements = nil
because all unassigned attributes are nil
by default, there's no need to assign them.
* Bundle fetch-related attributes together.
I considered bundling @options
with the top group of assignments, because it's still assigned from arguments, but then, the top group wouldn't be so easy to read.
Here are my changes:
Step 3: disable write-level access to internals
As you may have noticed, there's attr_accessor :total_remote_elements
. Assigning total_remote_elements
from outside doesn't make much sense because:
- If not all elements are needed, then
Enumerable#first(n)
can be used to get first n elements. - If
total_remote_elements
was set from outside to a bigger number than number of remote elements, that would probably cause an error when fetching non-existing elements. Not very useful.
So, I've removed it, and tests still pass. It only was used internally by PaginatedResource
.
Step 4: #each method
Take a look:
Step 4a
The first thing #each
does (on line 3) is fetch next page if @total_remote_elements
is nil
. When I first read it, it wasn't clear why @total_remote_elements
of nil
causes a fetch, and the comment didn't help much. As I read more code I understood that @total_remote_elements
gets assigned on the first fetch, so, if it's nil
, it means that nothing was fetched yet and we fetch the first page for setting up stuff. And that's what I want to convey on line 3:
Step 4b
On line 5 (see the code above ↑) we return an Enumerator, if block wasn't provided. I feel that lines 3 and 6-8 belong together, as they do the actual fetching and yielding work, and to_enum
between them just gets in the way. So, I move enumerator creation to the top:
Step 4c
- On line 6 (see the code above ↑), we
yield
already fetched elements. Ifstart
is beyond what was fetched, we'd getnil
as the result of@fetched_elements[start..-1]
, soArray()
convertsnil
to[]
. - On lines 10-14, if there are more pages to fetch, we update start to omit yielding already yielded elements, fetch next page and recursively call
each
.
So, altogether, new elements are fetched on demand and yielded.
I'd like to change the abstraction here from pages (last?
) to elements (more_elements_to_fetch?
) as it's easier to understand and easier to calculate. The only place I see the page abstraction useful is in retrieving new pages. But for calculating whether we can fetch more elements, it's overkill. Here are my changes (line 10):
Step 4d
On line 11 (see the code above ↑) we update start to omit yielding already yielded elements (lines 6-8 take care of yielding whatever was fetched before). It's a bit hard though to get that meaning from the code. So, I tried to explain it better (lines 11-12):
Step 4e
On line 14 (see the code above ↑), #each
is called recursively, passing Proc.new
as block. I had to look it up, and apparently, Proc.new
translates to the current passed block. But recursion isn't needed here and each recursive call does some extra work on lines 2-8, which are only really needed for the first #each
call. So, I replaced recursion with a loop:
Step 5
Next is #total_pages
method:
Not sure why #total_pages
is part of public interface, perhaps because of tests referencing it. The only thing I've changed here is replacing return nil
with just return
. There's no need to specify nil
because return
without argument will produce nil
. Here it is:
Step 6
Next is #==
method. It compares PaginatedResource
with objects, responding to #[]
:
each
is redundant, so I removed it:
Step 7
Next is #retrieve
method (see the code below ↓). It fetches a page of elements from the resource we got passed in #initialize
. Then it adds newly fetched elements to @fetched_elements
and, on the first retrieve only, it sets @total_remote_elements
:
Step 7a
On line 6 (see the code above ↑), +=
is used to add newly fetched elements to @fetched_elements
. It translates to call Array#+
, and that means that a new array is created every time elements are retrieved. A more efficient way is to use Array#concat
, which adds elements to the existing array.
Step 7b
The last change I want to make (see lines 8-11 above ↑) is to replace if @total_remote_elements.nil? then assign @total_remote_elements
with @total_remote_elements ||=
. I think it makes clear that @total_remote_elements
is assigned here, and you can stop reading right away if you're not interested in that.
End result
Here are all the changes put together:
Happy hacking!