After my RbCAW-related tasks are (mostly) done on the last sprint, my current responsibility as a back-back-end developer is to implement caching so hopefully our homepage will load in less than half a minute
crosses finger. Unfortunately, testing the caching mechanism is a pain in the back.
Our current data flow schema is as follows:
“What is that ‘Phrefetch’ thing?” you may ask. Following the silly module name that I gave to the API wrapper, ‘Phrefetch’ is what I call the caching mechanism. Fancy, eh?
So, TL;DR: Phrefetch fetch information such as Projects, Tasks, and Users from Phabricator through RbCAW and store those in the database. The magic that is frontend will then turn the raw DB entries into grid of colored pixels that is user interface. We use PostgreSQL as relational DB is more suitable on storing the information. Also, it is free on Heroku.
As almost every information on each table (Projects, Tasks, and Users) are fetched on one query, cache invalidation is performed not based on expiry but on a set interval. That means Phrefetch will need to run alongside Rails’ web server (Puma) and invoke the caching logic every K seconds or so.
This calls for concurrency, where a set of code runs alongside the others in parallel. Phrefetch will run as a background job, and is started right after Rails had finished initializing. I use
concurrency-ruby as its
TimerTask is exactly what I need.
Logically, Phrefetch can only be instantiated once, so I applied the singleton design pattern with the help of Ruby’s
require 'concurrent' require 'singleton' module Phrefetch class Phrefetch include Singleton def initialize @running = false @task = nil # Do other stuff end def start raise StandardError, 'Phrefetch already running' if running? @running = true @task = Concurrent::TimerTask.new(run_now: true) do |task| # Do the caching end @task.execute end def stop return unless running? @task.shutdown @running = false end def restart stop start end def running? return @running end end end
That’s the basic outline of Phrefetch. The
Concurrent::TimerTask will spawn a new thread when it’s invoked. As the caching logic will do HTTP requests and DB operations, it is bound to be rather slow, so I also placed a safeguard using
Timeout::timeout() to forcefully stop the caching logic if it took too long so it won’t block the next scheduled
TimerTask from running.
I’ve been rambling for 386 words and you might be wondering “What’s the problem then? It seems to be okay”.
I wanted to test that the periodical caching works properly: 1) it can be started, 2) it will finish if everything went well, and 3) it can be shut down and restarted properly. The problem lies in #2, we don’t really know how long the caching will run before it is finished.
So I opted for the simplest solution: sleep while the task is not done or timed out yet. It actually works fine until it does not — the test will randomly fail due to the caching took way too long to finish. This happens irregularly — it might fail out of nowhere and somehow finish properly after I put a debugging breakpoint.
This raised some questions because the tests for caching logic (without the concurrency stuff) don’t take that long. Even more questions and eyebrows were raised as stubbing the
ActiveRecord methods ain’t do nothing.
After some research, apparently what I was trying to do is kind of redundant because RSpec stubbed
ActiveRecord methods by default. Putting two and two together, this is why the caching logic tests run faster than the concurrency tests: RSpec failed to stub the
ActiveRecord methods on the concurrency tests, thus those tests will do a real write operation to the database. Read/Write operation on the database are slower than what RSpec stubs do, which I suspect is using in-memory database akin to SQLite as a substitute to real DB.
So another TL;DR: concurrency tests might be slow because those involve the database directly. Why? Because threads.
Apparently mocking libraries like Mocha or the one used by RSpec are not thread safe. It means that what I mocked or stubbed on the main testing thread will not be mocked/stubbed on another thread. Great.
So even if RSpec stubbed the
ActiveRecord magics on its own thread, the DB operations which inside the
Concurrent::TimerTask will not be stubbed, thus accessing the DB directly.
After hours and hours of googling, I’ve yet to found a solution to this problem.
Some suggest using NullDB to mock the actual database rather than just the DB operation methods. But apparently, NullDB have compatibility issue with PostgreSQL’s schema.
Some suggestion revolves on mocking the multithreading completely, leaving only the test about logic that happen inside the concurrent thread. But, that defeats the whole point of my test — to ensure the background job works properly. Also, I already done the synchronous test with the caching logic tests.
The moral of this story is: I suck at googling. I believe there must be some solution to this problem, but alas I’ve not find it yet. So, until then I think we will have to bear with restarting the RSpec tests whenever that one concurrency test fail. Peculiarly, this have not happened yet on our Gitlab’s pipeline, but I think it will, sooner or later.