This article would help you to scaling your app with caching your database queries.
Its pretty good to use active-record in ruby on rails, relationships, find_by methods, where queries everything makes your life much more simpler as a developer but sooner or later your application becomes slow. At that time you need to work on optimizations. There are lot of work around for optimizations and one of them in caching the queries. The company I work for is StudyPad and we have a product ie splashmath and that has over 9 millions users using it across multiple platforms. The amount of traffic that we get is pretty huge and caching one of the thing that helped us to solve out scaling problems. Here is quick idea to cache your queries in memcache.
Basic requirements
There are few requirements for this. First is the dalli gem and memcache installed on your machine.
Basic structure of the application.
We have lot of static/seed data which change very less and the queries on those things are pretty high. So first thing to handle that static data.
1 2 3 4 5 6 7 |
|
So in order to get skills for a grade, its pretty simple
1 2 |
|
Now for that matter every time we are fetching skills for a grade its gonna hit the database and get the results. So its time to get this query and its results cached.
Assuming you have configured dalli gem easily and you have memcached up and running. Here is a simple method to cache the results
1 2 3 4 5 |
|
So instead of fetching skills like
1
|
|
we are gonna get it like
1
|
|
So whats its gonna do. Its gonna fetch the results first time from the database and store it in cache. Next time its gonna return the results from the cache. Note few things here
Understand the pattern of the key. [self.class.name, id, :skills] is your key here.
Cache will expire in 240.hours. You can customize it as per your needs. Better keep a constant for this somewhere in your application.
In cached_skills methods we are keep records not the active-record relations that why we have to convert into array by using to_a else active-record-relation will be cached and database query will be executed.
1
|
|
Expiring the cache.
We are caching the query results but we are not expiring the results. What if some skill has changed. Grade object is not getting any notification for that, so cache is stale, we need to expire it. So we can write a after_commit hook for skill to expire its grade object’s cache
1 2 3 4 5 6 |
|
This is enough to make sure you cache is never stale. There is another way to handle the expiring cache. Lets see that.
Another way
We redefine the models like this
1 2 3 4 5 6 7 |
|
Note we have added touch: true in skills, and now we redefine our cached_skills method again:
1 2 3 4 5 |
|
Now just caching this we don’t need to expire the cache manually, when ever skills get updated it will touch its parent object object grade, that will update its updated_at value and that specific cache will be never used, as key attribute updated_at has been changed.
The problem with second approach
But there is a problem. Assume you have 10 different has_many relationships for grade and you are caching it all, now everytime a skill has be changed all the other cache keys for grade relationships will be useless too. For example: Grade has_many topics
1 2 3 4 5 6 7 8 9 10 11 |
|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Now in this case changing any skill will make topics cache useless, but that’s not the case when you are trying to expire it manually. So both approach has pros and cons, first will ask you write more code and second expire cache more frequently. You have to make that choice as per your needs.
What else?
Using the same base principle you can cache lot of queries like
1 2 |
|
This approach helped us to reduce the load on RDS and make things pretty fast. I hope this will help you too. Let me know you feedback or some tips that made you system more faster