Rails Developer for a Large Startup: My Vision of an Ideal Candidate
7 Feb2009

Few days ago we were chatting in our corporate Campfire room and one of the guys asked me what do I think about our hiring process for Rails developers, what questions I’d ask a candidate if I was interviewing and so on. Those questions sparkled a really long and interesting discussion and I would like to share my thoughts on the topic in this post.

So, first of all I would like to explain what kind of interviews I hate the most. 🙂

Ever since I have started thinking of myself as of a software developer (many years ago) and started going to “software developer position” interviews, I really hated questions like “What is the name and possible values of the third parameter for the function some_weird_func() in some_weird.h” or “How to declare a virtual destructor and when it could be useful?”… All my life I had pretty practical way of thinking and never bothered to learn APIs or some really deep language concepts that could really be useful in 1% of the time I spend on development and in some really specific edge case situations. I always believed that a Real Engineer should know where to find an answer to those question and should be able to find a way to solve a task without knowing all possible ways to do it. Real engineering mind should be flexible!

So, if I would perform an interview with a developer for our company, I would try to never ask those useless theoretical questions because, honestly speaking, web development is not a rocket science and most of the time you don’t need to have a CS Master degree to be a good developer.

Now, let’s get closer to our discussion at Scribd. During our conversation we have discussed our standard hiring practices, the questions we use for screening our candidates and so on. As the result, I have decided to write a set of questions I believe an ideal candidate should be able to answer. Many of those questions are not Rails or Ruby specific, but rather specific for high-load and high-scale web development and architecture work. Here are the most interesting questions on my list:

  1. You have a generic web application (does not matter what language and framework do you use) with the following set of layers and parties involved in each request:

    • Database
    • Application
    • External Services
    • User

    What security problems could occur on each level and what action should a developer take to prevent those kinds of problems from happening? As an example, take a login action in your application and explain possible problems.

  2. You have a database table with a 100M records, you need to make some changes in each record and it is clear to you that you won’t be able to make those changes using just SQL queries, so you need to do some application-level processing on each record (one-time script or a rake task). Your database is MySQL.

    • Where would you put the code in your application: model, module with a bunch of methods, migration, some dedicated script? Why?
    • How would you process all those records: Model.find(:all).each { ... }, a loop with different offsets and a limit, a loop walking through primary key values with some step, something else?
    • What would you choose for this task and why: Model.find(), Model.find_by_sql?, something else? What is the major difference between those options?
    • What would you use: update_attribute, attribute assignments + saves, Model.connection.execute() or something else to update the table? When and why would you use each of those options?
  3. You have a Rails action that accepts uploads and then transfers them to a pretty slow storage (S3, some slow NFS storage, something else). If you have a lots of uploads, pretty soon many of your backend application processes (if not all) will be used for uploads and your application would become unusable. What options do you see to solve this problem?
  4. You have a huge table in MySQL, you need to fetch a few randomly selected records (and this is not a one-time action, it’ll be used pretty often, like on the home page), how would you do that?
  5. You have a document model with the following fields: id, user_id, title, body, created_at, views_count, download_count. There are 10_000_000 documents. You need to implement a download counters on your site so when one downloads a document, you’d increase the download_counter field. How would you implement counter management?
  6. How’d you write an SQL query to find all users in a system that have no docs (standard User has_many docs relationship)?
  7. Caching:

    • What kinds of caching techniques do you know?
    • What caching options does Rails provide, when would you use those?
    • What caching options does HTTP protocol provide?
    • You have a pretty popular action on your site which uses some heavy SQL query which should be cached. But the problem is that sometimes when the cache expires and too many users hit the action, all those SQL queries will start hitting the database at the same time trying to refresh your caches (classic thundering herd problem) and make your system unstable. How would you try to solve the problem?

As you can see, none of these questions are about REST, NoSQL or any other popular Rails buzzwords (ActiveResource, named_scope, etc) because in my opinion – you can learn those in a week of skimming through any Rails-related mailing list. My questions are aimed at the most frequently happening problems in any high-traffic startups where not all common practices could be used and you need to think before writing any code that would be executed millions of times every day.

So, if you read the questions and you think you know (or able to find in Google/books/etc) answers and you think you know Rails, please let us know because we’re hiring. At the moment we need a few really great Ruby-developers that are not afraid of working on high-load web applications.

P. S. If you’re really good, we could work with you remotely w/o relocation to San Francisco (we have some team members in Canada and Ukraine now).