FastSessions Rails Plugin Released

Posted by Oleksiy Kovyrin under Databases, Development, My Projects

How often do we think about our http sessions implementation? I mean, do you know, how your currently used sessions-related code will behave when sessions number in your database will grow up to millions (or, even, hundreds of millions) of records? This is one of the things we do not think about. But if you’ll think about it, you’ll notice, that 99% of your session-related operations are read-only and 99% of your sessions writes are not needed. Almost all your sessions table records have the same information: session_id and serialized empty session in the data field.

Looking at this sessions-related situation we have created really simple (and, at the same time, really useful for large Rails projects) plugin, which replaces ActiveRecord-based session store and makes sessions much more effective. Below you can find some information about implementation details and decisions we’ve made in this plugin, but if you just want to try it, then check out our project site.

FastSessions is a sessions class for ActiveRecord sessions store created to work fast (really fast). It uses some techniques which are not so widely known in developers’ community and only when they cause huge problems, performance consultants are trying to help with them.

The Problem

Original ActiveRecord sessions store is slow. It is fine for some low traffic blogs, but it is too slow to use it on some big/large/huge sites. First of all, it is slow because ActiveRecord is slow. It is powerful ORM framework, but it is overkill for such simple task as a sessions management.

That is why people created SqlSession store. It works with mysql directly with database APIs and works much faster than original AR session store. But it is still slow because:

  • it creates/updates session on each hit – even dumb bots crawling your sites create thousands of thousands of useless records in your sessions table, 99% of hits do not require any session updates!
  • it uses 32-char string as a key for sessions records – all databases work with string keys MUCH slower that with integers keys, so it would be much better to use integers, but we have so long session ids and all session stores use these session ids as a key.
  • it uses auto_increment primary key, which causes table-level locks in InnoDB for all MySQL versions prior to 5.1.21. These table-level locks with unnecessary inserts cause really weird problems for large sites.

The Solution

FastSessions plugin was born as a hack created for Scribd.com (large RoR-based web project), which was suffering from InnoDB auto-increment table-level locks on sessions table.

So, first of all, we removed id field from the table. Next step was to make lookups faster and we’ve used a following technique: instead of using (session_id) as a lookup key, we started using (CRC32(session_id), session_id) – two-columns key which really helps MySQL to find sessions faster because almost all lookups use crc32 field only to find needed record.

And last, but most powerful change we’ve tried to make was to not create database records for empty sessions and to not save sessions data back to database if this data has not been changed during current request processing.

All of these changes were implemented and you can use them automatically after a simple plugin installation.

Controversial Decisions

Many plugin users would never think about one problem we’ve introduced when removed that auto-increment primary key, so I’d like to describe it here. The problem is following.

InnoDB groups all data in tables by primary key. This means that when we create auto-increment primary key and insert records to a table, our sessions records are grouped together and saved sequentially on the disk. But if we’ll make pretty random value (like crc32 of a random session id) a primary key, then every session record will be inserted in its own place and it will generate some random I/O which is not so good for I/O bound servers.

So, we decided to let the user choose what primary key to use in his deployment of our plugin, so if you’re going to use this module with MySQL 5.1.22+, then you’d like to set

1
  CGI::Session::ActiveRecordStore::FastSessions.use_auto_increment = true

because it will provide you with consecutive data inserts in InnoDB. Another cases when you’d like to use it is when your MySQL server is I/O bound now and you do not want to add random I/O because of randomized primary key.

Working With Old AR Sessions Table

If you do not like to loose old sessions created with default AR sessions plugin, you could set

1
  CGI::Session::ActiveRecordStore::FastSessions.fallback_to_old_table = true

and then all session reads will fall back to old sessions table if some session_id was not found in default fast sessions table. Old sessions table name could be set using

1
  CGI::Session::ActiveRecordStore::FastSessions.old_table_name

variable.

Installation

This plugin installation is pretty simple and described in a few steps below:

  1. Install this plugin sources in your vendor/plugins directory (it could be ./script/plugin install, or piston import command – it is up to you) from our SVN reposipory. For example:
    1
    $ piston import http://rails-fast-sessions.googlecode.com/svn/trunk/ vendor/plugins/fast_sessions
  2. Enable ActiveRecord session store in your config/environment.rb file:
    1
    2
    3
    4
    5
    Rails::Initializer.run do |config|
      ......
      config.action_controller.session_store = :active_record_store
      ......
    end
  3. Create migration for your new sessions table:
    1
    $  ./script/generate fast_session_migration AddFastSessions
  4. Open your newly created migration and change table_name and use_auto_increment parameters of the plugin (if you want to).
  5. Run your migration:
    1
    $  rake db:migrate
  6. Start your application and try to perform some actions which would definitely save some data to your session. Then check your fast_sessions table (if you did not renamed it) for a records.

Downloading

Most recent version if this plugin could be found on the project’s site or in SVN repository

Author

This plugin has been created by Alexey Kovyrin. Development is sponsored by Scribd.com.


Related posts:

  1. Loops plugin for rails and merb released
  2. Data Recovery Toolkit for InnoDB Released
  3. Bounces-handler Released
  4. DB Charmer – ActiveRecord Connection Magic Plugin
  5. InnoDB Recovery toolset Version 0.3 Released

13 Responses to this entry

Ivan V. says:

So, this plugin only works with MySQL right? Are you considering supporting other DBs, like postgresql?

Thanks.

Chris says:

Why don’t you just use the cookiestore in rails 2.0? That way you don’t need to store sessions in the database at all!

Matthias says:

the self.delete_old! method did not work for me. I changed it to:

@@connection.execute “DELETE FROM #{table_name} WHERE UNIX_TIMESTAMP(updated_at)

Matthias says:

< UNIX_TIMESTAMP(NOW()) – #{seconds}”

Note the UNIX_TIMESTAMP around (updated_at) in first part of the comment

Drew Blas says:

You mentioned turning off sessions from bot requests, but didn’t mention how. Here’s my method:

[application.rb]
# turn off sessions if this is a request from a robot
session :o ff, :if => proc { |request| request.user_agentuser_agent =~ /\b(Baidu|Gigabot|Googlebot|libwww-perl|lwp-trivial|msnbot|SiteUptime|Slurp|WordPress|ZIBB|ZyBorg)\b/i }

Interesting Rails Tidbits #4 says:

[...] FastSessions (official project page) is a Rails plugin that performs some interesting tricks on the way that Rails handles session storage. It appears only to work on MySQL and no hard performance numbers are given yet (though a “10-15% performance gain” is suggested), but I’ve seen quite a few people linking to it, so it might be worth a look. Scribd.com (YouTube-for-PDFs) supposedly uses this in production. [...]

Andrey says:

Of course, it’s a little bit artificial case, but when session size exceed 70kb I become error (code 500). Log says that sql connection lost during update operation.
After that application don’t get back alive, until I delete session record.

Eric says:

What about using MyISAM for the sessions table. Implications, like for instance, table locking during updates/additions?
Unfortunately, my MySQL database was not compiled with InnoDB.

Eric says:

What would the impact be for using fast_sessions with MySQL and MyISAM tables as opposed to InnoDB (table vs. row level locking)? Any thoughts?

Paolo Montrasio says:

There is a plugin called SmartSessionStore that does more or less the same and works also with PostgreSQL. I’ve been using it for almost an year in production with no problems at all.

Andrew Watkins says:

We’re using postgresql and actually ported it over, except for one problem: Postgresql does have crc32 as a function. You can kind of get there by either writing your own or installing the one with ltree. Postgresql does however have md5. What we did is setup tests against unique(md5_session_string, session_id), unique(crc32, session_id), and unique(md5_session_byte,session_id). The last one is a byte representation of the md5 hash. For random queries, the 1st two implementations ran at about the same speed, which was basically instantaneous. The byte one definitely showed a little lag, and so would not recommend it. Finally we created a stored procedure to actually handle the insert/update as postgresql doesn’t have a “ON DUPLICATE KEY UPDATE”. The question is whether or not this is faster than just indexing the session_id itself or is something screwy is going on where postgresql is optimizing against the hashed values.