Archive for the ‘Databases’ Category

Using Sphinx for Non-Fulltext Queries

Posted by Oleksiy Kovyrin under Admin-tips, Databases, Development, My Projects

How often do you think about the reasons why your favorite RDBMS sucks? :-) Last few months I was doing this quite often and yes, my favorite RDBMS is MySQL. The reason why I was thinking so because one of my recent tasks at Scribd was fixing scalability problems in documents browsing.

The problem with browsing was pretty simple to describe and as hard to fix – we have large data set which consists of a few tables with many fields with really bad selectivity (flag fields like is_deleted, is_private, etc; file_type, language_id , category_id and others). As the result of this situation it becomes really hard (if possible at all) to display documents lists like “most popular 1-10 pages PDF documents in Italian language from the category “Business” (of course, non-deleted, non-private, etc). If you’ll try to create appropriate indexes for each possible filters combination, you’ll end up having tens or hundreds of indexes and every INSERT query in your tables will take ages.

Read the rest of this entry »

MySQL UC 2008 Presentations

Posted by Oleksiy Kovyrin under Databases, Links

Since I wasn’t able to get to this year’s MySQL UC (employer change caused problems with US visa obtaining and I didn’t get visa in time) I’m really interested in all presentations people are posting after their sessions. I decided to collect them all in one place and would like to share with others – maybe someone will find it interesting to read what people have to say about many interesting aspects of MySQL usage.

So, I’ve created a folder in my Scribd.com account which you could use (and track using RSS readers) to find out what interesting presentations were published. You can use either my account or mysqluc08 folder there. One more possible option to track mysqluc presentations/documents is using our tagging (I tag all my docs with mysqluc08 tag).

InnoDB Recovery toolset Version 0.3 Released

Posted by Oleksiy Kovyrin under Databases, Development, My Projects

Even though I didn’t go to MySQL conf this year (really sad about this), this week is gonna be most active in the community so I decided to do some community stuff too :-) Today I’ve released version 0.3 of our innodb recovery toolkit. Now it became much faster, stable and accurate. At this moment it is possible to recover almost any table from corrupted/deleted tablespace without so much effort as it was before. Here is a short changes list (since 0.1 announced here):

  • More MySQL data types added: DECIMAL (both old and new), DATE, TIME
  • CHAR data type handling improved in table definitions generator
  • Indexes filtering added to page_parser
  • 64-bit stat() support added to all tools
  • Linux has no isnumber() function so we define our own implementation (pretty simple)
  • Lots of fixes in create_defs.pl script – now it generates definitions which could recover your data in 80% cases w/o any changes.
  • Min/max record size calculation fixed in constraints-based parser.
  • Nullable fixed-size columns support is fixed.
  • Debug logging is much cleaner now.

As always, if you need any help with your recovery, we would love to help.

Dog-pile Effect and How to Avoid it with Ruby on Rails memcache-client Patch

Posted by Oleksiy Kovyrin under Databases, Development

We were using memcache in our application for a long time and it helped a lot to reduce DB servers load on some huge queries. But there was a problem (sometimes called a “dog-pile effect”) – when some cached value was expired and we had a huge traffic, sometimes too many threads in our application were trying to calculate new value to cache it.

For example, if you have some simple but really bad query like

1
SELECT COUNT(*) FROM some_table WHERE some_flag = X

which could be really slow on a huge tables, and your cache expires, then ALL your clients calling a page with this counter will end up waiting for this counter to be updated. Sometimes there could be tens or even hundreds of such a queries running on your DB killing your server and breaking an entire application (number of application instances is constant, but more and more instances are locked waiting for a counter).

Read the rest of this entry »

FastSessions Rails Plugin Released

Posted by Oleksiy Kovyrin under Databases, Development, My Projects

How often do we think about our http sessions implementation? I mean, do you know, how your currently used sessions-related code will behave when sessions number in your database will grow up to millions (or, even, hundreds of millions) of records? This is one of the things we do not think about. But if you’ll think about it, you’ll notice, that 99% of your session-related operations are read-only and 99% of your sessions writes are not needed. Almost all your sessions table records have the same information: session_id and serialized empty session in the data field.

Looking at this sessions-related situation we have created really simple (and, at the same time, really useful for large Rails projects) plugin, which replaces ActiveRecord-based session store and makes sessions much more effective. Below you can find some information about implementation details and decisions we’ve made in this plugin, but if you just want to try it, then check out our project site.

Read the rest of this entry »

Innodb Locks, ActiveRecord and acts_as_ferret Problem

Posted by Oleksiy Kovyrin under Databases, Development

Last few days one of our customers (one of the largest Ruby on Rails sites on the Net) was struggling to solve some really strange problem – once upon a time they were getting an error from ActiveRecord on their site:

1
(ActiveRecord::StatementInvalid) "Mysql::Error: Lock wait timeout exceeded; try restarting transaction: UPDATE some_table.....

They have innodb_lock_wait_timeout set to 20 seconds. After a few hours of looking for strange transactions we were decided to create s script to dump SHOW INNODB STATUS and SHOW FULL PROCESSLIST commands output to a file every 10 seconds to catch one of those moments when this error occurred.

Today we’ve got next error and started digging in our logs…

Read the rest of this entry »

Data Recovery Toolkit for InnoDB Released

Posted by Oleksiy Kovyrin under Databases, Development, My Projects

I’m returned from my 1-week vacation today and want to say – I’ve never been so productive as I was there ;-) Blue ocean, hot sun and white sand really helped me to finish my work on the first release of one really awesome project.

Today I’m proud to announce our first public release of the Data Recovery Toolkit for InnoDB – set of tools for checking InnoDB tablespaces and recovering data from damaged tablespaces or from dropped/truncated InnoDB tables.

Read the rest of this entry »

Small Tip: How to Enable ActiveRecord Logging in Merb

Posted by Oleksiy Kovyrin under Databases, Development

Today I was developing one small merb application for one of our projects and needed to see ActiveRecord logging on console like I do in Rails. After a short research I’ve found out that merb_active_record plugin passes its MERB_LOGGER to AR by default so I decided to try to change merb log level and here they are – my pretty colored AR logs!

So, if you want to see ActiveRecord logs in your application in development mode, then you need to add one line to your conf/environments/development.rb file:

1
2
puts "Loaded DEVELOPMENT Environment..."
MERB_LOGGER.level = Merb::Logger::DEBUG

That’s it for now. Long live merb! :-)

Why I HATE “smart” Software: Cpanel vs Consultant

Posted by Oleksiy Kovyrin under Admin-tips, Databases

Today I was working on one small consulting task and our client asked for an upgrade from MySQL 5.0 to 5.1. It was pretty easy and task was successfully finished and reported to the customer… But few hours after my report I’ve got an email from customer with something like “WTF? Where is my 5.1?!”. I was shocked when I saw happily running 5.0 on their server w/o anything related to my 5.1 installation…

After some short investigation I’ve found out, that it was cpanel (dumb software for dumb system administrators) – it noticed, that installed mysql version (5.1) is not the same as it thought it should be (5.0), so without any warnings or notices it removed all 5.1 rpms and installed “brand new” 5.0.

Here I’d like to say GREAT THANKS to mysql team for such a great software which did not screwed up user’s data in such situation. But what idiots in cpanel development team decided, that is it appropriate and acceptable to perform such operations?! As an administrator and as a software developer I do not understand them – I just can’t understand such approach….

So, enough complaining – here is a piece of useful information for my readers: If you’re so unlucky to have cpanel installed on your server and you’d like to upgrade your mysql manually, then you can perform following operations:

1
2
3
# touch /etc/mysqlupdisable
# chattr +i /etc/mysqlupdisable
# service cpanel restart

After these small changes your cpanel will forget about mysql upgrades and you’ll be able to do what you want and not what some dumb developers decided you should do.

MySQL Master-Master Replication Manager 1.0 Released

Posted by Oleksiy Kovyrin under Databases, Development, My Projects, Networks

It’s been a long time since we’ve started this project and it is time to make a checkpoint. So, I’ve decided to release final 1.0 version and make 1.X branch stable while all serious development with deep architectural changes will be done 2.X branch (trunk at this moment).

Changes from previous release:

  • Perl semaphores implementation caused huge memory leaks (mmmd_mod).
  • Now we do not send any commands to hard offline hosts with dead TCP/IP stack to prevent mointoring problems for other hosts.
  • Removed legacy StartSlave method from agent code which caused problems on some Perl versions
  • Added a few fixes to prevent non-exclusive roles from moving. This caused internal status structures to be corrupted.
  • Made all mysql checks properly report errors occurring (previously they were resulting in an OK response). Thanks to Phillip Pearson.
  • Some memory leaks found in mysql checkers and as a quick fix we’ve added an idea of “Maximum Checks Before Restart” to all checkers. If you want some checker to restart after N checks, simply add “restart after N” to your checker declaration.
  • Added some more docs to the project site.

New version can be obtained here or from the project’s SVN repository.