Oleksiy Kovyrin

Found an Ideal I/O Scheduler for my MySQL boxes

Posted in: Databases
Tags: i/O, linux, MySQL, performance, scalability, scheduler

20 Jul2008

Today I was doing some work on one of our database servers (each of them has 4 SAS disks in RAID10 on an Adaptec controller) and it required huge multi-thread I/O-bound read load. Basically it was a set of parallel full-scan reads from a 300Gb compressed innodb table (yes, we use innodb plugin). Looking at the iostat I saw pretty expected results: 90-100% disk utilization and lots of read operations per second. Then I decided to play around with linux I/O schedulers and try to increase disk subsystem throughput. Here are the results:

Read the rest of this entry →

Using Sphinx for Non-Fulltext Queries

Posted in: Databases, Development, My Projects
Tags: full-text, index, MySQL, optimization, Ruby, scalability, scribd, sphinx

19 May2008

How often do you think about the reasons why your favorite RDBMS sucks? 🙂 Last few months I was doing this quite often and yes, my favorite RDBMS is MySQL. The reason why I was thinking so because one of my recent tasks at Scribd was fixing scalability problems in documents browsing.

The problem with browsing was pretty simple to describe and as hard to fix – we have large data set which consists of a few tables with many fields with really bad selectivity (flag fields like is_deleted, is_private, etc; file_type, language_id , category_id and others). As the result of this situation it becomes really hard (if possible at all) to display documents lists like “most popular 1-10 pages PDF documents in Italian language from the category “Business” (of course, non-deleted, non-private, etc). If you’ll try to create appropriate indexes for each possible filters combination, you’ll end up having tens or hundreds of indexes and every INSERT query in your tables will take ages.

Read the rest of this entry →

Command Line History

Posted in: Development, General
Tags: bash, history, top

28 Apr2008

Inspired by the Rail Spikes:

1
2
3
4
5
6
7
8
9
10
11
12

bash-3.2$ history 1000 | awk '{a[$2]++}END{for(i in a){print a[i] " " i}}' | sort -rn | head
228 cd
167 git
10 ssh
10 DEPLOY=production
6 sudo
6 pwd
6 ./script/import_views.rb
5 rm
4 rake
4 mv
bash-3.2$

Really interesting stats, I’d never guess that git is used more than ssh on my desktop (I’m a remote worker and mysql consultant so I ssh really often). 🙂

MySQL UC 2008 Presentations

Posted in: Databases, Links
Tags: MySQL, mysqluc08, presentations

18 Apr2008

Since I wasn’t able to get to this year’s MySQL UC (employer change caused problems with US visa obtaining and I didn’t get visa in time) I’m really interested in all presentations people are posting after their sessions. I decided to collect them all in one place and would like to share with others – maybe someone will find it interesting to read what people have to say about many interesting aspects of MySQL usage.

So, I’ve created a folder in my Scribd.com account which you could use (and track using RSS readers) to find out what interesting presentations were published. You can use either my account or mysqluc08 folder there. One more possible option to track mysqluc presentations/documents is using our tagging (I tag all my docs with mysqluc08 tag).

InnoDB Recovery toolset Version 0.3 Released

Posted in: Databases, Development, My Projects
Tags: community, innodb, MySQL, recovery

14 Apr2008

Even though I didn’t go to MySQL conf this year (really sad about this), this week is gonna be most active in the community so I decided to do some community stuff too 🙂 Today I’ve released version 0.3 of our innodb recovery toolkit. Now it became much faster, stable and accurate. At this moment it is possible to recover almost any table from corrupted/deleted tablespace without so much effort as it was before. Here is a short changes list (since 0.1 announced here):

More MySQL data types added: DECIMAL (both old and new), DATE, TIME
CHAR data type handling improved in table definitions generator
Indexes filtering added to page_parser
64-bit stat() support added to all tools
Linux has no isnumber() function so we define our own implementation (pretty simple)
Lots of fixes in create_defs.pl script – now it generates definitions which could recover your data in 80% cases w/o any changes.
Min/max record size calculation fixed in constraints-based parser.
Nullable fixed-size columns support is fixed.
Debug logging is much cleaner now.

As always, if you need any help with your recovery, we would love to help.

Homo-Adminus Blog

Yet Another Admin’s Blog