Category: Admin-tips
Lighttpd Book from Packt – Great Thanksgiving Present
27 Nov2008

Many people know me as a nginx web server evangelist. But as (IMHO) any professional I think that it is really rewarding to know as much as possible about all the tools available on the market so every time you need to make a decision on some technical issue, you’d consider all pros and cons based on my own knowledge.

This is why when I received an email from Packt company asking if I’d like to read and review their book on Lighttpd I decided to give it a shot (I usually do not review any books because I do not always have enough time to read a book thoroughly to be able to write a review). So, here are my impressions from this book.

Read the rest of this entry


ActiveMQ + Ruby Stomp Client: How to process elements one by one
30 Oct2008

Few months ago I’ve switched one of our internal projects from doing synchronous database saves of analytics data to an asynchronous processing using starling + a pool of workers. This was the day when I really understood the power of specialized queue servers. I was using database (mostly, MySQL) for this kind of tasks for years and sometimes (especially under a highly concurrent load) it worked not so fast… Few times I worked with some queue servers, but those were either some small tasks or I didn’t have a time to really get the idea, that specialized queue servers were created just to do these tasks quickly and efficiently.

All this time (few months now) I was using starling noticed really bad thing in how it works: if workers die (really die, or lock on something for a long time, or just start lagging) and queue start growing, the thing could kill your server and you won’t be able to do something about it – it just eats all your memory and this is it. Since then I’ve started looking for a better solution for our queuing, the technology was too cool to give up. I’ve tried 5 or 6 different popular solutions and all of them sucked… They ALL had the same problem – if your queue grows, this is your problem and not queue broker’s :-/ The last solution I’ve tested was ActiveMQ and either I wasn’t able to push it to its limits or it is really so cool, but looks like it does not have this memory problem. So, we’ve started using it recently.

In this small post I’d like to describe a few things that took me pretty long to figure out in ruby Stomp client: how to make queues persistent (really!) and how to process elements one by one with clients’ acknowledgments.
Read the rest of this entry


Found an Ideal I/O Scheduler for my MySQL boxes
20 Jul2008

Today I was doing some work on one of our database servers (each of them has 4 SAS disks in RAID10 on an Adaptec controller) and it required huge multi-thread I/O-bound read load. Basically it was a set of parallel full-scan reads from a 300Gb compressed innodb table (yes, we use innodb plugin). Looking at the iostat I saw pretty expected results: 90-100% disk utilization and lots of read operations per second. Then I decided to play around with linux I/O schedulers and try to increase disk subsystem throughput. Here are the results:

Read the rest of this entry


Using Sphinx for Non-Fulltext Queries
19 May2008

How often do you think about the reasons why your favorite RDBMS sucks? 🙂 Last few months I was doing this quite often and yes, my favorite RDBMS is MySQL. The reason why I was thinking so because one of my recent tasks at Scribd was fixing scalability problems in documents browsing.

The problem with browsing was pretty simple to describe and as hard to fix – we have large data set which consists of a few tables with many fields with really bad selectivity (flag fields like is_deleted, is_private, etc; file_type, language_id , category_id and others). As the result of this situation it becomes really hard (if possible at all) to display documents lists like “most popular 1-10 pages PDF documents in Italian language from the category “Business” (of course, non-deleted, non-private, etc). If you’ll try to create appropriate indexes for each possible filters combination, you’ll end up having tens or hundreds of indexes and every INSERT query in your tables will take ages.

Read the rest of this entry


Puppet – Admin’s Best Friend
12 Mar2008

If you’ve ever worked in companies with 5-10+ servers and it was your responsibility to install new boxes, change some configuration files and install new software on many boxes you definitely know how painful this work is. Every time you need to change something on 3-5-100 boxes, you go there and make those changes. Most experienced of us used some weird scripts to perform some task on many boxes or used some stuff like dsh. Even with those tricks I’d never wish this work to anyone.

While I was working in Galt, I’ve asked our junior admin to check out puppet and try to use it on our servers. After a week of screaming he’s managed to install and configure it and said “Wow! This is what we definitely need!”. Since then he was using it and was happy :-). But, unfortunately (for me) I’ve never tried it and when I’ve changed my job and started doing the same old (but not good) things like dsh and for+ssh I’ve started thinking: “Stop! Try puppet – maybe it is what you need!”. And what can i say? Yes! This is what every admin needs! Today we’ve deployed our next web server (we do it pretty often) and it was done with puppet – pretty painlessly and quickly.

So, what is puppet? Let me quote their site:

Puppet lets you centrally manage every important aspect of your system using a cross-platform specification language that manages all the separate elements normally aggregated in different files, like users, cron jobs, and hosts, along with obviously discrete elements like packages, services, and files.

Puppet’s simple declarative specification language provides powerful classing abilities for drawing out the similarities between hosts while allowing them to be as specific as necessary, and it handles dependency and prerequisite relationships between objects clearly and explicitly. Puppet is written entirely in Ruby.

Basically, puppet is a tool which lets you describe whatever you’d like about your server(s) and deploy your configuration (configs, packages, files, directories, cron jobs, user accounts, ssh keys, etc) to any number of servers you need. You can describe some simple parts of your system (like “apache”, “nginx”, “rails”, “useful_gems”, “admin-ssh-keys”, etc) and then describe your servers using those terms (like machine X = apache+rails+useful_gems+admin_ssh_keys). When it is done, you’ll be able to deploy all this stuff with one command!

So, if you’re struggling with deploying some changes on many servers or, even, have only a few of them, I’d recommend to try puppet – it could save you a lots of time and nerves.

P. S. When you’ll try to use it – do not give up for a few days – it is hard to change your way of thinking and start writing correct and convenient scripts using puppet’s language.