Archive for the ‘Admin-tips’ Category

Nginx-Fu: X-Accel-Redirect From Remote Servers

Posted by Alexey Kovyrin under Admin-tips, Development, Networks

We use nginx and its features a lot in Scribd. Many times in the last year we needed some pretty interesting, but not supported feature – we wanted nginx X-Accel-Redirect functionality to work with remote URLs. Out of the box nginx supports this functionality for local URIs only. In this short post I want to explain how did we make nginx serve remote content via X-Accel-Redirect.

Read the rest of this entry »

Advanced Squid Caching in Scribd: Cache Invalidation Techniques

Posted by Alexey Kovyrin under Admin-tips, Development, My Projects, Networks

Having a reverse-proxy web cache as one of the major infrastructure elements brings many benefits for large web applications: it reduces your application servers load, reduces average response times on your site, etc. But there is one problem every developer experiences when works with such a cache – cached content invalidation.

It is a complex problem that usually consists of two smaller ones: individual cache elements invalidation (you need to keep an eye on your data changes and invalidate cached pages when related data changes) and full cache purges (sometimes your site layout or page templates change and you need to purge all the cached pages to make sure users will get new visual elements of layout changes). In this post I’d like to look at a few techniques we use at Scribd to solve cache invalidation problems.

Read the rest of this entry »

Installing Midnight Commander 4.7 on Mac OS X

Posted by Alexey Kovyrin under Admin-tips, General

Another short post just to remember the procedure for the next time I’ll be setting up a new mac. For those of my readers who do not know what Midnight Commander (aka mc) is, GNU Midnight Commander is a visual file manager, created under a heavy influence of Norton Commander file manager from dark DOS ages :-) For more information, you can visit their web site. Now, get to the installation topic itself.

To install mc on a Mac OS X machine, you need macports installed and then first thing you’ll need to do is to install some prerequisite libraries:

1
$ sudo port install libiconv slang2

Next thing, download the sources from their web site and unpack them. When the sources are ready, you can configure the build:

1
2
3
4
5
6
7
8
$ ./configure \
        --prefix=/opt/mc \
        --with-screen=slang \
        --enable-extcharset \
        --enable-charset \
        --with-libiconv-prefix=/opt/local \
        --with-slang-includes=/opt/local/include \
        --with-slang-libs=/opt/local/lib

Then, normal GNU-style build and install procedure:

1
2
3
$ make
........
$ sudo make install

And the last thing would be to add /opt/mc/bin to your PATH environment variable.

Enabling IPv6 Support in nginx

Posted by Alexey Kovyrin under Admin-tips, Networks

This is going to be a really short post, but for someone it could save an hour of life.

So, you’ve nothing to do and you’ve decided to play around with IPv6 or maybe you’re happened to be an administrator of a web service that needs to support IPv6 connectivity and you need to make your nginx server work nicely with this protocol.

First thing you need to do is to enable IPv6 in nginx by recompiling it with --with-ipv6 configure option and reinstalling it. If you use some pre-built package, check if your nginx already has this key enabled by running nginx -V.

Read the rest of this entry »

Advanced Squid Caching in Scribd: Logged In Users and Complex URLs Handling

Posted by Alexey Kovyrin under Admin-tips, Development, My Projects

It’s been a while since I’ve posted my first post about the way we do document pages caching in Scribd and this approach has definitely proven to be really effective since then. In the second post of this series I’d like to explain how we handle our complex document URLs and logged in users in the caching architecture.

First of all, let’s take a look at a typical Scribd’s document URL: http://www.scribd.com/doc/1/Improved-Statistical-Test.

As we can see, it consists of a document-specific part (/doc/1) and a non-unique human-readable slug part (/Improved-Statistical-Test). When a user comes to the site with a wrong slug in the document URL, we need to make sure we send the user to the correct URL with a permanent HTTP 301 redirect. So, obviously we can’t simply send our requests to the squid because it’d cause few problems:

  • When we change document’s title, we’d create a new cached item and would not be able to redirect users from the old URL to the new one
  • When we change a title, we’d pollute cache with additional document page copies.

One more problem that makes the situation even worse – we have 3 different kinds of users on the site:

  1. Logged in users – active web site users that are logged in and should see their name at the top of the page, should see all kinds of customized parts of the page, etc (especially when a page is their own document).
  2. Anonymous users – all users that are not logged in and visit the site with a flash-enabled browser
  3. Bots – all kinds of crawlers that can’t read flash content and need to see a plain text document version

All three kinds of users should see their own document page versions whether the page is cached or not.

Read the rest of this entry »

ActiveMQ Tips: Flow Control and Stalled Producers Problem

Posted by Alexey Kovyrin under Admin-tips, Development

It’s been a few months since we‘ve started actively using ActiveMQ queue server in our project. For some time we had pretty weird problems with it and even started thinking about switching to something else or even writing our own queue server which would comply with our requirements. The most annoying problem was the following: some time after activemq restart everything worked really well and then activemq started lagging, queue started growing and all producer processes were stalling on push() operations. We rewrote our producers from Ruby to JRuby, then to Java and still – after some time everything was in a bad shape until we restarted the queue server.

Read the rest of this entry »

Using SSH tunnel connection as a SOCKS5 proxy

Posted by Alexey Kovyrin under Admin-tips, Networks

Month ago I was on a vacation and as usual even though our hotel provided us with an internet connection on a pretty decent speeds, I wasn’t able to work there because they’ve banned all tcp ports but some major ones (like 80, 21, etc) and I needed to be able to use ssh, mysql, IMs and other non-web software.

After a short research I’ve found a pretty simple to set up and easy to use approach to such a connection problems I’d like to describe here.

Read the rest of this entry »

Lighttpd Book from Packt – Great Thanksgiving Present

Posted by Alexey Kovyrin under Admin-tips, Networks

Many people know me as a nginx web server evangelist. But as (IMHO) any professional I think that it is really rewarding to know as much as possible about all the tools available on the market so every time you need to make a decision on some technical issue, you’d consider all pros and cons based on my own knowledge.

This is why when I received an email from Packt company asking if I’d like to read and review their book on Lighttpd I decided to give it a shot (I usually do not review any books because I do not always have enough time to read a book thoroughly to be able to write a review). So, here are my impressions from this book.

Read the rest of this entry »

ActiveMQ + Ruby Stomp Client: How to process elements one by one

Posted by Alexey Kovyrin under Admin-tips, Databases, Development, My Projects

Few months ago I’ve switched one of our internal projects from doing synchronous database saves of analytics data to an asynchronous processing using starling + a pool of workers. This was the day when I really understood the power of specialized queue servers. I was using database (mostly, MySQL) for this kind of tasks for years and sometimes (especially under a highly concurrent load) it worked not so fast… Few times I worked with some queue servers, but those were either some small tasks or I didn’t have a time to really get the idea, that specialized queue servers were created just to do these tasks quickly and efficiently.

All this time (few months now) I was using starling noticed really bad thing in how it works: if workers die (really die, or lock on something for a long time, or just start lagging) and queue start growing, the thing could kill your server and you won’t be able to do something about it – it just eats all your memory and this is it. Since then I’ve started looking for a better solution for our queuing, the technology was too cool to give up. I’ve tried 5 or 6 different popular solutions and all of them sucked… They ALL had the same problem – if your queue grows, this is your problem and not queue broker’s :-/ The last solution I’ve tested was ActiveMQ and either I wasn’t able to push it to its limits or it is really so cool, but looks like it does not have this memory problem. So, we’ve started using it recently.

In this small post I’d like to describe a few things that took me pretty long to figure out in ruby Stomp client: how to make queues persistent (really!) and how to process elements one by one with clients’ acknowledgments.
Read the rest of this entry »