Using Nginx As Reverse-Proxy Server On High-Loaded Sites
18 May2006

Two weeks ago we have started new version of one of our primary web projects and have started very massive advertisement campaign to promote this web site. As the result of that advertisements, our outgoing traffic has been increased to 200-250Mbit/s from only one server! In this article I will describe, how to build stable and efficient web site with two-layer architecture (with frontend + backend web servers) or how to modify your current server configuration to get additional resources to handle more requests.

First of all, let me describe general structure of web-server and how it handles clients requests:

  1. Client initiates request to your server.
  2. His browser connects to your server.
  3. Your server (as for Apache) creates new thread/process to handle request.
  4. If client requested dynamic content, web server spawns CGI process or executes dynamic content handling module (i.e. mod_php) and waits while request will be processed. When it receives result web-page, it sends it to client.
  5. If client asked for some static file, web server sends this file to client
  6. Client’s browser receives answer, closes connection to web server and displays content.

As you can see, when there are many requests coming to your server, your server needs to create many parallel threads/processes and keep them running while client will close connection. If client has slow connection, web server process will wait too long and resource consumption will increase very fast.

What we can do in such situation? Simple solution is to buy more memory and more CPUs for your server and wait while web server load will crash your server. But there is more efficient solution! You can simply put some small piece of software (nginx, for example) behind your big web server and let it handle all requests to static content and to pass all dynamic requests to primary web-server. With this solution your big server will spawn additional threads/processes only for static pages and it will return answers to small frontend very fast and then can free resources to use them to handle another queries. Small frontend can wait very long time while client will receive his content and will close connection – backend server will not consume resources for such long time!

Here you can see simple diagram of proposed web server configuration:


General Data Flow Diagram

As additional benefit from such configuration you can get very useful feature of managed downloads that will be described below.

If your server contains some static resources, which can be downloaded not by all users (content provider can provide mp3 files only to users with positive balance or some site can provide downloads only to logged-in users), in generic configuration you need to create some script to handle this downloads and to create some ugly links like http://some.service.com/down.php?file=xxx.mp3 and additionally your users will not be able to resume downloads (except such cases when your script so complex, that it handles Ranges HTTP-header)…

In configuration with nginx frontend, you can create simple URL-rewriting rule, that will pass all requests to pretty URLs like http://your.cool-service.com/files/cool.mp3 to some simple script /down.php automatically and, if this script has returned X-Accel-Redirect header, will send requested file to user automatically with Ranges header support and when user will download his cool content, your backend server can handle other requests. Your users even will not know that your script controls their downloads. Simple diagram for described algorithm is following:


Functional Algorithm

Let me bring your attention to interesting fact: If you only accelerate your site with described technique and do not want to create download control system, you do not need to modify any of your scripts on your backend server! They will work as in original configuration!

So, the last thing you need to boost your web server with nginx reverse proxying technique is following configuration file snipet:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
    server {
        listen       80;
        server_name  some-server.com www.server-name.com;

        access_log  logs/host.access.log  main;

        # Main location
        location / {
            proxy_pass         http://127.0.0.1:8080/;
            proxy_redirect     off;

            proxy_set_header   Host             $host;
            proxy_set_header   X-Real-IP        $remote_addr;
            proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;

            client_max_body_size       10m;
            client_body_buffer_size    128k;

            proxy_connect_timeout      90;
            proxy_send_timeout         90;
            proxy_read_timeout         90;

            proxy_buffer_size          4k;
            proxy_buffers              4 32k;
            proxy_busy_buffers_size    64k;
            proxy_temp_file_write_size 64k;
        }

        # Static files location
        location ~* ^.+.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js)$ {
            root   /spool/www/members_ng;
        }

    }

Full version of sample config file you can get here.

Notice: If your backend scripts are using user IP addresses for some purposes, you will need to install mod_rpaf module to use X-Real-IP header provided by nginx instead of real user’s IP address.

That is all! Now you can simply install nginx on your server, configure it and your server will be able to handle more traffic with the less resources that it uses now! Everything will be done transparently for your currently written scripts and if you want, you will be able to provide download handling with simple trick, that I will describe in my next post 😉

If you have some questions, do not hesitate to ask them here in comments – I will try to answer all of them. If you liked this article, you can support author by taking a look at advertisements on this page or simply vote for it on digg.com.