We use nginx and its features a lot in Scribd. Many times in the last year we needed some pretty interesting, but not supported feature – we wanted nginx X-Accel-Redirect functionality to work with remote URLs. Out of the box nginx supports this functionality for local URIs only. In this short post I want to explain how did we make nginx serve remote content via X-Accel-Redirect.
Read the rest of this entry »
Having a reverse-proxy web cache as one of the major infrastructure elements brings many benefits for large web applications: it reduces your application servers load, reduces average response times on your site, etc. But there is one problem every developer experiences when works with such a cache – cached content invalidation.
It is a complex problem that usually consists of two smaller ones: individual cache elements invalidation (you need to keep an eye on your data changes and invalidate cached pages when related data changes) and full cache purges (sometimes your site layout or page templates change and you need to purge all the cached pages to make sure users will get new visual elements of layout changes). In this post I’d like to look at a few techniques we use at Scribd to solve cache invalidation problems.
Read the rest of this entry »
This is going to be a really short post, but for someone it could save an hour of life.
So, you’ve nothing to do and you’ve decided to play around with IPv6 or maybe you’re happened to be an administrator of a web service that needs to support IPv6 connectivity and you need to make your nginx server work nicely with this protocol.
First thing you need to do is to enable IPv6 in nginx by recompiling it with --with-ipv6 configure option and reinstalling it. If you use some pre-built package, check if your nginx already has this key enabled by running nginx -V.
Read the rest of this entry »
After the previous post in this caching related series I’ve received many questions on hardware and software configuration of our servers so in this post I’ll describe our server’s configs and the motivation behind those configs.
Read the rest of this entry »
It’s been a while since I’ve posted my first post about the way we do document pages caching in Scribd and this approach has definitely proven to be really effective since then. In the second post of this series I’d like to explain how we handle our complex document URLs and logged in users in the caching architecture.
First of all, let’s take a look at a typical Scribd’s document URL: http://www.scribd.com/doc/1/Improved-Statistical-Test.
As we can see, it consists of a document-specific part (/doc/1) and a non-unique human-readable slug part (/Improved-Statistical-Test). When a user comes to the site with a wrong slug in the document URL, we need to make sure we send the user to the correct URL with a permanent HTTP 301 redirect. So, obviously we can’t simply send our requests to the squid because it’d cause few problems:
- When we change document’s title, we’d create a new cached item and would not be able to redirect users from the old URL to the new one
- When we change a title, we’d pollute cache with additional document page copies.
One more problem that makes the situation even worse – we have 3 different kinds of users on the site:
- Logged in users – active web site users that are logged in and should see their name at the top of the page, should see all kinds of customized parts of the page, etc (especially when a page is their own document).
- Anonymous users – all users that are not logged in and visit the site with a flash-enabled browser
- Bots – all kinds of crawlers that can’t read flash content and need to see a plain text document version
All three kinds of users should see their own document page versions whether the page is cached or not.
Read the rest of this entry »
Many people know me as a nginx web server evangelist. But as (IMHO) any professional I think that it is really rewarding to know as much as possible about all the tools available on the market so every time you need to make a decision on some technical issue, you’d consider all pros and cons based on my own knowledge.
This is why when I received an email from Packt company asking if I’d like to read and review their book on Lighttpd I decided to give it a shot (I usually do not review any books because I do not always have enough time to read a book thoroughly to be able to write a review). So, here are my impressions from this book.
Read the rest of this entry »
If you’d take a look at any web site, you will notice, that almost all of the pages on this given site are pretty static in their nature. Or course, this site could have some dynamic elements like login field or link in the header, some customized menu elements and some other things… But entire page could be considered static in many cases.
When I started thinking about my sites from this point of view, I understood, how great it would be to be able to cache entire page somewhere (in memcache for example) and be able to send it to the user without any requests to my applications, which are pretty slow (comparing to memcache
) in content generation. Then I came up with a pretty simple and really powerful idea I’ll describe in this article. An idea of caching entire pages of the site and using my application only to generate small partials of the page. This idea allows me to handle hundreds of queries with one server running pretty slow (yeah! it is slow even after all optimizations on MySQL side and lots of tweaks in site’s code) Ruby on Rails application. Of course, the idea is kind of universal and could be used with any back-end languages, technologies or frameworks because all of them are slower then memcache in content “generation”.
Read the rest of this entry »
So, we did it! New Best Tech Videos site version has been released today. Of course, it could have some issues, and we are really looking forward for your feedback on our support forums.
Let’s review main improvements we’ve made here:
- First of all, I want to mention our first step to socialization of the BTV – all our users could signup now and get their own lists of favorited, commented and voted videos and much more – they could have their own personal RSS feeds.
- Next cool thing we’ve prepared for you is user-generated content! You can find some great videos on the Net and share them with fellow BTV readers. At this moment our posting system works in pre-moderated mode to keep really high level of videos we post here, but later we want to delegate moderation to our most active members.
- As I mentioned before – we have our own video rating system. All our users could vote for videos they like and then we’ll let users to decide, what videos they would like to watch based on these votes.
So, these are just small part (but most valuable) of our ideas. If you want to have a look at all of them, just sign up and start using the service – you gonna like it, we’re sure!
Thanks to all our readers, without your help this release would not happen!
Месяц назад этому блогу исполнился 1 год. Надеюсь, все мои читатели читали его с удовольствием. Я никогда не просил помощи и всегда предлагал помощь людям, которые в ней нуждались. Но сегодня мне приходится просить о помощи, потому что это первый случай в моей долгой практике, когда я не знаю, как решить мою проблему.
Read the rest of this entry »
Иногда вам может быть нужно реализовать т.н. контролируемое скачивание, когда все запросы на скачивание файлов передаются скрипту, который решает, как поступить: отправить пользователю какой-либо файл, или показать стриницу access denied, или, может быть, сделать что-то еще. При использовании сервера lighttpd это может быть реализовано при помощи заголовка X-Sendfile, возвращаемого из скрипта. Nginx имеет свою союственную реализацию описанной идеи с использованием заголовка X-Accel-Redirect. В этой короткой статье я попытаюсь описать, как использовать эту возможность из приложений на PHP или Rails.
Read the rest of this entry »