It’s been a while since I’ve posted my first post about the way we do document pages caching in Scribd and this approach has definitely proven to be really effective since then. In the second post of this series I’d like to explain how we handle our complex document URLs and logged in users in the caching architecture.
First of all, let’s take a look at a typical Scribd’s document URL: http://www.scribd.com/doc/1/Improved-Statistical-Test.
As we can see, it consists of a document-specific part (/doc/1) and a non-unique human-readable slug part (/Improved-Statistical-Test). When a user comes to the site with a wrong slug in the document URL, we need to make sure we send the user to the correct URL with a permanent HTTP 301 redirect. So, obviously we can’t simply send our requests to the squid because it’d cause few problems:
- When we change document’s title, we’d create a new cached item and would not be able to redirect users from the old URL to the new one
- When we change a title, we’d pollute cache with additional document page copies.
One more problem that makes the situation even worse – we have 3 different kinds of users on the site:
- Logged in users – active web site users that are logged in and should see their name at the top of the page, should see all kinds of customized parts of the page, etc (especially when a page is their own document).
- Anonymous users – all users that are not logged in and visit the site with a flash-enabled browser
- Bots – all kinds of crawlers that can’t read flash content and need to see a plain text document version
All three kinds of users should see their own document page versions whether the page is cached or not.
Read the rest of this entry →
Many people know me as a nginx web server evangelist. But as (IMHO) any professional I think that it is really rewarding to know as much as possible about all the tools available on the market so every time you need to make a decision on some technical issue, you’d consider all pros and cons based on my own knowledge.
This is why when I received an email from Packt company asking if I’d like to read and review their book on Lighttpd I decided to give it a shot (I usually do not review any books because I do not always have enough time to read a book thoroughly to be able to write a review). So, here are my impressions from this book.
Read the rest of this entry →
If you’d take a look at any web site, you will notice, that almost all of the pages on this given site are pretty static in their nature. Or course, this site could have some dynamic elements like login field or link in the header, some customized menu elements and some other things… But entire page could be considered static in many cases.
When I started thinking about my sites from this point of view, I understood, how great it would be to be able to cache entire page somewhere (in memcache for example) and be able to send it to the user without any requests to my applications, which are pretty slow (comparing to memcache 😉 ) in content generation. Then I came up with a pretty simple and really powerful idea I’ll describe in this article. An idea of caching entire pages of the site and using my application only to generate small partials of the page. This idea allows me to handle hundreds of queries with one server running pretty slow (yeah! it is slow even after all optimizations on MySQL side and lots of tweaks in site’s code) Ruby on Rails application. Of course, the idea is kind of universal and could be used with any back-end languages, technologies or frameworks because all of them are slower then memcache in content “generation”.
Read the rest of this entry →
So, we did it! New Best Tech Videos site version has been released today. Of course, it could have some issues, and we are really looking forward for your feedback on our support forums.
Let’s review main improvements we’ve made here:
- First of all, I want to mention our first step to socialization of the BTV – all our users could signup now and get their own lists of favorited, commented and voted videos and much more – they could have their own personal RSS feeds.
- Next cool thing we’ve prepared for you is user-generated content! You can find some great videos on the Net and share them with fellow BTV readers. At this moment our posting system works in pre-moderated mode to keep really high level of videos we post here, but later we want to delegate moderation to our most active members.
- As I mentioned before – we have our own video rating system. All our users could vote for videos they like and then we’ll let users to decide, what videos they would like to watch based on these votes.
So, these are just small part (but most valuable) of our ideas. If you want to have a look at all of them, just sign up and start using the service – you gonna like it, we’re sure!
Thanks to all our readers, without your help this release would not happen!
[lang_en]
Sometimes you may need to implement controlled downloads when all downloads requests being sent to your script and then this script decides what to do: to send some file to the user or to show some access denied page or, maybe, do something else. In lighttpd server it can be done by returning X-Sendfile header from script. Nginx have its own implementation of such idea using X-Accel-Redirect header. In this short post I will try to describe how to use this feature from PHP and Rails applications.
[/lang_en]
[lang_ru]
Иногда вам может быть нужно реализовать т.н. контролируемое скачивание, когда все запросы на скачивание файлов передаются скрипту, который решает, как поступить: отправить пользователю какой-либо файл, или показать стриницу access denied, или, может быть, сделать что-то еще. При использовании сервера lighttpd это может быть реализовано при помощи заголовка X-Sendfile, возвращаемого из скрипта. Nginx имеет свою союственную реализацию описанной идеи с использованием заголовка X-Accel-Redirect. В этой короткой статье я попытаюсь описать, как использовать эту возможность из приложений на PHP или Rails.
[/lang_ru]
Read the rest of this entry →