Category: Development
DbCharmer Development: I Give Up
14 Nov2014

About 6 years ago (feels like an eternity in Rails world) working at Scribd I’ve started working on porting our codebase from some old version or Rails to a slightly newer one. That’s when I realized, that there wasn’t a ruby gem to help us manage MySQL connections for our vertically sharded databases (different models on different servers). I’ve started hacking on some code to replace whatever we were using back then, finished the first version of the migration branch and then decided to open the code for other people to use. That’s how the DbCharmer ruby gem was born.

For the next few years a lot of new functionality we needed has been added to the gem, making it more complex and immensely more powerful. I’ve enjoyed working on it, developing those features, contributing to the community. But then I left Scribd, stopped being a user of DbCharmer and the situation drastically changed. For quite some time (years) I would keep fighting to make the code work with newer and newer versions of Rails, struggling to wrap my head around more and more (sometimes useless) abstractions Rails Core team decided to throw into ActiveRecord.

Finally, in the last 2 years (while trying to make DbCharmer compatible with Rails 4.0) it has become more and more apparent, that I simply do not want to do this anymore. I do not need DbCharmer to support Rails 4.0+, while it is very clear that many users need it and constant nagging in the issues and the mailing list, asking for updates generated a lot of anxiety for me, anxiety I couldn’t do much about (the worst kind). As the result, since I simply do not see any good reasons to keep fighting this uphill battle (and developing stuff like this for ActiveRecord IS a constant battle!) I officially give up.

After some long and painful consideration I’ve decided to officially suspend the project. Here is what the suspension means in this case:

  • I will stop making any changes in DbCharmer code
  • Pull Requests and Issues functionality on the project repository will be disabled (I will dump the issues somewhere for future reference, but no new messages could be added)
  • There will be a huge message in the project README explaining that no Rails versions beyond the latest 3.2.x are supported and there are no plans to do any development to make the code work with Rails 4.0+
  • Project mailing list will be disabled
  • Project website will be moved to a github domain (with the same message explaining the project status)

I’m really sorry if any of the users of the project still had some hopes regarding the Rails4 branch and potential upgrade to the newer Rails versions, but 3.2.x will be the last version officially supported by the project.


Now, here are my answers to the questions that people have asked me about this decision (I’ve talked to a few of the largest users of the project already):

Are you going to kill the repo? – No, the repository (and even the rails4 branch) are going to stay intact. I’m just going to clearly mark it as inactive to make sure people do not try to use it with new Rails versions or expect the project to support those in the future.

What about Rubygems? – All rubygems versions released up until this point will stay active and accessible as long as rubygems is alive. Though no updates will be provided for any Rails versions beyond 3.2.x.

Why not crowdfund the development? – This is a really tough issue I struggled with for a long time. The problem here is that for many years now I had been very fortunate to be in a position when I’m not motivated by the money anymore. So crowdfunding the development would only increase my anxiety 10x, while not really changing the situation on the motivation side. That’s the opposite of what I need at this point.

What if you need DbCharmer functionality on your current or some next job? – This is why I’m not deleting the repo, gems, etc and calling it a suspension and not a closure :-) There is a chance, that one day I will end up in a situation, where I will really need all those wonderful features I’ve enjoyed with DbCharmer for years. And I’m pretty sure, that unless there will be another project available on the market, I will try to revive the project (or build something new upon the most important pieces of DbCharmer codebase). But nobody knows what will happen, so for now the project is suspended.

Can I help with the development? Maybe send in a patch? – Another very tough issue. Accepting patches still requires a lot of time and dedication to review, understand and test them. And that is not something I want to do at this point. The only real way to resume the development of the project at this point would be to transfer the ownership to somebody else. But unless someone creates a fork, shows a true dedication to the project (making sure all the incoming changes are 100% test-covered and battle-tested, etc), I’m not ready to do that. If you have some ideas on this matter, you could ping me any time.


Adding Custom Hive SerDe and UDF Libraries to Cloudera Hadoop 4.3
26 Jul2013

Yet another small note about Cloudera Hadoop Distribution 4.3.

This time I needed to deploy some custom JAR files to our Hive cluster so that we wouldn’t need to do “ADD JAR” commands in every Hive job (especially useful when using HiveServer API).

Here is the process of adding a custom SerDE or a UDF jar to your Cloudera Hadoop cluster:

  • First, we have built our JSON SerDe and got a json-serde-1.1.6.jar file.
  • To make this file available to Hive CLI tools, we need to copy it to /usr/lib/hive/lib on every server in the cluster (I have prepared an rpm package to do just that).
  • To make sure Hive map-reduce jobs would be able to read/write JSON tables, we needed to copy our JAR file to /usr/lib/hadoop/lib directory on all task tracker servers in the cluster (the same rpm does that).
  • And last, really important step: To make sure your TaskTracker servers know about the new jar, you need to restart your tasktracker services (we use Cloudera Manager, so that was just a few mouse clicks ;-))

And this is it for today.


DbCharmer 1.7.0 Release: Rails 3.0 Support and Forced Slave Reads
1 Sep2011

This week, after 3 months in the works, we’ve finally released version 1.7.0 of DbCharmer ruby gem – Rails plugin that significantly extends ActiveRecord’s ability to work with multiple databases and/or database servers by adding features like multiple databases support, master/slave topologies support, sharding, etc.

New features in this release:

  • Rails 3.0 support. We’ve worked really hard to bring all the features we supported in Rails 2.X to the new version of Rails and now I’m proud that we’ve implemented them all and the implementation looks much cleaner and more universal (all kinds of relations in rails 3 work in exactly the same way and we do not need to implement connection switching for all kinds of weird corner-cases in ActiveRecord).
  • Forced Slave Reads functionality. Now we could have models with slaves that are not used by default, but could be turned on globally (per-controller, per-action or in a block). This is a new feature that brings our master/slave routing capabilities to a really new level – we could now use it for a really mission-critical models on demand and not be afraid of breaking major functionality of our applications by switching them to slave reads.
  • Lots of changes were made in the structure of our code and tests to make sure it would be much easier for new developers to understand DbCharmer internals and make changes in its code.

Along with the new release we’ve got a brand new web site. You can find much better, cleaner and, most importantly, correct documentation for the library on the web site. We’ll be adding more examples, will try to add more in-depth explanation of our core functions, etc.

If you have any questions about the release, feel free to ask them in our new mailing list: DbCharmer Users Group.

For more updates on our releases, you can follow @DbCharmer on Twitter.


Nginx-Fu: X-Accel-Redirect From Remote Servers
24 Jul2010

We use nginx and its features a lot in Scribd. Many times in the last year we needed some pretty interesting, but not supported feature – we wanted nginx X-Accel-Redirect functionality to work with remote URLs. Out of the box nginx supports this functionality for local URIs only. In this short post I want to explain how did we make nginx serve remote content via X-Accel-Redirect.

Read the rest of this entry


Advanced Squid Caching in Scribd: Cache Invalidation Techniques
29 May2010

Having a reverse-proxy web cache as one of the major infrastructure elements brings many benefits for large web applications: it reduces your application servers load, reduces average response times on your site, etc. But there is one problem every developer experiences when works with such a cache – cached content invalidation.

It is a complex problem that usually consists of two smaller ones: individual cache elements invalidation (you need to keep an eye on your data changes and invalidate cached pages when related data changes) and full cache purges (sometimes your site layout or page templates change and you need to purge all the cached pages to make sure users will get new visual elements of layout changes). In this post I’d like to look at a few techniques we use at Scribd to solve cache invalidation problems.

Read the rest of this entry