Archive for the ‘Databases’ Category

DbCharmer 1.7.0 Release: Rails 3.0 Support and Forced Slave Reads

Posted by Oleksiy Kovyrin under Databases, Development, My Projects

This week, after 3 months in the works, we’ve finally released version 1.7.0 of DbCharmer ruby gem – Rails plugin that significantly extends ActiveRecord’s ability to work with multiple databases and/or database servers by adding features like multiple databases support, master/slave topologies support, sharding, etc.

New features in this release:

  • Rails 3.0 support. We’ve worked really hard to bring all the features we supported in Rails 2.X to the new version of Rails and now I’m proud that we’ve implemented them all and the implementation looks much cleaner and more universal (all kinds of relations in rails 3 work in exactly the same way and we do not need to implement connection switching for all kinds of weird corner-cases in ActiveRecord).
  • Forced Slave Reads functionality. Now we could have models with slaves that are not used by default, but could be turned on globally (per-controller, per-action or in a block). This is a new feature that brings our master/slave routing capabilities to a really new level – we could now use it for a really mission-critical models on demand and not be afraid of breaking major functionality of our applications by switching them to slave reads.
  • Lots of changes were made in the structure of our code and tests to make sure it would be much easier for new developers to understand DbCharmer internals and make changes in its code.

Along with the new release we’ve got a brand new web site. You can find much better, cleaner and, most importantly, correct documentation for the library on the web site. We’ll be adding more examples, will try to add more in-depth explanation of our core functions, etc.

If you have any questions about the release, feel free to ask them in our new mailing list: DbCharmer Users Group.

For more updates on our releases, you can follow @DbCharmer on Twitter.

Scribd is Hiring (I’m Looking for an Operations Engineer to Join My Team)

Posted by Oleksiy Kovyrin under Blog, Databases, Development, Links, Networks

Scribd is a top 100 site on the web and one of the largest sites built using Ruby on Rails. As one of the first rails sites to reach scale, we’ve built a lot of infrastructure and solved a lot of challenges to get Scribd to where it is today. We actively try to push the envelope and have contributed substantial work back to the open source community.

Scribd has an agile, startup culture and an unusually close working relationship between engineering and ops. You’ll regularly find cross-over work at Scribd, with ops people writing application-layer code and engineers figuring out operations-level problems. We think we’re able to make that work because of the uniquely talented people we have on the team.

To allow us to keep scaling, we’re now looking to add a strong, experienced operations guru to the team. As a member of Scribd operations, you’ll have tremendous ownership and responsibility for one of the web’s most popular applications. Because Scribd is a startup, you will wear many hats and have broader responsibility than you would at a larger company.

If you read this blog, you should already have a good sense of the kind of work you’ll be doing on this position.

The Ideal Profile

You are an experienced operations professional and have run ops at at least one large-scale website. You have comprehensive knowledge of a broad variety of system tools, from MySQL and Nginx to Squid and Memcached. You should also have strong software development skills and be well-versed in major programming languages. You should be strongly motivated, a creative solution finder, and ready to jump into the thorniest technical problems whenever necessary.

Responsibilities

  • Develop and maintain all aspects of Scribd’s operations infrastructure, including system monitoring, backups, server configuration, databases, and caching systems
  • Collaborate with engineering to create next generation infrastructure to support changing requirements
  • Predict scaling problems before they occur and work with engineering to prevent them
  • Write and debug application level ruby code
  • Participate in an on-call rotation
  • Quickly diagnose server problems and employ preventive measures to maintain high availability servers

Qualifications

  • Bachelors degree in CS or equivalent experience
  • 3-5 years of professional experience in site operations
  • Strong software engineering skills, including knowledge of major programming languages
  • Strong database skills, preferably with MySQL, and overall linux knowledge
  • Experience with most of the following technologies: MySQL, Nginx, Ruby, Memcached, Squid, git, Solr, HBase, Postfix
  • Proven ability to quickly learn and implement unfamiliar technologies
  • Strong desire to work hard at a rapidly growing company

Location: You are preferably located near San Francisco, CA. Relocation assistance is designed on a per-case basis. In short, we’ll be creative to get you here.

Contact: Please send your email cover letter and resume with the subject “Your name – Senior Site Operations Engineer – via Kovyrin.net” to jobs@scribd.com or contact me directly using any of my contacts. All communication and correspondence is held in the strictest confidence to ensure that you can connect and learn more without exposure.

DbCharmer – Rails Can Scale!

Posted by Oleksiy Kovyrin under Databases, Development, My Projects

Back in November 2009 I was working on a project to port Scribd.com code base to Rails 2.2 and noticed that some old plugins we were using in 2.1 were abandoned by their authors. Some of them were just removed from the code base, but one needed a replacement – that was an old plugin called acts_as_readonlyable that helped us to distribute our queries among a cluster of MySQL slaves. There were some alternatives but we didn’t like them for one or another reasons so we’ve decided to go with creating our own ActiveRecord plugin, that would help us scale our databases out. That’s the story behind the first release of DbCharmer.

Today, six months after the first release of the gem and we’ve moved it to gemcutter (which is now the official gems hosting) and we’re already at version 1.6.11. The gem was downloaded more than 2000 times. There are (at least) 10+ large users that rely on this gem to scale their products out. And (this is the most exciting) we’ve added tons of new features to the product.

Here are the main features added since the first release:

  • Much better multi-database migrations support including default migrations connection changing.
  • We’ve added ActiveRecord associations preload support that makes it possible to move eager loading queries to the same connection where your finder queries go to.
  • We’ve improved ActiveRecord’s query logging feature and now you can see what connections your queries executed on (and yes, all those improvements are colorized :-) ).
  • We’ve added an ability to temporary remap any ActiveRecord connections to any other connections for a block of code (really useful when you need to make sure all your queries would go to some non-default slave and you do not want to mess with all your models).
  • The most interesting change: we’ve implemented some basic sharding functionality in ActiveRecord which currently is being used in production in our application.

As you can see now DbCharmer helps you to do three major scalability tasks in your Rails projects:

  1. Master-Slave clusters to scale out your Rails models reads.
  2. Vertical sharding by moving some of your models to a separate (maybe even dedicated) servers and still keep using AR associations
  3. Horizontal sharding by slicing your models data to pieces and placing those pieces into different databases and/or servers.

So, If you didn’t check DbCharmer out yet and you’re working on some large rails project that is (or going to be) facing scalability problems, go read the docs, download/install the gem and prove them that Rails CAN scale!

DB Charmer – ActiveRecord Connection Magic Plugin

Posted by Oleksiy Kovyrin under Databases, Development, My Projects

Today I’m proud to announce the first public release of our ActiveRecord database connection magic plugin: DbCharmer.


DB Charmer – ActiveRecord Connection Magic Plugin

DbCharmer is a simple yet powerful plugin for ActiveRecord that does a few things:

  1. Allows you to easily manage AR models’ connections (switch_connection_to method)
  2. Allows you to switch AR models’ default connections to a separate servers/databases
  3. Allows you to easily choose where your query should go (on_* methods family)
  4. Allows you to automatically send read queries to your slaves while masters would handle all the updates.
  5. Adds multiple databases migrations to ActiveRecord

Read the rest of this entry »

Rails Developer for a Large Startup: My Vision of an Ideal Candidate

Posted by Oleksiy Kovyrin under Databases, Development, General, Links

Few days ago we were chatting in our corporate Campfire room and one of the guys asked me what do I think about Rails developers hiring process, what questions I’d ask a candidate, etc… This question started really long and interesting discussion and I’d like to share my thoughts on this question in this post.

Read the rest of this entry »

ActiveMQ + Ruby Stomp Client: How to process elements one by one

Posted by Oleksiy Kovyrin under Admin-tips, Databases, Development, My Projects

Few months ago I’ve switched one of our internal projects from doing synchronous database saves of analytics data to an asynchronous processing using starling + a pool of workers. This was the day when I really understood the power of specialized queue servers. I was using database (mostly, MySQL) for this kind of tasks for years and sometimes (especially under a highly concurrent load) it worked not so fast… Few times I worked with some queue servers, but those were either some small tasks or I didn’t have a time to really get the idea, that specialized queue servers were created just to do these tasks quickly and efficiently.

All this time (few months now) I was using starling noticed really bad thing in how it works: if workers die (really die, or lock on something for a long time, or just start lagging) and queue start growing, the thing could kill your server and you won’t be able to do something about it – it just eats all your memory and this is it. Since then I’ve started looking for a better solution for our queuing, the technology was too cool to give up. I’ve tried 5 or 6 different popular solutions and all of them sucked… They ALL had the same problem – if your queue grows, this is your problem and not queue broker’s :-/ The last solution I’ve tested was ActiveMQ and either I wasn’t able to push it to its limits or it is really so cool, but looks like it does not have this memory problem. So, we’ve started using it recently.

In this small post I’d like to describe a few things that took me pretty long to figure out in ruby Stomp client: how to make queues persistent (really!) and how to process elements one by one with clients’ acknowledgments.
Read the rest of this entry »

Advanced Squid Caching for Rails Applications: Preface

Posted by Oleksiy Kovyrin under Databases, Development, My Projects, Networks

Since the day one when I joined Scribd, I was thinking about the fact that 90+% of our traffic is going to the document view pages, which is a single action in our documents controller. I was wondering how could we improve this action responsiveness and make our users happier.

Few times I was creating a git branches and hacking this action trying to implement some sort of page-level caching to make things faster. But all the time results weren’t as good as I’d like them to be. So, branches were sitting there and waiting for a better idea.
Read the rest of this entry »

Found an Ideal I/O Scheduler for my MySQL boxes

Posted by Oleksiy Kovyrin under Admin-tips, Databases

Today I was doing some work on one of our database servers (each of them has 4 SAS disks in RAID10 on an Adaptec controller) and it required huge multi-thread I/O-bound read load. Basically it was a set of parallel full-scan reads from a 300Gb compressed innodb table (yes, we use innodb plugin). Looking at the iostat I saw pretty expected results: 90-100% disk utilization and lots of read operations per second. Then I decided to play around with linux I/O schedulers and try to increase disk subsystem throughput. Here are the results:

Read the rest of this entry »

Fwd: Scorching hot Startup Needs Scalability Sorcerer and Optimization Freak

Posted by Oleksiy Kovyrin under Databases, Development, Networks

Question: Do you think you have what it takes to take a service from a few hundred thousand users to tens of millions of users in 1 year flat? If you do read on and perhaps become the next beloved scalability rockstar of our age.

We are looking for a data charmer. A mysql magician. A code hack. A funny man. A mad man. A passionate man. Or perhaps a woman who does all these things and more.

Here’s what you gotta do:

  • Pro-active and reactive performance analysis, monitoring and general database plumbing of all leaky issues.
  • Work with others on the team to help maintain/improve and support the infrastructure for a high traffic, high growth site
  • Optimize and tune the database day to day
  • Algorithmic bent. Develop algos to quicken search times, response times, find shortest paths between various connections on site.
  • Have solid low level networking/protocol/computer security skills
  • Log everything. Usage stats, search stats, user behaviour stats. Draw conclusions. Constantly refine and tinker.
  • Help with periodic large storage migrations
  • Work intimately with operations, development, and strategy team to ensure smooth deployments of new iterations, high availability of database services.
  • Understand capacity planning. Always thinking 10 steps ahead. (Whether it means looking at distributed systems services, cloud computing options, evaluating HA models used in other industries etc)
  • Have a pulse on the state of the web, social media, social networking, different scalability architectures, benefits/negatives of each.
  • Interest in high concurrency, distributed systems architectures.
  • General low level hacking/scripting/optimizations in perl/python.
  • Evaluate changing conditions in the archi
  • Think creatively. No dogmatists.

Ideal skillset:

  • BS in Comp Sci or equivalent
  • 5+ years experiene with Linux/Unix systems
  • 3+ years with MySQL in production environment
  • Knowledge and experience with partitioned architectures and a database sharding techniques
  • Capacity planning/high growth planning/emergency planning experience
  • Passion, bordering on paranoia, for hunting bottlenecks, and optimizing IO operations
  • Experience with MySQL replication
  • Deep experience with MySQL internals
  • Experience with performance analysis tools, storage engines, backup methodologies for MySQL
  • Great perl/shell scripting experience
  • Team player, self motivated, able to handle high stress situations while maintaining a calm disposition
  • Great communication skills, attention for detail, and an interest in the business side of the equation of systems/scale planning
  • Eat/sleep/breathe the web, startups, and the landscape of the social web
  • Insomniac

We’re ready to offer an aggressive salary with tremendous upside by way of stock options, commensurate with your experience, your drive and your results.

Apply directly to:

net ‘dot’ startup ‘at’ googles mail service dot com

by sending us a CV/resume, and optionally, a link to your blog or Linkedin profile.

Ivan Needs Your Help

Posted by Oleksiy Kovyrin under Databases

Please help save Ivan, son of Andrii Nikitin (MySQL Support Engineer), who needs a bone marrow transplant. Andrii’s message is below:

“My family got bad news – doctors said allogenic bone marrow transplantation is the only chance for my son Ivan.

“8 months of heavy and expensive immune suppression brought some positive results so we hoped that recovering is just question of time.

“Ivan is very brave boy – not every human meets so much suffering during whole life, like Ivan already met in his 2,5 years. But long road is still in front of us to get full recover – we are ready to come it through.

“Ukrainian clinics have no technical possibility to do such complex operation, so we need 150-250K EUR for Israel or European or US clinic. The final decision will be made considering amount we able to find. Perhaps my family is able to get ~60% of that by selling the flat where parents leave and some other goods, but we still require external help.”

– Andrii Nikitin, MySQL Engineer

For donation: Donation can be made through PayPal (via MySQL/Sun website)