About 6 years ago (feels like an eternity in Rails world) working at Scribd I’ve started working on porting our codebase from some old version or Rails to a slightly newer one. That’s when I realized, that there wasn’t a ruby gem to help us manage MySQL connections for our vertically sharded databases (different models on different servers). I’ve started hacking on some code to replace whatever we were using back then, finished the first version of the migration branch and then decided to open the code for other people to use. That’s how the DbCharmer ruby gem was born.
For the next few years a lot of new functionality we needed has been added to the gem, making it more complex and immensely more powerful. I’ve enjoyed working on it, developing those features, contributing to the community. But then I left Scribd, stopped being a user of DbCharmer and the situation drastically changed. For quite some time (years) I would keep fighting to make the code work with newer and newer versions of Rails, struggling to wrap my head around more and more (sometimes useless) abstractions Rails Core team decided to throw into ActiveRecord.
Finally, in the last 2 years (while trying to make DbCharmer compatible with Rails 4.0) it has become more and more apparent, that I simply do not want to do this anymore. I do not need DbCharmer to support Rails 4.0+, while it is very clear that many users need it and constant nagging in the issues and the mailing list, asking for updates generated a lot of anxiety for me, anxiety I couldn’t do much about (the worst kind). As the result, since I simply do not see any good reasons to keep fighting this uphill battle (and developing stuff like this for ActiveRecord IS a constant battle!) I officially give up.
Read the rest of this entry →
As a leader of a technical operations team I often have to work on technical operations engineer hiring. This process involves a lot of interviews with candidates and during those interviews along with many challenging practical questions I really love to ask questions like “What are the most important resources you think an Operations Engineer should follow?”, “What books in your opinion are must-read for a techops engineer?” or “Who are your personal heroes in IT community?”. Those questions often give me a lot of information about candidates, their experience, who they are looking up to in the community, what they are interested in, and if they are actively working on improving their professional level.
Recently, one of the candidates asked me to share my lists with him and I thought this information could be valuable to other people so I have decided to share it here on my blog.
Read the rest of this entry →
As you may have heard, last January I have joined Swiftype – an early stage startup focused on changing local site search for the better. It has been a blast for the past 8 months, we have done a lot of interesting things to make our infrastructure more stable and performant, immensely increased visibility into our performance metrics, developed a strong foundation for the future growth of the company. Now we are looking to expand our team with great developers and technical operations people to push our infrastructure and the product even further.
Since I have joined Swiftype, I have been mainly focused on improving the infrastructure through better automation and monitoring, and worked on our backend code. Now I am looking for a few good operations engineers to join my team to work on a few key projects like building a new multi-datacenter infrastructure, creating a new data storage for our documents data, improving high-availability of our core services and much more.
To help us improve our infrastructure we are looking both for senior operations engineers and for more junior techops people that we could help grow and develop within the company. Both positions could be either remote or we could assist you with relocation to San Francisco if you want to work in our office.
If you are interested, you can take a look at an old, but still pretty relevant post I wrote many years ago on what I believe an ops candidate should know. And, of course, if you have any questions regarding these positions in Swiftype, please email me at email@example.com or use any other means for contacting me and I will try to get back to you as soon as possible. If you know someone who may be a great fit for these positions, please let them know!
- Posted in: Admin-tips, Databases, Development
- Tags: cloudera, custom, hadoop, hive, jar, java, json, serde, udf
Yet another small note about Cloudera Hadoop Distribution 4.3.
This time I needed to deploy some custom JAR files to our Hive cluster so that we wouldn’t need to do “
ADD JAR” commands in every Hive job (especially useful when using HiveServer API).
Here is the process of adding a custom SerDE or a UDF jar to your Cloudera Hadoop cluster:
- First, we have built our JSON SerDe and got a
- To make this file available to Hive CLI tools, we need to copy it to
/usr/lib/hive/lib on every server in the cluster (I have prepared an rpm package to do just that).
- To make sure Hive map-reduce jobs would be able to read/write JSON tables, we needed to copy our JAR file to
/usr/lib/hadoop/lib directory on all task tracker servers in the cluster (the same rpm does that).
- And last, really important step: To make sure your TaskTracker servers know about the new jar, you need to restart your tasktracker services (we use Cloudera Manager, so that was just a few mouse clicks ;-))
And this is it for today.
Today, just like many times before, I needed to configure a monitoring server for MySQL using Cacti and awesome Percona Monitoring Templates. The only difference was that this time I wanted to get it to run with 1 min resolution (using ganglia and graphite, both with 10 sec resolution, for all the rest of our monitoring in Swiftype really spoiled me!). And that’s where the usual pain in the ass Cacti configuration gets really amplified by the million things you need to change to make it work. So, this is a short checklist post for those who need to configure a Cacti server with 1 minute resolution and setup Percona Monitoring Plugins on it.
Read the rest of this entry →