As a leader of a technical operations team I often have to work on technical operations engineer hiring. This process involves a lot of interviews with candidates and during those interviews along with many challenging practical questions I really love to ask questions like “What are the most important resources you think an Operations Engineer should follow?”, “What books in your opinion are must-read for a techops engineer?” or “Who are your personal heroes in IT community?”. Those questions often give me a lot of information about candidates, their experience, who they are looking up to in the community, what they are interested in, and if they are actively working on improving their professional level.
Recently, one of the candidates asked me to share my lists with him and I thought this information could be valuable to other people so I have decided to share it here on my blog.
Must-Read Books List
First of all, I would like to share a list of books I believe every professional in our field should read at some point in their life. You may notice that many of these books are not too technical or are not really related to the pure systems administration part of a techops job. I still think those are very important because technical operations work on senior levels involves much more than just making sure things work as expected. A lot of it involves time management, crisis management and many other topics that are equally important for a professional in this field.
So, here is the list (with not particular ordering, grouped by topics):
Systems and Networks Administration
Advanced Programming in the UNIX Environment
by W. Richard Stevens and Stephen A. Rago
High Performance MySQL: Optimization, Backups, and Replication
by Baron Schwartz, Peter Zaitsev and Vadim Tkachenko
UNIX and Linux System Administration Handbook
by Evi Nemeth, Garth Snyder, Trent R. Hein and Ben Whaley
Technical Operations, Architecture, Scalability
Web Operations: Keeping the Data On Time
by John Allspaw and Jesse Robbins
Release It!: Design and Deploy Production-Ready Software
by Michael T. Nygard
Scalable Internet Architectures
by Theo Schlossnagle
The Art of Capacity Planning: Scaling Web Resources
by John Allspaw
Project, Release and Time Management
The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win
by Gene Kim, Kevin Behr and George Spafford
Kanban: Successful Evolutionary Change for Your Technology Business
by David J. Anderson
Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation
by Jez Humble and David Farley
The Power of Full Engagement: Managing Energy, Not Time, is the Key to High Performance and Personal Renewal
by Jim Loehr and Tony Schwartz
Failure Is Not an Option: Mission Control from Mercury to Apollo 13 and Beyond
by Gene Kranz
Team Geek: A Software Developer’s Guide to Working Well with Others
by Brian W. Fitzpatrick and Ben Collins-Sussman
Antifragile: Things That Gain from Disorder
by Nassim Nicholas Taleb
The Field Guide to Understanding Human Error
by Sidney Dekker
Behind Human Error
by David D. Woods, Sidney Dekker, Richard Cook, Leila Johannesen
For more information on interesting books for technical operations engineers, you can check out the following book lists on GoodReads:
Conferences, in my opinion, are an essential part in professional development of any engineer. Here is a list of conferences that could be useful for techops engineers:
- Surge Conference – in my opinion, this is definitely one of the best conferences dedicated to building and maintaining large web architectures. If I were to choose one conference a year to go to, it would definitely be Surge. Videos from previous years are freely available online: 2010, 2011, 2012. 2013 videos should be available soon as well.
- Oreilly’s Velocity Conference – biggest and, probably, the oldest web operations and web performance event. In my opinion, recently it became too focused on web frontend performance, though it is still a really interesting event. Complete video compilations from the conference are available for sale: 2011, 2012, 2013.
- Monitorama Conference – pretty new, but already very popular conference with interesting content for everyone interested in monitoring (which most ops engineers are). Sides and videos from the first ever Monitorama conference in 2013 are available online.
- Percona Live Conference – really awesome event for anybody who has MySQL in their stack. Huge multi-track event with talks from the best and brightest people in MySQL community. Slides and keynote videos from 2013 event are available online.
- DevOps Days – small events happening all around the world and becoming more and more popular. The major topic of these conferences is the DevOps movement, related team/project management practices, etc. Videos and slides from some of the events are available online.
Even if you do not have time to watch any of those conference videos, I think every operations engineer out there would really enjoy 2011 Surge Conference closing plenary session video where Theo Schlossnagle (one of my personal heroes in IT community) described a typical debugging session many of us go through every once in a while:
Interesting Web Resources
And last, but certainly not least, I would like to share a list of web resources I like to follow to stay up to date on the most recent news and fresh ideas within the web operations community and related areas:
Leading Industry Sites and Blogs
- MySQL Performance Blog from Percona – one of the best resources on MySQL performance
- High Scalability – awesome resource with a lot of great articles on scalability, performance and design of large scale systems
- Kitchen Soap – Blog by John Alspaw (another of my personal heroes in IT field)
- DevOps Community Planet – feed/news aggregator for the DevOps community
- DevOps Community on Reddit – not too active, but still a useful resource for getting interesting news
- Agile Sysadmin – Blog of Stephen Nelson-Smith
- obfuscurity – Blog by Jason Dixon, maintainer of Graphite, author of Descartes, Tasseo and other useful tools for metrics collection and displaying
- The Agile Admin – Many interesting thoughts on agile web operations and devops
- Operation Bootstrap – Blog of Aaron Nichols talking about many different aspects of working in operations
Engineering Blogs of Large Web Companies
- Code as Craft – Etsy Development and Operations blog
- Twitter Engineering Blog
- Netflix Tech Blog
- LinkedIn Engineering Blog
- Changelog – member-supported podcast on 5by5 network talking about interesting open source projects
- Food Fight – bi-weekly podcast for Chef community
- DevOps Cafe – interviews with interesting members of DevOps community
- The Ship Show – twice-monthly podcast, featuring discussion on everything from build engineering to DevOps to release management, plus interviews, new tools and techniques, and reviews
And this is it! I hope these lists would be useful for young engineers going into the technical operations and for people who already work in this space. I am going to try to regularly update this post in the future to make sure it stays relevant for a long time.