End Point

News

Welcome to End Point's blog

Ongoing observations by End Point people.

RailsConf 2015 - Atlanta: Day Three

Today, RailsConf concluded here in Atlanta. The day started with the reveal of this year's Ruby Heroes, followed by a Rails Core panel. The audio on this was a bit difficult to hear, but I did catch a few good snippets here and there. I'll post the link to this talk at a later time.

On Trailblazer

One interesting talk I attended was See You on The Trail by Nick Sutterer, sponsored by Engine Yard, a talk where he introduced Trailblazer. Trailblazer is an abstraction layer on top of Rails that introduces a few additional layers that build on the MVC convention. I appreciated several of the arguments he made during the talk:

  • MVC is a simple level of abstraction that allows developers to get up and running efficiently. The problem is that everything goes into those three buckets, and as the application gets more complex, the simplified structure of MVC doesn't answer on how to organize logic like authorization and validation.
  • Nick made the argument that DHH is wrong when says that microservices are the answer to troublesome monolithic apps. Nick's answer is a more structured, organized OO application.
  • Rails devs often say "Rails is simple", but Nick made the argument that Rails is easy (subjective) but not simple (objective). While Rails follows convention with the idea that transitioning between developers on a project should be easy if conventions have been followed, in actuality, there is still so much interpretation into how and where to organize business logic for a complex Rails application that makes transition between developers less straightforward and not simple.
  • Complex Rails tends to include fat models (as opposed to fat controllers), and views with [not-so-helpful] helpers and excessive rendering logic.
  • Rails doesn't introduce convention on where dispatch, authorization, validation, business logic, and rendering logic should live.
  • Trailblazer, an open source framework, introduces a new abstraction layer to introduce conventions for some of these steps. It includes Cells to encapsulate the OO approach in views, and Operations to deserialize and validate without touching the model.

There was a Trailblazer demo during the talk, but as I mentioned above, the takeaway for me here is that rather than focus on the specific technical implementation at this point, this buzzworthy topic of microservices is more about good code organization and conventions for increasingly complex applications, that encourages readability and maintenance on the development side.

I went to a handful of other decent talks today and will include a summary of my RailsConf experience sharing links to popular talks here.

RailsConf 2015 - Atlanta: Day Two

It's day 2 of RailsConf 2015 in Atlanta! I made it through day 1!

The day started with Aaron Patterson's keynote. He covered features he's been working on including auto parallel testing, cache compiled views, integration test performance, and "soup two nuts" performance. I'll update this post with links to his talk and slides later. Aaron is always good at starting his talk with self-deprecation and humor followed by sharing his extensive performance work supported by lots of numbers.

On Hiring

One talk I attended today was "Why We're Bad At Hiring (And How To Fix It)" by @kerrizor of Living Social. I was originally planning on attending a different talk, but a fellow conference attendee suggested this one. A few gems (not Ruby gems) from this talk were:

  • Imagine your company as a small terrarium. If you are a very small team, hiring one person can drastically affect the environment, while hiring one person will be less influential for larger companies. I liked this analogy.
  • Stay away from monocultures (e.g. the banana monoculture) and avoid hiring employees just like you.
  • Understand how your hiring process may bias reject specific candidates. For example, requiring a github account may bias reject applicants that are working with a company that can't share code (e.g. security clearance required). Another example: requiring open source contributions may bias reject candidates with very little free time outside of their current job.
  • The interview process should be well organized and well communicated. Organization and communication demonstrate confidence in the hiring process.
  • Hypothetical scenarios or questions are not a good idea. I've been a believer of this after reading some of Malcolm Gladwell's books where he discusses how circumstances are such a strong influence of behavior.
  • Actionable examples that are better than hypothetical scenarios include:
    1. ask an applicant to plan an app out (e.g. let's plan out strategy for planning an app that does x)
    2. ask an applicant to pair program with a senior developer
    3. ask the applicant to give a lighting talk or short presentation to demonstrate communication skills
  • After a rejected interview, think about what specifically might change your mind about the candidate.
  • Also communicate the reapplication process.
  • Improve your process by measuring with the goal to prevent false negatives. One actionable item here is to keep tabs on people – are there any developers you didn't hire that went on to become very successful & what did you miss?
  • Read this book.

Interview practices that Kerri doesn't recommend include looking at GPA/SAT/ACT scores, requiring a Pull request to apply, speed interviews, giving puzzle questions, whiteboard coding & fizzbuzz.

While I'm not extremely involved in the hiring processes for End Point, I am interested in the topic of growing talent within a company. The notes specifically related to identifying your own hiring biases was compelling.

Testing

I also attended a few talks on testing. My favorite little gem from one of these talks was the idea that when writing tests, one should try to balance between readability, maintainability, and performance, see:

Eduardo Gutierrez gave a talk on Capybara where he went through explicit examples of balancing maintainability, readability, and performance in Capybara. I'll update this post to include links to all these talks when they become available.

RailsConf 2015 - Atlanta: Day One

I'm here in Atlanta for my sixth RailsConf! RailsConf has always been a conference I enjoy attending because it includes a wide spectrum of talks and people. The days are long, but rich with nitty gritty tech details, socializing, and broader topics such as the social aspects of coding. Today's keynote started with DHH discussing the shift towards microservices to support different types of integrated systems, and then transitioned to cover code snippets of what's to come in Rails 5, to be released this year. Rather than rehash the entire talk, I'll post a link to the keynote when it becomes available.

Open Source & Being a Hero

One of the talks I was really looking forward to attending was "Don't Be a Hero - Sustainable Open source Dev" by Lillie Chilen, because of my involvement in open source (with Piggybak, RailsAdminImport, Annotator and Spree, another Ruby on Rails ecommerce framework). In the case of RailsAdminImport, I found a need for a plugin to RailsAdmin, developed it for a client, and then released it into the open source with no plans on maintaining a community. I've watched as it's been forked by a handful of users who were in need of the same functionality, but I most recently gave another developer commit & Rubygems access since I have historically been a horrible maintainer of the project. I can leverage some of Lillie's advice to help build a group of contributors & maintainers for this project since it's not something I plan to put a ton of time into.

With Piggybak, while I haven't developed any new features for it in a while, I released it into the open source world with the intention of spending time maintaining a community after being involved in the Spree community. Piggybak was most recently upgraded to Rails 4.2.

Lillie's talk covered actionable items you can do if you find yourself in a hero role in an open source project. She explained that while there are some cool things about being a hero, or a single maintainer on a project, ultimately you are also the single point of failure of the project and your users are in trouble if you get eaten by a dinosaur (or get hit by a bus).

Here are some of these actionable items to recovery from hero status:

  1. Start with the documentation on how to get the app running, how to run tests, and how to contribute. Include comments on your workflow, such as if you like squashed commits or how documentation should look.
  2. Write down everything you do as a project maintainer and try to delegate. You might not realize all the little things you do for the project until you write them down.
  3. Try to respond quickly to requests for code reviews (or pull requests). Lillie referenced a study that mentioned if a potential contributor receives a code review within 48 hours, they are much more likely to come back, but if they don't hear back within 7 days, there is little chance they will continue to be involved.
  4. Recruit collaborators by targeted outreach. There will be a different audience of collaborators if you open source tool is an app versus a library.
  5. Manage your own expectations for contributors. Understand the motivations of contributors and try to figure out ways to encourage specific deliverables.
  6. Have regular retrospectives to analyze what's working and what's not, and encourage introspection.

While Lillie also covered several things that you can do as a contributor, I liked the focus on actionable tasks here for owners of projects. The ultimate goal should be to find other collaborators, grow a team, and figure out what you can do to encourage people to progress in the funnel and transition from user to bug fixer to contributor to maintainer. I can certainly relate to being the single maintainer on an open source project (acting as a silo), with no clear plan as to how to grow the community. I'll share the full set of slides when they are released.

Other Hot Topics

A couple of other hot topics that came up in a few talks were microservices and Docker. I find there are hot topics like this at every RailsConf, so if the trend continues, I'll dig deeper into these topics.

What Did I Miss?

I always like to ask what talks people found memorable throughout the day in case I want to look back at them later. Below are a few from today. I'd like to revisit these later & I'll update to include the slides when I find them.

The `name' attribute is required in cookbook metadata: Solving a Vagrant/Chef Provisioning Issue

When Vagrant/Chef Provisioning Goes South

I recently ran into the following error when provisioning a new vagrant machine via the `vagrant up` command:

[2015-04-21T17:10:35+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2015-04-21T17:10:35+00:00] ERROR: Cookbook loaded at path(s) [/tmp/vagrant-chef/path/to/my-cookbook] has invalid metadata: The `name' attribute is required in cookbook metadata
[2015-04-21T17:10:35+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
  

After some googling and digging I learned version 12 of chef-client introduced a breaking change. From version 12 on, every cookbook requires a name attribute in their metadata.rb file . A quick grep through the metadata.rb files in the project revealed several did not include name attributes. You would be correct at this point to suggest I could have added name attributes to the cookbook metadata files and been done with this. However, in this case I was a consumer of the cookbooks and was not involved in maintaining them so an alternate solution was preferable.

Selecting a Specific Version of Chef in Vagrant

My idea for a solution was to install the most recent chef-client release prior to version 12. I was not sure how to do this initially but along the way I learned that by default, Vagrant will install the most recent release of chef-client. The Vagrant documentation for Chef provisioners described what I needed to do. The Chef version could be specified in config.vm.provision block in the Vagrantfile:

config.vm.provision :chef_solo do |chef|
      chef.version = "11.18"
      chef.cookbooks_path = "cookbooks"
      chef.data_bags_path = "data_bags"

      # List of recipes to run
      chef.add_recipe "vagrant_main::my_project"
  end
  

With this configuration change, chef-client 11.18 completed the provisioning step successfully.

Handling databases in dev environments for web development

One of the biggest problems for web development environments is copying large amounts of data. Every time a new environment is needed, all that data needs to be copied. Source code should be tracked in version control software, and so copying it should be a simple matter of checking it out from the code repository. So that is usually not the problem. The main problem area is database data. This can be very large, take a long time to copy, and can impact the computers involved in the copy (usually the destination computer gets hammered with IO which makes load go high).

Often databases for development environments are created by copying a database dump from production and then importing that database dump. And since database dumps are text, they can be highly compressed, which can result in a relatively small file to copy over. But the import of the dump can still take lots of time and cause high load on the dev computer as it rebuilds tables and indexes. As long as your data is relatively small, this process may be perfectly acceptable.

Your database WILL get bigger

At some point though your database will get so big that this process will take too long and cause too much load to be acceptable.

To address the problem you can try to reduce the amount of data involved by only dumping a portion of the database data instead of all of it, or possibly using some "dummy sample data" instead. These techniques may work if you don't care that development environments no longer have the same data as production. However, one serious problem with this is that a bug or behavior found in production can't be replicated in a development environment because the data involved isn't the same. For example, say a customer can't checkout on the live site but you can't replicate the bug in your development environment to fix the bug. In this example, the root cause of the problem could be a bug in the code handling certain products that are out of stock, and since the dev database didn't have the same data it could make finding and fixing these types of problems a lot harder.

Snapshots

Another option is to use file system snapshots, like LVM snapshots, to quickly make clones of the database without needing to import the database dump each time. This works great if development environments live on the same server, or at least the development databases live on the same server. You would need to create a volume to hold a single copy of the database; this copy would be the origin for all snapshots. Then for each development environment, you could snapshot the origin volume, mount it read-write in a place accessible by the developer, customize the database configuration (like setting a unique port number to listen on), and then start up the database. This then provides a clone of the entire database in a tiny fraction of the time and uses less disk space and other system resources too.

In using snapshots there are some things you'll need to be careful about. Snapshots are usually created using copy-on-write tables. The more snapshots mounted read-write, the more IO overhead is involved for the volumes involved. For this reason it is important that writes to the origin volume be avoided as much as possible while the snapshots are open. Also, snapshots that get a lot of writes can fill up their copy-on-write table, and depending on the file system and database that you are using this can be a big problem. So it is important to monitor each open snapshot for how full it is and increase their size if needed so they don't fill up. Updating the origin database will require shutting down and removing all snapshots first, then update the origin database, then create and mount all the snapshots again. This is because all the copy-on-write tables would get full if you tried to update the origin while the snapshots are open.

Using snapshots like this may sound more complicated, and it is, but the processes involved can be scripted and automated and the benefits can be pretty significant if you have several developers and a lot of data to copy.

Nvidia: Invalid or Corrupted Push Buffer Stream

As a high-performance video rendering appliance, the Liquid Galaxy requires really good video cards -- better than your typical on-board integrated video cards. Despite ongoing attempts by competitors to displace them, Nvidia remains the best choice for high-end video, if you use the proprietary Nvidia driver for Linux.

In addition to providing regular security and system updates, End Point typically provides advanced remote monitoring of our customers' systems for issues such as unanticipated application behavior, driver issues, and hardware errors. One particularly persistent issue presents as an error with an Nvidia kernel module.  Unfortunately, relying on proprietary Nvidia drivers so as to maintain an acceptable performance level limits the available diagnostic information and options for resolution.

The issue presents when the system ceases all video output functions as Xorg crashes. The kernel log contains the following error message:

2015-04-14T19:59:00.000083+00:00 lg2 kernel: [  719.850677] NVRM: Xid (0000:01:00): 32, Channel ID 00000003 intr 02000000

The message is repeated approximately 11000 times every second until the disk fills and the ability to log in to the system is lost. The only known resolution at this time is to power-cycle the affected machine. In the error state, the module cannot be removed from the kernel, which also prevents Linux from shutting down properly. All affected systems were running some version of Ubuntu x86-64. The issue seems to be independent of driver version, but is at least present in 343.36 and 340.65, and affects all Geforce cards. Quadro cards seem unaffected.

The Xid message in the kernel log contains an error code that provides a little more information. The Nvidia docs list the error as "Invalid or corrupted push buffer stream". Possible causes listed include driver error, system memory corruption, bus error, thermal error, or frame buffer error. All affected systems were equipped with ECC RAM and were within normal operating temperature range when the issue presented.

Dealing with bugs like these can be arduous, but until they can be fixed, we cope by monitoring and responding to problems as quickly as possible.

Joe Mastey at Mountain West Ruby Conference 2015

A conversation with a co-worker today about the value of improving one's professional skills reminded me of Joe Mastey's talk he gave at the 2015 Mountain West Ruby Conference. That then reminded me that I had never finished my write up on that conference. Blogger won't let me install harp music and an animated soft focus flashback overlay, so please just imagine it's the day after the conference when you're reading this. "That reminds me of the time..."

I've just finished my second MWRC and I have to give this one the same 5-star rating I gave last year's. There were a few small sound glitches here and there, but overall the conference is well-run, inclusive, and packed with great speakers and interesting topics. Rather than summarizing each talk, I want to dig into the one most relevant to my interests. "Building a Culture of Learning" by Joe Mastey

I was excited to catch Joe's talk because learning and teaching have always been very interesting to me, regardless of the particular discipline. I find it incredibly satisfying to improve upon my own learning skills, as well as improving my teaching skills by teasing out how different individuals learn best and then speak to that. There's magic in that one on one interaction when everything comes together just right. I just really dig that.

Joe's work as the Manager of Internal Learning at Enova has gone way beyond the subtleties of the one-on-one. He's taken the not-so-simple acts of learning and training, and scaled them up in an environment that does not sound, on paper, like it would support it. He's created a culture of learning ("oh hey they just said the title of the movie in the movie!") in a financial company that's federally regulated, saw huge growth due to an IPO, and had very real business-driven deadlines for shipping their software.

Joe broke his adventure down into three general phases after refreshingly admitting that "YMMV" and that you can't ignore the existing corporate culture when trying to build a culture of learning within.

Phase 1 - Building Credibility

I would hazard a guess that most software development shops are perpetually at Phase 1: Learning is mostly ad-hoc by way of picking things up from one's daily work; and has little to no people pushing for more formal training. People probably agree that training is important, but the mandate has not come down from the CTO, and there's "no time for training" because there's so much work to do.

How did Joe help his company evolve past Phase 1? Well, he did a lot of things that I think many devs would be happy to just get one or two of at the their company. My two favorites from his list probably appeal to polar opposite personality types, but that's part of why I like them.

My first favorite is that books are all-you-can-eat. If a developer asks Joe for a tech book, he'll say yes, and he'll buy a bunch of extra copies for the office. I like having a paper book to read through to get up to speed on a topic, ideally away from my desk and the computer screen. I've also found that for some technologies, the right book can be faster and less frustrating than potentially spotty documentation online.

My second favorite is how Joe implemented "new hire buddies." Each new hire is teamed up with an experienced dev from a different team. Having a specific person to talk to, and get their perspective on company culture, can really help people integrate into the culture much more quickly. When I joined End Point in 2010, I worked through our new hire "boot camp" training like all new hires. I then had the occasionally-maddening honor of working directly with one of the more senior evil super-geniuses at End Point on a large project that I spent 100% of my time on. He became my de facto new hire buddy and I could tell that despite the disparity in our experience levels, being relatively joined at the hip with someone like that improved my ramp-up and cultural integration time greatly.

Phase 2 - Expand Reach and Create Impact

If my initial guess about Phase 1 is correct, it follows that that dev shops in Phase 2 are more rare: More people are learning more, more people are driving that learning, but it's still mostly focused on new hires and the onboarding process.

Phase 2 is where Joe's successful efforts are a little more intimidating to me, especially given my slightly introverted nature. The efforts here scale up and get more people speaking publically, both internally and externally. It starts with a more formal onboarding process, and grows to things like weekly tech talks and half day internal workshops. Here is where I start to make my "yeah, but…" face. We all have it. It's the face you make when someone says something you don't think can work, and you start formulating your rebuttal immediately. E.g. "Yeah, but how do you get management and internal clients to be OK with ‘shutting down development for half a day' for training?" Joe does mention the danger of being perceived as "wasting too much time." You'll want to be sure you get ahead of that and communicate the value of what you're spending "all that dev time on."

Phase 3 - Shift The Culture

It would be interesting to know how many shops are truly in Phase 3 because it sounds pretty intense: Learning is considered part of everyone's job, the successes from the first two phases help push the culture of learning to think and act bigger, the acts of learning and training others are part of job descriptions, and things like FOSS contributions and that golden unicorn of "20% personal project time" actually happen on company time. Joe describes the dangers or downsides to Phase 3 in a bit of a "with great power comes great responsibility" way. I've never personally worked somewhere that's in Phase 3, but it make sense that the increased upside has increased (potential) downside.

At End Point, we have some elements of all three phases, but we're always looking to improve. Joe's talk at MWRC 2015 has inspired me to work on expanding our own culture of learning. I think his talk is also going to serve as a pretty good road-map on how to get to the next phase.