Holistic Engineering

A random assortment of shit with sprinkles.

Gem Activation and You: Part I

| Comments

Based on recent readings and conversations, I think the process from when someone requires a file in a ruby script, and what gems are selected and more importantly, how they’re selected is a confusing topic for many.

The how is called Gem Activation and is a critical point for:

  • Requiring your libraries from gems
  • Covered in Part 2:
    • Using and understanding how Bundler works
    • Understanding how ruby scripts installed from gems (called “binstubs”) work
    • Why bundle exec is important for projects managed by Bundler, even for things you have elsewhere.
  • Will be covered in Part 3:
    • Why packaging independent gems with system packagers like rpm and dpkg may be more trouble than it’s worth, or a horrible idea. You pick your preferred caustic phrase here.

Depending on the level of interest in the first post and my schedule, I will try to cover all of these topics.

We will be covering RubyGems 2.0 as a baseline for most things — I will try and note when things differ in the 1.8 series especially, but be aware that I may miss a few things, and some of this behavior may be surprising to RubyGems 1.8 users.

In short, upgrade already.

Some Philosophy And Why This Is Important

It’s best to understand the tool you’re using. Let’s be honest — RubyGems is definitely not a popular platform for many, especially in the ops world. However, between RubyGems and Bundler, many (tens of thousands) of developers are able to not care about a lot of problems which plague developers of other systems such as Perl and Python, like multi-tenancy of a given library for different projects (we’re going to talk about this a lot!) and hosting binary builds on a platform-specific basis. Hi, Windows and Java users!

RubyGems Is Not Going Away

RubyGems runs by default since the first production release of the 1.9 series, 1.9.2. It is commonly used on 1.8.7, and obviously the default behavior has continued in the 2.0 series, just with RubyGems 2.0 to support it. You can turn it off with this command on all 1.9.2 and greater rubies:

foo.sh
1
$ ruby --no-gems my_program.rb

Which is a horrible idea — you’re basically throwing the ability to run anything on rubygems.org without a lot of manual effort. This is necessary for some projects (logstash I know has real, unresolvable problems with rubygems in jar files that surround emulated disk performance), but these problems are more the exception than the rule.

And let’s be blunt: there have been more than a few attempts to replace RubyGems. Do you remember any of them?

Knowledge Is Power

Want to know how to package your new gem? When you should use Bundler and when you shouldn’t for your next project? How to configure your dependencies? Hopefully this will give you some insight as to how you might accomplish that.

Even if you hate (or still hate) RubyGems after reading all this, you should know your tools, damnit.

Getting Started: The Basics

A Gem Specification is embedded into each gem, which is unpacked when you install a gem and put in a special place where rubygems can refer to it later.

When you start ruby, the variable $LOAD_PATH contains all the paths for requiring items in the Standard Library, rubygems being one of those things. The parts we’re discussing will already be loaded by the time your ruby starts executing your code, unless you execute with --no-gems as mentioned above.

When you require something, this would normally refer to Kernel.require and search this $LOAD_PATH.

RubyGems intervenes — you can actually see this here. What is happening is that it overrides Kernel.require with its own method which first considers RubyGems, then the standard library. If anything is found in gems which meets the required path and other activation requirements, the specification is activated and so are its dependencies. Otherwise, it fails with a LoadError which is a standard ruby exception class.

What this means is that the specific versions of the gems that are needed are added to the $LOAD_PATH. Again, this happens at require time, not boot.

Time For Some Action

A really good gem that has a fair amount of dependencies (and one I think a lot of people reading this will be using) is chef.

Go ahead and gem install chef if you don’t have it installed already. Then, start up irb and follow along (note that the first .dup statement is very important):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[22] erikh@speyside ~mine/blog% irb

irb(main):001:0> orig = $LOAD_PATH.dup
=> [ ... standard ruby stuff (a lot of it) ... ]

irb(main):002:0> require 'chef/config'
=> true

irb(main):003:0> $LOAD_PATH - orig
=> [
"/Users/erikh/.gem/ruby/2.0.0/gems/mixlib-config-1.1.2/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/mixlib-cli-1.3.0/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/mixlib-log-1.6.0/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/mixlib-authentication-1.3.0/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/mixlib-shellout-1.1.0/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/systemu-2.5.2/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/yajl-ruby-1.1.0/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/ipaddress-0.8.0/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/ohai-6.16.0/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/rest-client-1.6.7/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/net-ssh-2.6.7/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/net-ssh-multi-1.1/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/erubis-2.7.0/lib",
"/Users/erikh/.gem/ruby/2.0.0/gems/chef-11.4.4/lib"
]

If you view the dependencies for chef on rubygems.org, you’ll see these are the exact settings for chef version 11.4.4, which at the time of this writing is the latest version of chef. This is very important to remember, as we’ll see in the future that we can not only manipulate what “latest” means, as it has very much to do with how Bundler works, but it is also related to how activation breakage occurs.

Note that I also have multiple versions of all these gems installed:

1
2
3
4
5
6
7
[1] erikh@speyside mine/blog% gem list --local | grep net-ssh
net-ssh (2.6.7, 2.2.2)
net-ssh-gateway (1.2.0, 1.1.0)
net-ssh-multi (1.1)

[2] erikh@speyside mine/blog% gem list --local | grep chef
chef (11.4.4, 11.4.0, 10.24.4, 10.18.2)

As you can see, I have 4 versions of chef, a few versions of net-ssh, etc.

What Just Happened

RubyGems is runtime activated and recursive, and consults specifications of the gems to determine what else to activate. Barring any restriction, when a require happens, it activates the latest version on the system for the require. Otherwise, it will fall through to Kernel.require, which may exploit already existing activated gems, or standard libraries.

In our case, we required chef/config, which is registered with the chef gem, and it picked version 11.4.4 because that’s the latest version on my system.

It also activated net-ssh version 2.6.7 and other dependencies that the 11.4.4 version of chef requires. It did not require anything, just made them what will be required should the chef requires you make, or your program itself need it. Should 2.6.8 of net-ssh come out, the ~> 2.6 requirement in the 11.4.4 gem means that if it were installed, it would take precedence over 2.6.7 because it’s the latest. This can change, but we’ll talk about it more later.

If we were to require 'net/ssh' above, the require would fall through to already requiring the 2.6.7 version since a net-ssh gem has already been activated.

Therefore, for a single require at runtime:

  • The gem is located that matches the require.
  • It is activated, meaning it is added to the $LOAD_PATH.
  • Dependencies in the gem are also activated at this time, and added to the $LOAD_PATH.
  • Kernel.require is executed now that all the things in the $LOAD_PATH that needed to be activated are.
  • Further requires for activated gems fall through to Kernel.require.

RubyGems Is A Runtime System

I know I’ve said this a few times, but I cannot express it enough — at ruby startup, nothing is activated. Only when a require is executed is anything activated, and only if something is not already activated that meets the requirement.

Multiple Dependencies On The Same Gem

Occasionally you may see something similar to this this coming from your programs, usually at startup:

1
Unable to activate chef-10.24.4, because net-ssh-2.2.2 conflicts with net-ssh (~> 2.6)

This happens when two gems depend on the same thing, but there is a conflict on what version they depend on. In this case, it’s vagrant 1.0.7 and chef 10.24.4, and they depend on different versions of net-ssh (2.2.2 and ~> 2.6 respectively).

Let’s act this out with a little exercise that shows off our little love triangle here:

  • gem install vagrant -v 1.0.7
  • gem install chef -v 10.24.4
  • Start irb again:
1
2
3
4
5
6
7
8
9
[32] erikh@speyside ~mine/blog% irb

irb(main):001:0> gem 'vagrant', '1.0.7'
=> true

irb(main):002:0> gem 'chef', '10.24.4'
Gem::LoadError: Unable to activate chef-10.24.4, because net-ssh-2.2.2
                conflicts with net-ssh (~> 2.6)
   (... stack trace here ...)

So, in the rubygems toolkit there’s a Kernel-level method called gem which lets you activate things manually, which as you can see here simulates an activation that would break.

This is not quite the same gem as provided by Bundler. It however is very similar in goal, and worth remembering for later.

Note again that while we didn’t require anything, requiring does this before it requires the actual file. So we’ve just simulated the part that matters here, not the whole standard pipeline.

The way to fix this activation is to relax the requirements or match them. I’ve done this in vagrant-fixed-ssh for my own needs, but most people will likely be happier using Chef 11 and the newer Vagrant 1.1+ series.

What Do?

We’ll cover some solutions in our next article. Thanks for reading!

Comments