reedy.in / words

Topics that require more than 140 characters...

Deep Fetch

Sometimes you come across a piece of code of code that feels more verbose than it needs to be. These code fragments stand out against the rest of the code-base in a Ruby file and can often lead to frustrations and debug issues due to unneeded complexity.

I was pairing with a colleague and we had one of these experiences. We both looked at the section of code in question and knew there should be a more “rubyesque” way.

The original code looked something like the following:

Original Code
1
2
3
4
5
6
7
8
9
10
11
12
13
node = {
  "webserver" => {
    "users" => {
      "admin" => {
        "password" => "some amazing password"
      }
    }
  }
}

if node['webserver'].member?('users') && node['webserver']['users'].member?('admin') && node['webserver']['users']['admin'].member?('password')
  defaults['password'] = node['webserver']['users']['admin']['password']
end

This is a pretty common use case, especially as we are consuming JSON resources that have several levels deep. You can search StackOverview and find any number of solutions. The solution isn’t complex and, while you can certainly download any number of gems to provide a solution, it turns out it only take a few lines of ruby to implement a nested search method.

After writing this post I came across a Gem called deep_fetch which provides a slightly different implementation. Rather than rename the methods throughout this post I wanted to give them a nod for a ready-made gem that provides the same basic functionality.

One Possible Solution

First, this isn’t the only way to solve this problem. That is the beauty of Ruby, there are many ways to implement smart solutions to common problems. Now, lets dive in.

After a brief conversation we decided we wanted our solution to accept any number of keys, returning the value for the last key or false.

Ideal implementation
1
2
3
4
5
node.our_custom_search('webserver', 'users', 'admin', 'password')
#=> "some amazing password"

node.our_custom_search('webserver', 'users', 'jdoe', 'password')
#=> false

I like to start by looking at what methods we already have access to. Since we are working with a Hash object I went to the Ruby docs and saw that #fetch looked promising. The #fetch method allows you to provide a key, either a String or Symbol, and receive it’s value. The reason #fetch is ideal for our solution is that you can optionally provide a default value in the event that the provided key doesn’t exist. An exception is raised if you do not provide a default value and the key does not exist.

Fetch Default Value Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Test Hash
person = { name: 'Dan Reedy', website: 'http://reedy.in' }

# Valid Fetch
person.fetch(:name)
#=> 'Dan Reedy'

# Invalid Fetch without a default
person.fetch(:email)
#=> KeyError: key not found: :email

# Invalid Fetch with a default
person.fetch(:email, nil)
#=> nil

Playing off of the name of this method we decided to name our solution #deep_fetch.

Let’s be Good Testers

With the usage defined it is time for test. Since I want to use standard libraries the MiniTest module will be the testing framework1.

hash_deep_fetch_test.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
require 'minitest/autorun'

describe Hash do
  before do
    @hash = {
      'webserver' => {
        'users' => {
          'admin' => {
            'password' => 'some amazing password'
          }
        }
      }
    }
  end

  describe '#deep_fetch' do
    it 'returns the correct value for the provided keys' do
      @hash.deep_fetch('webserver','users','admin','password').must_equal 'some amazing password'
    end
    it 'returns false if the provided keys do not exist' do
      @hash.deep_fetch('webserver','users','jdoe','password').must_equal false
    end
  end
end

Running the test will result in the expected failures.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Run options: --seed 25720

# Running:

EE

Finished in 0.001193s, 1676.4459 runs/s, 0.0000 assertions/s.

  1) Error:
Hash::#deep_fetch#test_0001_returns the correct value for the provided keys:
NoMethodError: undefined method `deep_fetch' for #<Hash:0x007f944c0ee6f0>
    deep_merge.rb:26:in `block (3 levels) in <main>'


  2) Error:
Hash::#deep_fetch#test_0002_returns false if the provided keys do not exist:
NoMethodError: undefined method `deep_fetch' for #<Hash:0x007f944c0ed1b0>
    deep_merge.rb:29:in `block (3 levels) in <main>'

2 runs, 0 assertions, 0 failures, 2 errors, 0 skips

Implementing the #deep_fetch method

With the knowledge that we are going to take a collection of keys and reduce that down to a single value I chose to use the Enumerable#reduce method for looping. The added benefit with #reduce is the ability to assign the initial value of the memo block variable to self.

hash_deep_fetch.rb
1
2
3
4
5
6
7
Hash.class_eval do
  def deep_fetch(*keys)
    keys.reduce(self) do |memo, key|
      memo.fetch(key)
    end
  end
end

This is enough for the first test to pass.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Run options: --seed 60473

# Running:

.E

Finished in 0.001141s, 1752.8484 runs/s, 876.4242 assertions/s.

  1) Error:
Hash::#deep_fetch#test_0002_returns false if the provided keys do not exist:
KeyError: key not found: "jdoe"
    deep_merge.rb:8:in `fetch'
    deep_merge.rb:8:in `block in deep_fetch'
    deep_merge.rb:7:in `each'
    deep_merge.rb:7:in `reduce'
    deep_merge.rb:7:in `deep_fetch'
    deep_merge.rb:31:in `block (3 levels) in <main>'

2 runs, 1 assertions, 0 failures, 1 errors, 0 skips

The #fetch method raises an exception if a key isn’t found. In our implementation we decided that we wanted false if the keys do not exist, so the staight forward solution is to capture the KeyError exception and return false.

hash_deep_fetch.rb
1
2
3
4
5
6
7
8
9
Hash.class_eval do
  def deep_fetch(*keys)
    keys.reduce(self) do |memo, key|
      memo.fetch(key)
    end
  rescue KeyError
    false
  end
end

That’s it! Nine lines of code and we have a method to dive deep into nested hashes and pull out values.

But wait…there’s more!

There are two ways I think we can improve this method.

Specify a default value if the key isn’t found

First, I think it would be nice to specify the default value, much like the actual #fetch method. Rather than returning false the code will let the KeyError be raised if there isn’t a default. First update the test.

hash_deep_fetch_test.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
# ... the setup code ...
describe '#deep_fetch' do
  it 'returns the correct value for the provided keys' do
    @hash.deep_fetch('webserver','users','admin','password').must_equal 'some amazing password'
  end
  it 'raises KeyError exception if the provided keys do not exist' do
    -> { @hash.deep_fetch('webserver','users','jdoe','password') }.must_raise KeyError
  end
  it 'returns the provided default value if the key does not exist' do
    @hash.deep_fetch('webserver','users','jdoe','password', default: false).must_equal false
  end
end
# ... the rest ...

This results in a new failure.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Run options: --seed 47519

# Running:

.E.

Finished in 0.001223s, 2452.9845 runs/s, 1635.3230 assertions/s.

  1) Error:
Hash::#deep_fetch#test_0003_returns the provided default value if the key does not exist:
KeyError: key not found: "jdoe"
    deep_merge.rb:8:in `fetch'
    deep_merge.rb:8:in `block in deep_fetch'
    deep_merge.rb:7:in `each'
    deep_merge.rb:7:in `reduce'
    deep_merge.rb:7:in `deep_fetch'
    deep_merge.rb:40:in `block (3 levels) in <main>'

3 runs, 2 assertions, 0 failures, 1 errors, 0 skips

Now to make it pass. I’ll take advantage of Ruby 2’s keyword arguments for this example.

hash_deep_fetch.rb
1
2
3
4
5
6
7
8
9
Hash.class_eval do
  def deep_fetch(*keys, default: false)
    keys.reduce(self) do |memo, key|
      memo.fetch(key)
    end
  rescue KeyError
    default.nil? ? raise : default
  end
end

Allow #deep_fetch to locate values by a key path

Developers who spend time with Objective-C’s dictionaries or the various other implementations of Key-Value stores will be familiar with the idea of a key path. Simply put it is a string of dot separated keys3. Rather than changing the existing method, we will create a new method called #fetch_keypath which will leverage deep_fetch behind the scenes.

As always, test first.

hash_deep_fetch_test.rb
1
2
3
4
5
6
7
# ... Setup & other tests ...
describe '#fetch_keypath' do
  it 'returns the correct value for the provided keypath' do
    @hash.fetch_keypath('webserver.users.admin.password').must_equal 'some amazing password'
  end
end
# ... the rest ...

Then code the solution. The internals of this method will rely on the splat operator again. This time it is used to pass the elements within the array as individual arguments rather than a single argument.

hash_deep_fetch.rb
1
2
3
4
5
6
7
Hash.class_eval do
  # ... #deep_fetch method

  def fetch_keypath(keypath)
    deep_fetch(*keypath.split('.'))
  end
end

Adding the optional default value is trivial at this point.

hash_deep_fetch_test.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
# ... Setup & other tests ...
describe '#fetch_keypath' do
  it 'returns the correct value for the provided keypath' do
    @hash.fetch_keypath('webserver.users.admin.password').must_equal 'some amazing password'
  end
  it 'raises KeyError if the provided keys do not exist' do
    -> { @hash.fetch_keypath('webserver.users.jdoe.password') }.must_raise KeyError
  end
  it 'returns the provided default value if the key does not exist' do
    @hash.deep_fetch('webserver','users','jdoe','password', default: 'Key Missing').must_equal 'Key Missing'
  end
end
# ... the rest ...

And the updated #fetch_keypath method

hash_deep_fetch.rb
1
2
3
  def fetch_keypath(keypath, default: false)
    deep_fetch(*keypath.split('.'), default: default)
  end

You can grab the tests and code for this blog at Github. Also, thanks to Travis Longoria for the inspiration.

  1. Old habits die hard though, so I’ll use the RSpec style for my tests
  2. The #inject and #reduce methods provide the same functionality. I prefer #reduce as I feel it makes more sense, reducing a collection down to a single result.
  3. Key-Value Coding Programming Guide

Comments