Skip to main content
A cartoon depiction of the author, wearing a hoodie and smiling

On Getting Better

I did a phone screen for a job interview in 2010 that went extremely poorly. I was hoping to find a job where I could work on something web-facing, and I was trying to learn Ruby in my spare time, but I hadn't gotten very far. Still, I could manage to write a Ruby script that ran, so I told them my current language of choice was Ruby. Which... may have been a mistake.

What follows is just the first step of what they asked me to build, but for illustration's sake here in this blog post I'll tell you about that much: they asked me to write some code that takes in a source text and assembles a word frequency count for the whole text as well as finding the most commonly used word in the text. Here's roughly the solution that I remember writing:

def old(source)
  freq = {}
  source.split(/\s/).each do |word|
    freq[word] ||= 0
    freq[word] += 1
  max = 0
  the_word = ""
  freq.each do |word,count|
    if count > max then
      the_word = word
      max = count
  puts the_word

So, uh, I didn't get that job offer. I probably wouldn't bring in the person who came up with that code, in Ruby, for an on-site interview. They said that they knew Ruby and then the most complicated concept they used was a Hash.each method? It's gross. It's also got some weird unnecessary cruft that makes it clear that someone is trying to write a different programming language than the one they're actually using. The good news is that I kept using Ruby for side projects, and the complexity of what I wanted to do increased occasionally, and I actually got to the point where I'm not totally ashamed of the Ruby I write.

But hey, when it comes time to start solving real problems in a new language, Stack Overflow is cool! One way it exists is as a tool to translate your thoughts into a language's collections interface. Unfortunately, once you start putting pieces together you're effectively pulling a Google Translate— just because you've identified the individual steps you need and have strung them all together doesn't mean that what you've written isn't a monstrosity. Check out this bad boy that I might have assembled given the same problem based on a couple years' worth of learning bits and pieces of Ruby's Enumerable/Array/Hash interfaces:

def mid(source)
  source.split(/\s/).group_by {|a| a}.inject({}) {|h,(k,v)| h[k] = v.size; h}.max_by {|k,v| v}.first

Woof. Okay, deep breath.

My first problem with this approach, at this point in my experience & preference level, is using inject with maps. Maybe I just need to get over it but that Eye of Sauron (({})) certainly isn't helping put me at ease and the multiple statements in the block just to return the accumulated hash is... enh. It just smells, reader. More importantly, though, after writing Scala for a while I've been scared away from these daunting one-liners. I love naming shit along the way now! So let's try it again, the way I'd probably do it now:

def new(source)
  puts max_key freq words(source)

def max_key(hash)
  hash.max_by {|k,v| v}.first

def freq(words)
    words.group_by {|a| a}
         .map {|k, v| [k, v.size]} 

def words(source)

Is this the end? No, probably not. Will I someday start hating all these extraneous methods and feel like one-liners or just method chaining with some reasonable line breaks is the way to go? Maybe. I may even be overlooking two pieces that could be combined to avoid doing an extra pass over the source. This solution is only marginally faster than the original, as near as I can tell (luv u time).

There are a few pieces I'm already displeased with, too! They're in the same method, freq. Two things. First! x.f {|a| a} is dumb. Scala has x.f(_), which I quite like, so an elegant solution seems realistic. You could do this:

it = ->x{x}

But I'm not sure that's better, even if you do something like this to make that Proc available everywhere:

def Object
  def it

Eww. Luckily, there has been some discussion about adding an "itself" method this past year, so before too much longer we'll have x.f(&itself) and all will be well on that front.

The second one, though; {|k, v| [k, v.size]} returns an array of array-pairs, which we then have to coerce back into a hash by relying on the Hash constructor that accepts that. It's not great. It would be nice if there were a way to transform the hash in-place, but there isn't, as far as I can tell! A lot of people seem to recommend adding a map_hash method to Enumerable, which basically hides this grossness from you each time you want it, but adding methods to Enumerable isn't really my idea of a good solution any day of the week. You can also operate on the hash directly but then you're getting into gross mutability stuff and down that path lies madness.

So what? Well, I guess I wanted to illustrate how far I've come in just ~3 years, especially when you consider that I write Ruby almost exclusively as a hobby; the number of commits I've submitted to our legacy Rails application at my current job is probably in the low teens. So there's a lot you can learn in what is increasingly starting to feel like a not-particularly-long amount of time, and I'm sure there's a lot further I can go. Maybe in 2016 I'll be able to write an even better implementation of this relatively trivial example— or maybe Ruby 4.0 will just have a word frequency library. Then maybe employers will stop asking this (pretty lame) question for interviews.