Skip to content

Instantly share code, notes, and snippets.

@zverok
Last active December 2, 2023 11:22
Show Gist options
  • Save zverok/501f3e1539045ea0190377c994d6084a to your computer and use it in GitHub Desktop.
Save zverok/501f3e1539045ea0190377c994d6084a to your computer and use it in GitHub Desktop.

Inspired by Lucian Ghinda's article about code review/refactoring with AI, just an experimenting on looking at the code the way I usually do that.

It is not an attempt to demonstrate "how everybody should write code," rather a way of sharing how I approach code style and refatoring. (I.e. not to communicate "you should do this," but just "here are some techniques and considerations I use, maybe you'll find it interesting").

So...

Here's the original code:

class Parser
  def initialize(list, rubygems_client: RubyGems::Client.new)
    @list = list
    @rubygems_client = rubygems_client
  end

  def parse
    return parse_gemfile_format if gemfile_format?

    parse_single_line_format
  end

  private

    attr_reader :list, :rubygems_client

    def gemfile_format? = lines.any? { gem_method_call?(_1) }

    def parse_gemfile_format
      ast = Prism.parse(list)
      root_node = ast.value

      parsed_gems = []

      root_node.statements.body.each do |call_node|
        case call_node.name
        when :gem
          name = gem_name_from_call_node(call_node)
          parsed_gems << name
        when :group
          names = gem_names_from_group(call_node)
          parsed_gems.concat(names) if names.present?
        end
      end

      parsed_gems.compact_blank.map { rubygems_client.info(_1) }
    end

    def parse_single_line_format
      lines.filter_map do |line|
        gem_name = line.strip
        gem_info = rubygems_client.info(gem_name) unless gem_name.empty?
        gem_info
      end.compact_blank
    end

    def gem_names_from_group(call_node)
      statements_node = call_node&.block&.body
      statements_node.body.map { gem_name_from_call_node(_1) }
    end

    def gem_name_from_call_node(call_node) = call_node&.arguments&.arguments&.first&.unescaped

    def gem_method_call?(line) = line.strip.include?("gem ")

    def lines = list.lines
end

Concern 1: I don't like guard conditions to branch by equally possible branches. I prefer to use them to, indeed, guard something (like "if the argument is wrong, quick raise" or "if some exceptional condition happens, quick return"), but in cases like this I prefer bare if to empahsize "here are two equally possible/interesting branches."

  def parse
    if gemfile_format?
      parse_gemfile_format
    else
      parse_single_line_format
    end
  end

Next, my favorite technique to understand, and maybe adjust, relatively complex algorithmic code, is to start with one method. Once upon a time I was a proponent of the "many small methods with descriptive names" culture, but eventually I started to believe that (reasonably) big methods that try to tell the entire story is at least a good starting point to see the whole story.

So... Let's see how it looks:

  def parse
    if lines.any? { gem_method_call?(_1) }
      ast = Prism.parse(list)
      root_node = ast.value

      parsed_gems = []

      root_node.statements.body.each do |call_node|
        case call_node.name
        when :gem
          name = gem_name_from_call_node(call_node)
          parsed_gems << name
        when :group
          names = gem_names_from_group(call_node)
          parsed_gems.concat(names) if names.present?
        end
      end

      parsed_gems.compact_blank.map { rubygems_client.info(_1) }
    else
      lines.filter_map do |line|
        gem_name = line.strip
        gem_info = rubygems_client.info(gem_name) unless gem_name.empty?
        gem_info
      end.compact_blank
    end
  end

This is so far just one round of flattening (three main methods inlined), let's take this a bit further.

Having a condition:

if lines.any? { gem_method_call?(_1) }

...and embedding the call:

if lines.any? { _1.strip.include?("gem ") }

...and now to atomize it into chained operations (which allows to use a form of any? that accepts pattern):

if lines.map(&:strip).any?(/gem /)

Now, the smaller branch:

lines.filter_map do |line|
  gem_name = line.strip
  gem_info = rubygems_client.info(gem_name) unless gem_name.empty?
  gem_info
end.compact_blank

I would also try to atomize the chained calls:

lines.map(&:strip).reject(&:empty?).map { rubygems_client.info(_1) }

Now, I think I saw this lines.map(&:strip) somewhere 🤔

Let's extract it:

lines = list.lines.map(&:strip).reject(&:empty?)

if lines.any?(/gem /)
  # ..ignore Prism branch for a moment...
else
  lines.map { rubygems_client.info(_1) }
end

Now to the Prism branch!

So far, we have this:

ast = Prism.parse(list)
root_node = ast.value

parsed_gems = []

root_node.statements.body.each do |call_node|
  case call_node.name
  when :gem
    name = gem_name_from_call_node(call_node)
    parsed_gems << name
  when :group
    names = gem_names_from_group(call_node)
    parsed_gems.concat(names) if names.present?
  end
end

parsed_gems.compact_blank.map { rubygems_client.info(_1) }

One round of "more functional iterators" and "more chains":

Prism.parse(list).value.statements.body.flat_map { |call_node|
  case call_node.name
  when :gem
    gem_name_from_call_node(call_node)
  when :group
    gem_names_from_group(call_node)
  end
}.compact.map { rubygems_client.info(_1) }

(I tend to "embed every local var that is needed only on the next line" initially, with then, possibly, extracting some non-trivial things to name and debug them.)

Now, one more round of "embed everything":

Prism.parse(list).value.statements.body.flat_map { |call_node|
  case call_node.name
  when :gem
    call_node.arguments&.arguments&.first&.unescaped
  when :group
    call_node.block&.body&.body&.map { _1.arguments&.arguments&.first&.unescaped }
  end
}.compact.map { rubygems_client.info(_1) }

(I made one less &. hre where we definitely should have an object. Without testing and looking deeper into Prism docs, it is not clear what else is impossible to be nil)

Now, the content of "gem name from call node" is actually repeated, so it might be reasonable to have in method... Or, actually, we might notice that it dangles on the end of every case branch, so we just can put it outside:

Prism.parse(list).value.statements.body.flat_map { |call_node|
  case call_node.name
  when :gem then call_node
  when :group then call_node.block&.body&.body
  end
}.compact
 .filter_map { _1.arguments&.arguments&.first&.unescaped }
 .map { rubygems_client.info(_1) }

So, here is more or less the whole picture...

def parse
  lines = list.lines.map(&:strip).reject(&:empty?)

  if lines.any?(/gem /)
    Prism.parse(list).value.statements.body.flat_map { |call_node|
      case call_node.name
      when :gem then call_node
      when :group then call_node.block&.body&.body
      end
    }.compact
     .filter_map { _1.arguments&.arguments&.first&.unescaped }
     .map { rubygems_client.info(_1) }
  else
    lines.map { rubygems_client.info(_1) }
  end
end

...but wait, actually every branch ends with the call to rubygems_client, so...

def parse
  lines = list.lines.map(&:strip).reject(&:empty?)

  if lines.any?(/gem /)
    Prism.parse(list).value.statements.body.flat_map { |call_node|
      case call_node.name
      when :gem then call_node
      when :group then call_node.block&.body&.body
      end
    }.compact.filter_map { _1.arguments&.arguments&.first&.unescaped }
  else
    lines
  end.map { rubygems_client.info(_1) }
end

...but wait, now it is obvious that one of the branches does no processing whatsoever, so...

class Parser
  def initialize(list, rubygems_client: RubyGems::Client.new)
    @list = list
    @rubygems_client = rubygems_client
  end

  private attr_reader :list, :rubygems_client

  def parse
    gem_names = list.lines.map(&:strip).reject(&:empty?)

    if gem_names.any?(/gem /)
      # It is actually a Gemfile, parse it with prism!
      gem_names = Prism.parse(list).value.statements.body.flat_map { |call_node|
        case call_node.name
        when :gem then call_node
        when :group then call_node.block&.body&.body
        end
      }.compact.filter_map { _1.arguments&.arguments&.first&.unescaped }
    end

    gem_names.map { rubygems_client.info(_1) }
  end
end

Now, this code can be split back into methods once it matures (though with this amount of code, I'd let it live for some time), but the sequence/structure of these methods would be different from the original. Now we localized the most non-trivial thing (Prism parsing), and split the whole process into two major steps (guess gem names → get gems info) instead of two branches (get names, then infos from text file / get names, then infos from Gemfile).

We can now also change the body of big flat_map (find all gem "foo" calls) to something more robust, as long as it returns necessary call nodes, without affecting any of the rest of the code logic.

Hope it was entertaining :) For me, it was!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment