What is the fastest way to iterate over a hash in C?

Brian_Takita · 1 October 2006 05:03

Hello,

I'm working on the Fast JSON project, and have come against some puzzling
performance quirk.

Most of the Hash#to_json functionality is implemented in C and performs much
better. However there is one section that performs 6 time better when
implemented in ruby vs c.

I wrote a benchmark that calls to_json on 50000 hashes.

Here is the method in Ruby. The benchmark takes around 1.7 seconds:

  def process_internal_json(json, state, depth, delim)
    first = true
    each { |key,value|
      if first
        first = false
      else
        json << delim
      end
      generate_key_value_json(json, state, depth, key, value)
    }
    json
  end

Here is the method in C. The benchmark takes around 9.5 seconds:

static VALUE process_internal_json(VALUE self, VALUE json, VALUE state,
VALUE depth, VALUE delim) {
int first = 1;
VALUE key_value_pairs = rb_funcall(self, rb_intern("to_a"), 0);

  VALUE key_value = Qnil;
  while((key_value = rb_ary_pop(key_value_pairs)) != Qnil) {
    if(first == 1) {
      first = 0;
    }
    else {
      rb_str_concat(json, delim);
    }
    VALUE value = rb_ary_pop(key_value);
    VALUE key = rb_ary_pop(key_value);
    generate_key_value_json(self, json, state, depth, key, value);
  }
}

It seems like there is some optimization in the Hash#each method. I'm trying
to figure out how to get that same performance benefit using C. Perhaps its
is not worth it though.

Does anybody know what going on?

Thank you,
Brian Takita

Brian_Takita · 1 October 2006 05:21

I found a better solution in c. This method causes the benchmark to run in
about 1.4 seconds.

static VALUE process_internal_json(VALUE self, VALUE json, VALUE state,
VALUE depth, VALUE delim) {
VALUE key_value_pairs = rb_funcall(self, rb_intern("to_a"), 0);

  VALUE key_value = Qnil;
  int i;
  for( i = 0; i < RARRAY(key_value_pairs)->len; i++) {
    if(i > 0) {
      rb_str_concat(json, delim);
    }
    VALUE key_value = rb_ary_entry(key_value_pairs, i);
    VALUE key = rb_ary_entry(key_value, 0);
    VALUE value = rb_ary_entry(key_value, 1);
    generate_key_value_json(self, json, state, depth, key, value);
  }
}

···

On 9/30/06, Brian Takita <brian.takita@gmail.com> wrote:

Hello,

I'm working on the Fast JSON project, and have come against some puzzling
performance quirk.

Most of the Hash#to_json functionality is implemented in C and performs
much
better. However there is one section that performs 6 time better when
implemented in ruby vs c.

I wrote a benchmark that calls to_json on 50000 hashes.

Here is the method in Ruby. The benchmark takes around 1.7 seconds:

  def process_internal_json(json, state, depth, delim)
    first = true
    each { |key,value|
      if first
        first = false
      else
        json << delim
      end
      generate_key_value_json(json, state, depth, key, value)
    }
    json
  end

Here is the method in C. The benchmark takes around 9.5 seconds:

static VALUE process_internal_json(VALUE self, VALUE json, VALUE state,
VALUE depth, VALUE delim) {
  int first = 1;
  VALUE key_value_pairs = rb_funcall(self, rb_intern("to_a"), 0);

  VALUE key_value = Qnil;
  while((key_value = rb_ary_pop(key_value_pairs)) != Qnil) {
    if(first == 1) {
      first = 0;
    }
    else {
      rb_str_concat(json, delim);
    }
    VALUE value = rb_ary_pop(key_value);
    VALUE key = rb_ary_pop(key_value);
    generate_key_value_json(self, json, state, depth, key, value);
  }
}

It seems like there is some optimization in the Hash#each method. I'm
trying
to figure out how to get that same performance benefit using C. Perhaps
its
is not worth it though.

Does anybody know what going on?

Thank you,
Brian Takita

Topic		Replies	Views
Storing values for later use ruby-talk	5	107	11 June 2008
Ruby speed compared to C in a simple calculations ruby-talk	6	178	6 April 2012
Performance improvement possible? ruby-talk	34	121	28 June 2008
Rb_yield & rb_hash_each ruby-talk	1	103	1 October 2006
Creating InsensitiveHash in C ruby-talk	2	84	19 February 2009

What is the fastest way to iterate over a hash in C?

Related topics