Hello Rubyers,
May I show the result of my benchmark for perl5, ruby, and scala?
https://blog.cloudcache.net/benchmark-for-scala-ruby-and-perl/
Welcome you to give any suggestion to me for improving this.
Thank you.
Jon
Hello Rubyers,
May I show the result of my benchmark for perl5, ruby, and scala?
https://blog.cloudcache.net/benchmark-for-scala-ruby-and-perl/
Welcome you to give any suggestion to me for improving this.
Thank you.
Jon
Not perfect, but a quick edit of your ruby script:
stopwords = {}
File.open('stopwords.txt').each_line do |s|
s.chomp!
stopwords[s] = 1
end
count = Hash.new(0)
File.open('words.txt').each_line do |s|
s.chomp!
count[s] += 1 unless stopwords[s]
end
count.sort_by{|_,c| -c}.take(20).each do |s|
puts "#{s[0]} -> #{s[1]}"
end
It may even be a little bit faster:
Calculating:
org 0.170 (± 0.0%) i/s - 6.000 in 35.518822s
new 0.214 (± 0.0%) i/s - 7.000 in 33.074879s
new 0.215 (± 0.0%) i/s - 7.000 in 32.760206s
org 0.175 (± 0.0%) i/s - 6.000 in 34.557145s
Comparison:
new: 0.2 i/s
new: 0.2 i/s - 1.01x (± 0.00) slower
org: 0.2 i/s - 1.23x (± 0.00) slower
org: 0.2 i/s - 1.26x (± 0.00) slower
On 1/15/22, Jon Smart <jon@smartown.nl> wrote:
May I show the result of my benchmark for perl5, ruby, and scala?
https://blog.cloudcache.net/benchmark-for-scala-ruby-and-perl/
Welcome you to give any suggestion to me for improving this.
On my machine your ruby script ran in
real 0m2.824s (this is an average ish figure from ten runs)
user 0m2.757s
sys 0m0.057s
However a slightly tweaked version ran in
real 0m2.597s (also a average figure from 10 runs)
user 0m2.430s
sys 0m0.146s
stopwords = File.open('stopwords.txt').read.split("\n")
count = Hash.new(0)
File.open('words.txt').read.split("\n") do |s|
count[s] += 1
end
stopwords.each { |s| count.delete(s) }
z = count.sort_by {|k, v| -v}
z.take(20).each do |s| puts "#{s[0]} -> #{s[1]}" end
A thing of note that simply reading the files is around 46% of the total
time
stopwords = File.open('stopwords.txt').read.split("\n")
words = File.open('words.txt').read.split("\n")
real 0m1.220s
user 0m1.063s
sys 0m0.140s
For giggles I hacked up this is lua (v5.4.3). Being lua there is less by
way of "sugar" / "convenience" so the code is a lot less concise. But for
that you get improved performance
real 0m1.541s
user 0m1.515s
sys 0m0.022s
local stopwords = {}
local file1 = io.open("stopwords.txt", "r")
for line in file1:lines() do
stopwords[line] = 1
end
local count = {}
local file2 = io.open("words.txt", "r")
for line in file2:lines() do
if stopwords[line] == nil then
if count[line] == nil then
count[line] = 1
else
count[line] = count[line] + 1
end
end
end
local keys = {}
for key, _ in pairs(count) do
table.insert(keys, key)
end
table.sort(keys, function(lhs, rhs) return count[lhs] > count[rhs] end)
for i=1,20 do
print(keys[i], count[keys[i]])
end
As a further note just reading the files in Lua took
real 0m1.312s
user 0m1.291s
sys 0m0.017s
So again it suggest that your benchmarks are really testing the performance
of your storage medium / underlying C interface to read the files
Frank your code looks graceful, thanks.
There is a hacker who changed the Read to Mmap who improved the speed pretty pretty much faster because it decreases the system call from hundreds to just 2.
On 16.01.2022 00:34, Frank J. Cameron wrote:
On 1/15/22, Jon Smart <jon@smartown.nl> wrote:
May I show the result of my benchmark for perl5, ruby, and scala?
https://blog.cloudcache.net/benchmark-for-scala-ruby-and-perl/
Welcome you to give any suggestion to me for improving this.Not perfect, but a quick edit of your ruby script:
stopwords = {}
File.open('stopwords.txt').each_line do |s|
s.chomp!
stopwords[s] = 1
endcount = Hash.new(0)
File.open('words.txt').each_line do |s|
s.chomp!
count[s] += 1 unless stopwords[s]
endcount.sort_by{|_,c| -c}.take(20).each do |s|
puts "#{s[0]} -> #{s[1]}"
endIt may even be a little bit faster:
Calculating:
org 0.170 (± 0.0%) i/s - 6.000 in 35.518822s
new 0.214 (± 0.0%) i/s - 7.000 in 33.074879s
new 0.215 (± 0.0%) i/s - 7.000 in 32.760206s
org 0.175 (± 0.0%) i/s - 6.000 in 34.557145sComparison:
new: 0.2 i/s
new: 0.2 i/s - 1.01x (± 0.00) slower
org: 0.2 i/s - 1.23x (± 0.00) slower
org: 0.2 i/s - 1.26x (± 0.00) slowerUnsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>
Peter both read_line and read use the same buffer IO in the language?
I run this benchmark using a sub data of my actual data. The latter has xxx millions of words as input. If reading the whole file into ram by one time this will run out of memory.
Thank you.
On 16.01.2022 03:15, Peter Hickman wrote:
On my machine your ruby script ran in
real 0m2.824s (this is an average ish figure from ten runs)
user 0m2.757s
sys 0m0.057sHowever a slightly tweaked version ran in
real 0m2.597s (also a average figure from 10 runs)
user 0m2.430s
sys 0m0.146sstopwords = File.open('stopwords.txt').read.split("\n")
count = Hash.new(0)
File.open('words.txt').read.split("\n") do |s|
count[s] += 1
endstopwords.each { |s| count.delete(s) }
z = count.sort_by {|k, v| -v}
z.take(20).each do |s| puts "#{s[0]} -> #{s[1]}" endA thing of note that simply reading the files is around 46% of the
total timestopwords = File.open('stopwords.txt').read.split("\n")
words = File.open('words.txt').read.split("\n")real 0m1.220s
user 0m1.063s
sys 0m0.140sFor giggles I hacked up this is lua (v5.4.3). Being lua there is less
by way of "sugar" / "convenience" so the code is a lot less concise.
But for that you get improved performancereal 0m1.541s
user 0m1.515s
sys 0m0.022slocal stopwords = {}
local file1 = io.open("stopwords.txt", "r")
for line in file1:lines() do
stopwords[line] = 1
endlocal count = {}
local file2 = io.open("words.txt", "r")
for line in file2:lines() do
if stopwords[line] == nil then
if count[line] == nil then
count[line] = 1
else
count[line] = count[line] + 1
end
end
endlocal keys = {}
for key, _ in pairs(count) do
table.insert(keys, key)
endtable.sort(keys, function(lhs, rhs) return count[lhs] > count[rhs]
end)
for i=1,20 do
print(keys[i], count[keys[i]])
endUnsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>
Peter both read_line and read use the same buffer IO in the language?
Not sure, I just did this for the data as was
As a calibration I ran your perl version on my machine and got
real 0m1.445s
user 0m1.417s
sys 0m0.022s
If we are looking at really big data then you would need to see how much
memory is being consumed by the process. Something that eats a lot of
memory will behave differently when it has enough ram v.s. when the system
is hitting swap
Here are some figures, I slept 60 seconds at the end so I could run "ps aux
grep ruby" (or perl or lua) in another window once it had done it's thing
USER PID %CPU %MEM VSZ RSS TT STAT STARTED
TIME COMMAND
peterhickman 89328 0.0 0.2 34845928 31456 s002 S+ 8:39pm
0:02.91 ruby script
peterhickman 89355 0.0 2.7 35439260 449224 s002 S+ 8:40pm
0:02.60 ruby script2
peterhickman 89398 0.0 0.1 34675524 23720 s002 S+ 8:43pm
0:01.42 perl perl
peterhickman 89541 100.0 0.1 34663240 16488 s002 R+ 8:56pm
0:03.42 lua fred.lua
"script" is your original ruby version, "script2" is mine (yup we are
eating the memory) and "perl" is your original perl version. "fred.lua" is
mine too
So my ruby script is faster(ish) but eats 10x more memory. Perl uses less
memory than your script and Lua does it best
What this would translate into your real data set is anyones guess
Except that I'm putting money one my Ruby version being the worst
On Sat, 15 Jan 2022 at 19:54, Jon Smart <jon@smartown.nl> wrote:
For yet more fun, I took a somewhat optimized version of the original, that
still uses each_line, and compared it to a similar implementation in
Crystal.
I didn't use your version, Peter, because the use of File.read means that
you are pulling the whole file into RAM first, and has the OP suggests,
that could be problematic on some systems, with some data sets.
On my system, running Ruby 3.1.0, I got it down to about 1.8 seconds per
iteration with the following code:
stopwords =
File.open('stopwords.txt').each_line do |s|
stopwords << s
end
count = Hash.new(0)
File.open('words.txt').each_line do |s|
count[s] += 1
end
stopwords.each {|s| count.delete(s)}
count.sort_by{|_,c| -c}.take(20).each do |s|
puts "#{s[0].chomp} -> #{s[1]}"
end
Using YJIT shaves a very tiny amount off of that. It tended to be about 5
hundredths of a second, even when I built it out in a benchmarkable format,
and repeated the count and sort process a couple dozen times. MJIT was
slower for a single iteration by about 50%, but was comparable for 20
iterations.
I then ran almost the same code under Crystal:
stopwords = of String
File.open("stopwords.txt").each_line do |s|
stopwords << s
end
count = Hash(String, Int32).new(0)
File.open("words.txt").each_line do |w|
count[w] += 1
end
stopwords.each {|s| count.delete(s)}
count.to_a.sort_by {|_,c| -c}[0..19].each do |s|
puts "#{s[0].chomp} -> #{s[1]}"
end
It consistently runs in about 0.76 to 0.77 seconds.
RAM usage was interesting.
Name VSZ RSS Notes
ruby count.rb 97616 39536 Ruby 3.1.0 without a JIT
ruby --yjit count.rb 360236 302028 Ruby 3.1.0 with YJIT
ruby --mjit count.rb 171464 39780 Ruby 3.1.0 with MJIT
count 162276 19884 Crystal 1.3.1
This task is heavily influenced by the speed of the underlying IO
subsystem, so any gains from YJIT were quite minimal in the face of that,
at the expense of a lot more RAM use.
Kirk Haines
On Sat, Jan 15, 2022 at 2:12 PM Peter Hickman < peterhickman386@googlemail.com> wrote:
On Sat, 15 Jan 2022 at 19:54, Jon Smart <jon@smartown.nl> wrote:
Peter both read_line and read use the same buffer IO in the language?
Not sure, I just did this for the data as was
As a calibration I ran your perl version on my machine and got
real 0m1.445s
user 0m1.417s
sys 0m0.022sIf we are looking at really big data then you would need to see how much
memory is being consumed by the process. Something that eats a lot of
memory will behave differently when it has enough ram v.s. when the system
is hitting swapHere are some figures, I slept 60 seconds at the end so I could run "ps
aux | grep ruby" (or perl or lua) in another window once it had done it's
thingUSER PID %CPU %MEM VSZ RSS TT STAT STARTED
TIME COMMAND
peterhickman 89328 0.0 0.2 34845928 31456 s002 S+ 8:39pm
0:02.91 ruby script
peterhickman 89355 0.0 2.7 35439260 449224 s002 S+ 8:40pm
0:02.60 ruby script2
peterhickman 89398 0.0 0.1 34675524 23720 s002 S+ 8:43pm
0:01.42 perl perl
peterhickman 89541 100.0 0.1 34663240 16488 s002 R+ 8:56pm
0:03.42 lua fred.lua"script" is your original ruby version, "script2" is mine (yup we are
eating the memory) and "perl" is your original perl version. "fred.lua" is
mine tooSo my ruby script is faster(ish) but eats 10x more memory. Perl uses less
memory than your script and Lua does it bestWhat this would translate into your real data set is anyones guess
Except that I'm putting money one my Ruby version being the worst
I tried this on my system, version 1.3.1, and it came in with
real 0m4.474s
user 0m4.209s
sys 0m0.059s
So pretty much the worst of the bunch time wise. Memory wise it was
peterhickman 13455 0.0 0.1 34447596 21212 s002 S+ 12:10am
0:04.26 ./ccc
which puts it between perl and lua for memory usage
Could you post the times for Jon's initial ruby script? It will help me
calibrate your machine against mine
On Sat, 15 Jan 2022 at 22:15, Kirk Haines <wyhaines@gmail.com> wrote:
I then ran almost the same code under Crystal:
stopwords = of String
File.open("stopwords.txt").each_line do |s|
stopwords << s
endcount = Hash(String, Int32).new(0)
File.open("words.txt").each_line do |w|
count[w] += 1
endstopwords.each {|s| count.delete(s)}
count.to_a.sort_by {|_,c| -c}[0..19].each do |s|
puts "#{s[0].chomp} -> #{s[1]}"
endIt consistently runs in about 0.76 to 0.77 seconds.
I then ran almost the same code under Crystal:
stopwords = of String
File.open("stopwords.txt").each_line do |s|
stopwords << s
endcount = Hash(String, Int32).new(0)
File.open("words.txt").each_line do |w|
count[w] += 1
endstopwords.each {|s| count.delete(s)}
count.to_a.sort_by {|_,c| -c}[0..19].each do |s|
puts "#{s[0].chomp} -> #{s[1]}"
endIt consistently runs in about 0.76 to 0.77 seconds.
I tried this on my system, version 1.3.1, and it came in with
real 0m4.474s
user 0m4.209s
sys 0m0.059sSo pretty much the worst of the bunch time wise. Memory wise it was
peterhickman 13455 0.0 0.1 34447596 21212 s002 S+ 12:10am
0:04.26 ./cccwhich puts it between perl and lua for memory usage
Could you post the times for Jon's initial ruby script? It will help me
calibrate your machine against mine
Compile it with - - release.
Kirk
On Sat, Jan 15, 2022, 5:17 PM Peter Hickman <peterhickman386@googlemail.com> wrote:
On Sat, 15 Jan 2022 at 22:15, Kirk Haines <wyhaines@gmail.com> wrote:
I then ran almost the same code under Crystal:
stopwords = of String
File.open("stopwords.txt").each_line do |s|
stopwords << s
endcount = Hash(String, Int32).new(0)
File.open("words.txt").each_line do |w|
count[w] += 1
endstopwords.each {|s| count.delete(s)}
count.to_a.sort_by {|_,c| -c}[0..19].each do |s|
puts "#{s[0].chomp} -> #{s[1]}"
endIt consistently runs in about 0.76 to 0.77 seconds.
I tried this on my system, version 1.3.1, and it came in with
real 0m4.474s
user 0m4.209s
sys 0m0.059sSo pretty much the worst of the bunch time wise. Memory wise it was
peterhickman 13455 0.0 0.1 34447596 21212 s002 S+ 12:10am
0:04.26 ./cccwhich puts it between perl and lua for memory usage
Could you post the times for Jon's initial ruby script? It will help me
calibrate your machine against mine
Now that I have a little more time to write, and can make complete
sentences, given the timings that you reported for Ruby, I am pretty sure
that you are compiling the crystal code in development mode vs release
mode.
crystal build - - release count.cr
You will probably find that it runs in around a second on your machine.
Kirk Haines
On Sat, Jan 15, 2022, 5:17 PM Peter Hickman <peterhickman386@googlemail.com> wrote:
On Sat, 15 Jan 2022 at 22:15, Kirk Haines <wyhaines@gmail.com> wrote:
Oh yeah, that was it. Much faster now
real 0m1.003s
user 0m0.748s
sys 0m0.062s
Same memory usage. I had just downloaded crystal and run it, should have
thought about it more
Thanks
On Sun, 16 Jan 2022 at 05:29, Kirk Haines <wyhaines@gmail.com> wrote:
On Sat, Jan 15, 2022, 5:17 PM Peter Hickman < > peterhickman386@googlemail.com> wrote:
On Sat, 15 Jan 2022 at 22:15, Kirk Haines <wyhaines@gmail.com> wrote:
I then ran almost the same code under Crystal:
stopwords = of String
File.open("stopwords.txt").each_line do |s|
stopwords << s
endcount = Hash(String, Int32).new(0)
File.open("words.txt").each_line do |w|
count[w] += 1
endstopwords.each {|s| count.delete(s)}
count.to_a.sort_by {|_,c| -c}[0..19].each do |s|
puts "#{s[0].chomp} -> #{s[1]}"
endIt consistently runs in about 0.76 to 0.77 seconds.
I tried this on my system, version 1.3.1, and it came in with
real 0m4.474s
user 0m4.209s
sys 0m0.059sSo pretty much the worst of the bunch time wise. Memory wise it was
peterhickman 13455 0.0 0.1 34447596 21212 s002 S+ 12:10am
0:04.26 ./cccwhich puts it between perl and lua for memory usage
Could you post the times for Jon's initial ruby script? It will help me
calibrate your machine against mineNow that I have a little more time to write, and can make complete
sentences, given the timings that you reported for Ruby, I am pretty sure
that you are compiling the crystal code in development mode vs release
mode.crystal build - - release count.cr
You will probably find that it runs in around a second on your machine.
Kirk Haines
Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>
I have to agree with Peter here: it is not very clear what,
*precisely* you are trying to benchmark.
For example, in your benchmark, you include the time taken to print
the result to the console. It is well known that the console is slow,
and this is a property of the *console*, not the program you are
benchmarking or the language implementation you are using. You also
include the *startup time* of the implementation in your benchmark,
which e.g. for the JVM can be significant. You include in your
benchmark the time spent reading the files from the harddisk and
parsing them – this is at least partially dependent on the performance
of your harddisk, your filesystem, your Operating System, and whether
or not the files are in the cache or not, none of which has anything
*specifically* to to with the language or the program.
However, in your follow-up blog post, you mention that your *actual*
application is about *streaming data*.
This means, you are actually including lots of irrelevant operations
in your benchmark:
* The speed of the terminal is irrelevant, since in your real
application you are not printing to the console.
* The startup time of the VM is irrelevant, because it will only be
started once and then constantly process streaming data.
* The time for reading and parsing the stopword list is irrelevant,
since it will only be done once at application startup.
* The filesystem performance is irrelevant, because the data will not
come from the filesystem but some form of message queue.
The *real* takeaway here is that *benchmarking is hard*. It is no
coincidence that the benchmarks which are used in the industry are
written by benchmark engineers who *only* write benchmarks 24/7 and do
nothing else. And even *they* sometimes get it wrong: I can't remember
the details, but there was a famous example of a SPEC benchmark that
was *supposed* to test database performance but *actually* tested
memory allocator performance of the benchmark runner!
If I remember correctly, the problem was that the actual database
operations the benchmark performed were so trivial that the most
expensive operation was not the database query but allocating the
result set object in the test harness. So, the score in this
particular "database benchmark" had absolutely nothing to do with the
database and was purely a measure of how fast the computer that the
test harness was running on could allocate memory.
Another example is an old benchmark from the Computer Language
Benchmark Game, where Haskell beat C, C++, and even hand-optimized
assembly by a *massive* margin. The problem was that the benchmark was
about sorting a gigantic array, but the benchmark never did anything
with the sorted array. It just sorted the array, and then ignored the
result. The Haskell compiler was clever enough to recognize that the
sorted array was never used, so it optimized away the sorting, and
since now the unsorted input array was not used as well, it also
optimized away the array itself, and lastly, it optimized away the
code that reads the array from the input. The result was almost like
`void main() { exit 0; }`. Whereas the C and C++ language
specifications do not allow for such optimizations, and the
hand-optimized assembly did the sort and then ignored the result.
All this is to say that creating *representative* benchmarks with
*statistically significant and robust results* is very hard. I
certainly would never dare to try it myself.
If I were you, as a first step, I would carefully decide what,
*exactly*, I want to measure. In your case, I think it does not make
sense to include VM startup, printing, or building the stopword set.
It might make sense to include reading the wordlist from disk but it
might also make sense to exclude it – that really depends on your use
case.
It might make sense to choose different data structures or different
algorithms altogether. For example, you are only looking at the top 20
elements. Currently, you are sorting the entire thing in order to find
the top 20 elements, which is O(n * log n). But you don't actually
need to sort at all to find the top 20 elements and can do it in O(n).
Cheers.
Peter Hickman <peterhickman386@googlemail.com> wrote:
As a further note just reading the files in Lua took
real 0m1.312s
user 0m1.291s
sys 0m0.017sSo again it suggest that your benchmarks are really testing the performance of your storage medium / underlying C interface to read the files