Optimization help - reading out of /proc on Solaris

Hi all,

Ruby 1.8.6
Solaris 10

I recently converted a C extension to get process table information on
Solaris into a pure Ruby. I knew it would be slower, I just didn't
realize how _much_ slower it would be. I was expecting the pure Ruby
version to be about 1/10th as fast. Instead, it's about 1/70th as
fast. Anticipating the, "Is it fast enough?" question, my answer is,
"I'm not sure". Besides, tuning can be fun. :slight_smile:

Anyway, below is the code. I ran it through the profiler, but the top
two most costly ops were Dir.foreach, which I don't see any way to
optimize*, and the loop that gathers environment information, which I
again see no way to optimize.

# sunos.rb

路路路

#
# A pure Ruby version of sys-proctable for SunOS 5.8 or later
#--
# Directories under /proc on Solaris 2.8+

# The Sys module serves as a namespace only.
module Sys

聽聽聽# The ProcTable class encapsulates process table information.
聽聽聽class ProcTable

聽聽聽聽聽聽# The version of the sys-proctable library
聽聽聽聽聽聽VERSION = '0.8.0'

聽聽聽聽聽聽private

聽聽聽聽聽聽PRNODEV = -1 # non-existent device

聽聽聽聽聽聽FIELDS = [
聽聽聽聽聽聽聽聽聽:flag, # process flags (deprecated)
聽聽聽聽聽聽聽聽聽:nlwp, # number of active lwp's in the process
聽聽聽聽聽聽聽聽聽:pid, # unique process id
聽聽聽聽聽聽聽聽聽:ppid, # process id of parent
聽聽聽聽聽聽聽聽聽:pgid, # pid of session leader
聽聽聽聽聽聽聽聽聽:sid, # session id
聽聽聽聽聽聽聽聽聽:uid, # real user id
聽聽聽聽聽聽聽聽聽:euid, # effective user id
聽聽聽聽聽聽聽聽聽:gid, # real group id
聽聽聽聽聽聽聽聽聽:egid, # effective group id
聽聽聽聽聽聽聽聽聽:addr, # address of the process
聽聽聽聽聽聽聽聽聽:size, # size of process in kbytes
聽聽聽聽聽聽聽聽聽:rssize, # resident set size in kbytes
聽聽聽聽聽聽聽聽聽:ttydev, # tty device (or PRNODEV)
聽聽聽聽聽聽聽聽聽:pctcpu, # % of recent cpu used by all lwp's
聽聽聽聽聽聽聽聽聽:pctmem, # % of system memory used by process
聽聽聽聽聽聽聽聽聽:start, # absolute process start time
聽聽聽聽聽聽聽聽聽:time, # usr + sys cpu time for this process
聽聽聽聽聽聽聽聽聽:ctime, # usr + sys cpu time for reaped children
聽聽聽聽聽聽聽聽聽:fname, # name of the exec'd file
聽聽聽聽聽聽聽聽聽:psargs, # initial characters argument list
聽聽聽聽聽聽聽聽聽:wstat, # if a zombie, the wait status
聽聽聽聽聽聽聽聽聽:argc, # initial argument count
聽聽聽聽聽聽聽聽聽:argv, # address of initial argument vector
聽聽聽聽聽聽聽聽聽:envp, # address of initial environment vector
聽聽聽聽聽聽聽聽聽:dmodel, # data model of the process
聽聽聽聽聽聽聽聽聽:taskid, # task id
聽聽聽聽聽聽聽聽聽:projid, # project id
聽聽聽聽聽聽聽聽聽:nzomb, # number of zombie lwp's in the process
聽聽聽聽聽聽聽聽聽:poolid, # pool id
聽聽聽聽聽聽聽聽聽:zoneid, # zone id
聽聽聽聽聽聽聽聽聽:contract, # process contract
聽聽聽聽聽聽聽聽聽:lwpid, # lwp id
聽聽聽聽聽聽聽聽聽:wchan, # wait address for sleeping lwp
聽聽聽聽聽聽聽聽聽:stype, # synchronization event type
聽聽聽聽聽聽聽聽聽:state, # numeric lwp state
聽聽聽聽聽聽聽聽聽:sname, # printable character for state
聽聽聽聽聽聽聽聽聽:nice, # nice for cpu usage
聽聽聽聽聽聽聽聽聽:syscall, # system call number (if in syscall)
聽聽聽聽聽聽聽聽聽:pri, # priority
聽聽聽聽聽聽聽聽聽:clname, # scheduling class name
聽聽聽聽聽聽聽聽聽:name, # name of system lwp
聽聽聽聽聽聽聽聽聽:onpro, # processor which last ran thsi lwp
聽聽聽聽聽聽聽聽聽:bindpro, # processor to which lwp is bound
聽聽聽聽聽聽聽聽聽:bindpset, # processor set to which lwp is bound
聽聽聽聽聽聽聽聽聽:count, # number of contributing lwp's
聽聽聽聽聽聽聽聽聽:tstamp, # current time stamp
聽聽聽聽聽聽聽聽聽:create, # process/lwp creation time stamp
聽聽聽聽聽聽聽聽聽:term, # process/lwp termination time stamp
聽聽聽聽聽聽聽聽聽:rtime, # total lwp real (elapsed) time
聽聽聽聽聽聽聽聽聽:utime, # user level cpu time
聽聽聽聽聽聽聽聽聽:stime, # system call cpu time
聽聽聽聽聽聽聽聽聽:ttime, # other system trap cpu time
聽聽聽聽聽聽聽聽聽:tftime, # text page fault sleep time
聽聽聽聽聽聽聽聽聽:dftime, # text page fault sleep time
聽聽聽聽聽聽聽聽聽:kftime, # kernel page fault sleep time
聽聽聽聽聽聽聽聽聽:ltime, # user lock wait sleep time
聽聽聽聽聽聽聽聽聽:slptime, # all other sleep time
聽聽聽聽聽聽聽聽聽:wtime, # wait-cpu (latency) time
聽聽聽聽聽聽聽聽聽:stoptime, # stopped time
聽聽聽聽聽聽聽聽聽:minf, # minor page faults
聽聽聽聽聽聽聽聽聽:majf, # major page faults
聽聽聽聽聽聽聽聽聽:nswap, # swaps
聽聽聽聽聽聽聽聽聽:inblk, # input blocks
聽聽聽聽聽聽聽聽聽:oublk, # output blocks
聽聽聽聽聽聽聽聽聽:msnd, # messages sent
聽聽聽聽聽聽聽聽聽:mrcv, # messages received
聽聽聽聽聽聽聽聽聽:sigs, # signals received
聽聽聽聽聽聽聽聽聽:vctx, # voluntary context switches
聽聽聽聽聽聽聽聽聽:ictx, # involuntary context switches
聽聽聽聽聽聽聽聽聽:sysc, # system calls
聽聽聽聽聽聽聽聽聽:ioch, # chars read and written
聽聽聽聽聽聽聽聽聽:path, # array of symbolic link paths from /proc/<pid>/
pid
聽聽聽聽聽聽聽聽聽:contracts, # array symbolic link paths from /proc/<pid>/
contracts
聽聽聽聽聽聽聽聽聽:fd, # array of used file descriptors
聽聽聽聽聽聽聽聽聽:cmd_args, # array of command line arguments
聽聽聽聽聽聽聽聽聽:environ # hash of environment associated with the process
聽聽聽聽聽聽]

聽聽聽聽聽聽public

聽聽聽聽聽聽ProcTableStruct = Struct.new("ProcTableStruct", *FIELDS)

聽聽聽聽聽聽# In block form, yields a ProcTableStruct for each process entry
that you
聽聽聽聽聽聽# have rights to. This method returns an array of
ProcTableStruct's in
聽聽聽聽聽聽# non-block form.
聽聽聽聽聽聽#
聽聽聽聽聽聽# If a +pid+ is provided, then only a single ProcTableStruct is
yielded or
聽聽聽聽聽聽# returned, or nil if no process information is found for that
+pid+.
聽聽聽聽聽聽#
聽聽聽聽聽聽# Example:
聽聽聽聽聽聽#
聽聽聽聽聽聽# # Iterate over all processes
聽聽聽聽聽聽# ProcTable.ps do |proc_info|
聽聽聽聽聽聽# p proc_info
聽聽聽聽聽聽# end
聽聽聽聽聽聽#
聽聽聽聽聽聽# # Print process table information for only pid 1001
聽聽聽聽聽聽# p ProcTable.ps(1001)
聽聽聽聽聽聽#
聽聽聽聽聽聽def self.ps(pid = nil)
聽聽聽聽聽聽聽聽聽array = block_given? ? nil : []

聽聽聽聽聽聽聽聽聽Dir.foreach("/proc") do |file|
聽聽聽聽聽聽聽聽聽聽聽聽next if file =~ /\D/ # Skip non-numeric entries under /
proc

聽聽聽聽聽聽聽聽聽聽聽聽# Only return information for a given pid, if provided
聽聽聽聽聽聽聽聽聽聽聽聽if pid
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽next unless file.to_i == pid
聽聽聽聽聽聽聽聽聽聽聽聽end

聽聽聽聽聽聽聽聽聽聽聽聽# Skip over any entries we don't have permissions to read
聽聽聽聽聽聽聽聽聽聽聽聽begin
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽psinfo = IO.read("/proc/#{file}/psinfo")
聽聽聽聽聽聽聽聽聽聽聽聽rescue StandardError, Errno::EACCES
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽next
聽聽聽聽聽聽聽聽聽聽聽聽end

聽聽聽聽聽聽聽聽聽聽聽聽struct = ProcTableStruct.new

聽聽聽聽聽聽聽聽聽聽聽聽struct.flag = psinfo[0,4].unpack("i")[0] # pr_flag
聽聽聽聽聽聽聽聽聽聽聽聽struct.nlwp = psinfo[4,4].unpack("i")[0] # pr_nlwp
聽聽聽聽聽聽聽聽聽聽聽聽struct.pid = psinfo[8,4].unpack("i")[0] # pr_pid
聽聽聽聽聽聽聽聽聽聽聽聽struct.ppid = psinfo[12,4].unpack("i")[0] # pr_ppid
聽聽聽聽聽聽聽聽聽聽聽聽struct.pgid = psinfo[16,4].unpack("i")[0] # pr_pgid
聽聽聽聽聽聽聽聽聽聽聽聽struct.sid = psinfo[20,4].unpack("i")[0] # pr_sid
聽聽聽聽聽聽聽聽聽聽聽聽struct.uid = psinfo[24,4].unpack("i")[0] # pr_uid
聽聽聽聽聽聽聽聽聽聽聽聽struct.euid = psinfo[28,4].unpack("i")[0] # pr_euid
聽聽聽聽聽聽聽聽聽聽聽聽struct.gid = psinfo[32,4].unpack("i")[0] # pr_gid
聽聽聽聽聽聽聽聽聽聽聽聽struct.egid = psinfo[36,4].unpack("i")[0] # pr_egid
聽聽聽聽聽聽聽聽聽聽聽聽struct.addr = psinfo[40,4].unpack("L")[0] # pr_addr

聽聽聽聽聽聽聽聽聽聽聽聽struct.size = psinfo[44,4].unpack("L")[0] * 1024 #
pr_size
聽聽聽聽聽聽聽聽聽聽聽聽struct.rssize = psinfo[48,4].unpack("L")[0] * 1024 #
pr_rssize

聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_pad1

聽聽聽聽聽聽聽聽聽聽聽聽# TODO: Convert this to a human readable string somehow
聽聽聽聽聽聽聽聽聽聽聽聽struct.ttydev = psinfo[56,4].unpack("i")[0] # pr_ttydev

聽聽聽聽聽聽聽聽聽聽聽聽# pr_pctcpu
聽聽聽聽聽聽聽聽聽聽聽聽struct.pctcpu = (psinfo[60,2].unpack("S")[0] * 100).to_f /
0x8000

聽聽聽聽聽聽聽聽聽聽聽聽# pr_pctmem
聽聽聽聽聽聽聽聽聽聽聽聽struct.pctmem = (psinfo[62,2].unpack("S")[0] * 100).to_f /
0x8000

聽聽聽聽聽聽聽聽聽聽聽聽struct.start = Time.at(psinfo[64,8].unpack("L")[0]) #
pr_start
聽聽聽聽聽聽聽聽聽聽聽聽struct.time = psinfo[72,8].unpack("L")[0] #
pr_time
聽聽聽聽聽聽聽聽聽聽聽聽struct.ctime = psinfo[80,8].unpack("L")[0] #
pr_ctime

聽聽聽聽聽聽聽聽聽聽聽聽struct.fname = psinfo[88,16].strip # pr_fname
聽聽聽聽聽聽聽聽聽聽聽聽struct.psargs = psinfo[104,80].strip # pr_psargs
聽聽聽聽聽聽聽聽聽聽聽聽struct.wstat = psinfo[184,4].unpack("i")[0] # pr_wstat
聽聽聽聽聽聽聽聽聽聽聽聽struct.argc = psinfo[188,4].unpack("i")[0] # pr_argc
聽聽聽聽聽聽聽聽聽聽聽聽struct.argv = psinfo[192,4].unpack("L")[0] # pr_argv
聽聽聽聽聽聽聽聽聽聽聽聽struct.envp = psinfo[196,4].unpack("L")[0] # pr_envp
聽聽聽聽聽聽聽聽聽聽聽聽struct.dmodel = psinfo[200,1].unpack("C")[0] # pr_dmodel

聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_pad2

聽聽聽聽聽聽聽聽聽聽聽聽struct.taskid = psinfo[204,4].unpack("i")[0] # pr_taskid
聽聽聽聽聽聽聽聽聽聽聽聽struct.projid = psinfo[208,4].unpack("i")[0] #
pr_projectid
聽聽聽聽聽聽聽聽聽聽聽聽struct.nzomb = psinfo[212,4].unpack("i")[0] # pr_nzomb
聽聽聽聽聽聽聽聽聽聽聽聽struct.poolid = psinfo[216,4].unpack("i")[0] # pr_poolid
聽聽聽聽聽聽聽聽聽聽聽聽struct.zoneid = psinfo[220,4].unpack("i")[0] # pr_zoneid
聽聽聽聽聽聽聽聽聽聽聽聽struct.contract = psinfo[224,4].unpack("i")[0] #
pr_contract

聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_filler

聽聽聽聽聽聽聽聽聽聽聽聽### lwpsinfo struct info

聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_flag

聽聽聽聽聽聽聽聽聽聽聽聽struct.lwpid = psinfo[236,4].unpack("i")[0] # pr_lwpid

聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_addr

聽聽聽聽聽聽聽聽聽聽聽聽struct.wchan = psinfo[244,4].unpack("L")[0] # pr_wchan
聽聽聽聽聽聽聽聽聽聽聽聽struct.stype = psinfo[248,1].unpack("C")[0] # pr_stype
聽聽聽聽聽聽聽聽聽聽聽聽struct.state = psinfo[249,1].unpack("C")[0] # pr_state
聽聽聽聽聽聽聽聽聽聽聽聽struct.sname = psinfo[250,1] # pr_sname
聽聽聽聽聽聽聽聽聽聽聽聽struct.nice = psinfo[251,1].unpack("C")[0] # pr_nice
聽聽聽聽聽聽聽聽聽聽聽聽struct.syscall = psinfo[252,2].unpack("S")[0] # pr_syscall

聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_oldpri
聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_cpu

聽聽聽聽聽聽聽聽聽聽聽聽struct.pri = psinfo[256,4].unpack("i")[0] # pr_syscall

聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_pctcpu
聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_pad
聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_start
聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_time

聽聽聽聽聽聽聽聽聽聽聽聽struct.clname = psinfo[280,8].strip # pr_clname
聽聽聽聽聽聽聽聽聽聽聽聽struct.name = psinfo[288,16].strip # pr_name
聽聽聽聽聽聽聽聽聽聽聽聽struct.onpro = psinfo[304,4].unpack("i")[0] # pr_onpro
聽聽聽聽聽聽聽聽聽聽聽聽struct.bindpro = psinfo[308,4].unpack("i")[0] #
pr_bindpro
聽聽聽聽聽聽聽聽聽聽聽聽struct.bindpset = psinfo[308,4].unpack("i")[0] #
pr_bindpset

聽聽聽聽聽聽聽聽聽聽聽聽# Get the full command line out of /proc/<pid>/as.
聽聽聽聽聽聽聽聽聽聽聽聽begin
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽fd = File.open("/proc/#{file}/as")

聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽fd.sysseek(struct.argv, IO::SEEK_SET)
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽address = fd.sysread(struct.argc * 4).unpack("L")[0]

聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.cmd_args = []

聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽0.upto(struct.argc - 1){ |i|
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽fd.sysseek(address, IO::SEEK_SET)
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽data = fd.sysread(128)[/^[^\0]*/] # Null strip
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.cmd_args << data
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽address += data.length + 1 # Add 1 for the space
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽}

聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽# Get the environment hash associated with the process.
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.environ = {}

聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽fd.sysseek(struct.envp, IO::SEEK_SET)

聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽env_address = fd.sysread(128).unpack("L")[0]

聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽loop do
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽fd.sysseek(env_address, IO::SEEK_SET)
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽data = fd.sysread(1024)[/^[^\0]*/] # Null strip
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽break if data.empty?
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽key, value = data.split('=')
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.environ[key] = value
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽env_address += data.length + 1 # Add 1 for the space
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽end
聽聽聽聽聽聽聽聽聽聽聽聽rescue Errno::EACCES, Errno::EOVERFLOW, EOFError
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽# Skip this if we don't have proper permissions, if
there's
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽# no associated environment, or if there's a largefile
issue.
聽聽聽聽聽聽聽聽聽聽聽聽ensure
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽fd.close if fd
聽聽聽聽聽聽聽聽聽聽聽聽end

聽聽聽聽聽聽聽聽聽聽聽聽### struct prusage

聽聽聽聽聽聽聽聽聽聽聽聽begin
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽prusage = 0.chr * 512
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽prusage = IO.read("/proc/#{file}/usage")

聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽# skip pr_lwpid
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.count = prusage[4,4].unpack("i")[0] #
pr_count
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.tstamp = prusage[8,8].unpack("L")[0] #
pr_tstamp
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.create = prusage[16,8].unpack("L")[0] #
pr_create
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.term = prusage[24,8].unpack("L")[0] #
pr_term
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.rtime = prusage[32,8].unpack("L")[0] #
pr_rtime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.utime = prusage[40,8].unpack("L")[0] #
pr_utime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.stime = prusage[48,8].unpack("L")[0] #
pr_stime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.ttime = prusage[56,8].unpack("L")[0] #
pr_ttime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.tftime = prusage[64,8].unpack("L")[0] #
pr_tftime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.dftime = prusage[72,8].unpack("L")[0] #
pr_dftime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.kftime = prusage[80,8].unpack("L")[0] #
pr_kftime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.ltime = prusage[88,8].unpack("L")[0] #
pr_ltime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.slptime = prusage[96,8].unpack("L")[0] #
pr_slptime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.wtime = prusage[104,8].unpack("L")[0] #
pr_wtime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.stoptime = prusage[112,8].unpack("L")[0] #
pr_stoptime
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.minf = prusage[120,4].unpack("L")[0] #
pr_minf
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.majf = prusage[124,4].unpack("L")[0] #
pr_majf
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.nswap = prusage[128,4].unpack("L")[0] #
pr_nswap
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.inblk = prusage[128,4].unpack("L")[0] #
pr_inblk
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.oublk = prusage[128,4].unpack("L")[0] #
pr_oublk
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.msnd = prusage[128,4].unpack("L")[0] #
pr_msnd
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.mrcv = prusage[128,4].unpack("L")[0] #
pr_mrcv
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.sigs = prusage[128,4].unpack("L")[0] #
pr_sigs
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.vctx = prusage[128,4].unpack("L")[0] #
pr_vctx
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.ictx = prusage[128,4].unpack("L")[0] #
pr_ictx
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.sysc = prusage[128,4].unpack("L")[0] #
pr_sysc
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.ioch = prusage[128,4].unpack("L")[0] #
pr_ioch
聽聽聽聽聽聽聽聽聽聽聽聽rescue Errno::EACCES
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽# Do nothing if we lack permissions. Just move on.
聽聽聽聽聽聽聽聽聽聽聽聽end

聽聽聽聽聽聽聽聽聽聽聽聽# Information from /proc/<pid>/path. This is represented
as a hash,
聽聽聽聽聽聽聽聽聽聽聽聽# with the symbolic link name as the key, and the file it
links to
聽聽聽聽聽聽聽聽聽聽聽聽# as the value, or nil if it cannot be found.
聽聽聽聽聽聽聽聽聽聽聽聽#--
聽聽聽聽聽聽聽聽聽聽聽聽# Note that cwd information can be gathered from here,
too.
聽聽聽聽聽聽聽聽聽聽聽聽struct.path = {}

聽聽聽聽聽聽聽聽聽聽聽聽Dir["/proc/#{file}/path/*"].each{ |entry|
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽link = File.readlink(entry) rescue nil
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.path[File.basename(entry)] = link
聽聽聽聽聽聽聽聽聽聽聽聽}

聽聽聽聽聽聽聽聽聽聽聽聽# Information from /proc/<pid>/contracts. This is
represented as
聽聽聽聽聽聽聽聽聽聽聽聽# a hash, with the symbolic link name as the key, and the
file
聽聽聽聽聽聽聽聽聽聽聽聽# it links to as the value.
聽聽聽聽聽聽聽聽聽聽聽聽struct.contracts = {}

聽聽聽聽聽聽聽聽聽聽聽聽Dir["/proc/#{file}/contracts/*"].each{ |entry|
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽link = File.readlink(entry) rescue nil
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽struct.contracts[File.basename(entry)] = link
聽聽聽聽聽聽聽聽聽聽聽聽}

聽聽聽聽聽聽聽聽聽聽聽聽# Information from /proc/<pid>/fd. This returns an array
of
聽聽聽聽聽聽聽聽聽聽聽聽# numeric file descriptors used by the process.
聽聽聽聽聽聽聽聽聽聽聽聽struct.fd = Dir["/proc/#{file}/fd/*"].map{ |f|
File.basename(f).to_i }

聽聽聽聽聽聽聽聽聽聽聽聽if block_given?
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽yield struct
聽聽聽聽聽聽聽聽聽聽聽聽else
聽聽聽聽聽聽聽聽聽聽聽聽聽聽聽array << struct
聽聽聽聽聽聽聽聽聽聽聽聽end
聽聽聽聽聽聽聽聽聽end

聽聽聽聽聽聽聽聽聽pid ? array[0] : array
聽聽聽聽聽聽end
聽聽聽end
end

Thanks,

Dan

* I tried tossing threads at it in a one-thread-per-directory
approach, but they didn't get along with IO.read in MRI, and seemed to
provide no real speed benefit with JRuby.

Anyway, below is the code. I ran it through the profiler, but the top
two most costly ops were Dir.foreach, which I don't see any way to
optimize*, and the loop that gathers environment information, which I
again see no way to optimize.

Could you post your profiling? If you run using "time", how much user
CPU versus system CPU are you using?

Have you tried using Dir.open.each instead of Dir["/foo/*"]? Maybe
globbing is expensive.

Your environment loop does a fresh sysread(1024) for each var=val pair,
even if you've only consumed (say) 7 bytes from the previous call. You
would make many fewer system calls if you read a big chunk and chopped
it up afterwards. This may also avoid non-byte-aligned reads.

I would also be tempted to write one long unpack instead of lots of
string slicing and unpacking. The overhead here may be negligible, but
the code may end up being smaller and simpler. e.g.

  struct = ProcTableStruct.new(*psinfo.unpack(<<PATTERN))
i i i i
i i i i
i i L L
L x4i ss
...etc
PATTERN

Perhaps you could combine it with your struct building, e.g.

      FIELDS = [
         [:flag,"i"], # process flags (deprecated)
         [:nlwp,"i"], # number of active lwp's in the process
         ...
         [:size,"s"], # size of process in kbytes
         [:rssize,"s"], # resident set size in kbytes
         [nil,"X4"], # skip pr_pad1
         ... etc

HTH,

Brian.

路路路

--
Posted via http://www.ruby-forum.com/\.

You can optimize the loop body. Profiler output takes a while to get used to.

Few things that caught my attention:

You can combine the first two "next" in one. Also matching with the
RX on the left side is faster AFAIK.

Try to use as few psinfo.unpack as possible, i.e. ideally only 1.

Use the block form of File.open.

Do not use sysread/syswrite/sysseek unless you have to (most of the
time you don't).

You can replace all but the first checks for block_given? with
"array". Might be faster.

Have fun!

robert

路路路

2008/9/16 Daniel Berger <djberg96@gmail.com>:

I recently converted a C extension to get process table information on
Solaris into a pure Ruby. I knew it would be slower, I just didn't
realize how _much_ slower it would be. I was expecting the pure Ruby
version to be about 1/10th as fast. Instead, it's about 1/70th as
fast. Anticipating the, "Is it fast enough?" question, my answer is,
"I'm not sure". Besides, tuning can be fun. :slight_smile:

Anyway, below is the code. I ran it through the profiler, but the top
two most costly ops were Dir.foreach, which I don't see any way to
optimize*, and the loop that gathers environment information, which I
again see no way to optimize.

--
use.inject do |as, often| as.you_can - without end

Method calls are expensive. Right now he's got 3 per field (array
indexing, unpack, and assignment). He could get that down to 1 per
field (assignment only) pretty easily by following your suggestion.

路路路

Brian Candler <b.candler@pobox.com> wrote:

Anyway, below is the code. I ran it through the profiler, but the top
two most costly ops were Dir.foreach, which I don't see any way to
optimize*, and the loop that gathers environment information, which I
again see no way to optimize.

Could you post your profiling? If you run using "time", how much user
CPU versus system CPU are you using?

Have you tried using Dir.open.each instead of Dir["/foo/*"]? Maybe
globbing is expensive.

Your environment loop does a fresh sysread(1024) for each var=val pair,
even if you've only consumed (say) 7 bytes from the previous call. You
would make many fewer system calls if you read a big chunk and chopped
it up afterwards. This may also avoid non-byte-aligned reads.

I would also be tempted to write one long unpack instead of lots of
string slicing and unpacking. The overhead here may be negligible, but
the code may end up being smaller and simpler. e.g.

--
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Here's a concept for metaprogramming that that I was able to generate
mostly by running regexes to transform the code. I've only tackled
/proc/#{file}/psinfo, but it should be fairly simple to extend to
the other files as well

# The Sys module serves as a namespace only.
module Sys

   # The ProcTable class encapsulates process table information.
   class ProcTable

      # The version of the sys-proctable library
      VERSION = '0.8.0'

      private

      PRNODEV = -1 # non-existent device

      #Dissecting the format of this, we have a symbol mapping to an unpack format string segment
      #the @ sign followed by a number indicates the offset in the string, and the text following that number is the format
      #of the data to unpack
      FIELDS=[
            [:flag , "@0 i"],
            [:nlwp , "@4 i"],
            [:pid , "@8 i"],
            [:ppid , "@12 i"],
            [:pgid , "@16 i"],
            [:sid , "@20 i"],
            [:uid , "@24 i"],
            [:euid , "@28 i"],
            [:gid , "@32 i"],
            [:egid , "@36 i"],
            [:addr , "@40 L"],
            [:size , "@44 L"],
            [:rssize , "@48 L"],
            [:ttydev , "@56 i"],
            [:pctcpu , "@60 S"],
            [:pctmem , "@62 S"],
            [:start , "@64 L"],
            [:time , "@72 L"],
            [:ctime , "@80 L"],
#note that the A format specifier automatically does what the #strip method does
#so I don't have to call .strip in the ps method
            [:fname , "@88 A16"],
            [:psargs , "@104 A80"],
            [:wstat , "@184 i"],
            [:argc , "@188 i"],
            [:argv , "@192 L"],
            [:envp , "@196 L"],
            [:dmodel , "@200 C"],
            [:taskid , "@204 i"],
            [:projid , "@208 i"],
            [:nzomb , "@212 i"],
            [:poolid , "@216 i"],
            [:zoneid , "@220 i"],
            [:contract , "@224 i"],
            [:lwpid , "@236 i"],
            [:wchan , "@244 L"],
            [:stype , "@248 C"],
            [:state , "@249 C"],
            [:sname , "@250 a1"],
            [:nice , "@251 C"],
            [:syscall , "@252 S"],
            [:pri , "@256 i"],
            [:clname , "@280 A8"],
            [:name , "@288 A16"],
            [:onpro , "@304 i"],
            [:bindpro , "@308 i"],
            [:bindpset , "@308 i"]
      ]

      field_names,format_strings=FIELDS.transpose

      eval <<-"end;"
        def first_pass_fill string
          struct=ProcTableStruct.new

          #{ field_names.collect{|x| "struct.#{x}"}.join(", ") } = string.unpack "#{format_strings.join ' '}"
        end
      end;

      #repeat the above with a new array instead of FIELDS and a new method name
      #for any other file you want to unpack this way

=begin
This eval will define a function with the following code. The arrays and
metaprogramming are just an easier way to manage the format string and
fieldnames that you can understand them when maintenence time comes around.

        def first_pass_fill string
          struct=ProcTableStruct.new

          struct.flag, struct.nlwp, struct.pid, struct.ppid, struct.pgid,
          struct.sid, struct.uid, struct.euid, struct.gid, struct.egid, struct.addr,
          struct.size, struct.rssize, struct.ttydev, struct.pctcpu, struct.pctmem,
          struct.start, struct.time, struct.ctime, struct.fname, struct.psargs,
          struct.wstat, struct.argc, struct.argv, struct.envp, struct.dmodel,
          struct.taskid, struct.projid, struct.nzomb, struct.poolid, struct.zoneid,
          struct.contract, struct.lwpid, struct.wchan, struct.stype, struct.state,
          struct.sname, struct.nice, struct.syscall, struct.pri, struct.clname,
          struct.name, struct.onpro, struct.bindpro, struct.bindpset = string.unpack "@0
          i @4 i @8 i @12 i @16 i @20 i @24 i @28 i @32 i @36 i @40 L @44 L @48 L @56 i
          @60 S @62 S @64 L @72 L @80 L @88 A16 @104 A80 @184 i @188 i @192 L @196 L @200
          C @204 i @208 i @212 i @216 i @220 i @224 i @236 i @244 L @248 C @249 C @250 a1
          @251 C @252 S @256 i @280 A8 @288 A16 @304 i @308 i @308 i"
        end
=end

      public

      ProcTableStruct = Struct.new("ProcTableStruct", *field_names)

      #if you have multiple files you're reading from with their field names
      #in multiple different variables, you'll want to replace field_names
      #with some array concatentation
      
      # In block form, yields a ProcTableStruct for each process entry that you
      # have rights to. This method returns an array of ProcTableStruct's in
      # non-block form.

路路路

Brian Candler <b.candler@pobox.com> wrote:

Anyway, below is the code. I ran it through the profiler, but the top
two most costly ops were Dir.foreach, which I don't see any way to
optimize*, and the loop that gathers environment information, which I
again see no way to optimize.

Could you post your profiling? If you run using "time", how much user
CPU versus system CPU are you using?

Have you tried using Dir.open.each instead of Dir["/foo/*"]? Maybe
globbing is expensive.

Your environment loop does a fresh sysread(1024) for each var=val pair,
even if you've only consumed (say) 7 bytes from the previous call. You
would make many fewer system calls if you read a big chunk and chopped
it up afterwards. This may also avoid non-byte-aligned reads.

I would also be tempted to write one long unpack instead of lots of
string slicing and unpacking. The overhead here may be negligible, but
the code may end up being smaller and simpler. e.g.

struct = ProcTableStruct.new(*psinfo.unpack(<<PATTERN))
i i i i
i i i i
i i L L
L x4i ss
..etc
PATTERN

Perhaps you could combine it with your struct building, e.g.

     FIELDS = [
        [:flag,"i"], # process flags (deprecated)
        [:nlwp,"i"], # number of active lwp's in the process
        ...
        [:size,"s"], # size of process in kbytes
        [:rssize,"s"], # resident set size in kbytes
        [nil,"X4"], # skip pr_pad1
        ... etc

HTH,

Brian.

      #
      # If a +pid+ is provided, then only a single ProcTableStruct is yielded or
      # returned, or nil if no process information is found for that +pid+.
      #
      # Example:
      #
      # # Iterate over all processes
      # ProcTable.ps do |proc_info|
      # p proc_info
      # end
      #
      # # Print process table information for only pid 1001
      # p ProcTable.ps(1001)
      #
      def self.ps(pid = nil)
         array = block_given? ? nil :
         Dir.foreach("/proc") do |file|
            next if file =~ \D # Skip non-numeric entries under / proc

            # Only return information for a given pid, if provided
            if pid
               next unless file.to_i == pid
            end
               
            # Skip over any entries we don't have permissions to read
            begin
               psinfo = IO.read("/proc/#{file}/psinfo")
            rescue StandardError, Errno::EACCES
               next
            end
               
            #the first pass fill just gets the raw data and unpacks it
            struct = first_pass_fill psinfo
            #now we do the transformations we need on the few fields that need it
            struct.pctcpu= (struct.pctcpu*100).to_f / 0x8000
            struct.pctmem= (struct.pctmem*100).to_f / 0x8000
            struct.start=Time.at(struct.start)
            #the fields that needed stripping were handled by unpack

            #repeat the above for other files that we need to deal with

            if block_given?
               yield struct
            else
               array << struct
            end
         end
            
         pid ? array[0] : array
      end
   end
end

--
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/