Say I'm parsing stuff like http headers, what is going to give better
performance? Strings with regular expressions? StringIO with
readline? splitting strings into arrays on a delimiter? Or is it
going to be so close it's not really an issue?
if you try to write your regular expressions badly enough they can surely use
the most cpu
-a
···
On Tue, 31 Oct 2006, snacktime wrote:
Say I'm parsing stuff like http headers, what is going to give better
performance? Strings with regular expressions? StringIO with readline?
splitting strings into arrays on a delimiter? Or is it going to be so close
it's not really an issue?
Chris
--
my religion is very simple. my religion is kindness. -- the dalai lama
It all 'depends' If you're doing http header parsing, why not just
use the header parsing in mongrel. It's already available as a C
extension, probably not going to get much faster than that.
But if you want to stick with the strict ruby parsing, experiment and see
what works. I was parsing all the netflix[1] data with ruby for fun and
I found out some interesting things about text parsing, at least on my
laptop:
- if you only need the data between two delimiter, it was
faster to do String#index 2x's and slice the data out of the
middle vs, split and index into the array
- but, if you had 3 items you wanted out, it was faster to do the
split.
- for simple parsing, regex's were overkill, but if you want to use
them make sure to compile them once, use them MANY times
On Tue, Oct 31, 2006 at 02:14:42PM +0900, ara.t.howard@noaa.gov wrote:
On Tue, 31 Oct 2006, snacktime wrote:
>Say I'm parsing stuff like http headers, what is going to give better
>performance? Strings with regular expressions? StringIO with readline?
>splitting strings into arrays on a delimiter? Or is it going to be so
>close
>it's not really an issue?
>
>Chris
if you try to write your regular expressions badly enough they can surely
use
the most cpu
I am using it actually, but I"m writing a proxy and I need to parse
the headers the server returns also. I was thinking about just adding
a parser class to the mongrel parser to do this based on the existing
one, still not decided though.
···
On 10/30/06, Jeremy Hinegardner <jeremy@hinegardner.org> wrote:
On Tue, Oct 31, 2006 at 02:14:42PM +0900, ara.t.howard@noaa.gov wrote:
> On Tue, 31 Oct 2006, snacktime wrote:
>
> >Say I'm parsing stuff like http headers, what is going to give better
> >performance? Strings with regular expressions? StringIO with readline?
> >splitting strings into arrays on a delimiter? Or is it going to be so
> >close
> >it's not really an issue?
> >
> >Chris
>
> if you try to write your regular expressions badly enough they can surely
> use
> the most cpu
It all 'depends' If you're doing http header parsing, why not just
use the header parsing in mongrel. It's already available as a C
extension, probably not going to get much faster than that.