Splitting a line by columns

I have a line of text output in columnar form; what’s the best way to split it
into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

Thoughts?

I’m not sure what you mean by columns, given your example. Columns,
to me, suggests columns in a newspaper.

But in your example, there’s not much wrong with what you’ve done. A
slight improvement is

data = /(.{5})(.{2})(.{5})(.{2})(.{3})/.match(line).captures

The reason I suggest this is that you can easily generalise it
(replace the literal numbers by variables) to accept different column
widths.

However, I suspect you had something more complicated in mind.

Gavin

···

On Sunday, October 12, 2003, 1:50:16 PM, Mike wrote:

I have a line of text output in columnar form; what’s the best way to split it
into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

If the data is a fixed-width record String#unpack is a compact idiom,
and it’s usually fast as well. For instance:

record = "aaaaabbcccccddeee"
fields = record.unpack("a5a2a5a2a3")

– fxn

···

On Sunday 12 October 2003 05:50, Mike Campbell wrote:

I have a line of text output in columnar form; what’s the best way to
split it into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy
somehow

Hi –

I have a line of text output in columnar form; what’s the best way to split it
into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

Thoughts?

I think the main disadvantage of the above is that it’s a bit awkward
to write into a method. So you might want to do something like:

class String
def columnize(*sizes)
str = dup
sizes.map {|s| str.slice!(0…s)}
end
end

p “aaaabbbbcccddddd”.columnize(4,4,3,5)

[“aaaa”, “bbbb”, “ccc”, “ddddd”]

David

···

On Sun, 12 Oct 2003, Mike Campbell wrote:


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Hi!

  • Mike Campbell; 2003-10-12, 11:44 UTC:

I have a line of text output in columnar form; what’s the best way
to split it into its requisite parts?

That’s one of those ‘it only takes n programmers to get n+1 results’
qestions. It strongly depends on what you mean by ‘best way’.

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

Thoughts?

Here are mine:


#!/usr/bin/env ruby

class Cutter < Array
def cut(line)
map { |range| line[range] }
end
end

knife = Cutter.new([0…4, 5…6, 7…11, 12…13, 14…16])
md = knife.cut(line)


I am a bit surprised how powerful that ad hoc solution is: It
supports overlapping columns and columns can be arranged in arbitrary
order, … The solution also makes it easy to programmatically select
the columns of interest before cutting anything:


knife = Cutter.new()
knife.push(0…4) if rand < 0.5
knife.push(5…6) if rand < 0.5
knife.push(7…11) if rand < 0.5
knife.push(12…13) if rand < 0.5
knife.push(14…16) if rand < 0.5


Comments?

Please take notice of signature! / Bitte Signature beachten!

Josef ‘Jupp’ Schugt

···


db Wenn sie mir ohne meine Einwilligung geschickt wurde, db
dpqb wird eine E-Mail > 100 kB ungelesen entsorgt dpqb
dp::qb If you send me an e-mail > 100 kB without my dp::qb
dp::::qb consent it will be silently discarded dp::::qb

I have a line of text output in columnar form; what’s the best way
to split it into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

I’m not sure what you mean by columns, given your example. Columns,
to me, suggests columns in a newspaper.

Well, my data is laid out like so:
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee

Where the values in aaaaa (et. al.) vary, but in this example, they always are 5
text columns wide.

But in your example, there’s not much wrong with what you’ve done. A
slight improvement is

data = /(.{5})(.{2})(.{5})(.{2})(.{3})/.match(line).captures

Mm, forgot about #captures. Thanks.

I knew about the {} enhancement; apologies for being too vague. I was
mainly wondering if there was another way to do this rather than a regex.

The reason I suggest this is that you can easily generalise it
(replace the literal numbers by variables) to accept different column
widths.

However, I suspect you had something more complicated in mind.

Sadly, no. My problems are generally mundane. =)

Thanks for your help.

Michael

···

On Sunday, October 12, 2003, 1:50:16 PM, Mike wrote:

Perfect, yes. Thanks.

···

— Xavier Noria fxn@hashref.com wrote:

On Sunday 12 October 2003 05:50, Mike Campbell wrote:

I have a line of text output in columnar form; what’s the best
way to
split it into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy
somehow

If the data is a fixed-width record String#unpack is a compact
idiom,
and it’s usually fast as well. For instance:

record = "aaaaabbcccccddeee"
fields = record.unpack("a5a2a5a2a3")

Do you Yahoo!?
The New Yahoo! Shopping - with improved product search