Splitting a line by columns

Michael_Campbell1 · 12 October 2003 03:50

I have a line of text output in columnar form; what’s the best way to split it
into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

Thoughts?

Gavin_Sinclair · 12 October 2003 04:25

I’m not sure what you mean by columns, given your example. Columns,
to me, suggests columns in a newspaper.

But in your example, there’s not much wrong with what you’ve done. A
slight improvement is

data = /(.{5})(.{2})(.{5})(.{2})(.{3})/.match(line).captures

The reason I suggest this is that you can easily generalise it
(replace the literal numbers by variables) to accept different column
widths.

However, I suspect you had something more complicated in mind.

Gavin

···

On Sunday, October 12, 2003, 1:50:16 PM, Mike wrote:

I have a line of text output in columnar form; what’s the best way to split it
into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

Xavier_Noria · 12 October 2003 08:13

If the data is a fixed-width record String#unpack is a compact idiom,
and it’s usually fast as well. For instance:

record = "aaaaabbcccccddeee"
fields = record.unpack("a5a2a5a2a3")

– fxn

···

On Sunday 12 October 2003 05:50, Mike Campbell wrote:

I have a line of text output in columnar form; what’s the best way to
split it into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy
somehow

David_A_Black2 · 12 October 2003 12:27

Hi –

I have a line of text output in columnar form; what’s the best way to split it
into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

Thoughts?

I think the main disadvantage of the above is that it’s a bit awkward
to write into a method. So you might want to do something like:

class String
def columnize(*sizes)
str = dup
sizes.map {|s| str.slice!(0…s)}
end
end

p “aaaabbbbcccddddd”.columnize(4,4,3,5)

[“aaaa”, “bbbb”, “ccc”, “ddddd”]

David

···

On Sun, 12 Oct 2003, Mike Campbell wrote:

–
David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Josef_Jupp_SCHUGT · 12 October 2003 19:46

Hi!

Mike Campbell; 2003-10-12, 11:44 UTC:

I have a line of text output in columnar form; what’s the best way
to split it into its requisite parts?

That’s one of those ‘it only takes n programmers to get n+1 results’
qestions. It strongly depends on what you mean by ‘best way’.

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

Thoughts?

Here are mine:

#!/usr/bin/env ruby

class Cutter < Array
def cut(line)
map { |range| line[range] }
end
end

knife = Cutter.new([0…4, 5…6, 7…11, 12…13, 14…16])
md = knife.cut(line)

I am a bit surprised how powerful that ad hoc solution is: It
supports overlapping columns and columns can be arranged in arbitrary
order, … The solution also makes it easy to programmatically select
the columns of interest before cutting anything:

knife = Cutter.new()
knife.push(0…4) if rand < 0.5
knife.push(5…6) if rand < 0.5
knife.push(7…11) if rand < 0.5
knife.push(12…13) if rand < 0.5
knife.push(14…16) if rand < 0.5

Comments?

Please take notice of signature! / Bitte Signature beachten!

Josef ‘Jupp’ Schugt

···

–
db Wenn sie mir ohne meine Einwilligung geschickt wurde, db
dpqb wird eine E-Mail > 100 kB ungelesen entsorgt dpqb
dp::qb If you send me an e-mail > 100 kB without my dp::qb
dp::::qb consent it will be silently discarded dp::::qb

Michael_Campbell1 · 12 October 2003 14:42

I have a line of text output in columnar form; what’s the best way
to split it into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy somehow

I’m not sure what you mean by columns, given your example. Columns,
to me, suggests columns in a newspaper.

Well, my data is laid out like so:
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
aaaaabbcccccddeee
…

Where the values in aaaaa (et. al.) vary, but in this example, they always are 5
text columns wide.

But in your example, there’s not much wrong with what you’ve done. A
slight improvement is

data = /(.{5})(.{2})(.{5})(.{2})(.{3})/.match(line).captures

Mm, forgot about #captures. Thanks.

I knew about the {} enhancement; apologies for being too vague. I was
mainly wondering if there was another way to do this rather than a regex.

The reason I suggest this is that you can easily generalise it
(replace the literal numbers by variables) to accept different column
widths.

However, I suspect you had something more complicated in mind.

Sadly, no. My problems are generally mundane. =)

Thanks for your help.

Michael

···

On Sunday, October 12, 2003, 1:50:16 PM, Mike wrote:

Michael_Campbell1 · 12 October 2003 16:11

Perfect, yes. Thanks.

···

— Xavier Noria fxn@hashref.com wrote:

On Sunday 12 October 2003 05:50, Mike Campbell wrote:

I have a line of text output in columnar form; what’s the best
way to
split it into its requisite parts?

Say I have lines of

aaaaabbcccccddeee

I can do something like:

md = /(…)(…)(…)(…)(…)/.match(line); # seems klugy
somehow

If the data is a fixed-width record String#unpack is a compact
idiom,
and it’s usually fast as well. For instance:
record = "aaaaabbcccccddeee"
fields = record.unpack("a5a2a5a2a3")

Do you Yahoo!?
The New Yahoo! Shopping - with improved product search

Topic		Replies	Views
Newbie regexp question ruby-talk	5	65	15 September 2006
Text processing ruby-talk	19	122	7 March 2009
Using Ruby for line-wrapping ruby-talk	6	89	28 April 2007
Split words with two columns ruby-talk	2	115	18 April 2006
How to make a virtual 2nd column! ruby-talk	4	189	1 August 2002

Splitting a line by columns

[“aaaa”, “bbbb”, “ccc”, “ddddd”]

Related topics