Regexp conditional

Hi,

How could I easily do the following?:

input = '111|"aaaa" bbbbb|c'

I would like to get as output following:

output = '111|"""aaa"" bbbbb"|c'

to correct wrongly prepared pipe separated file with enclosing the
equation characters.

thanks
chris

My interpretation of what you need is:

    input = '111|"aaaa" bbbbb|c'

    fields = input.split('|')
    fields[1].gsub!('"', '""')
    fields[1] = %Q{"#{fields[1]}"}

    output = fields.join('|')

Depending on the input that may not be valid, for example if fields[1] may contain a pipe. Anyway being a mal-formed input assumptions depend on the actual data.

-- fxn

···

On Mar 19, 2008, at 10:00 , ciapecki wrote:

Hi,

How could I easily do the following?:

input = '111|"aaaa" bbbbb|c'

I would like to get as output following:

output = '111|"""aaa"" bbbbb"|c'

to correct wrongly prepared pipe separated file with enclosing the
equation characters.

your solution works, and thanks for that,
I am waiting though for a regexp solution,

thanks anyway
chris

···

On 19 Mrz., 10:09, Xavier Noria <f...@hashref.com> wrote:

My interpretation of what you need is:

    input = '111|"aaaa" bbbbb|c'

    fields = input.split('|')
    fields[1].gsub!('"', '""')
    fields[1] = %Q{"#{fields[1]}"}

    output = fields.join('|')

Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.

-- fxn

Even 2 regexps:

irb(main):001:0> input = '111|"aaaa" bbbbb|c'
=> "111|\"aaaa\" bbbbb|c"
irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ? '"'<<m<<'"' : m}
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"
irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ? '"'<<m<<'"' : m}
111|"""aaaa"" bbbbb"|c
=> nil

Cheers

  robert

···

On 19.03.2008 11:21, ciapecki wrote:

On 19 Mrz., 10:09, Xavier Noria <f...@hashref.com> wrote:

My interpretation of what you need is:

    input = '111|"aaaa" bbbbb|c'

    fields = input.split('|')
    fields[1].gsub!('"', '""')
    fields[1] = %Q{"#{fields[1]}"}

    output = fields.join('|')

Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.

your solution works, and thanks for that,
I am waiting though for a regexp solution,

Robert Klemme wrote:

I am waiting though for a regexp solution,

Even 2 regexps:
Cheers

  robert

Robert,

How fixed is the input. If it is always of the same format, then what
about:

input = '111|"aaaa" bbbbb|c'
output=input.gsub(/\"/,'""').gsub(/(.*)\|(.*)\|(.*)/,'\1|"\2"|\3')

Mac

···

--
Posted via http://www.ruby-forum.com/\.

this is just great,
thanks robert for this

chris

···

On 19 Mrz., 22:28, Robert Klemme <shortcut...@googlemail.com> wrote:

On 19.03.2008 11:21, ciapecki wrote:

> On 19 Mrz., 10:09, Xavier Noria <f...@hashref.com> wrote:

>> My interpretation of what you need is:

>> input = '111|"aaaa" bbbbb|c'

>> fields = input.split('|')
>> fields[1].gsub!('"', '""')
>> fields[1] = %Q{"#{fields[1]}"}

>> output = fields.join('|')

>> Depending on the input that may not be valid, for example if fields[1]
>> may contain a pipe. Anyway being a mal-formed input assumptions depend
>> on the actual data.

> your solution works, and thanks for that,
> I am waiting though for a regexp solution,

Even 2 regexps:

irb(main):001:0> input = '111|"aaaa" bbbbb|c'
=> "111|\"aaaa\" bbbbb|c"
irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
'"'<<m<<'"' : m}
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"
irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
'"'<<m<<'"' : m}
111|"""aaaa"" bbbbb"|c
=> nil

Cheers

        robert

Paul Mckibbin wrote:

Robert,

Oops. I meant Chris of course.

···

Mac

--
Posted via http://www.ruby-forum.com/\.

You're welcome. Btw, this is even better (also faster)

irb(main):003:0> input = '111|"aaaa" bbbbb|c'
=> "111|\"aaaa\" bbbbb|c"
irb(main):004:0> input.gsub(/"/,'""').gsub(/[^|]*"[^|]*/,'"\\&"')
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"
irb(main):005:0> puts input.gsub(/"/,'""').gsub(/[^|]*"[^|]*/,'"\\&"')
111|"""aaaa"" bbbbb"|c
=> nil

This could even be a bit faster:

irb(main):006:0> input.gsub(/"/,'""').gsub(/[^|"]*"[^|]*/,'"\\&"')
=> "111|\"\"\"aaaa\"\" bbbbb\"|c"

Kind regards

robert

···

2008/3/20, ciapecki <ciapecki@gmail.com>:

On 19 Mrz., 22:28, Robert Klemme <shortcut...@googlemail.com> wrote:
> On 19.03.2008 11:21, ciapecki wrote:
>
>
>
> > On 19 Mrz., 10:09, Xavier Noria <f...@hashref.com> wrote:
>
> >> My interpretation of what you need is:
>
> >> input = '111|"aaaa" bbbbb|c'
>
> >> fields = input.split('|')
> >> fields[1].gsub!('"', '""')
> >> fields[1] = %Q{"#{fields[1]}"}
>
> >> output = fields.join('|')
>
> >> Depending on the input that may not be valid, for example if fields[1]
> >> may contain a pipe. Anyway being a mal-formed input assumptions depend
> >> on the actual data.
>
> > your solution works, and thanks for that,
> > I am waiting though for a regexp solution,
>
> Even 2 regexps:
>
> irb(main):001:0> input = '111|"aaaa" bbbbb|c'
> => "111|\"aaaa\" bbbbb|c"
> irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
> '"'<<m<<'"' : m}
> => "111|\"\"\"aaaa\"\" bbbbb\"|c"
> irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,'""') ?
> '"'<<m<<'"' : m}
> 111|"""aaaa"" bbbbb"|c
> => nil
>
> Cheers
>
> robert

this is just great,
thanks robert for this

--
use.inject do |as, often| as.you_can - without end

Hi Mac,

The format is fixed but contains 49 fields separated by | so your one-
liner could not fit into one line :slight_smile:

thanks,
chris

···

On 20 Mrz., 01:18, Paul Mckibbin <pmckib...@gmail.com> wrote:

Paul Mckibbin wrote:

> Robert,

Oops. I meant Chris of course.

> Mac

--
Posted viahttp://www.ruby-forum.com/.

ciapecki wrote:

The format is fixed but contains 49 fields separated by | so your one-
liner could not fit into one line :slight_smile:

A minor change :slight_smile:

input = '111|"aaaa" bbbbb|c|"22222" asdasd|ddd|"aaaa"qqqqq|jjjj'
output=input.gsub(/"/,'""').gsub(/(.*?)\|(.*?)\|(.*?)/,'\1|"\2"|\3')

=>111|"""aaaa"" bbbbb"|c|"""22222"" asdasd"|ddd|"""aaaa""qqqqq"|jjjj

This was just to point out that there is no need for multiple replace
options, but you need to know the layout is correct and in a given
pattern. Robert's is the better and more robust solution (and also
shorter).

Mac

···

--
Posted via http://www.ruby-forum.com/\.