Add to words syntaxes

Dani · 16 March 2006 09:14

Thanks, for the answers, but didnt worked. I try to clear my problem,
Here is the input:

0608A;
Teszt Kft;
33445566222;
;
20060101;
20060131;
0A0001C002A 33445566222
0A0001C007A Teszt kft
# this text repeat himself with other data several times

This, should look like this:

<?xml version="1.0" encoding="windows-1250"?>
<nyomtatvanyok>
  <nyomtatvany>
    <nyomtatvanyinformacio>
      <nyomtatvanyazonosito>0608A</nyomtatvanyazonosito>
      <adozo>
        <nev>Teszt kft</nev> # the first line from the txt
        <adoszam>33445566222</adoszam> # second line
        <adoazonosito></adoazonosito> #thrid line, in this txt is blank
      </adozo>
      <idoszak>
        <tol>20060101</tol> # fourth
        <ig>20060131</ig> # fifth
      </idoszak>
    </nyomtatvanyinformacio>
    <mezok>
      <mezo eazon="0A0001C002A">11111111122</mezo> # and the lines...
      <mezo eazon="0A0001C007A">Próba Cég</mezo>
.....
</mezok></nyomtatvany></nyomtatvanyok>

So my problem is how can I get an XML source like above. For the columns
I already have this script:
outfile = ARGV.shift

lines = ARGF.readlines
marked_up_lines = lines.map do |line|
words = line.split
'<mezo eazon="' + words[0] + '">' + words[1] + '</mezo>' + "\n"
end

File.open(outfile,'w') do |file|
file.write marked_up_lines.join
end

This works fine, but I dont know what to do with the first 6 lines to
bring it to look like above and how can I bring ruby to repeat himself
and what for a character need I to use for repeating..
Regards,

Daniel

···

-----Original Message-----
From: Ross Bamford [mailto:rossrt@roscopeco.co.uk]
Sent: Thursday, March 16, 2006 9:42 AM
To: ruby-talk ML
Subject: Re: add to words syntaxes

On Thu, 2006-03-16 at 16:25 +0900, Dani wrote:

Hi!
Still have a question: I have a txt file with words, seperated with
semicolon or linebreaks. looks like this:
Teszt Kft
33445566222
;
20060101
20060131
0A0001C002A 33445566222
0A0001C007A Teszt kft

I need that ruby add to each line several XML tags. So lets say:
<hello>Teszt Kft</hello>
<foo>33445566222</foo>
<bar></bar>
and so on... from line 6 (where are two columns) should ruby use
another (ruby)script. And when it reached the end of the section

(where

could be an escape character or something, what it changes to ie.
</end>) begin from front, so long it has no more escape characters.
So, how can I do this? Please if you can help me...

Perhaps it's me not being awake properly, but it doesn't seem very clear
from your question exactly what you want to do? To clarify:

+ Are the lines you want to process always in sets of 3
(or whatever number)? Or is it single-line, with some way to choose
which tag goes on a given line?

+ What do you mean 'use another script' for the two-column
  lines? I note as well that it could be difficult to determine
  which lines are two column, since spaces seem to be allowed
  in the processed input anyway.

Anyway, I'm sure this isn't what you're after, but maybe it'll point you
in the right direction. Failing that, post a bit more detail and sample
input/output and I'm sure you'll get what you need.

s = "<your sample input, above>"
s.gsub(/^(.*)$/) { "<foo>#$1</foo>" }

# => "<foo>Teszt Kft</foo>
# <foo>33445566222</foo>
# <foo>;</foo>
# <foo>20060101</foo>
# <foo>20060131</foo>
# <foo>0A0001C002A 33445566222</foo>
# <foo>0A0001C007A Teszt kft</foo>"

s.map { |line| "<foo>#{line.chomp}</foo>\n" }.join
# => as above

s.map do |line|
  tag = (line =~ /^\d+$/) ? 'foo' : 'bar'
  "<#{tag}>#{line.chomp}</#{tag}>\n"
end.join
# => "<bar>Teszt Kft</bar>
      <foo>33445566222</foo>
      <bar>;</bar>
      <foo>20060101</foo>
      <foo>20060131</foo>
      <bar>0A0001C002A 33445566222</bar>
      <bar>0A0001C007A Teszt kft</bar>"

--
Ross Bamford - rosco@roscopeco.REMOVE.co.uk

Carlos · 16 March 2006 09:45

Dani wrote:

Thanks, for the answers, but didnt worked. I try to clear my problem,
Here is the input:

0608A;
Teszt Kft;
33445566222;
;
20060101;
20060131;
0A0001C002A 33445566222
0A0001C007A Teszt kft
# this text repeat himself with other data several times

This, should look like this:

<?xml version="1.0" encoding="windows-1250"?>
<nyomtatvanyok>
  <nyomtatvany>
    <nyomtatvanyinformacio>
      <nyomtatvanyazonosito>0608A</nyomtatvanyazonosito>
      <adozo>
        <nev>Teszt kft</nev> # the first line from the txt
        <adoszam>33445566222</adoszam> # second line
        <adoazonosito></adoazonosito> #thrid line, in this txt is blank
      </adozo>
      <idoszak>
        <tol>20060101</tol> # fourth
        <ig>20060131</ig> # fifth
      </idoszak>
    </nyomtatvanyinformacio>
    <mezok>
      <mezo eazon="0A0001C002A">11111111122</mezo> # and the lines...
      <mezo eazon="0A0001C007A">Próba Cég</mezo>
.....
</mezok></nyomtatvany></nyomtatvanyok>

So my problem is how can I get an XML source like above. For the columns
I already have this script:
outfile = ARGV.shift

lines = ARGF.readlines

Well, before you start to loop to fill the "mezo" tags, extract the other lines:

  nyomtatvanyazonosito, nev, adoszam, adoazonosito, tol, ig =
    lines.slice!(0,6).map {|w| w.chomp.chomp(';') }
  first_six_data = <<-EOT
<?xml ...
<nyomtatvanyinformacio>
   <nyomtatvanyazonosito>#{nyomtatvanyazonosito}</nyomtatvanyazonosito>
      <adozo>
        <nev>#{nev}</nev>
  ...
EOT

...And then continue processing the array as before.

marked_up_lines = lines.map do |line|
words = line.split
'<mezo eazon="' + words[0] + '">' + words[1] + '</mezo>' + "\n"
end

File.open(outfile,'w') do |file|

file.write first_six_data

file.write marked_up_lines.join
end

HTH

(warning: code not tested)

Topic		Replies	Views
Add to words syntaxes ruby-talk	1	97	16 March 2006
Add to words syntaxes ruby-talk	1	87	16 March 2006
Add to words syntaxes ruby-talk	1	74	16 March 2006
Add to words syntaxes ruby-talk	0	78	16 March 2006
Add to words syntaxes ruby-talk	1	76	17 March 2006

Add to words syntaxes

Related topics