7stud2
(7stud --)
23 May 2013 16:13
1
I have a large file which lots of gibberish in and I'm trying to find
the meaningful sections.
Essentially I'll have something like this:
···
________________
To: "1313131"
From: "1313131"
random data lines
To: "1313132"
From: "1313132"
random data lines
To: "1313133"
From: "1313132"
random data lines
To: "1313134"
From: "1313134"
random data lines
________________
What I need to do is locate the line(s) where From is different from To.
In this case, the one From "1313132" To "1313133".
I don't know how to do this kind of match, but I assume that Ruby has a
way?
--
Posted via http://www.ruby-forum.com/ .
7stud2
(7stud --)
23 May 2013 17:33
2
regex capturing with assignment. Capture your string or number in the
From field using parenthesis to capture and assign: /(\d+)/
...then compare the assigned value, which is $1, with the next string
Sorry to do this to your all, but this is a quick Perl example I just
whipped up, but is easy to convert to Ruby:
perl -le '$x = "12345"; $y = "123465";
if ( $x =~ /(\d+)/ ) {
if ( $y == $1 ) {
print "yep"
}
else {
print "nope!"
}
}
'
···
--
Posted via http://www.ruby-forum.com/ .
7stud2
(7stud --)
23 May 2013 18:58
3
I use Rubular a lot, it's great!
Thanks for the ideas. I've decided to loop through the file using 2
variables, similarly to Derrick's suggestion.
Nice trick with "\1" Chris, I haven't tried using that inside the same
expression before. I'll see whether I can use that in this instance.
I was wondering whether Ruby's Regexp had this kind of option built in,
but I guess this scenario is more on the conditional side of
programming.
···
--
Posted via http://www.ruby-forum.com/ .
I have a large file which lots of gibberish in and I'm trying to find
the meaningful sections.
Essentially I'll have something like this:
________________
To: "1313131"
From: "1313131"
random data lines
To: "1313132"
From: "1313132"
random data lines
To: "1313133"
From: "1313132"
random data lines
To: "1313134"
From: "1313134"
random data lines
________________
What I need to do is locate the line(s) where From is different from To.
In this case, the one From "1313132" To "1313133".
I don't know how to do this kind of match, but I assume that Ruby has a
way?
--
Posted via http://www.ruby-forum.com/\ .
Here is a regex that works for your example data.
text = '
random data lines
random data lines
random data lines
random data lines
random data lines
'
regex = /To: "(.*?)"\nFrom: "(?!\1)(.*?)"$/
text.scan(regex) # => [["1313133", "1313132"], ["abc", "def"]]
···
On Thu, May 23, 2013 at 11:13 AM, Joel Pearson <lists@ruby-forum.com> wrote:
To: "1313131"
From: "1313131"
To: "1313132"
From: "1313132"
To: "1313133"
From: "1313132"
To: "1313134"
From: "1313134"
To: "abc"
From: "def"
7stud2
(7stud --)
24 May 2013 10:43
5
Excellent! I tried negatives using (?!\1) before but I couldn't get them
to work. Thanks for the help.
···
--
Posted via http://www.ruby-forum.com/ .
7stud2
(7stud --)
26 May 2013 18:06
6
A group within a group, and scan with a block? I had no idea!
Ruby, you continually delight me
···
--
Posted via http://www.ruby-forum.com/ .
Chris6
(Chris)
23 May 2013 18:36
7
Here's a regex that captures all the cases where To matches From:
Can't find an easy switch to find the mismatch as you need, but maybe it'll
provide
a starting point.
Plus Rubular is a great resource for exploring regex
cheers
···
On Thu, May 23, 2013 at 1:33 PM, Derrick B. <lists@ruby-forum.com> wrote:
regex capturing with assignment. Capture your string or number in the
From field using parenthesis to capture and assign: /(\d+)/
...then compare the assigned value, which is $1, with the next string
Sorry to do this to your all, but this is a quick Perl example I just
whipped up, but is easy to convert to Ruby:
perl -le '$x = "12345"; $y = "123465";
if ( $x =~ /(\d+)/ ) {
if ( $y == $1 ) {
print "yep"
}
else {
print "nope!"
}
}
'
--
Posted via http://www.ruby-forum.com/\ .
Stu1
(Stu)
23 May 2013 22:53
8
You can do you regex test against both contexts ^To: and ^From: and
use post_match to reveal the contents after:
http://ruby-doc.org/core-2.0/MatchData.html#method-i-post_match
~Stu
Excellent! I tried negatives using (?!\1) before but I couldn't get them
to work. Thanks for the help.
You can even get the whole line content if you like
irb(main):053:0> s.scan %r{(To:\s+("\d+")\s*$\s*From:\s+(?!\2).*?(?=To))}m
=> [["To: \"1313133\"\nFrom: \"1313132\"\nrandom data lines\n\n",
"\"1313133\""]]
irb(main):054:0>
s.scan(%r{(To:\s+("\d+")\s*$\s*From:\s+(?!\2).*?(?=To))}m).map(&:first)
=> ["To: \"1313133\"\nFrom: \"1313132\"\nrandom data lines\n\n"]
irb(main):055:0> puts
s.scan(%r{(To:\s+("\d+")\s*$\s*From:\s+(?!\2).*?(?=To))}m).map(&:first)
random data lines
Or with a block:
irb(main):057:0> s.scan %r{(To:\s+("\d+")\s*$\s*From:\s+(?!\2).*?(?=To))}m
do puts $1 end;nil
random data lines
Kind regards
robert
···
On Fri, May 24, 2013 at 12:43 PM, Joel Pearson <lists@ruby-forum.com> wrote:
To: "1313133"
From: "1313132"
To: "1313133"
From: "1313132"
--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
A group within a group,
This is regular regular expression functionality: I don't know a single
regexp engines with support for groups which can't do that.
and scan with a block? I had no idea!
That is a fairly old feature of the standard lib - even in 1.8.6 - and so
important when scanning large volumes of text.
Ruby, you continually delight me
Good!
For spec about the regexp language I find this site pretty useful
http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt
Kind regards
robert
···
On Sun, May 26, 2013 at 8:06 PM, Joel Pearson <lists@ruby-forum.com> wrote:
--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/