Hi,
I'm using the following regexp to capture a particular string from a
japanese website content.
/<ul id="ownerProfile" class="owner">.*?<li>([^<]*?)<\/li>/m
The following is the match result.
女性 /
Is there a way I can remove the slash('/') from my result by modifying
the above regular expression.
N. B. gsub can be used but I want to know whether there it can be
achieved by modifying the above regexp
Please help.
Thanks
Arun
···
--
Posted via http://www.ruby-forum.com/ .
7stud
(7stud --)
11 May 2009 15:43
2
In words, describe what just the regex part does.
···
--
Posted via http://www.ruby-forum.com/ .
Your boss doesn't like gsub?
Try
/<ul id="ownerProfile" class="owner">.*?<li>([^<\/]*?)<\/li>/m
That should work, but it won't work for a case where you have / separating
something in the inner text.
Jayanth
···
On Mon, May 11, 2009 at 8:21 PM, Arun Kumar <arunkumar@innovaturelabs.com>wrote:
Hi,
I'm using the following regexp to capture a particular string from a
japanese website content.
/<ul id="ownerProfile" class="owner">.*?<li>([^<]*?)<\/li>/m
The following is the match result.
女性 /
Is there a way I can remove the slash('/') from my result by modifying
the above regular expression.
N. B. gsub can be used but I want to know whether there it can be
achieved by modifying the above regexp
Please help.
Thanks
Arun
--
Posted via http://www.ruby-forum.com/\ .
Phlip1
(Phlip)
11 May 2009 16:05
4
Arun Kumar wrote:
I'm using the following regexp to capture a particular string from a
japanese website content.
/<ul id="ownerProfile" class="owner">.*?<li>([^<]*?)<\/li>/m
Parsing HTML with Regexp makes certain baby dieties cry.
Use Nokogiri, with an XPath of '/ul[ @id = "ownerProfile" and @class =
"owner" ]'. Then pull out the .text and you are done!
···
--
Phlip
7stud
(7stud --)
11 May 2009 15:45
5
7stud -- wrote:
In words, describe what just the regex part does.
I mean the part between the <li> tags.
···
--
Posted via http://www.ruby-forum.com/\ .
That bugle's been blown to death mate.
Jayanth
···
On Mon, May 11, 2009 at 9:35 PM, Phlip <phlip2005@gmail.com> wrote:
Arun Kumar wrote:
> I'm using the following regexp to capture a particular string from a
> japanese website content.
>
> /<ul id="ownerProfile" class="owner">.*?<li>([^<]*?)<\/li>/m
Parsing HTML with Regexp makes certain baby dieties cry.
Use Nokogiri, with an XPath of '/ul[ @id = "ownerProfile" and @class =
"owner" ]'. Then pull out the .text and you are done!
--
Phlip