Nokogiri extract text?

Pen_Ttt · 23 June 2010 03:23

there is a simple file /home/pt/test.html such as the following

<html>

<body>

hallo,world

</body>

</html>

i want to extract the text "hallo,world" in the /home/pt/test.html with
nokogiri,how to write?

require 'rubygems'
require 'nokogiri'
html = '/home/pt/test.html'
doc = Nokogiri::HTML(html)

would you mind to finish it ?

···

--
Posted via http://www.ruby-forum.com/.

Luis1 · 23 June 2010 11:15

At http://wiki.github.com/tenderlove/nokogiri/ you can read on how to
find the nodes you need. I think you'll need to use xpath.

Bye

···

On Wed, Jun 23, 2010 at 12:23 AM, Pen Ttt <myocean135@yahoo.cn> wrote:

there is a simple file /home/pt/test.html such as the following

<html>

<body>



 
hallo,world

 



</body>

</html>

i want to extract the text "hallo,world" in the /home/pt/test.html with
nokogiri,how to write?

require 'rubygems'
require 'nokogiri'
html = '/home/pt/test.html'
doc = Nokogiri::HTML(html)

would you mind to finish it ?

--
Luis Parravicini
http://ktulu.com.ar/blog/

Nils_Haldenwang · 10 April 2011 12:04

If you just want to extract some specific text within a specific tag you
should go with what Luis posted.

If you want to extract the whole plain text from a specific area in your
document, not knowing which tags may occur, you can try this:
http://www.nils-haldenwang.de/frameworks-and-tools/nokogiri/how-to-extract-plain-text-from-html-with-nokogiri

···

--
Posted via http://www.ruby-forum.com/.

Ted_Flethuseo · 10 April 2011 12:29

I do it like this:

puts doc.search('p').map { |e| e.text }

Pen Ttt wrote in post #920908:

···

there is a simple file /home/pt/test.html such as the following

<html>

<body>



ã€€ã€€ 
hallo,world

ã€€ 

ã€€

</body>

</html>

i want to extract the text "hallo,world" in the /home/pt/test.html with
nokogiri,how to write?

require 'rubygems'
require 'nokogiri'
html = '/home/pt/test.html'
doc = Nokogiri::HTML(html)

would you mind to finish it ?

--
Posted via http://www.ruby-forum.com/\.

Topic		Replies	Views
Extracting some text from HTML ruby-talk	2	142	2 November 2010
Nokogiri help parsing HTML ruby-talk	17	529	29 March 2013
How to extract texts from html source? ruby-talk	13	129	2 June 2005
Print - and strip text between tags using Nokogiri ruby-talk	12	622	17 December 2012
Nokogiri parsing Google page. Want links ruby-talk	3	453	20 December 2017

Nokogiri extract text?

Related topics