How to remove leading   from string

Hi
  I Need small help that how to remove leading   tags
  My text is:
     str = "<p>Welcome to ruby &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"
   I want result is
     str = "<p>Welcome to ruby</p>"

Can anybody help

···

--
Posted via http://www.ruby-forum.com/.

Lucky Nl wrote:

Hi
  I Need small help that how to remove leading &nbsp; tags
  My text is:
     str = "<p>Welcome to ruby &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"
   I want result is
     str = "<p>Welcome to ruby</p>"

Can anybody help

You mean trailing, rather than leading?

You probably want String#gsub or String#gsub!. For example:

$ irb --simple-prompt

str = "<p>Welcome to ruby &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"

=> "<p>Welcome to ruby &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"

str.gsub!(/(&nbsp;|\s)+/, " ")

=> "<p>Welcome to ruby </p> <p> </p>"

Removing empty paragraphs is left as an exercise. For more information
on String and Regexp see Programming Ruby: The Pragmatic Programmer's Guide

However for anything other than the most basic transformations, you are
almost certainly better off with a HTML parser like Nokogiri, than
chomping HTML with regexps.

···

--
Posted via http://www.ruby-forum.com/\.

simply a.delete!("&nbsp;")

···

On Aug 2, 3:59 am, Lucky Nl <lakshmi2...@gmail.com> wrote:

Hi
I Need small help that how to remove leading &nbsp; tags
My text is:
str = "<p>Welcome to ruby &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"
I want result is
str = "<p>Welcome to ruby</p>"

Can anybody help
--
Posted viahttp://www.ruby-forum.com/.

Hi ,
Am entering multiple paragrpahs in editior .that will be saved into str
varable.

Ex:
   str is
      <p>test one &nbsp;&nbsp;</p><p>&nbsp;&nbsp;</p><p>test one test
       onetest onetest one</p> <p>test two test
       two test two test two &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>

we have enetered like this way
  I want result is

      <p>test one &nbsp;&nbsp;</p><p>&nbsp;&nbsp;</p><p>test one test
       onetest onetest one</p> <p>test two test
       two test two test two</p>

here removed end of the nbsptags between paragparhs and removed nbsp; 's
in "<p>test two test
       two test two test two</p>"

···

--
Posted via http://www.ruby-forum.com/.

Hi
  I Need small help that how to remove leading &nbsp; tags
  My text is:
     str = "<p>Welcome to ruby &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"
   I want result is
     str = "<p>Welcome to ruby</p>"

Can anybody help
--
Posted viahttp://www.ruby-forum.com/.

simply a.delete!("&nbsp;")

Well, that's almost certainly not what the OP wants!

str = "<p>Welcome to ruby &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"

=> "<p>Welcome to ruby &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"

str.delete('&nbsp;')

=> "<>Welcome to ruy </> <></>"

Look at the documentation for String#delete

Then take a look at String#gsub

I suspect that you want to do two passes: one for '&nbsp;' and one for empty paragraphs.

-Rob

Rob Biedenharn
Rob@AgileConsultingLLC.com http://AgileConsultingLLC.com/
rab@GaslightSoftware.com http://GaslightSoftware.com/

···

On Aug 2, 2010, at 5:15 PM, BruceL wrote:

On Aug 2, 3:59 am, Lucky Nl <lakshmi2...@gmail.com> wrote:

Lucky Nl wrote:

Hi ,
Am entering multiple paragrpahs in editior .that will be saved into str
varable.

Ex:
   str is
      <p>test one &nbsp;&nbsp;</p><p>&nbsp;&nbsp;</p><p>test one test
       onetest onetest one</p> <p>test two test
       two test two test two &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>

we have enetered like this way
  I want result is

      <p>test one &nbsp;&nbsp;</p><p>&nbsp;&nbsp;</p><p>test one test
       onetest onetest one</p> <p>test two test
       two test two test two</p>

here removed end of the nbsptags between paragparhs and removed nbsp; 's
in "<p>test two test
       two test two test two</p>"

Your requirement is unclear. Are you saying you want to remove the
&nbsp;'s within the fourth paragraph only, and remove the fifth
paragraph entirely?

I've shown you how to use gsub, and where to find more documentation on
it. String#scan might be useful too.

I suggest you use them in whatever way you need, since only you
understand what you're trying to achieve.

···

--
Posted via http://www.ruby-forum.com/\.

str = str.gsub(/&nbsp;/,"").gsub(/<p>\s*<\/p>/,"")

This will remove any &nbsp; from your html, and after that, remove any <p> tag that contained only whitespace character.

It's less than optimal, as you could combine it in one go, probably, but I don't want to spend time on stuff you should be able to do on your own.

···

On 2010-08-02 07:22:33 -0400, Lucky Nl said:

<p>test one &nbsp;&nbsp;</p><p>&nbsp;&nbsp;</p><p>test one test
       onetest onetest one</p> <p>test two test
       two test two test two &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>

--
Thank you for your brain.
-MrZombie

Hi ,
Let me explain my requiremnt clearly.
Am usinng fck editor in rubyonrails.
So I can enter data is multiple paragraphs or single paragraph. but
after the last paragraph if there is any spaces , i want to remove them
.

SO i entered data is like.

···

---------------------------------------------------------------
<p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
<p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;</p>
--------------------------------------------------------

In the above text 4th paragraph is the last paragrph which i
entered.after that i was pressed enter button so editor converted this
into " <p>&nbsp;&nbsp;&nbsp;&nbsp;</p"

I want to remove nbsp's after text in last paragrpah means result look
like
-------------------------------------------
<p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
<p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14</p>
----------------------------------------------------------------
--
Posted via http://www.ruby-forum.com/.

SO i entered data is like.
---------------------------------------------------------------
<p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
<p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;</p>
--------------------------------------------------------

In the above text 4th paragraph is the last paragrph which i
entered.after that i was pressed enter button so editor converted this
into " <p>&nbsp;&nbsp;&nbsp;&nbsp;</p"

I want to remove nbsp's after text in last paragrpah means result look
like
-------------------------------------------
<p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
<p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14</p>
----------------------------------------------------------------

So the approach is:
(1) Write a regular expression which matches just the thing you want to
delete;
(2) Invoke it with gsub to replace that text with the empty string.

For example, to delete *all* empty paragraphs, then you want to match
<p> followed by any mixture of &nbsp; and space followed by </p>. So you
could write:

    str.gsub! /<p>(&nbsp;|\s)*<\/p>/, ''

(x|y) means match x or y, \s means match any whitespace character, and *
means match it 0 or more times.

To delete only the *last* paragraph if it is empty, then you can tweak
it to:

    str.gsub! /<p>(&nbsp;|\s)*<\/p>\s*\z/, ''

where \z matches the end of the string, and \s* allows 0 or more space
characters, including newlines, to precede that.

Once you're happy with that, then you can do another match and replace
to change the final instance of "&nbsp;&nbsp; </p>" into just "</p>"

But you might want to be sure this is what you really want. How did the
previous &nbsp; entries get there? Do you really want to keep them? It
would be much simpler just to replace all sequences of &nbsp; or space
with a single space.

    str.gsub! /(&nbsp;|\s)+/, ' '

···

--
Posted via http://www.ruby-forum.com/\.

Brian Candler wrote:

SO i entered data is like.
---------------------------------------------------------------
<p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
<p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;</p>
--------------------------------------------------------

In the above text 4th paragraph is the last paragrph which i
entered.after that i was pressed enter button so editor converted this
into " <p>&nbsp;&nbsp;&nbsp;&nbsp;</p"

I want to remove nbsp's after text in last paragrpah means result look
like
-------------------------------------------
<p> pargraph1 pargraph1 &nbsp;&nbsp; </p>
<p>pargraph2 pargraph2 pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14</p>
----------------------------------------------------------------

So the approach is:
(1) Write a regular expression which matches just the thing you want to
delete;
(2) Invoke it with gsub to replace that text with the empty string.

For example, to delete *all* empty paragraphs, then you want to match
<p> followed by any mixture of &nbsp; and space followed by </p>. So you
could write:

    str.gsub! /<p>(&nbsp;|\s)*<\/p>/, ''

(x|y) means match x or y, \s means match any whitespace character, and *
means match it 0 or more times.

To delete only the *last* paragraph if it is empty, then you can tweak
it to:

    str.gsub! /<p>(&nbsp;|\s)*<\/p>\s*\z/, ''

where \z matches the end of the string, and \s* allows 0 or more space
characters, including newlines, to precede that.

Once you're happy with that, then you can do another match and replace
to change the final instance of "&nbsp;&nbsp; </p>" into just "</p>"

But you might want to be sure this is what you really want. How did the
previous &nbsp; entries get there? Do you really want to keep them? It
would be much simpler just to replace all sequences of &nbsp; or space
with a single space.

    str.gsub! /(&nbsp;|\s)+/, ' '

Hi when i was used below logic.
str.gsub! /<p>(&nbsp;|\s)*<\/p>\s*\z/, ''

···

-------------------------------------------------
str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"
str = str.gsub! /<p>(&nbsp;|\s)*<\/p>\s*\z/, ''
puts str
---------------------------------
its returns below resule and it is fine.
----------------------------------------------
<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp;</p>
---------------------------------------------
But if i gave with littile modifiaction at endof line is enetered with
chars <p>dada&nbsp;&nbsp;ddsa</p>"

str giving result nil
--
Posted via http://www.ruby-forum.com/\.

Oh k basically result not modified then returns nil right?

its very helpful your regular expression
But i need one mroe help
str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello3 hell14 &nbsp;&nbsp;</p> <p>&nbsp;&nbsp;</p>"

In the above str <p>&nbsp;&nbsp;</p> is empty line inmy point of view.
so the enetered text in editor is upto

str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello4 hell14 &nbsp;&nbsp;</p>

i want to remove spaces wt the endof the lastpargrpah also.Not in the 1
&2&3rd pargrpahs
Want result is
str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello4 hell14</p>

···

--
Posted via http://www.ruby-forum.com/.

Lucky Nl wrote:

But if i gave with littile modifiaction at endof line is enetered with
chars <p>dada&nbsp;&nbsp;ddsa</p>"

str giving result nil

Yes, the result of gsub! is nil if no change is made; but the string
remains as it was.

irb(main):001:0> str = "abc"
=> "abc"
irb(main):002:0> str.gsub!(/d/,"")
=> nil
irb(main):003:0> str
=> "abc"

It's intended so you can say

  if str.gsub! ...
    # it changed
  else
    # it didn't
  end

If you use gsub instead of gsub!, then it always returns the resulting
string.

irb(main):004:0> str2 = str.gsub(/d/,"")
=> "abc"

···

--
Posted via http://www.ruby-forum.com/\.

Lucky Nl wrote:

i want to remove spaces wt the endof the lastpargrpah also.Not in the 1
&2&3rd pargrpahs
Want result is
str = "<p> pargraph1 pargraph1 &nbsp;&nbsp; </p> <p>pargraph2 pargraph2
pargraph2 &nbsp;&nbsp; </p>
<p> pargraph3 hello3 hell13 &nbsp;&nbsp; </p>
<p> pargraph4 hello4 hell14</p>

So, write a regexp which matches any number of &nbsp; or space, followed
by </p>, followed by end of string. I've given you the tools to do that
already.

If you can't make it work then show what you tried, and we can explain
what needs changing.

You can test your regexps using irb, or you can use this web site:

···

--
Posted via http://www.ruby-forum.com/\.