[newbie] upper to lower first letter of a word

Are you likely to have numbers in the drug’s name? Don’t forget to
include that as a test case if so. The following puts a space before the
string of digits immediately preceding " mg" if it doesn’t already have
one:

sub(/(\w)(\d+ mg)/, “#{$1} #{$2}”)

martin

···

Thomas A. Reilly w3gat@bellsouth.net wrote:

I would appreciate it if someone could give me the regexp that it would
split the following:
for example -
“clonidine300 mg” into “clonidine 300 mg”

I have a bunch of drug data where the dose had been typed together.

Hi –

I would appreciate it if someone could give me the regexp that it would
split the following:
for example -
“clonidine300 mg” into “clonidine 300 mg”

I have a bunch of drug data where the dose had been typed together.

Just to add to the list of ideas: in 1.8.0 you can use scanf:

irb(main):016:0> require ‘scanf’
=> false
irb(main):017:0> str = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):018:0> name, dose, unit = str.scanf(‘%[\D]%d%s’)
=> [“clonidine”, 300, “mg”]

the advantage of which is that it turns your dosage into an integer
rather than a string. (But it doesn’t handle the case where there are
digits in the drug name.)

David

···

On Sat, 27 Sep 2003, Thomas A. Reilly wrote:


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

And yet another way:

rb(main):018:0> m = /(\w+?)(\d+)\s+(\w+)/.match(s)
=> #MatchData:0x81c2200
irb(main):019:0> m[1]
=> “clonidine”
irb(main):020:0> m[2]
=> “300”
irb(main):021:0> m[3]
=> “mg”

···

On Saturday, 27 September 2003 at 12:44:11 +0900, Jim Freeze wrote:

On Saturday, 27 September 2003 at 12:17:19 +0900, Thomas A. Reilly wrote:

I would appreciate it if someone could give me the regexp that it would
split the following:
for example -
“clonidine300 mg” into “clonidine 300 mg”

I have a bunch of drug data where the dose had been typed together.

There’s probably more than one way to do this. Here’s one way:


Jim Freeze

“There is no reason for any individual to have a computer in their
home.”
– Ken Olson, President of DEC, World Future Society
Convention, 1977

Whoops, that’s because I’d already required it in this session :slight_smile:

David

···

On Sat, 27 Sep 2003 dblack@superlink.net wrote:

irb(main):016:0> require ‘scanf’
=> false


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Martin DeMello wrote:

I would appreciate it if someone could give me the regexp that it would
split the following:
for example -
“clonidine300 mg” into “clonidine 300 mg”

I have a bunch of drug data where the dose had been typed together.

Are you likely to have numbers in the drug’s name? Don’t forget to
include that as a test case if so. The following puts a space before the
string of digits immediately preceding " mg" if it doesn’t already have
one:

sub(/(\w)(\d+ mg)/, “#{$1} #{$2}”)

martin

Another way to do it:

irb(main):001:0> s = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):002:0> s[/(?=\d+ mg)/] = " "
=> " "
irb(main):003:0> s
=> “clonidine 300 mg”
irb(main):004:0>

···

Thomas A. Reilly w3gat@bellsouth.net wrote:

Thanks a lot everyone.
The suggestions wwrked fine.

Tom

irb(main):001:0> s = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):002:0> s[/(?=\d+ mg)/] = " "
=> " "

What’s exactly happening here?

thanks,
Rodrigo

Rodrigo B. de Oliveira wrote:

irb(main):001:0> s = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):002:0> s[/(?=\d+ mg)/] = " "
=> " "

What’s exactly happening here?

String#=, as in “s1[pat] = s2”, is a destructive slice operator. It
replaces the first match of pat with the r.h.s. (raising IndexError if
no match).

In this case, pat is /(?=\d+ mg)/, which has a lookahead pattern
(?=…). This lookahead expression matches the point in the string just
before
the match of /\d+ mg/, but it doesn’t consume the “300 mg”. So
slicing out the match (which is empty) and substituting " " has the
effect of inserting a space before the match of /\d+ mg/.

I think:
s[/pattern/] returns the matching part of the string.
s[/pattern/]= value assigns ‘value’ to that piece of the string.
(?=regex) is known as ``zero-width positive lookahead’’ and means that
the parte is string matched should not be consumed.

···

il Sun, 28 Sep 2003 03:01:27 +0900, “Rodrigo B. de Oliveira” rbo@acm.org ha scritto::

irb(main):001:0> s = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):002:0> s[/(?=\d+ mg)/] = " "
=> " "

What’s exactly happening here?

Thanks! Really beautiful.

Rodrigo

···

----- Original Message -----
From: “Joel VanderWerf” vjoel@PATH.Berkeley.EDU

String#=, as in “s1[pat] = s2”, is a destructive slice operator. It
replaces the first match of pat with the r.h.s. (raising IndexError if
no match).

In this case, pat is /(?=\d+ mg)/, which has a lookahead pattern
(?=…)…