Are you likely to have numbers in the drug’s name? Don’t forget to
include that as a test case if so. The following puts a space before the
string of digits immediately preceding " mg" if it doesn’t already have
one:
sub(/(\w)(\d+ mg)/, “#{$1} #{$2}”)
martin
···
Thomas A. Reilly w3gat@bellsouth.net wrote:
I would appreciate it if someone could give me the regexp that it would
split the following:
for example -
“clonidine300 mg” into “clonidine 300 mg”
I have a bunch of drug data where the dose had been typed together.
Hi –
I would appreciate it if someone could give me the regexp that it would
split the following:
for example -
“clonidine300 mg” into “clonidine 300 mg”
I have a bunch of drug data where the dose had been typed together.
Just to add to the list of ideas: in 1.8.0 you can use scanf:
irb(main):016:0> require ‘scanf’
=> false
irb(main):017:0> str = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):018:0> name, dose, unit = str.scanf(‘%[\D]%d%s’)
=> [“clonidine”, 300, “mg”]
the advantage of which is that it turns your dosage into an integer
rather than a string. (But it doesn’t handle the case where there are
digits in the drug name.)
David
···
On Sat, 27 Sep 2003, Thomas A. Reilly wrote:
–
David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav
And yet another way:
rb(main):018:0> m = /(\w+?)(\d+)\s+(\w+)/.match(s)
=> #MatchData:0x81c2200
irb(main):019:0> m[1]
=> “clonidine”
irb(main):020:0> m[2]
=> “300”
irb(main):021:0> m[3]
=> “mg”
···
On Saturday, 27 September 2003 at 12:44:11 +0900, Jim Freeze wrote:
On Saturday, 27 September 2003 at 12:17:19 +0900, Thomas A. Reilly wrote:
I would appreciate it if someone could give me the regexp that it would
split the following:
for example -
“clonidine300 mg” into “clonidine 300 mg”
I have a bunch of drug data where the dose had been typed together.
There’s probably more than one way to do this. Here’s one way:
–
Jim Freeze
“There is no reason for any individual to have a computer in their
home.”
– Ken Olson, President of DEC, World Future Society
Convention, 1977
Whoops, that’s because I’d already required it in this session 
David
···
On Sat, 27 Sep 2003 dblack@superlink.net wrote:
irb(main):016:0> require ‘scanf’
=> false
–
David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav
Martin DeMello wrote:
I would appreciate it if someone could give me the regexp that it would
split the following:
for example -
“clonidine300 mg” into “clonidine 300 mg”
I have a bunch of drug data where the dose had been typed together.
Are you likely to have numbers in the drug’s name? Don’t forget to
include that as a test case if so. The following puts a space before the
string of digits immediately preceding " mg" if it doesn’t already have
one:
sub(/(\w)(\d+ mg)/, “#{$1} #{$2}”)
martin
Another way to do it:
irb(main):001:0> s = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):002:0> s[/(?=\d+ mg)/] = " "
=> " "
irb(main):003:0> s
=> “clonidine 300 mg”
irb(main):004:0>
···
Thomas A. Reilly w3gat@bellsouth.net wrote:
Thanks a lot everyone.
The suggestions wwrked fine.
Tom
irb(main):001:0> s = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):002:0> s[/(?=\d+ mg)/] = " "
=> " "
What’s exactly happening here?
thanks,
Rodrigo
Rodrigo B. de Oliveira wrote:
irb(main):001:0> s = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):002:0> s[/(?=\d+ mg)/] = " "
=> " "
What’s exactly happening here?
String#=, as in “s1[pat] = s2”, is a destructive slice operator. It
replaces the first match of pat with the r.h.s. (raising IndexError if
no match).
In this case, pat is /(?=\d+ mg)/, which has a lookahead pattern
(?=…). This lookahead expression matches the point in the string just
before the match of /\d+ mg/, but it doesn’t consume the “300 mg”. So
slicing out the match (which is empty) and substituting " " has the
effect of inserting a space before the match of /\d+ mg/.
I think:
s[/pattern/] returns the matching part of the string.
s[/pattern/]= value assigns ‘value’ to that piece of the string.
(?=regex) is known as ``zero-width positive lookahead’’ and means that
the parte is string matched should not be consumed.
···
il Sun, 28 Sep 2003 03:01:27 +0900, “Rodrigo B. de Oliveira” rbo@acm.org ha scritto::
irb(main):001:0> s = “clonidine300 mg”
=> “clonidine300 mg”
irb(main):002:0> s[/(?=\d+ mg)/] = " "
=> " "
What’s exactly happening here?
Thanks! Really beautiful.
Rodrigo
···
----- Original Message -----
From: “Joel VanderWerf” vjoel@PATH.Berkeley.EDU
…
String#=, as in “s1[pat] = s2”, is a destructive slice operator. It
replaces the first match of pat with the r.h.s. (raising IndexError if
no match).
In this case, pat is /(?=\d+ mg)/, which has a lookahead pattern
(?=…)…