Kev,
given the following input:
"INSERT INTO PROJECT ( PROJECT_ID, PROJECTTITLE,
PROJECTNO, PROJECTDATE, CATEGORY, SECTOR, SECTION,
DEPARTMENT, LOCATION, STATUS, COMPLETIONSTATUS,
CLIENT, TYPE ) VALUES ( 2319, 'SKIM PERPINDAHAN DAN
PERUMAHAN KAMPONG RATAIE, PERUMAHAN, KAMPONG RATAIE
HOUSING CONTRACT H-3 (64 HOUSES)', 'AD520/0105',
TO_Date( '01/01/1981 12:00:00 AM', 'MM/DD/YYYY
HH:MI:SS AM'), '2', '2', 'T', 'G', 'D', 'D', NULL,
NULL, NULL, 3000000, 'Y', 2);"How would you replace the commas embedded in the
strings in the SQL?
First of all, I am assuming that in your database, single quotes are
not escaped by a backslash (\'). If they are, you need to handle that
case separately. Most databases escape a single quote with another
single quote (''), guaranteeing an even number of single quotes.
I would approach this problem by splitting the string apart and then
reassembling it. You can split it apart in such a way that all of the
quoted strings are separate elements (using a shorter example string):
irb(main):001:0> s = "a, 'b', 'c,d', 'e', 'f,g,h', i"
=> "a, 'b', 'c,d', 'e', 'f,g,h', i"
irb(main):002:0> s.scan(/[^']+|'[^']*'/)
=> ["a, ", "'b'", ", ", "'c,d'", ", ", "'e'", ", ", "'f,g,h'", ", i"]
Now you can take these individual elements and replace all the
commas in the ones that are quoted:
irb(main):003:0> s.scan(/[^']+|'[^']*'/).map {|e| if e =~ /'/ then
e.gsub(/,/,';') else e end }
=> ["a, ", "'b'", ", ", "'c;d'", ", ", "'e'", ", ", "'f;g;h'", ", i"]
Now, all you have to do is reassemble the string:
irb(main):004:0> s.scan(/[^']+|'[^']*'/).map {|e| if e =~ /'/ then
e.gsub(/,/,';') else e end }.join
=> "a, 'b', 'c;d', 'e', 'f;g;h', i"
Using your original string (sorry for all the wrapping):
irb(main):005:0> s = "INSERT INTO PROJECT ( PROJECT_ID, PROJECTTITLE,
PROJECTNO, PROJECTDATE, CATEGORY, SECTOR, SECTION, DEPARTMENT, LOCATIO
N, STATUS, COMPLETIONSTATUS, CLIENT, TYPE ) VALUES ( 2319, 'SKIM
PERPINDAHAN DAN PERUMAHAN KAMPONG RATAIE, PERUMAHAN, KAMPONG RATAIE
HOUSING
CONTRACT H-3
irb(main):006:0" (64 HOUSES)', 'AD520/0105', TO_Date( '01/01/1981
12:00:00 AM', 'MM/DD/YYYY HH:MI:SS AM'), '2', '2', 'T', 'G', 'D', 'D',
NU
LL, NULL, NULL, 3000000, 'Y', 2);"
irb(main):006:0> s.scan(/[^']+|'[^']*'/).map {|e| if e =~ /'/ then
e.gsub(/,/,';') else e end }.join
=> "INSERT INTO PROJECT ( PROJECT_ID, PROJECTTITLE, PROJECTNO,
PROJECTDATE, CATEGORY, SECTOR, SECTION, DEPARTMENT, LOCATION, STATUS,
COMPLET
IONSTATUS, CLIENT, TYPE ) VALUES ( 2319, 'SKIM PERPINDAHAN DAN PERUMAHAN
KAMPONG RATAIE; PERUMAHAN; KAMPONG RATAIE HOUSING CONTRACT H-3\n(64
HOUSES)', 'AD520/0105', TO_Date( '01/01/1981 12:00:00 AM', 'MM/DD/YYYY
HH:MI:SS AM'), '2', '2', 'T', 'G', 'D', 'D', NULL, NULL, NULL, 3000
000, 'Y', 2);"
You could also replace the if/then/else with the ternary operator if
you wanted to make the line a little shorter, if somewhat less readable:
s.scan(/[^']+|'[^']*'/).map {|e| e =~ /'/ ? e.gsub(/,/,';') : e }.join
I hope this helps.
- Warren Brown