Regexp to split name?

Does anyone have an example of splitting a name into first and last
names? Or is just a case of doing string.split(' ')?

···

--
Posted via http://www.ruby-forum.com/.

quoth the Alex MacCaw:

Does anyone have an example of splitting a name into first and last
names? Or is just a case of doing string.split(' ')?

I'd say a regexp is overkill here.

irb(main):001:0> name = "Alex MacCaw"
=> "Alex MacCaw"
irb(main):002:0> first, last = name.split
=> ["Alex", "MacCaw"]
irb(main):003:0> first
=> "Alex"
irb(main):004:0> last
=> "MacCaw"

Note that you will have to do more work to accommodate middle names and
titles, ie: Mr, Mrs, Dr etc...

-d

···

--
darren kirby :: Part of the problem since 1976 :: http://badcomputer.org
"...the number of UNIX installations has grown to 10, with more expected..."
- Dennis Ritchie and Ken Thompson, June 1972

Hi --

···

On Sun, 24 Jun 2007, darren kirby wrote:

quoth the Alex MacCaw:

Does anyone have an example of splitting a name into first and last
names? Or is just a case of doing string.split(' ')?

I'd say a regexp is overkill here.

irb(main):001:0> name = "Alex MacCaw"
=> "Alex MacCaw"
irb(main):002:0> first, last = name.split
=> ["Alex", "MacCaw"]
irb(main):003:0> first
=> "Alex"
irb(main):004:0> last
=> "MacCaw"

Note that you will have to do more work to accommodate middle names and
titles, ie: Mr, Mrs, Dr etc...

And also last names with spaces in them (von Trapp, Vaughn Williams,
etc.).

David

--
* Books:
   RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242\)
   RUBY FOR RAILS (http://www.manning.com/black\)
* Ruby/Rails training
     & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)

name = "Mr John Joe Peter Smith"
TITLES = ["Mr", "Mrs", "Ms", "Dr"]
a = name.split
last = a.pop
title = a.shift if TITLES.include? a.first
first = a.shift
middles = a

title #=> "Mr"
first #=> "John"
middles #=> ["Joe", "Peter"]
last #=> Smith"

···

On 23/06/07, darren kirby <bulliver@badcomputer.org> wrote:

quoth the Alex MacCaw:
> Does anyone have an example of splitting a name into first and last
> names? Or is just a case of doing string.split(' ')?

I'd say a regexp is overkill here.

irb(main):001:0> name = "Alex MacCaw"
=> "Alex MacCaw"
irb(main):002:0> first, last = name.split
=> ["Alex", "MacCaw"]
irb(main):003:0> first
=> "Alex"
irb(main):004:0> last
=> "MacCaw"

Note that you will have to do more work to accommodate middle names and
titles, ie: Mr, Mrs, Dr etc...

-d
--
darren kirby :: Part of the problem since 1976 :: http://badcomputer.org
"...the number of UNIX installations has grown to 10, with more expected..."
- Dennis Ritchie and Ken Thompson, June 1972

dblack@wobblini.net wrote:

Hi --

quoth the Alex MacCaw:

Does anyone have an example of splitting a name into first and last
names? Or is just a case of doing string.split(' ')?

I'd say a regexp is overkill here.

irb(main):001:0> name = "Alex MacCaw"
=> "Alex MacCaw"
irb(main):002:0> first, last = name.split
=> ["Alex", "MacCaw"]
irb(main):003:0> first
=> "Alex"
irb(main):004:0> last
=> "MacCaw"

Note that you will have to do more work to accommodate middle names and
titles, ie: Mr, Mrs, Dr etc...

And also last names with spaces in them (von Trapp, Vaughn Williams,
etc.).

And titles with spaces in them (The Honourable, His Excellency, etc...).

···

On Sun, 24 Jun 2007, darren kirby wrote:

--
Alex

Hi --

name = "Mr John Joe Peter Smith"
TITLES = ["Mr", "Mrs", "Ms", "Dr"]
a = name.split
last = a.pop
title = a.shift if TITLES.include? a.first

Have mercy on us Yanks and allow for a period :slight_smile:

first = a.shift
middles = a

title #=> "Mr"
first #=> "John"
middles #=> ["Joe", "Peter"]
last #=> Smith"

However:

   name = "Mr Andrew Lloyd Webber"

   # etc.

   title #=> "Mr"
   first #=> "Andrew"
   middles #=> ["Lloyd"] (wrong)
   last #=> Webber" (wrong)

David

···

On Mon, 25 Jun 2007, Dan Stevens (IAmAI) wrote:

--
* Books:
   RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242\)
   RUBY FOR RAILS (http://www.manning.com/black\)
* Ruby/Rails training
     & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)

And international names (though the US seems to have a broad
assortment of them already)

···

On 6/25/07, Alex Young <alex@blackkettle.org> wrote:

dblack@wobblini.net wrote:
> Hi --
>
> On Sun, 24 Jun 2007, darren kirby wrote:
>
>> quoth the Alex MacCaw:
>>> Does anyone have an example of splitting a name into first and last
>>> names? Or is just a case of doing string.split(' ')?
>>
>> I'd say a regexp is overkill here.
>>
>> irb(main):001:0> name = "Alex MacCaw"
>> => "Alex MacCaw"
>> irb(main):002:0> first, last = name.split
>> => ["Alex", "MacCaw"]
>> irb(main):003:0> first
>> => "Alex"
>> irb(main):004:0> last
>> => "MacCaw"
>>
>> Note that you will have to do more work to accommodate middle names and
>> titles, ie: Mr, Mrs, Dr etc...
>
> And also last names with spaces in them (von Trapp, Vaughn Williams,
> etc.).
>
And titles with spaces in them (The Honourable, His Excellency, etc...).

dblack@wobblini.net wrote:

Hi --

name = "Mr John Joe Peter Smith"
TITLES = ["Mr", "Mrs", "Ms", "Dr"]
a = name.split
last = a.pop
title = a.shift if TITLES.include? a.first

Have mercy on us Yanks and allow for a period :slight_smile:

first = a.shift
middles = a

title #=> "Mr"
first #=> "John"
middles #=> ["Joe", "Peter"]
last #=> Smith"

However:

  name = "Mr Andrew Lloyd Webber"

  # etc.

  title #=> "Mr"
  first #=> "Andrew"
  middles #=> ["Lloyd"] (wrong)
  last #=> Webber" (wrong)

name = "The Honourable Lord Andrew, the Baron Lloyd-Webber of Sydmonton", you mean? It's hard to come up with a trickier example. Names are just *hard* - the only reliable way of handling them that I've found is to let users control it themselves...

···

On Mon, 25 Jun 2007, Dan Stevens (IAmAI) wrote:

--
Alex

Michael Fellinger wrote:

···

On 6/25/07, Alex Young <alex@blackkettle.org> wrote:

dblack@wobblini.net wrote:
> Hi --
>
> On Sun, 24 Jun 2007, darren kirby wrote:
>
>> quoth the Alex MacCaw:
>>> Does anyone have an example of splitting a name into first and last
>>> names? Or is just a case of doing string.split(' ')?
>>
>> I'd say a regexp is overkill here.
>>
>> irb(main):001:0> name = "Alex MacCaw"
>> => "Alex MacCaw"
>> irb(main):002:0> first, last = name.split
>> => ["Alex", "MacCaw"]
>> irb(main):003:0> first
>> => "Alex"
>> irb(main):004:0> last
>> => "MacCaw"
>>
>> Note that you will have to do more work to accommodate middle names and
>> titles, ie: Mr, Mrs, Dr etc...
>
> And also last names with spaces in them (von Trapp, Vaughn Williams,
> etc.).
>
And titles with spaces in them (The Honourable, His Excellency, etc...).

And international names (though the US seems to have a broad
assortment of them already)

Can open. Worms everywhere. :slight_smile:

--
Alex

Names are just *hard* - the only reliable way of handling them that I've
found is to let users control it themselves...

Agreed. My example makes very simple assumptions that I'd imagine
apply to the vast majority of names. However, in many computer
problems there are obscure exceptions that either break the program or
break things for the user.

···

On 25/06/07, Alex Young <alex@blackkettle.org> wrote:

dblack@wobblini.net wrote:
> Hi --
>
> On Mon, 25 Jun 2007, Dan Stevens (IAmAI) wrote:
>
>> name = "Mr John Joe Peter Smith"
>> TITLES = ["Mr", "Mrs", "Ms", "Dr"]
>> a = name.split
>> last = a.pop
>> title = a.shift if TITLES.include? a.first
>
> Have mercy on us Yanks and allow for a period :slight_smile:
>
>> first = a.shift
>> middles = a
>>
>> title #=> "Mr"
>> first #=> "John"
>> middles #=> ["Joe", "Peter"]
>> last #=> Smith"
>
> However:
>
> name = "Mr Andrew Lloyd Webber"
>
> # etc.
>
> title #=> "Mr"
> first #=> "Andrew"
> middles #=> ["Lloyd"] (wrong)
> last #=> Webber" (wrong)
>

name = "The Honourable Lord Andrew, the Baron Lloyd-Webber of
Sydmonton", you mean? It's hard to come up with a trickier example.
Names are just *hard* - the only reliable way of handling them that I've
found is to let users control it themselves...

--
Alex

I worked at an institution that was forced to rewrite a bunch of
name-related code for a legacy system because of a "sanity" check that
was just plain wrong... and nobody realized it until Dr. O came to
work. Now they had to allow one-letter surnames, too (they'd already
allowed one-letter given or middle names, thanks to President Truman's
middle name, S).

Almost any assumption you make about name parsing will be wrong. For
example, take the assumption that names are composed only of letters
and letter-like symbols.

http://en.wikipedia.org/wiki/Nancy_3._Hoffman
http://en.wikipedia.org/wiki/List_of_personal_names_that_contain_numbers

-Alex

···

On 6/25/07, Dan Stevens (IAmAI) <dan.stevens.iamai@gmail.com> wrote:

On 25/06/07, Alex Young <alex@blackkettle.org> wrote:
> Names are just *hard* - the only reliable way of handling them that I've
> found is to let users control it themselves...

Agreed. My example makes very simple assumptions that I'd imagine
apply to the vast majority of names. However, in many computer
problems there are obscure exceptions that either break the program or
break things for the user.

Alex LeDonne wrote:

Almost any assumption you make about name parsing will be wrong. For
example, take the assumption that names are composed only of letters
and letter-like symbols.

Not to mention the assumption that each name consists of symbols that are part of some character set:

http://upload.wikimedia.org/wikipedia/commons/thumb/d/d7/Prince_symbol.svg/20px-Prince_symbol.svg.png

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Reminds me of an old SF story "The Man Whose Name Wouldn't Fit." IIRC
it started with a guy getting fired because his company put in a new
computer personnel data system which had one too many characters in
the field for last name to accomodate him (and it was too expensive to
fix).

I think it ended with a neo-luddite movement with a secret weapon
which dissolved the bond between the the magnetic material and the
substrate on magnetic tapes and disks.

Don't know how many here are old enough to remember when most
computers used magnetic tape. <G>

···

On 6/25/07, Alex LeDonne <aledonne.listmail@gmail.com> wrote:

I worked at an institution that was forced to rewrite a bunch of
name-related code for a legacy system because of a "sanity" check that
was just plain wrong... and nobody realized it until Dr. O came to
work. Now they had to allow one-letter surnames, too (they'd already
allowed one-letter given or middle names, thanks to President Truman's
middle name, S).

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Hi --

···

On Thu, 28 Jun 2007, Rick DeNatale wrote:

On 6/25/07, Alex LeDonne <aledonne.listmail@gmail.com> wrote:

I worked at an institution that was forced to rewrite a bunch of
name-related code for a legacy system because of a "sanity" check that
was just plain wrong... and nobody realized it until Dr. O came to
work. Now they had to allow one-letter surnames, too (they'd already
allowed one-letter given or middle names, thanks to President Truman's
middle name, S).

Reminds me of an old SF story "The Man Whose Name Wouldn't Fit." IIRC
it started with a guy getting fired because his company put in a new
computer personnel data system which had one too many characters in
the field for last name to accomodate him (and it was too expensive to
fix).

I think it ended with a neo-luddite movement with a secret weapon
which dissolved the bond between the the magnetic material and the
substrate on magnetic tapes and disks.

Don't know how many here are old enough to remember when most
computers used magnetic tape. <G>

I sometimes wonder whether the DECtapes in my attic would still be
readable.

David

--
* Books:
   RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242\)
   RUBY FOR RAILS (http://www.manning.com/black\)
* Ruby/Rails training
     & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)

Rick DeNatale wrote:

I worked at an institution that was forced to rewrite a bunch of
name-related code for a legacy system because of a "sanity" check that
was just plain wrong... and nobody realized it until Dr. O came to
work. Now they had to allow one-letter surnames, too (they'd already
allowed one-letter given or middle names, thanks to President Truman's
middle name, S).

Reminds me of an old SF story "The Man Whose Name Wouldn't Fit." IIRC
it started with a guy getting fired because his company put in a new
computer personnel data system which had one too many characters in
the field for last name to accomodate him (and it was too expensive to
fix).

I assume you meant that his name was one character too long, not that the field for the name was too long.

I think it ended with a neo-luddite movement with a secret weapon
which dissolved the bond between the the magnetic material and the
substrate on magnetic tapes and disks.

Don't know how many here are old enough to remember when most
computers used magnetic tape. <G>

I remember carrying boxes of punched cards. One of my previous bosses told me a humorous horror story about a company he worked for. The company magazine was doing an article on the Data Processing department and wanted a picture of the computers at work. The department decided that the best time would be during the payroll run when all of the tape drives would be in use. They set up the shot and when the flash went off all of the tape drives went off line. Everyone had forgotten about the optical EOT sensors.

···

On 6/25/07, Alex LeDonne <aledonne.listmail@gmail.com> wrote: