[ANN] linguistics 2.0.0 Released

linguistics version 2.0.0 has been released!

* docs: <http://deveiate.org/code/linguistics&gt;
* project: <https://bitbucket.org/ged/linguistics&gt;
* github: <https://github.com/ged/linguistics&gt;

Linguistics is a framework for building linguistic utilities for Ruby
objects in any language. It includes a generic language-independant
front end, a module for mapping language codes into language names, and
a module which contains various English-language utilities.

Version 2.0 has been rewritten to be more modular, easier to extend and
maintain, and to work better under 1.9.

Version 2 is Ruby 1.9-only.

= Examples

Here's some examples of what the English-language module can do:

== Pluralization

     "box".en.plural
     # => "boxes"

     "mouse".en.plural
     # => "mice"

     "ruby".en.plural
     # => "rubies"

== Indefinite Articles

     "book".en.a
     # => "a book"

     "article".en.a
     # => "an article"

== Present Participles

     "runs".en.present_participle
     # => "running"

     "eats".en.present_participle
     # => "eating"

     "spies".en.present_participle
     # => "spying"

== Ordinal Numbers

     5.en.ordinal
     # => "5th"

     2004.en.ordinal
     # => "2004th"

== Numbers to Words

     5.en.numwords
     # => "five"

     2004.en.numwords
     # => "two thousand and four"

     2385762345876.en.numwords
     # => "two trillion, three hundred and eighty-five billion, seven hundred and
     # sixty-two million, three hundred and forty-five thousand, eight hundred
     # and seventy-six"

== Quantification

     "cow".en.quantify( 5 )
     # => "several cows"

     "cow".en.quantify( 1005 )
     # => "thousands of cows"

     "cow".en.quantify( 20_432_123_000_000 )
     # => "tens of trillions of cows"

== Conjunctions

     animals = %w{dog cow ox chicken goose goat cow dog rooster llama pig goat
                  dog cat cat dog cow goat goose goose ox alpaca}
     "The farm has: " + animals.en.conjunction
     # => "The farm has: four dogs, three cows, three geese, three goats, two
     # oxen, two cats, a chicken, a rooster, a llama, a pig, and an alpaca"

Note that 'goose' and 'ox' are both correctly pluralized, and the correct
indefinite article 'an' has been used for 'alpaca'.

You can also use the generalization function of the #quantify method to give
general descriptions of object lists instead of literal counts:

     allobjs =
     ObjectSpace::each_object {|obj| allobjs << obj.class.name }
     puts "The current Ruby objectspace contains: " +
          allobjs.en.conjunction( :generalize => true )

Outputs:

     The current Ruby objectspace contains: hundreds of thousands of Strings,
     thousands of RubyVM::InstructionSequences, thousands of Arrays, thousands
     of Hashes, hundreds of Procs, hundreds of Regexps, [...], a
     SystemStackError, a Random, an ARGF.class, a Data, a fatal, an
     OptionParser::List, a YAML::EngineManager, a URI::Parser, a Rational, and
     a Gem::Platform

== Infinitives

     "leaving".en.infinitive
     # => "leave"

     "left".en.infinitive
     # => "leave"

     "leaving".en.infinitive.suffix
     # => "ing"

== Conjugation

Conjugate a verb given an infinitive:

     "run".en.past_tense
     # => "ran"

     "run".en.past_participle
     # => "run"

     "run".en.present_tense
     # => "run"

     "run".en.present_participle
     # => "running"

Conjugate an infinitive with an explicit tense and grammatical person:

     "be".en.conjugate( :present, :third_person_singular )
     # => "is"

     "be".en.conjugate( :present, :first_person_singular )
     # => "am"

     "be".en.conjugate( :past, :first_person_singular )
     # => "was"

The functionality is a port of the verb conjugation portion of Morph
Adorner (http://morphadorner.northwestern.edu/\).

It includes a good number of irregular verbs, but it's not going to be
100% correct every time.

== WordNet® Integration

If you have the 'wordnet' gem installed, you can look up WordNet synsets using
the Linguistics interface:

Test to be sure the WordNet module loaded okay.

     Linguistics::EN.has_wordnet?
     # => true

Fetch the default synset for the word "balance"

     "balance".en.synset
     # => #<WordNet::Synset:0x7f9fb11012f8 {102777100} 'balance' (noun):
     # [noun.artifact] a scale for weighing; depends on pull of

Fetch the synset for the first verb sense of "balance"

     "balance".en.synset( :verb )
     # => #<WordNet::Synset:0x7f9fb10f3fb8 {201602318} 'balance, poise' (verb):
     # [verb.contact] hold or carry in equilibrium>

Fetch the second noun sense

     "balance".en.synset( 2, :noun )
     # => #<WordNet::Synset:0x7f9fb10ebbd8 {102777402} 'balance, balance wheel'
     # (noun): [noun.artifact] a wheel that regulates the rate of movement in a
     # machine; especially a wheel oscillating against the hairspring of a
     # timepiece to regulate its beat>

Fetch the second noun sense's hypernyms (more-general words, like a
superclass)

     "balance".en.synset( 2, :noun ).hypernyms
     # => [#<WordNet::Synset:0x7f9fb10dd100 {104574999} 'wheel' (noun):
     # [noun.artifact] a simple machine consisting of a circular frame with
     # spokes (or a solid disc) that can rotate on a shaft or axle (as in
     # vehicles or other machines)>]

A simpler way of doing the same thing:

     "balance".en.hypernyms( 2, :noun )
     # => [#<WordNet::Synset:0x7f9fb10d24d0 {104574999} 'wheel' (noun):
     # [noun.artifact] a simple machine consisting of a circular frame with
     # spokes (or a solid disc) that can rotate on a shaft or axle (as in
     # vehicles or other machines)>]

Fetch the first hypernym's hypernyms

     "balance".en.synset( 2, :noun ).hypernyms.first.hypernyms
     # => [#<WordNet::Synset:0x7f9fb10c5190 {103700963} 'machine, simple machine'
     # (noun): [noun.artifact] a device for overcoming resistance at one point by
     # applying force at some other point>]

Find the synset to which both the second noun sense of "balance" and the
default sense of "shovel" belong.

     ("balance".en.synset( 2, :noun ) | "shovel".en.synset)
     # => #<WordNet::Synset:0x7f9fb1091e58 {103183080} 'device' (noun):
     # [noun.artifact] an instrumentality invented for a particular

Fetch words for the specific kinds of (device-ish) "instruments"

     "instrument".en.hyponyms( "device" ).collect( &:words ).flatten.join(', ')
     # => "analyser, analyzer, cauterant, cautery, drafting instrument, engine,
     # extractor, instrument of execution, instrument of punishment, measuring
     # device, measuring instrument, measuring system, medical instrument,
     # navigational instrument, optical instrument, plotter, scientific
     # instrument, sonograph, surveying instrument, surveyor's instrument,
     # tracer, arm, weapon, weapon system, whip"

..or musical instruments

     "instrument".en.hyponyms( "musical" ).collect( &:words ).flatten.join(', ')
     # => "barrel organ, grind organ, hand organ, hurdy-gurdy, hurdy gurdy,
     # street organ, bass, calliope, steam organ, electronic instrument,
     # electronic musical instrument, jew's harp, jews' harp, mouth bow, keyboard
     # instrument, music box, musical box, percussion instrument, percussive
     # instrument, stringed instrument, wind, wind instrument"

There are many more WordNet methods supported--too many to list here. See the
WordNet::Synset API documentation for the complete list.

== LinkParser Integration

If you have the 'linkparser' gem installed, you can create linkages
from English sentences that let you query for parts of speech:

Test to see whether or not the link parser is loaded.

     Linguistics::EN.has_linkparser?
     # => true

Diagram the first linkage for a test sentence

     puts "he is a big dog".en.sentence.linkages.first.diagram

Outputs:

          +-----Ost----+
          > +----Ds---+
      +-Ss+ | +--A--+
      > > > > >
     he is.v a big.a dog.n

Find the verb in the sentence

     "he is a big dog".en.sentence.verb.to_s
     # => "is"

Combined infinitive + LinkParser: Find the infinitive form of the verb of the
given sentence.

     "he is a big dog".en.sentence.verb.en.infinitive
     # => "be"

Find the direct object of the sentence

     "he is a big dog".en.sentence.object.to_s
     # => "dog"

Combine WordNet + LinkParser to find the definition of the direct object of
the sentence

     "he is a big dog".en.sentence.object.en.definition
     # => "a member of the genus Canis (probably descended from the common wolf)
     # that has been domesticated by man since prehistoric times; occurs in many
     # breeds"

= Installation

   gem install linguistics

···

--
Michael Granger <ged@FaerieMUD.org>
Rubymage, Architect, Believer
The FaerieMUD Consortium <http://faeriemud.org/&gt;

3 cheers for a number library that actually works, thanks!

···

--
Posted via http://www.ruby-forum.com/.

omg. thank you, Michael.
best regards -botp

···

On Thu, Oct 11, 2012 at 12:14 AM, Michael Granger <ged@faeriemud.org> wrote:

linguistics version 2.0.0 has been released!

Thank you, very usefull.

···

--
Posted via http://www.ruby-forum.com/.

Hi,

linguistics version 2.0.0 has been released!

docs: Ruby Linguistics
project: https://bitbucket.org/ged/linguistics
github: GitHub - ged/linguistics: A generic, language-neutral framework for extending Ruby objects with linguistic methods.
Linguistics is a framework for building linguistic utilities for Ruby

When I read the word "linguistic" I immediately lost interest...

= Examples

Here's some examples of what the English-language module can do:

...but when I saw what came after I was amazed. Incredible. This is going to be incredibly useful for most of my projects. Thanks for sharing.

Regards,

···

On Oct 11, 2012, at 1:14 AM, "Michael Granger" <ged@FaerieMUD.org> wrote:

--
Javi Lavandeira
Twitter: @javilm

Blog: http://www.lavandeira.net/blog/
Email and hosting: http://www.lavandeira.net/services/