[ANN] Clusterer - 0.1.0 first release

Surendra_Singhi · 22 August 2006 17:51

Hello,

I am pleased to announce the availability of the Ruby library 'clusterer'
which implements the basic K-Means and Hierarchical Clustering algorithms for
text data.

The project Rubyforge page is:

http://rubyforge.org/projects/clusterer/

The library can be installed using:

gem install clusterer

More information can be found in the following blog entry:

http://cuttingtheredtape.blogspot.com/2006/08/ruby-clustering-library-for-text-data.html

Happy hacking.

···

--
Surendra Singhi
http://ssinghi.kreeti.com, http://www.kreeti.com
Read my blog at: http://cuttingtheredtape.blogspot.com/
,----

"All animals are equal, but some animals are more equal than others."
-- Orwell, Animal Farm, 1945

`----

Lyle_Johnson3 · 22 August 2006 18:17

I've installed the gem but am not getting very good results with my
limited use. In particular, I tried the example you posted on your
blog:

Clusterer::Clustering.kmeans_clustering(["hello world","mea
culpa","goodbye world"])

but it appears to have placed all three strings in the same cluster;
the result was [[0, 1, 2]]. I get a similar result ([[1, 0, 2]]) if I
try the hierarchical clustering instead.

This is on Mac OS X 10.4, running Ruby 1.8.4.

···

On 8/22/06, Surendra Singhi <efuzzyone@netscape.net> wrote:

I am pleased to announce the availability of the Ruby library 'clusterer'
which implements the basic K-Means and Hierarchical Clustering algorithms for
text data.

A_S_Bradbury · 23 August 2006 10:00

This looks very interesting, great work.

Alex

···

On Tuesday 22 August 2006 18:51, Surendra Singhi wrote:

Hello,

I am pleased to announce the availability of the Ruby library 'clusterer'
which implements the basic K-Means and Hierarchical Clustering algorithms
for text data.

The project Rubyforge page is:

http://rubyforge.org/projects/clusterer/

The library can be installed using:

gem install clusterer

More information can be found in the following blog entry:

http://cuttingtheredtape.blogspot.com/2006/08/ruby-clustering-library-for-t
ext-data.html

Happy hacking.

Surendra_Singhi · 23 August 2006 13:12

Hello,

"Lyle Johnson" <lyle.johnson@gmail.com> writes:

I am pleased to announce the availability of the Ruby library 'clusterer'
which implements the basic K-Means and Hierarchical Clustering algorithms for
text data.

I've installed the gem but am not getting very good results with my
limited use. In particular, I tried the example you posted on your
blog:

Clusterer::Clustering.kmeans_clustering(["hello world","mea
culpa","goodbye world"])

but it appears to have placed all three strings in the same cluster;
the result was [[0, 1, 2]]. I get a similar result ([[1, 0, 2]]) if I
try the hierarchical clustering instead.

The examples were just to show how to use the algorithms.

Clustering can also be thought of as a problem where you are looking for
representative points for a given set of points, if you want to preserve all
the information you can have every point as a cluster, or if you want maximum
compression, then just have one cluster. So, there is a trade-off.

Here I choose the default number of clusters equal to Math.sqrt(no. of docs),
and with the example it reduces to integer 1, and hence one cluster.

If you want custom number of clusters, then use

Clusterer::Clustering.kmeans_clustering(["hello world","mea culpa","goodbye world"],2)

and also use it on a larger corpus to really evaluate the merit of the algorithms.

The algorithms may also need some additional customisation depending upon the
problem domain.

Cheers,

···

On 8/22/06, Surendra Singhi <efuzzyone@netscape.net> wrote:

--
Surendra Singhi
http://ssinghi.kreeti.com, http://www.kreeti.com
Read my blog at: http://cuttingtheredtape.blogspot.com/
,----

By all means marry; if you get a good wife, you'll be happy. If you
get a bad one, you'll become a philosopher.
-- Socrates

`----

Topic		Replies	Views
Questions about clusterer 0.1.9 gem ruby-talk	0	106	29 June 2007
[survey] Cluster Analysis on Ruby Ecosystem ruby-talk	0	151	23 November 2012
[ANN] namecase 1.1.0 Released ruby-talk	2	127	11 October 2008
[ANN] RubyInstaller 1.8.7-p371 released ruby-talk	0	144	21 October 2012
[ANN] Stemmer 1.0.1 - Porter Stemmer Gem ruby-talk	0	108	21 April 2005

[ANN] Clusterer - 0.1.0 first release

Related topics