Name that data structure!

Simon_Chiang · 22 December 2008 22:38

I'm using a data structure that I'm sure has been implemented and
studied to death several times over... but I don't know it's name.
The structure is simple: any given node may have one or more previous
nodes. Nodes can be re-used, so you get structures like this:

  o one
  >
  >-o two (fork of 1)
  > >
  `---o three (fork of 2)
    > >
    > > o four
    > > >
    > `-`-o five (merge of 3,4)
    > >
    `-----`-o six (merge of 2,5)

The whole thing is ordered in the sense that n-1 logically occurs
before n. Kinda reminiscent of git, but I'm using this as a kind of
audit of how values progress though an analysis.

Any ideas? Thanks.

Tom_Cloyd2 · 22 December 2008 23:07

Simon Chiang wrote:

I'm using a data structure that I'm sure has been implemented and
studied to death several times over... but I don't know it's name.
The structure is simple: any given node may have one or more previous
nodes. Nodes can be re-used, so you get structures like this:

  o one
  >
  >-o two (fork of 1)
  > >
  `---o three (fork of 2)
    > >
    > > o four
    > > >
    > `-`-o five (merge of 3,4)
    > >
    `-----`-o six (merge of 2,5)

The whole thing is ordered in the sense that n-1 logically occurs
before n. Kinda reminiscent of git, but I'm using this as a kind of
audit of how values progress though an analysis.

Any ideas? Thanks.

Unless I totally don't get it...which just might be the case...my current major Ruby project is something very much like this. However, I'm only just learning about classes, though, and obviously am not much of a programmer. I do have the design concept of this thing - which I'm calling a node relationship database - very well in hand, and am very excited about its potential. In many ways, it mimics the behavior of neurons in the brain.

I referred to this project a short while back, in connection with a post asking what the point of classes in general might be (got some great answers). I exposed most of my code as it was at that point (and it has considerable documentation in it), and no one remarked about it.

But...depending upon how interesting other people's responses to you are, if you're interested I could share the code and see if what I'm doing is where you're interested in going.

t.

···

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Greg_Brown1 · 22 December 2008 23:10

Except for the ordering, it kind of sounds like a digraph.

-greg

···

On Mon, Dec 22, 2008 at 5:38 PM, Simon Chiang <simon.a.chiang@gmail.com> wrote:

o one
>
>-o two (fork of 1)
> >
`---o three (fork of 2)
   > >
   > > o four
   > > >
   > `-`-o five (merge of 3,4)
   > >
   `-----`-o six (merge of 2,5)

The whole thing is ordered in the sense that n-1 logically occurs
before n. Kinda reminiscent of git, but I'm using this as a kind of
audit of how values progress though an analysis.

--
Technical Blaag at: http://blog.majesticseacreature.com
Non-tech stuff at: http://metametta.blogspot.com
"Ruby Best Practices" Book now in O'Reilly Roughcuts:
http://rubybestpractices.com

Gregory_Seidman · 22 December 2008 23:12

Unless there's something more complicated that I am missing, this is a
directed, acyclic graph, usually abbreviated as a DAG.

--Greg

···

On Tue, Dec 23, 2008 at 07:38:00AM +0900, Simon Chiang wrote:

I'm using a data structure that I'm sure has been implemented and
studied to death several times over... but I don't know it's name.
The structure is simple: any given node may have one or more previous
nodes. Nodes can be re-used, so you get structures like this:

  o one
  >
  >-o two (fork of 1)
  > >
  `---o three (fork of 2)
    > >
    > > o four
    > > >
    > `-`-o five (merge of 3,4)
    > >
    `-----`-o six (merge of 2,5)

The whole thing is ordered in the sense that n-1 logically occurs
before n. Kinda reminiscent of git, but I'm using this as a kind of
audit of how values progress though an analysis.

Any ideas? Thanks.

Tom_Cloyd2 · 22 December 2008 23:22

Gregory Brown wrote:

o one
>
>-o two (fork of 1)
> >
`---o three (fork of 2)
   > >
   > > o four
   > > >
   > `-`-o five (merge of 3,4)
   > >
   `-----`-o six (merge of 2,5)

The whole thing is ordered in the sense that n-1 logically occurs
before n. Kinda reminiscent of git, but I'm using this as a kind of
audit of how values progress though an analysis.

Except for the ordering, it kind of sounds like a digraph.

Directed graph - Wikipedia

-greg

I believe a directed graph could be a fair match, but somehow it misses the essence of it, which is that any node can connect to any other, and more importantly that what it's connecting to can be a GROUP of nodes rather than just one. This sort of thing is critical to the function of brains.

More than that (if one wishes to sweeten the pot significantly) the nature of the relationship is highly variable as well. In my model, it can be anything - a quality, a number (on any sort of scaling), a concept. This model has the potential for storing about anything.

I plan to use it for storing bibleographical references, Internet links. databases of concepts, etc.

The structure is NOT inherent, but rather is emergent, as the database is built. One can certainly input stuff according to a pre-conceived form, and I plan on doing this some of the time, but that's just a variation on the basic design.

It took me a few days of idle thought to realize the power of this thing. It's far beyond relational database thinking, which, after all, has constraints built in for good reason. But they come at a price. And...brains don't work that way.

t.

···

On Mon, Dec 22, 2008 at 5:38 PM, Simon Chiang <simon.a.chiang@gmail.com> wrote:

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Forum · 22 December 2008 23:54

Sounds like a marked graph to me, and not directed at all, or did I
miss the arrows? That of course is a very general
thing, I therefore doubt that this statement is helpfull.
I do not understand with what you mean that a node can be connected to
a group of nodes?
Do you mean that if the nodes are e.g. lower case letters, there might
be an edge X in e(a,{b,c}) (X being the marking)
without implying that there is X in e(a,b) and X in e(a,c)?
If that is the case you still have just a general graph only that you
have to expand the nodeset to its powerset.

Cheers
Robert

···

--
Il computer non è una macchina intelligente che aiuta le persone
stupide, anzi, è una macchina stupida che funziona solo nelle mani
delle persone intelligenti.
Computers are not smart to help stupid people, rather they are stupid
and will work only if taken care of by smart people.

Umberto Eco

Tom_Cloyd2 · 23 December 2008 00:53

Robert Dober wrote:

Sounds like a marked graph to me, and not directed at all, or did I
miss the arrows? That of course is a very general
thing, I therefore doubt that this statement is helpfull.
I do not understand with what you mean that a node can be connected to
a group of nodes?
Do you mean that if the nodes are e.g. lower case letters, there might
be an edge X in e(a,{b,c}) (X being the marking)
without implying that there is X in e(a,b) and X in e(a,c)?
If that is the case you still have just a general graph only that you
have to expand the nodeset to its powerset.

Cheers
Robert

Robert,

I apologize for my mathematical illiteracy. I don't understand your question, because I don't understand your symbolic representation. Sorry.

So, allow me to explain it in terms I do understand - in the terms I'm using to work out the problem as I program the solution. Please realize that this may or may not well relate to the structure Simon's interested in. I think it does, but that's for him to say. For me, it's a practical problem in highly flexible information storage and retrieval, and a good chance for me to advance my Ruby skills. Here's a fundamental specification for the model I'm implementing:

* A "node" is a unit of information. It can be anything: a word, phrase, number, equation, pointer to something else - anything. It's just something that can be connected (related) in some way to another node.
* I indicate a node like this: .n {information}
* The node indication begins with the first character after the space following node indicator, and ends when another indicator or an EOL/CR is read,
* I indicate a relationship similarly: .r {information}

I now can specify node relationships. Some examples:

.n Tom .r is a .n poor programmer
.n apple r. is not .n bridge building material
.n e=mc**2 .r could be .n true

Obviously, these aren't sentences, so we do not use articles, conjunctions, and the like.

Simple enough. Now, consider this:

I start with a database of nodes, and of relationships - note that each gets a unique ascension number:
n1 {}
n2 {}
n3 {}
...

r1 {}
r2 {}
...

When I specify a relationship with a node not in the database of nodes, it gets inserted there; ditto for relationship types (NOT to be confused with relationships - which are sets of two nodes linked with a relationship of a given type).

Relationships ALSO go in a database, and each also gets a unique ascension number.

Here's three complex relationships, each with its number (I preface relationship ascension numbers with a lowercase L - for link):

l43 n1 and n2 and n3 .r are .n true
l44 n4 and n5 and n5 .r are .n not true
l45 l44 and l43 .r are .n true

The last one is the form of a line detector in the human eye, a simple but vital element or our brain. That's not my interest, but I AM interested in being able to build structures LIKE this, and I think this sort of cascading complexity is what Simon is getting at. I think of it as a sequential sparse matrix, but what do I know? I probably have that all garbled, but the logic itself I'm sure of. This WILL work, though it well may not be optimal in my specification of it. I'm simply trying to follow the maxim of 'get it to work, first', then 'get it to work well, later'.

So, I'll stop here, and wait for comment, if any.

t.

···

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Tom_Cloyd2 · 23 December 2008 00:58

Addendum:

This model encompasses directed graphs, and non-directed ones too (are those call "marked" graphs? am not familiar with that term).

E.g.

.n apple .r is a .n fruit <= a directed relationship; as specified it only goes one way.
.n fruit .r contains .n apple <= another directed relationship.
.n apple .r is a / contains .n fruit <= a way of building a bi-relational relationship. Starting at either end of the relationship, you "read" the component of the relation specified which you first encounter, up to the "/"

t.

Simon_Chiang · 23 December 2008 07:56

Unless there's something more complicated that I am missing, this is a directed, acyclic graph, usually abbreviated as a DAG.

Ah, great! I think that's it.

As a followup, does anyone know of libraries that work on DAGs? I'm
looking into the Ruby Graph Library, but any recommendations are
welcome.

http://rgl.rubyforge.org/rgl/index.html

Tom, your project sounds interesting but more involved than what I'm
working on. Thanks everyone!

Ian_Hobson · 23 December 2008 12:19

Tom Cloyd wrote:

I now can specify node relationships. Some examples:

.n Tom .r is a .n poor programmer
.n apple r. is not .n bridge building material
.n e=mc**2 .r could be .n true

Obviously, these aren't sentences, so we do not use articles, conjunctions, and the like.

Hi Tom,

What you have described is the associative model of data (storage).

Each association is three things
    A subject
    The association (almost verb, but more accurately copula) such as "is a", "sells", "knows programming language" etc
    An Object

The subject and object can be either entities (real things) or an association. Thus

((Ian, isa Person), Drives, (Vehicle, RegNo, AB54CDE))

The great strength is the distinction between entity (e.g. people, legal entities, cars), and their relationships (supplier, customer, drives etc).
An entity has a life span and could be destroyed. When it ceases to exist all relationships into which it entered, become void. Relationships usually have start and end times.

Another feature is the implied inverse of the relationship. If I drive a car, that car is driven by me.

So instead of a computer system having a suppliers database, a customers database and an employees database, it has a database containing "legal entities" and "people" who may have "supplier", "employee" or "customer" relationship with the company. (There should be no legal entities as employees - that is a data error). Clearly a supplier or employee could also be a customer, and it is not unknown for an employee to be an occasional supplier for one-off items.

Most accounts departments spend a lot of time sorting out problems with customers who are also suppliers. Now they are easy to handle.

Note the unusual nature of an address change. I and where I live are entities, with a "lives at" relationship between them. To alter where I live, the system must creates a new address and alters my "live at" relationship (dated when I move).

See if you can get hold of a copy of "The associative model of data" by Simon Williams (ISBN 1-903453-00-3). Lazy Software (William's company) are the publishers and gave away copies many years ago for marketing purposes. They may still have copies. It contains a very clear explanation of the what and how, and describes a general purpose user interface for editing such structures.

I have long felt a PIM based on these principles (with some form of schema definition language on top), would be a very powerful and flexible tool.

Regards

Ian

Klaus_Stein1 · 23 December 2008 13:01

If you want to do rather sophisticated things have a look at RSRuby which
connects Ruby to R (I use this for graph analysis and partly for graph
visualization).

(Thanks to the others for telling me about RGL , which looks fine)

Klaus

···

Simon Chiang <simon.a.chiang@gmail.com> wrote:

As a followup, does anyone know of libraries that work on DAGs? I'm
looking into the Ruby Graph Library, but any recommendations are
welcome.

--

The Answer is 42. And I am the Answer. Now I am looking for the Question.

Robert_K1 · 23 December 2008 15:01

The DAG defines a partial order. You can use it for "topological sorting" by stuffing it in a text file formatted suitable to "tsort". That tool then will print out nodes in a total ordering that satisfies the partial ordering.

Kind regards

robert

···

On 23.12.2008 08:56, Simon Chiang wrote:

Unless there's something more complicated that I am missing, this is a directed, acyclic graph, usually abbreviated as a DAG.

Ah, great! I think that's it.

Directed acyclic graph - Wikipedia

As a followup, does anyone know of libraries that work on DAGs? I'm
looking into the Ruby Graph Library, but any recommendations are
welcome.

Tom_Cloyd2 · 23 December 2008 19:21

Ian Hobson wrote:

Tom Cloyd wrote:

I now can specify node relationships. Some examples:

.n Tom .r is a .n poor programmer
.n apple r. is not .n bridge building material
.n e=mc**2 .r could be .n true

Obviously, these aren't sentences, so we do not use articles, conjunctions, and the like.

Hi Tom,

What you have described is the associative model of data (storage).

Each association is three things
   A subject
   The association (almost verb, but more accurately copula) such as "is a", "sells", "knows programming language" etc
   An Object The subject and object can be either entities (real things) or an association. Thus

  ((Ian, isa Person), Drives, (Vehicle, RegNo, AB54CDE))

The great strength is the distinction between entity (e.g. people, legal entities, cars), and their relationships (supplier, customer, drives etc).
An entity has a life span and could be destroyed. When it ceases to exist all relationships into which it entered, become void. Relationships usually have start and end times.

Another feature is the implied inverse of the relationship. If I drive a car, that car is driven by me.

So instead of a computer system having a suppliers database, a customers database and an employees database, it has a database containing "legal entities" and "people" who may have "supplier", "employee" or "customer" relationship with the company. (There should be no legal entities as employees - that is a data error). Clearly a supplier or employee could also be a customer, and it is not unknown for an employee to be an occasional supplier for one-off items.

Most accounts departments spend a lot of time sorting out problems with customers who are also suppliers. Now they are easy to handle.

Note the unusual nature of an address change. I and where I live are entities, with a "lives at" relationship between them. To alter where I live, the system must creates a new address and alters my "live at" relationship (dated when I move).

See if you can get hold of a copy of "The associative model of data" by Simon Williams (ISBN 1-903453-00-3). Lazy Software (William's company) are the publishers and gave away copies many years ago for marketing purposes. They may still have copies. It contains a very clear explanation of the what and how, and describes a general purpose user interface for editing such structures.

I have long felt a PIM based on these principles (with some form of schema definition language on top), would be a very powerful and flexible tool.

Regards

Ian

Wow. Ian, this is my holiday gift - from you. I got to this model from several diverse sources, but a major one was simply me- working things out with sketches made on sheet after sheet of blank paper. It's been a lot of run, and considerably exciting. I really did think I was on to something, if only because I recognized strong similarities to how organic data processing works.

Your orienting me to previous work done with these concepts is simply terrific. I strongly suspected there were other people I should be reading. You given me a real lead in that direction.

I absolutely will continue working on this - I'm learning too much Ruby not to. But I also want to USE the darned thing, and in fact already am.

Thanks again for taking time to share your knowledge!

Tom

···

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Joel_VanderWerf1 · 23 December 2008 19:27

Tom Cloyd wrote:

Ian Hobson wrote:

Tom Cloyd wrote:

I now can specify node relationships. Some examples:

.n Tom .r is a .n poor programmer
.n apple r. is not .n bridge building material
.n e=mc**2 .r could be .n true

Obviously, these aren't sentences, so we do not use articles, conjunctions, and the like.

You may find concept analysis interesting, too:

···

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Tom_Cloyd2 · 23 December 2008 20:51

Joel VanderWerf wrote:

Tom Cloyd wrote:

Ian Hobson wrote:

Tom Cloyd wrote:

I now can specify node relationships. Some examples:

.n Tom .r is a .n poor programmer
.n apple r. is not .n bridge building material
.n e=mc**2 .r could be .n true

Obviously, these aren't sentences, so we do not use articles, conjunctions, and the like.

You may find concept analysis interesting, too:

Formal concept analysis - Wikipedia

Absolutely. I see in that article, right away, my sense of this thing's involving spare matrices. Also prominent is the notion of what we call in statistical factor analysis "latent structure". Lovely. This is really relevant stuff, both to data storage, retrieval and analysis and to the study of functional intelligence. Concept analysis appears to offer another perspective on this "associative" model I'm learning about and developing, and it may give me some additional ideas for what the sorts of functionality I will want my program to have.

Thanks very much!!!!

t.

···

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Clifford_Heath5 · 24 December 2008 00:21

Tom Cloyd wrote:

Joel VanderWerf wrote:

Tom Cloyd wrote:

Ian Hobson wrote:

Tom Cloyd wrote:

I now can specify node relationships. Some examples:

.n Tom .r is a .n poor programmer
.n apple r. is not .n bridge building material
.n e=mc**2 .r could be .n true

Obviously, these aren't sentences, so we do not use articles, conjunctions, and the like.

You may find concept analysis interesting, too:

http://en.wikipedia.org/wiki/Formal_concept_analysis

While you guys are contemplating this stuff, you might like to think about
playing with my ActiveFacts gem, recently released in stealth mode (only
some functionality is present). "gem install activefacts", and read more
at <http://dataconstellation.com/ActiveFacts>\.

Fact-oriented (or semantic) modeling, has been around for decades, and there
have been interesting GUI tools for it also - but until now, no textual
modeling language. That's the gap the ActiveFact's CQL language seeks to
fill, in addition to becoming a viable replacement for SQL, as presented
recently at OSDC.

ActiveFacts also contains a cute Ruby API for representing data and models
in this form.

More (much more) to follow...

Clifford Heath, Data Constellation.

Tom_Cloyd2 · 24 December 2008 02:34

Clifford Heath wrote:

Tom Cloyd wrote:

Joel VanderWerf wrote:

Tom Cloyd wrote:

Ian Hobson wrote:

Tom Cloyd wrote:

I now can specify node relationships. Some examples:

.n Tom .r is a .n poor programmer
.n apple r. is not .n bridge building material
.n e=mc**2 .r could be .n true

Obviously, these aren't sentences, so we do not use articles, conjunctions, and the like.

You may find concept analysis interesting, too:

Formal concept analysis - Wikipedia

While you guys are contemplating this stuff, you might like to think about
playing with my ActiveFacts gem, recently released in stealth mode (only
some functionality is present). "gem install activefacts", and read more
at <http://dataconstellation.com/ActiveFacts>\.

Fact-oriented (or semantic) modeling, has been around for decades, and there
have been interesting GUI tools for it also - but until now, no textual
modeling language. That's the gap the ActiveFact's CQL language seeks to
fill, in addition to becoming a viable replacement for SQL, as presented
recently at OSDC.

ActiveFacts also contains a cute Ruby API for representing data and models
in this form.

More (much more) to follow...

Clifford Heath, Data Constellation.

Clifford,

This is an amazing thing you've done. I need to work my way carefully through your language, but the diagrams you provide are immediately clear to me. This is exactly what I'm trying to do. You very much farther along. I'm going to study your work very carefully. Thanks - for the work and for directing us to it. I'm a bit speechless at this point...

t.

···

--

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC - Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< tc@tomcloyd.com >> (email)
<< TomCloyd.com >> (website) << sleightmind.wordpress.com >> (mental health weblog)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Clifford_Heath5 · 24 December 2008 03:11

Tom Cloyd wrote:

This is an amazing thing you've done. I need to work my way carefully through your language, but the diagrams you provide are immediately clear to me. This is exactly what I'm trying to do. You very much farther along. I'm going to study your work very carefully. Thanks - for the work and for directing us to it. I'm a bit speechless at this point...

Thanks. I've spent fully half of the last two years working on it,
working freelance to fund it and make time free, and though I feel
like what I've achieved so far is a small step in terms of number of
lines of code, it's been a huge intellectual leap. I couldn't have
done it without the work of Nijssen and Halpin, who designed the
graphical languages and formalised the semantics.

After you install the gem, run a gem server and check the API docs
while perusing the examples (online, or CQL only in the gem)

The slides from OSDC show, in the final sections, how to write queries
in CQL - worth looking through. This isn't implemented yet, but isn't
a big step from where I am now. At the moment I'm just solidifying the
SQL schema generation code. There's a huge amount more to be done
still... it might be time for some collaborators.

Clifford Heath.

Topic		Replies	Views
Managing complexity and untangling my thoughts ruby-talk	17	169	2 May 2005
Assoc Class (Hash Pairs) ruby-talk	27	253	7 August 2002
Dependency "trees" - suggestions? ruby-talk	14	158	17 September 2002
Representing Undirected Edges (advice, please) ruby-talk	9	112	12 May 2005
[ANN] ndb Object-Relational Mapper ruby-talk	24	188	19 October 2004

Name that data structure!

Related topics