« Data model » : différence entre les versions

De Assothink Wiki
Aller à la navigation Aller à la recherche
Contenu ajouté Contenu supprimé
Aucun résumé des modifications
Ligne 7 : Ligne 7 :
*it is built on clearly distinguished layers (often or always mixed in current models - and this has disappointing effects on algorithmic issues)
*it is built on clearly distinguished layers (often or always mixed in current models - and this has disappointing effects on algorithmic issues)
*is is easily extensible
*is is easily extensible
*it is suited for advances applications, like the Assothink engine
*it is suited for advanced performance oriented applications, like the Assothink engine


It is thus strongly suggested as target formal model in the evolutions of the initiative mentioned above.<br>
It is thus strongly suggested as target formal model in the evolutions of the initiative mentioned above.<br>


It is also strongly suggested to have APIs based on this data model, with the benefits listed above.
It is also strongly suggested to have APIs based on this data model, with the benefits listed above.


== 5 Levels ==
== 5 Levels ==

Version du 1 février 2015 à 22:42

Intro

The Assothink data model is the underlying formal description of all Assotink components.

It has been compared to various formal models (wordnet, freebase, wikipedia, wikidata, babelnet, ontologies), and it is claimed here that the Assothink model is better because:

  • it is built on clearly distinguished layers (often or always mixed in current models - and this has disappointing effects on algorithmic issues)
  • is is easily extensible
  • it is suited for advanced performance oriented applications, like the Assothink engine

It is thus strongly suggested as target formal model in the evolutions of the initiative mentioned above.

It is also strongly suggested to have APIs based on this data model, with the benefits listed above.

5 Levels

The Assothink data model includes 5 layers:

  • Concepts
  • keySets
  • Concept links
  • Language anchors
  • Active jelly

Concepts

Concepts exists without names and without keySet references - this is what makes them difficult to handle at both the formal level and the programming level.

Very few things may be said about concepts.

The only think that is relevant at this stage is that they are distinct and categorized entities.

The categorization includes typically:

  • categories themselves
  • properties
  • nouns
  • verbs (action descriptors)
  • adjective (noun qualifiers)
  • adverbs (verb qualifiers)
  • quantities
  • intrinsic concepts

It may be said that concepts exists in non human brain, i.e. in brain without verbal communication capabilities.

KeySets

A keySet is any system allowing a bijective connection between 'unnameable and unreadable' concepts and readable references.

The readable references might be numbers, byte or character combination.

Keyset references are not linked to any language. They are language independent.

Bijective connection means bidirectional non-ambiguous mapping.

KeySets currently exists and are widely used. Freebase, wikidata, WordNet, BabelNet, etc... use their own keySets.

Links

Links are connection between concepts.

Typically a link integrates 3 concepts: a property concept, a source concept and a destination concept.

In the Assothink model, the links form the "passive jelly": a set of connections evolving slowly or not evolving.

The links are the critical components of an active brain.

Links have nothing to do with language.

Qualified and fuzzy links

Links may be qualified or fuzzy. Hypernymy, hyponymy, etc... are typical qualified concepts. However the vast majority of links are fuzzy. In the case of a fuzzy link however the link description also includes a (scalar) permeability measure). So the formal model of a link integrates 3 concepts (identified through keySet references) and 1 scalar value in the 0.0 1.0 range.

Languages

The languages is probably an attribute of the humans, but all of the known languages are obviously poorly evoluated tools:

  • not performing in terms of data speed
  • not performing in terms of ambiguity
  • not performing in terms of descriptibility

However, language is the most used tool to communicate, an not only by the writer and the reader of this document.

In our formal model, the language is a complex set of 'language anchors'.

The language anchors is the meeting point of a concept with a human language.

A language anchors typically contains (for the involved concept):

  • definition(s)
  • reference words, aliases, forming a synset
  • example of use

Besides that a language involves various components that are not critical part of this data model.

  • grammar and syntax rules
  • variant form words

Active Jelly

The active jelly is a set of excitation states quickly evolving in time.

The excitation state is linked to any individual concept (possibly many excitation states are defined, with various time reactivity: short term excitation, long term excitation, and possibly a continuous spectral description). 

The active jelly is directly connected to the concept layer and to the link layer.

The active jelly has nothing to do neither with keySets nor with languages.

The active jelly is a model of what 'moves' in a human brain. It is also the base of artificial associative intelligence system. It is the most critical component of Assothink, which is targeting software engines able to generate ~108 variations of excitations per seconds on standard sequential computers, using efficient concept and link software implementation. Typically this kind of engine should be realized on parallel computer or parallel analogical system (similar to bio systems).