« Data model » : différence entre les versions
Aucun résumé des modifications |
(→Intro) |
||
Ligne 7 : | Ligne 7 : | ||
*it is built on clearly distinguished layers (often or always mixed in current models - and this has disappointing effects on algorithmic issues) |
*it is built on clearly distinguished layers (often or always mixed in current models - and this has disappointing effects on algorithmic issues) |
||
*is is easily extensible |
*is is easily extensible |
||
*it is suited for |
*it is suited for advanced performance oriented applications, like the Assothink engine |
||
It is thus strongly suggested as target formal model in the evolutions of the initiative mentioned above.<br> |
It is thus strongly suggested as target formal model in the evolutions of the initiative mentioned above.<br> |
||
It is also strongly suggested to have APIs based on this data model, with the benefits listed above. |
It is also strongly suggested to have APIs based on this data model, with the benefits listed above. |
||
== 5 Levels == |
== 5 Levels == |
Version du 1 février 2015 à 22:42
Intro
The Assothink data model is the underlying formal description of all Assotink components.
It has been compared to various formal models (wordnet, freebase, wikipedia, wikidata, babelnet, ontologies), and it is claimed here that the Assothink model is better because:
- it is built on clearly distinguished layers (often or always mixed in current models - and this has disappointing effects on algorithmic issues)
- is is easily extensible
- it is suited for advanced performance oriented applications, like the Assothink engine
It is thus strongly suggested as target formal model in the evolutions of the initiative mentioned above.
It is also strongly suggested to have APIs based on this data model, with the benefits listed above.
5 Levels
The Assothink data model includes 5 layers:
- Concepts
- keySets
- Concept links
- Language anchors
- Active jelly
Concepts
Concepts exists without names and without keySet references - this is what makes them difficult to handle at both the formal level and the programming level.
Very few things may be said about concepts.
The only think that is relevant at this stage is that they are distinct and categorized entities.
The categorization includes typically:
- categories themselves
- properties
- nouns
- verbs (action descriptors)
- adjective (noun qualifiers)
- adverbs (verb qualifiers)
- quantities
- intrinsic concepts
It may be said that concepts exists in non human brain, i.e. in brain without verbal communication capabilities.
KeySets
A keySet is any system allowing a bijective connection between 'unnameable and unreadable' concepts and readable references.
The readable references might be numbers, byte or character combination.
Keyset references are not linked to any language. They are language independent.
Bijective connection means bidirectional non-ambiguous mapping.
KeySets currently exists and are widely used. Freebase, wikidata, WordNet, BabelNet, etc... use their own keySets.
Links
Links are connection between concepts.
Typically a link integrates 3 concepts: a property concept, a source concept and a destination concept.
In the Assothink model, the links form the "passive jelly": a set of connections evolving slowly or not evolving.
The links are the critical components of an active brain.
Links have nothing to do with language.
Qualified and fuzzy links
Links may be qualified or fuzzy. Hypernymy, hyponymy, etc... are typical qualified concepts. However the vast majority of links are fuzzy. In the case of a fuzzy link however the link description also includes a (scalar) permeability measure). So the formal model of a link integrates 3 concepts (identified through keySet references) and 1 scalar value in the 0.0 1.0 range.
Languages
The languages is probably an attribute of the humans, but all of the known languages are obviously poorly evoluated tools:
- not performing in terms of data speed
- not performing in terms of ambiguity
- not performing in terms of descriptibility
However, language is the most used tool to communicate, an not only by the writer and the reader of this document.
In our formal model, the language is a complex set of 'language anchors'.
The language anchors is the meeting point of a concept with a human language.
A language anchors typically contains (for the involved concept):
- definition(s)
- reference words, aliases, forming a synset
- example of use
Besides that a language involves various components that are not critical part of this data model.
- grammar and syntax rules
- variant form words
Active Jelly
The active jelly is a set of excitation states quickly evolving in time.
The excitation state is linked to any individual concept (possibly many excitation states are defined, with various time reactivity: short term excitation, long term excitation, and possibly a continuous spectral description).
The active jelly is directly connected to the concept layer and to the link layer.
The active jelly has nothing to do neither with keySets nor with languages.
The active jelly is a model of what 'moves' in a human brain. It is also the base of artificial associative intelligence system. It is the most critical component of Assothink, which is targeting software engines able to generate ~108 variations of excitations per seconds on standard sequential computers, using efficient concept and link software implementation. Typically this kind of engine should be realized on parallel computer or parallel analogical system (similar to bio systems).