1
Representation / 3
Representation / 3Conceptualization and Ontological Analysis
Conceptualization and Ontological AnalysisLaurea
Laurea MAGISTRALE in MAGISTRALE in COMPUTER SCIENCE COMPUTER SCIENCE
ARTIFICIAL INTELLIGENCE
Questi lucidi sono stati preparati per uso didattico. Essi contengono materiale originale di proprietà dell'Università degli Studi di Bari e/o figure di proprietà di altri autori, società e organizzazioni di cui e' riportato il riferimento. Tutto o parte del materiale può essere fotocopiato per uso personale o didattico ma non può essere distribuito per uso commerciale. Qualunque altro uso richiede una specifica autorizzazione da parte dell'Università degli Studi di Bari e degli altri autori coinvolti.
●
2 ways to represent categories in logics
● Unary predicates
– tomato(X) X is a tomato
● “reified” categories
– X tomatoes
●tomatoes (constant symbol) represents the set of all tomatoes
●
Categories may be themselves objects, and may be interlinked in a
UNIFIED HIERARCHICAL TAXONOMY
Some problems
●
At what level to represent?
●
●
There exist object properties so basic as to be present in all domains? Which ones?
●
●
There exist primitives at which reducing all knowledge? Is it possible to get new knowledge?
●
Some problems
●
At what level to represent?
● Choice of the level of detail (GRAIN SIZE)
●
There exist object properties so basic as to be present in all domains? Which ones?
● Possibility of changing conceptualization maintaining a coherence among various “views”
●
There exist primitives at which reducing all knowledge? Is it possible to get new knowledge?
● Possibility of deriving new assertions (new knowledge)
Modeling for Conceptualization
●
Modeling explicit knowledge requires specific and dedicated activities
●
Positive outcome: knowledge which was previously implicit is now made explicit
●
The ontology defines the kind of things that exist in the application domain
Conceptualize and Represent Ontologies for Knowledge Representation
●
Ontology
● “is the study of existence of all kinds of entities – both abstract and concrete ones – that make up the world”
● “aims at providing a framework of distinctions that can be used to discriminate and classify things that exist and define words that describe them”
●
Used in
● Philosophy: area of Metaphysics that studies how the universe around us is actually made
● Computer Science: area of Artificial Intelligence that studies the methods to correctly represent the universe around us
Ontology in Philosophy
● “Each special science aims at truth, seeking to portray accurately some part of reality … No special science can arrogate to itself the task of rendering mutually consistent the various partial portraits: that task can alone belong to an overarching science of being, that is, to ontology.
But … the proper concern of ontology is not the portraits we construct of it, but reality itself”
– [Lowe, 2001]
● Difference between reality and its representation
Can we know reality itself?
Kant
No, only our thoughts, or ideas, about reality
Plato, Aristotle Yes, at least partly
Ontology in Philosophy
●
The theory of a priori distinctions
● applicable independently of the state of the world
between
● Particulars: The physical entities in the world
– (our perception thereof)
– Physical objects, events, regions in the space, quantity of matter, ...
● Universals: The meta-level categories used to model the world
– (to talk about the entities that must be included in our domain of discourse)
– Concepts, properties, qualities, state, relationships, roles, ...
Ontology in Philosophy
●
A general, or formal, or axiomatic, ontology, is in charge of
● Determining the conditions of possibility of an
“object/entity” in general, and
● identifying the requirements fulfilled by each
“object/entity”
●
Assuming the use of logic representations
● Formal Ontology = the formal, systematic and axiomatic development of the logics of all forms and ways of being
– i.e., the rigorous description of the forms of being (structural features) of objects
Ontologies in Computer Science
●
A wide area of research concerning the study of formal languages for representing
knowledge about the entities populating one or many domains of interest
● Uses
– Improve communication among persons and organizations
– Foster system interoperability
●Share modeling methods, paradigms, languages and software tools
– Support IT systems engineering
●Foster reusability/sharability: sharing of formal representations
●Improve search: used as meta-data to index database documents and information systems in general
●Express specifications: helps in identifying the requirements of an IT system
Ontologies in Computer Science
●
Selection of ontological categories
● First step in the design of:
– Databases (“domains”)
– Knowledge Bases (“types”)
– Object-Oriented systems (“classes”)
● Determines what can be represented in a (family of) application(s)
– Any incompleteness, distortion or restriction in the structure of categories limits the generality of all programs or databases using them
Ontologies in Computer Science
●
Areas interested in ontology representation
● Semantic Web, Natural Language Processing, Computer Vision, Biological Information Systems, ...
●
Many formalisms to represent and implement ontologies
– KIF, Description Logics, OWL, ...
● Their use in real-world problem solving supported by specific editors
– Protegè, Rice, OilEd, ...
and systems for automatic reasoning
– RACER, FaCT++, KAON2, ...
Ontology
●
Some definitions
● Philosophy: “a systematic explanation of being”
● Neches: “…defines the basic terms and relations including the vocabulary of a topic area as well as the rules for combining terms and relations to define extensions to the vocabulary”
● Gruber, the most cited: “…an explicit specification of a conceptualization”
● Borst, slightly modified: “…a formal specification of a shared conceptualization”
● Guarino: “…a logical theory which gives an explicit, partial account of a conceptualization”
Relationships
●
Defining inter-relationships among categories help us in structuring our conceptual system
●
Fundamental (ontological) relationships
● Hyponymy or inclusion (is-a, isa, is_a, ...) between names of entities
● Meronymy between entities
– Intended as whole and its part (part-of)
● Troponymy between verbs and processes
– Verb V1 is a troponym of V2 if V1 indicates a specific case of the more generic verb V2
●E.g., “falling” is a troponym of “moving”
Taxonomic Relationships
●
Inclusion relationships
● Very powerful
● Widely exploited in defining any kind of conceptual structures that tries to capture the intuition of humans that suggests the existence of “natural”
categories of hyponyms
●
Taxonomic relationship (is-a-kind-of)
● A special type of hyponymy
● Vertically structures taxonomic hierarchies
– Heterarchies if multiple inheritance is allowed
Some common relationships in knowledge structures
●
is_a
● Allow us to move into the taxonomic hierarchy
– Cat is_a Pet
– Pet is_a Animal
– Animal is_a LivingBeing
● Used for expressing subset relationships
– Generalizations
– Specializations
● Transitive
– Persian is_a Cat + Cat is_a Pet = Persian is_a Pet
●
instance_of
● Specifically applies to instances belonging to classes
Some common relationships in knowledge structures
●
part_of
● Applies to objects made up of sets of components
– Paw part_of Cat
– Drum part_of Printer
●
All these relationships establish partial orderings within the domain
● Instead of storing explicitly all the relationships, only first-level ones are stored, and a mechanism is provided to generate the others at need
Ontologies
●
Knowledge organized into Taxonomic Hierarchies or Graphs
Entity
Person Object
Mechanic Car
Engine
kind_of kind_of
kind_of kind_of
has_part repair
is_a
is_a
is_a kind_of
is_a is_a part_of
19 Alive:
Fly:
Animals T F
Paws:
Fly:
Birds 2 T
Paws:
Mammals
T
Fly:
Penguins
F
Cats
Paws:
Fly:
Bats 2 T
Name:
Friend:
Opus
Opus Name:
Friend:
Pussy
Pussy Name:
Pat Pat subset
subs et
subset
subs et subset
instance_ of
instance_
of instance_
of
Knowledge expressed graphically with exceptions (penguins among birds, bats among mammals)
Paws: 4
Dir. Reprod.
T
● rel(Alive, Animals, T)
● rel(Fly, Animals, F)
● Birds Animals
● Mammals Animals
● rel(Dir.reprod., Mammals, T)
● rel(Fly, Birds, T)
● rel(Paws, Birds, 2)
● rel(Paws, Cats, 4)
● rel(Paws, Bats, 2)
● Penguins Birds
● Cats Mammals
Translation in logics
● Bats Mammals
● rel(Fly, Penguins, F)
● rel(Fly, Bats, T)
● Opus Penguins
● Pussy Cats
● Pat Bats
● name(Opus, “Opus”)
● name(Pussy, “Pussy”)
● friend(Opus, “Pussy”)
● friend(Pussy, “Opus”)
● name(Pat, “Pat”)
Hierarchies of Ontologies
top-level ontology
domain ontology task & problem- solving ontology
application ontology
Top-level Ontologies
●
Top-level foundational ontologies
● Result of a conceptual integration activity
● Simplify the design of domain-specific ontologies
● Improve quality and understandability by representing a rigorous context for comparisons, evaluations and choices
● Enforce reuse of ontological resources
Ontology in Knowledge-based Systems
●
The reference knowledge
● Shared to support the transmission (exchange) of meaning among tasks within a process and
● Defines a shared vocabulary
strict correlation between language and knowledge (as represented in the ontology itself)
– The existential quantifier notation in logics says that an object exists, but logics does not provide a vocabulary to describe what exists
●
plays a relevant role for both
● Knowlege representation
● Knowledge acquisition
Lexical Ontologies
●
Define a given number of concepts that represent the meaning of words in a language
● Sometimes developed independently from a formal work on foundational ontologies
●
Tend to a “generalization of common sense”
●
Use of an ontology may improve reasoning and
retrieval activities, while its structure supports
the browsing activity
Ontology as Conceptualization
●
An ontology
● Is a formal conceptualization of the world
– Conceptualization
●Expressed by a set of rules representing the structure of a specific aspect of reality
– Ontological theory
●Includes formulas that can be considered as always being true –and thus can be shared by several agents
independently of the particular state of things
● Specifies a set of constraints that declare what must necessarily be true in any possible world
– Any possible world must be compliant to the constraints
– Given an ontology, a legal description of the world is any possible world that satisfies the constraints
Ontologies in Computer Science
●
(Formal) Ontology
● An explicit and formal description of the
conceptualization of a reality in terms of concepts, properties of the concepts and semantic
relationships among them
● The theory of
– Formal distinction between the elements of a domain (independently of their context)
– The connections among the entities of the world and the categories representing them
● A formal, shared and explicit representation of a conceptualization of a domain of interest
– More in detail: an axiomatic first-order theory that can be expressed in a Description Logics
Formal Ontology
●
Formal because
● Rigorous and general
● Adopts a formal logics perspective
– i.e., handles the link between neutral “truths”
●
Handles the connections between “neutral objects” and reality
● Aim : characterizing “particulars” and “universals”
through properties and formal relationships
●
Need for formal tools (logic theories) to handle the fundamental elements/relationships of the ontology
● part_of, integrity, identity, dependency
Formal Ontology
●
Defines a set of meta-properties useful to analyze the behavior of entities
●
Allows to analyze constraints imposed to an information system by defining additional modeling principles
●
Defines a minimum set of top-level ontologies to drive conceptual modeling
●
Formal relationships allow one to express general constraints on the domain by inducing distinctions among entities within the domain structure
Formal Ontology
●
3 components
● A set of concepts
– “Classes”
● The semantic interconnections among them
– Conceptual relationships, or semantic attributes
● (optional) A logic level that allows to infer new facts from those encoded within the resource
– E.g., a set of axioms or micro-theories
Formal Ontology
●
A triple O = (C, R, A)
● C a set of concepts
● R a set of conceptual relationships, each defined over C C
● A a set of axioms
– If A = the ontology is not axiomatized
●
Note: C and R induce a graph G = (V,E)
● V C
● E = { (c1, c2) CC : S R : (c1, c2) S }
●
and a labeling function
● l : CC 2R s.t. l(c1, c2) = { SR : (c1, c2) S }
Ontology
●
Example
● O’ = (C’, R’, A’)
– C’ = {Entity, Object, Person, Mechanic, Car, Engine}
– R’ = { is_a, has, repairs }
●is_a = { (Object, Entity), (Person, Entity), (Mechanic, Person), (Car, Object), (Engine, Object) }
●has_part = { (Car, Engine) }
●repairs = { (Mechanic, Car) }
– A’ = { “a Car : m Mechanic: repairs(m, a)” }
Ontology
●
Simple example (cont.)
Entity
Person Object
Mechanic Car
Engine is_a
is_a is_a
is_a is_a
has_part repairs
Ontological Language
●
Usually introduces
● Concepts (classes, entities)
● Properties of concepts (slots, attributes, roles), relationships between concepts (associations) and additional constraints
●
Can be:
● simple (concepts only),
● frame-based (with concepts and properties), or
● logic-based (Ontolingua, DAML+OIL, OWL)
●
May also be expressed using diagrams
● E.g., some consider the Entity-Relationship conceptual data model and UML class diagrams as ontological languages
Ontological Language
●
A possible reasoning is defined
● Formal languages exist to support reasoning mechanisms with different aims, such as
– Ontology Design
●Consistency check of classes and derivation of implicit relationships
– Ontology Integration
●Assert relationships between different ontologies – computation of consistency in the hierarchy of integrated classes
– Ontology Use
●Determining whether a set of facts are consistent with respect to the ontology – determining belonging of specific objects to the classes in the ontology
DLs as a formalization of Semantic Nets
●
~80s : drift towards the logics of Semantic Nets
●
Process consists of
● Reformulation of constructs according to the criteria of Logics
● Elimination of constructs that are not suitable to such a reformulation
– Defaults
– Exceptions
KL-One
●
Introduces fundamental ideas of DL
● Concepts and Roles
● Restrictions on Values
● Numerical restrictions (1, NIL)
● Formal semantics
●
[Brachman- Schmolze 1985]
is_a Role
Concept Restriction on value
Numeric restriction
From KL-One to Description Logics
●
Terminological Logics
● FL-
(Frame Language) [Brachman and Levesque,1984]
– Tradeoff between expressivity of representation language and complexity of reasoning
● CLASSIC [Brachman 1991]
– Limited, Complete
●
Description Logics
● LOOM [MacGregor- Bates 1987], BACK [Nebel- vonLuck, 1988]
– Expressive, Incomplete
● KRIS [Baader, Hollunder, 1991]
– Expressive, Complete
● FaCT, DLP, Racer
– Systems optimized for expressive logics
Description Logics
●
Can be seen as
● “Logic” evolutions of “network” KR languages
– E.g., frames and semantic nets
● Restrictions of First-Order Logics (FOL) to obtain better computational properties
●
In general, has 2 peculiar features (not shared by most other formalisms)
● Unique Names Assumption not supported
– Different names may denote the same concept
● Open World Assumption
– Not knowing a fact does not necessarily mean it is false
●Closed World Assumption not supported
Description Languages
●
A description language is simpler than a First- Order Logic language
● Involves only
– atomic concepts,
●E.g., a common name –E.g., ‘father’, ‘wife’, etc.
– roles and
●A role is a binary relationship and an object name –Represents, indeed, a single object – names of objects
Description Logics
●
Each DL is characterized by operators to build two kinds of terms
● Concepts
– Corresponding to unary relationships
– With operators for building complex concepts
●and, or , not, all, some, atleast, atmost, …
● Roles
– Corresponding to binary relationships
– and possibly operators
●
Individuals
● Used only in assertions
Ontology Web Language (OWL)
●
A markup language to explicitly represent meaning and semantics of terms through vocabularies and relationships among them
●
Several versions exist, very different from each other
● OWL DL sublanguage may be considered equivalent to a description logics
A-BOX and T-BOX
●
T-BOX
● Defines the terminology of a domain
– Concerns the definition of complex concepts and roles starting from basic concepts and roles
● Represents intensional knowledge
– Defines subsumption relationships between concepts and allows to classify them in an inheritance hierarchy
●Actually, a lattice with a top and a bottom
●
A-BOX
● Defines the extensional knowledge
– A set of specific and contingent facts concerning individuals
Language of the T-BOX
●
Terminological axioms T
● C ⊑ D inclusion of concepts CI DI
● R ⊑ S inclusion of roles RI SI
● C D equality of concepts CI DI
● R S equality of roles RI SI
●
I satisfies T iff it satisfies all elements in T
Terminology (T-BOX)
●
Example
Terminology (T-BOX)
●
Definitions
● Equalities that introduce a symbol on the left-hand- side
– E.g.: Mother Woman ⨅ hasChild.Person
●
Terminology
● Symbols appear on the left-hand-side at most once
●
Basic Symbols
● Appear only on the right-hand-side
●
Defined Symbols
● Appear also on the left-hand-side
●
We assume T to be acyclic
Terminology (T-BOX)
●
Example: an acyclic terminology
Expansion of T
●
Acyclic terminologies can be expanded
● By replacing defined symbols by their definitions
●
The process converges
●
The expansion T
eis unique
● Properties of Te
– Any equality in Te is of the form C De where De contains basic symbols only
– Te contains the same basic and defined symbols as T
– Te is equivalent to T
Expansion of T
●
Example: expansion
Language of Assertions (A-BOX)
●
An A-BOX is a set of assertions of 2 kinds
● a:C assertions about concepts, aI CI
● (b, c):R assertions about roles, (bI , cI ) RI
●
a, b, c, d, … are meta-symbols for individuals
● I also provides an interpretation for the symbols of individuals
Assertions (A-BOX)
●
Example
– Mary:Mother Peter:Father
– (Mary, Peter):hasChild (Peter, Harry):hasChild
– (Mary, Paul):hasChild
●
Open World Assumption (OWA)
● Not everything is specified
●
Unique Name Assumption (UNA)
● Different symbols, different individuals
DLs as Fragment of FOL
●
Assertions in Description Logics can be translated into FOL formulas
●
Through the definition of a translation function t(C, x)
● t(C, x) ↦ C(x)
that returns a FOL formula with x free variable
Translation from DL to FOL
t (C ⊑ D) ↦ ∀x . t (C, x) ⇒t (D, x) t (a:C) ↦ t (C, a)
t ((a, b):R) ↦ R(a, b) t (⟙, x) ↦ true t (⟘, x) ↦ false
t (A, x) ↦ A(x) A atomic
t (C D⨅ , x) ↦ t (C, x) ⋀ t (D, x) t (C D⨆ D , x) ↦ t (C, x) ⋁ t (D, x)
Cyc (enCYClopedia)
●
A popular and quite exhaustive ontology
● Proprietary system
● Developed since 1985
– “So, the mattress in the road to AI is lack of knowledge, and the anti-mattress is knowledge. But how much does a program need to know to begin with? The annoying, inelegant, but apparently true answer is: a non-trivial fraction of consensus reality - the millions of things that we all know and that we assume everyone else knows”
(Guha & Lenat 90, p.4)
● Developed to overcome the limitations of small domains
● Aim: classifying all human knowledge
Cyc
●
Consists of a constitutional ontology and several domain-specific ontologies (called microtheories)
● Currently 100,000+ concepts, 1,000,000+ facts and axioms
– Still, not yet in its final form!
●
A subset (OpenCyc) released for free use
● Currently ~40,000 concepts and ~300,000 relationships among them
– http://www.opencyc.org
●
2 components
● Constraint language (Predicate Logics)
● CycL (Frame-based language)
Cyc
●
Categories
● Under the top level are all concepts used in facts and rules of Cyc’s KB
Cyc’s Top Level
●
Some concept descriptions (from Cyc’s documentation)
● #$Thing
– The universal set: the collection of everything!
●Each Cyc constant in the Knowledge Base is a member of such a collection
●Moreover, any collection in the Knowledge Base is a member of collection #$Thing
#$Thing
#$Individual
#$Collection
#$Situation
#$IntangibleIndividual
#$SetOrCollection
#$Intangible
#$TemporalThing #$Relationship
● #$Intangible
– The collection of things that are not physical – are not made of, nor encoded in, matter
●Each #$Collection is #$Intangible (albeit its instances are tangible) and such are also some #$Individual.
●Warning: do not mismatch ‘tangibility’ with ‘perceivability’ – human beings may perceive light even if it is intangible
● #$Individual
– The collection of all things that are not sets or collections
●So, #$Individual includes, among other things, physical objects, temporal sub-abstractions of physical objects, numbers, relationships and groups
●An instance of #$Individual may have parts or structure (including parts which are discontinuous); but NO instance of
#$Individual may have elements or subsets
● #$IntangibleIndividual
– The collection of intangible individuals
●Its instances have no mass, volume, color, etc.
–E.g., hours, ideas, algorithms, integers, distances, etc.
●On the other hand, being a subset of #$Individual, this collection EXCLUDES sets and collections, which are elements of #$Intangible, but not of #$IntangibleIndividual
● #$TemporalThing
– The collection of things that have a particular temporal extension, things of which one might reasonably ask
“When?”
●It includes many things, such as actions, tangible objects, agreements, and abstract portions of time
●Some things are NOT instances of #$TemporalThing because they are abstract, atemporal
–E.g., a mathematical set, an integer, etc.
Cyc
●
A tremendously complex system, including both
● a part of knowledge representation and inference
● a full-fledged ontology
●
Advantages
● Size
● Inferential power
● Reasoning optimization
●
Disadvantages
● Too complex
● Ontological choices unclear
● Some failures (e.g., links with natural language)
Gr@phBRAIN
●http://193.204.187.73:8088/GraphBRAIN
Conceptualization Exercises
●
Conceptualize the following domain
● Computers (HW/SW) and other electronic devices
– From single electronic components to composite systems
● Publications
– Multimedia documents and their contents
● Food&Drinks
– With ingredients, nutrition facts, dietary restrictions, etc.
References
●
Robert Neches, Richard Fikes, Tim Finin, Thomas Gruber, Ramesh Patil, Ted Senator, and William R. Swartout: “Enabling
Technology for Knowledge Sharing”, AI
Magazine, Fall 1991
●