RDF Ontologies
Singer
plays
type
born
2
Person
An RDF ontology can be seen as a graph of entities
subclassOf
1935
RDF Ontologies
Singer
plays
type
born
3
Person
An RDF ontology can be seen as a graph of entities
subclassOf
classes
relations / properties
instances
literals
1935
RDF Ontologies
instances
relations / properties
classes
literals
Singer
plays
type
born
4
Person
An RDF ontology can be seen as a graph of entities
subclassOf
intelligent search, QA, machine translation...
Ontologies serve all kinds of purposes:
1935
Existing Ontologies
There are literally hundreds of ontologies on the Web
Singer
type
born
plays
5
Person
subclassOf
1935
]>
Overlapping Data
birthDate
1935
type
RockSinger
marriedTo
YAGO
ElvisPedia
born
Singer
type
6
Many ontologies contain similar or overlapping entities and facts.
plays
1935
Problem: Elvis is lonely
ElvisPedia
birthDate
type
Singer
marriedTo
type
YAGO
born
1935
RockSinger
7
plays
Who is the spouse
of the guitar player?
?
1935
marriedTo
Problem: Elvis is lonely
ElvisPedia
birthDate
type
Singer
marriedTo
type
YAGO
born
1935
RockSinger
8
plays
Who is the spouse
of the guitar player?
?
1935
marriedTo
Complementary
information
cannot be used
Solution: Unify Entities
ElvisPedia
birthDate
type
Singer
marriedTo
type
YAGO
born
1935
RockSinger
9
plays
1935
marriedTo
identical
Goal: Merging Ontologies
ElvisPedia
birthDate
type
Singer
marriedTo
type
YAGO
born
1935
RockSinger
10
plays
To merge two ontologies, we have to identify
1935
Goal: Merging Ontologies
ElvisPedia
birthDate
type
Singer
marriedTo
type
YAGO
born
1935
RockSinger
11
To merge two ontologies, we have to identify
• equivalent instances
plays
sameAs
1935
Goal: Merging Ontologies
subClassOf
ElvisPedia
birthDate
type
Singer
marriedTo
type
YAGO
born
1935
RockSinger
12
To merge two ontologies, we have to identify
• equivalent instances
• equivalent or subsuming classes
sameAs
plays
1935
Goal: Merging Ontologies
sameAs
subClassOf
ElvisPedia
birthDate
type
Singer
marriedTo
type
YAGO
1935
born
1935
RockSinger
13
To merge two ontologies, we have to identify
• equivalent instances
• equivalent or subsuming classes
• equivalent or subsuming relations
subPropertyOf
plays
Previous work
Previous work
• uses hard logical constraints, which may be inadequate
• requires parameter tuning
• has not been tried on large ontologies
birthDate
marriedTo
1935
RockSinger
type
14
type
Singer
born
1935
[Gracia 2009, Jean-Mary 2009, Isaac 2007,
Aumueller 2005, Wang 2008, Noessner 2010,
Sais 2007/2009, Arasu 2009, Volz 2009,
Bhattacharya 2007, Hogan 2007/2010,
Hu 2011, Li 2009, Udrea 2007, and more]
Previous work
Previous work
• uses hard logical constraints, which may be inadequate
• requires parameter tuning
• has not been tried on large ontologies
birthDate
marriedTo
1935
RockSinger
type
15
type
Singer
born
(1)
• has mostly focused on
(1) instance matching or
(2) schema alignment,
... but not both
(2)
(1)
(2)
1935
[Gracia 2009, Jean-Mary 2009, Isaac 2007,
Aumueller 2005, Wang 2008, Noessner 2010,
Sais 2007/2009, Arasu 2009, Volz 2009,
Bhattacharya 2007, Hogan 2007/2010,
Hu 2011, Li 2009, Udrea 2007, and more]
PARIS: Aligning Everything At Once
born
There is a synergy between equality of instances, properties and classes!
=> Compute all together!
birthDate
marriedTo
1935
type
RockSinger
type
Singer
16
PARIS
1935
Probabilistic Model
We chose a probabilistic model
marriedTo
1935
type
RockSinger
born
type
birthDate
Singer
17
the probability that
is a subclass of
the probability that
=
the probability that
is a sub-property of
1935
Probabilistic Model
18
...
All possible
alignments
between the
ontologies
...
Worlds:
Probabilistic Model
19
...
All possible
alignments
between the
ontologies
...
Probabilities:
...
Worlds:
Probabilistic Model
20
Events:
We care here mainly about equality and subsumption events.
Each event can be true or false in a particular world.
...
...
Probabilities:
...
Worlds:
Probabilistic Model
21
Marginals:
The marginal probability of an event
is given by the sum of the probabilities
of the worlds where the event holds.
...
...
Probabilities:
...
Worlds:
23
We do not
care about
these values.
We only care
about these
marginals!
(it is the product measure)
We are interested in marginals that fulfill certain properties.
For any set of marginals, there exists a probability distribution.
Probabilistic Model
Marginals:
...
...
Probabilities:
...
Worlds:
Literals
birthDate
RockSinger
type
1935
Singer
type
born
marriedTo
25
?
1936
The probability that two literals are equal
shall reflect the likelyhood that the two literals
are intended to refer to the same thing.
Literals
birthDate
RockSinger
type
1935
Singer
type
born
marriedTo
26
?
1936
The probability that two literals are equal
shall reflect the likelyhood that the two literals
are intended to refer to the same thing.
• for strings:
string distance
• for numbers:
numeric distance
• for other literals:
domain-specific
Equality of Instances
For instances, relations give a hint:
Elvis Presley
label
label
born
1935
born
1935
Elvis Presley
marriedTo
?
28
Equality of Instances
For instances, relations give a hint:
Elvis Presley
label
label
born
1935
born
1935
Elvis Presley
marriedTo
?
29
Not many people are
called Elvis =>
highly indicative
Equality of Instances
For instances, relations give a hint:
Not many people are
called Elvis =>
highly indicative
Elvis Presley
label
label
born
1935
born
1935
Elvis Presley
marriedTo
?
30
Many people are born
in 1935 =>
less indicative
Not many people are
called Elvis =>
highly indicative
Many people are born
in 1935 =>
less indicative
Elvis Presley
label
label
born
1935
born
1935
Elvis Presley
marriedTo
?
31
The local inverse functionality of a relation with an argument is
one over the number of incoming links:
Local Inverse Functionality
Elvis Presley
1935
marriedTo
label
1935
born
label
born
Elvis Presley
Only one person
is called Elvis
?
32
The local inverse functionality of a relation with an argument is
one over the number of incoming links:
Local Inverse Functionality
Elvis Presley
1935
marriedTo
label
1935
born
label
born
Elvis Presley
Only one person
is called Elvis
?
33
10 people are born
in 1935
The local inverse functionality of a relation with an argument is
one over the number of incoming links:
Local Inverse Functionality
Probability of Inverse Functionality
Elvis Presley
Elvis Presley
marriedTo
label
born
1935
1935
label
born
?
34
Many people share
their year of birth
Few people share
the same name
We define the probability of a relation being inverse functional
as the harmonic mean of the local inverse functionalities:
Equality of Instances
marriedTo
Elvis Presley
Elvis Presley
label
label
born
born
1935
1935
?
35
and
shall be matched if
Equality of Instances
marriedTo
Elvis Presley
Elvis Presley
label
label
born
born
1935
1935
?
36
Equality of Instances
marriedTo
Elvis Presley
Elvis Presley
label
label
born
born
1935
1935
?
37
Equality of Instances
marriedTo
Elvis Presley
Elvis Presley
label
label
born
born
1935
1935
?
38
This evaluates to 1 iff
• There is at least one
highly inverse functional r
• There is one shared
argument
Equality of Instances
• Literals: precomputed
• Others: recursive
label
marriedTo
1935
1935
born
born
label
Elvis Presley
Elvis Presley
?
39
Equality of Classes
Singer
type
RockSinger
If all instances of one class are instances of the other
then the former subsumes the latter
40
type
type
41
subclassOf
type
Singer
RockSinger
Equality of Classes
If all instances of one class are instances of the other
then the former subsumes the latter
type
subclassOf
42
type
Singer
RockSinger
Equality of Classes
If all instances of one class are instances of the other
then the former subsumes the latter
type
subclassOf
43
type
Singer
RockSinger
Equality of Classes
If all instances of one class are instances of the other
then the former subsumes the latter
type
subclassOf
44
type
Singer
RockSinger
Equality of Classes
If all instances of one class are instances of the other
then the former subsumes the latter
Equality of Relations
knows
marriedTo
knows
If every pair of one relation is a pair of another relation,
then the first is a sub-property of the second.
45
Equality of Relations
knows
marriedTo
knows
If every pair of one relation is a pair of another relation,
then the first is a sub-property of the second.
46
marriedTo
subpropertyOf
knows
Equality of Relations
knows
marriedTo
knows
If every pair of one relation is a pair of another relation,
then the first is a sub-property of the second.
47
knows
subpropertyOf
marriedTo
Can be solved analogously to the
subsumption of classes.
Algorithm
1. Fix equalities for literals
48
Literals:
Instances:
Relations:
1. Fix equalities for literals
2. Set equalities for relations to a small initial value
3. Iterate the estimations for relations and instances to convergence (*)
(*) We have no proof for convergence, but it seems to happen
50
Iterate
Algorithm
Literals:
Algorithm
Iterate
1. Fix equalities for literals
2. Set equalities for relations to a small initial value
3. Iterate the estimations for relations and instances to convergence (*)
4. Compute the estimations for classes
(*) We have no proof for convergence, but it seems to happen
51
final
step
Classes:
Instances:
Relations: