The Need to Move beyond Triples
Fabian Suchanek
(Vision paper)
Amazing! This talk is free
of the Corona virus!
(about the speaker, we don’t know...)
Cool knowledge‐based applications
Apple Siri
2
When was
Elvis born?
“1935”
IBM Watson
Discovered 6 kineasis
proteins that relate
to cancer
How long was the
Thirty Years’ War?
Amazon Echo
These applications feed from
knowledge bases
.
There are plenty of knowledge bases
NELL
TextRunner
Plus industrial projects at
Sponsored message: New version of YAGO at
http://yago-knowledge.org
.
What’s in a knowledge base?
From YAGO
Essentially binary facts (“triples”) in the knowledge format “RDF”:
4
What’s in the real world?
In February 1998, Andrew Wakefield published a paper in the medical
journal The Lancet, which reported on twelve children with
developmental disorders. The parents were said to have linked the start
of behavioral symptoms to vaccination. The resulting controversy
became the biggest science story of 2002. As a result, vaccination rates
dropped sharply. In 2011, the BMJ detailed how Wakefield had faked
some of the data behind the 1998 Lancet article.
Beliefs
Claims
Events
Reasons
Stories
Falsifications
...none of which is in a knowledge base!
5
The vision of this paper:
“The Need to Move Beyond Triples”
If we want tomorrow’s intelligent applications to be really intelligent,
we have to extend their knowledge bases by
6
1) We have to be able to extract complex knowledge from text (“IE”)
2) We have to be able to represent such knowledge and to reason on it
Beliefs
Claims
Events
Reasons
Stories
Falsifications
IE: What is possible already
7
Several cool approaches can extract non‐binary information:
- FRED
- K-Parser
- Document spanners
- ClausIE
- StuffIE
- OpenIE
- HighLife
- Classical slot fillers
Andrew Wakefield
published in
The Lancet
in 1998.
Publication_event
author
venue
time
(>50% of the vision paper
is discussion of related work)
8
Several cool approaches can extract non‐binary information:
- FRED
- K-Parser
- Document spanners
- ClausIE
Andrew Wakefield
published in
The Lancet
in 1998.
Publication_event
author
venue
time
IE: What is possible already
- StuffIE
- OpenIE
- HighLife
- Classical slot fillers
(>50% of the vision paper
is discussion of related work)
IE: What we need
9
“Wakefield published a paper that reported on children. Their parents
were said to have linked the start of behavioral symptoms to vaccination.
The resulting controversy caused vaccination rates to fall. ...”
Publication
RateChange
Wakefield
paper
Claim
symptoms
children
vaccination
Link
parents
vaccinationRate
-
caused
of
direction
author
pub.
content
about
of
by
of
of
of
IE: What we need
10
Publication
RateChange
Wakefield
paper
Claim
symptoms
children
vaccination
Link
parents
vaccinationRate
-
caused
of
direction
author
pub.
content
about
of
by
of
of
of
You know a system
that can do (part of) it?
Please let me know!
Type here: ____________
Cross‐sentence analysis, advanced co‐reference resolution,
standardized types of frames, relationships between events,
negation, hypothetical stances, storylines, ...
Reasoning: What we have
11
RateChange
vaccinationRate
-
of
direction
As knowledge representation:
- Frames, JSON
- complex objects
- object-relational databases
Publication
Wakefield
paper
caused
author
pub.
Reasoning: What we have
12
RateChange
vaccinationRate
-
of
direction
As knowledge representation:
- Frames, JSON
- complex objects
- object-relational databases
Publication
Wakefield
paper
caused
author
pub.
great, but do not allow for reasoning
- “If X caused Y and Y caused Z, then X caused Z”
- “If X did not publish a paper, X is not a scientist”
- “If Mary believes what Paul says & Paul says X, then Mary believes X”
Reasoning: What we have
13
RateChange
vaccinationRate
-
of
direction
For reasoning:
- RDFS, OWL DL, SHACL
- Description Logic
Publication
Wakefield
paper
caused
author
pub.
Reasoning: What we have
14
RateChange
vaccinationRate
-
of
direction
For reasoning:
- RDFS, OWL DL, SHACL
- Description Logic
Publication
Wakefield
paper
caused
author
pub.
great, but do not allow for statements
about statements
- “The paper says that vaccines cause autism”
- “Fact A caused Fact B”
Reasoning: What we have
15
RateChange
vaccinationRate
-
direction
Annotated Knowledge Representations:
- Fact identifiers
- RDF*
- Reification
of
Publication
Wakefield
paper
caused
author
pub.
Reasoning: What we have
16
RateChange
vaccinationRate
-
direction
Annotated Knowledge Representations:
- Fact identifiers
- RDF*
- Reification
of
Publication
Wakefield
paper
caused
author
pub.
cannot deal with hypothetical statements
cannot do reasoning
- “Mary believes that vaccines cause autism”
Reasoning: What we have
17
Big logic machinery:
- Context logics
- Modal logics
- Epistemic logics
Reasoning: What we have
18
Big logic machinery:
- Context logics
- Modal logics
- Epistemic logics
- “All clients believe that the company delivers a good service”
- “the loss of value on the stock market happened because the
public learned of a fraudulent activity by the company”
(or if they can, they are propositional logics or undecidable)
cannot quantify over contexts
Formal argumentation has monolithic propositions.
Belief revision has monolithic agents.
Provenance and annotated logics cannot make claims about annotations.
Vagueness, fuzziness, and probability are orthogonal topics.
Reasoning: What we need
19
1) a very simple logic
inside
a context
2) a very simple logic
about
contexts
=> a moderately simple logic
in combination
First‐order logic without
?
OWL EL?
Datalog?
Horn Rules?
Datalog?
You have a great idea? Let me know!
(?)
(?)
Applications
20
• Analysis of fake news / fact checking:
understand an article about a controversial topic, allow reasoning
(who said what when and why, what is the evidence, ...)
• Analysis of the e-reputation of a company:
extract controversy or beliefs with reasons and supporters,
for companies or their products
• Modeling of controversies:
detect a controversial topic on the Web (in blogs, forums, Twitter),
extract opinions, and model different views
>more
Understanding the arguments of the other side
is a prerequisite for refuting them.
Applications
21
• Flagging of potentially fraudulent activity:
Detect claims that contradict knowledge, or violate rules.
• Modeling of processes:
Model sequences of actions, causal relationships, and suggestions.
• Smarter chatbots:
Allow dialogues that go beyond single-shot questions.
• Legal text understanding:
Analyze a law, a regulation, or a contract, and derive
what is permitted and what is obligatory for which party.
Our project “NoRDF”
22
We are hiring
- PhD students
- postdocs
- engineers
https://suchanek.name/work/research/nordf/
Our project “NoRDF” aims to extract and model complex information
from natural language text. We are supported by the French National
Research Agency and 4 sponsors:
Backup Slides
Reasoning: What we have
24
RateChange
vaccinationRate
-
direction
Annotated Reasoning:
- Provenance formalisms
- Annotated logics
of
Publication
Wakefield
paper
caused
author
pub.
Reasoning: What we have
25
RateChange
vaccinationRate
-
direction
Annotated Reasoning:
- Provenance formalisms
- Annotated logics
cannot make statements about annotations
of
Publication
Wakefield
paper
caused
author
pub.
- “Mary believes that Fact A holds because of Fact B”
- “Fact A precedes Fact B”