CC-BY
Fabian M. Suchanek
Fact Extraction
74
>openie
Semantic IE
2
Source Selection and Preparation
Entity Recognition
Entity Disambiguation
singer
Fact Extraction
KB
construction
Entity Typing
singer Elvis
You
are
here
3
Fact Extraction
•
Definitions
•
without background knowledge
•
by extraction patterns
•
by Large Language Models
•
by Natural Language Inference
•
with background knowledge
•
by the DIPRE Algorithm
•
by classification
•
General Considerations
•
Semantic Representations
Def: Open Information Extraction
4
Open Information Extraction
(Open IE) extracts a triple of a subject, verb,
and object from a sentence without canonicalizing these components.
It has been said that man is a rational animal. All my life I
have been searching for evidence that could support this.
[Bertrand Russell]
[Anefo]
Wikipedia: Russell
Def: Open Information Extraction
5
〈 man, is, rational animal〉
try a demo
Open Information Extraction
(Open IE) extracts a triple of a subject, verb,
and object from a sentence without canonicalizing these components.
It has been said that man is a rational animal. All my life I
have been searching for evidence that could support this.
[Bertrand Russell]
[Anefo]
Wikipedia: Russell
Open IE Uses
6
Example:
“Who built the pyramids?”
〈 ?, built, pyramides 〉
try it out!
62 answers from 584 sentences
Egyptians (132)
Ancient Egypt (123)
aliens (44)
the people (38)
slaves (29)
Khufu (23)
the Pharaohs (17)
the men (16)
the kings (11)
the ones (9)
The question (8)
In many applications, Open IE
is fully sufficient (e.g., natural
language question answering).
Open IE Weaknesses
7
Example:
“Who built the pyramids?”
〈 ?, built, pyramides 〉
62 answers from 584 sentences
Egyptians (132)
Ancient Egypt (123)
aliens (44)
the people (38)
slaves (29)
Khufu (23)
the Pharaohs (17)
the men (16)
the kings (11)
the ones (9)
The question (8)
Open IE is less useful for tasks such as
• counting facts
• counting entities
• logical reasoning
• fact checking
•
complex queries with joins
Def: Canonical knowledge base
8
62 answers from 584 sentences
Egyptians (132)
Ancient Egypt (123)
aliens (44)
the people (38)
slaves (29)
Khufu (23)
the Pharaohs (17)
the men (16)
the kings (11)
the ones (9)
The question (8)
<Ancient_Egypt, built, Giza_pyramids>
<Great_Giza_Pyramid, partOf, Giza_pyramids>
<Pyramid of Khafre, partOf, Giza_pyramids>
<Pyramid of Menkaure, partOf, Giza_pyramids>
canoni‐
calization
In a
canonical knowledge base
, every entity and every relation exists exactly once,
and has a unique identifier.
9
Def: Fact Extraction
Fact extraction
is the extraction of canonicalized facts about entities from a corpus.
Bertrand Russell was a British philosopher, mathematician,
political activist, and Nobel Laureate. Russell co‐wrote the
“Principia Mathematica”.
[via TheFamousPeople]
10
Def: Fact Extraction
<Bertrand_Russell, type, philosopher>
<Bertrand_Russell, type, mathematician>
<Bertrand_Russell, type, political_activist>
<Bertrand_Russell, won, Nobel_Prize>
<Bertrand_Russell, authored, Principia_Mathematica>
canonicalized facts
with canonicalized relations
Fact extraction
is the extraction of canonicalized facts about entities from a corpus.
[via TheFamousPeople]
Bertrand Russell was a British philosopher, mathematician,
political activist, and Nobel Laureate. Russell co‐wrote the
“Principia Mathematica”.
11
Def: Relation Classification & Relation Extraction
Russell
co‐wrote the
“Principia Mathematica”
...
Fact extraction can be further broken down into
•
Relation classification
: determining the canonicalized relation between two given entities.
•
Relation extraction
: extracting facts with a canonicalized relation and uncanonicalized entities
Disambiguation
Russell co‐wrote the “Principia Mathematica” with his colleague...
NERC
Relation classification
<Russell, authored, Principia Mathematica>
<Bertrand_Russell, authored, Principia_Mathematica_(book)>
Relation
extraction
Fact
extraction
12
Def: Relation Classification & Relation Extraction
Russell
co‐wrote the
“Principia Mathematica”
...
Fact extraction can be further broken down into
•
Relation classification
: determining the canonicalized relation between two given entities.
•
Relation extraction
: extracting facts with a canonicalized relation and uncanonicalized entities
Disambiguation can happen before or after disambiguation.
Relation classification
Russell co‐wrote the “Principia Mathematica” with his colleague...
NERC
Disambiguation
<Bertrand_Russell, authored, Principia_Mathematica_(book)>
Fact
extraction
<Bertrand_Russell> co‐wrote the <Principia_Mathematica_(book)>
“classical”
way of
doing it
Lab!
13
Fact Extraction
•
Definitions
•
without background knowledge
•
by extraction patterns
•
by Large Language Models
•
by Natural Language Inference
•
with background knowledge
•
by the DIPRE Algorithm
•
by classification
•
General Considerations
•
Semantic Representations
Def: Extraction Pattern
14
An
extraction pattern
for a binary relation r is a phrase that contains two
place‐holders X and Y, and that indicates that X and Y stand in relation r .
X is the author of Y .
X wrote Y .
In her book Y , X writes
X ’s observations in his book Y .
Extraction patterns for wrote(X , Y ):
Where do they come from?
-> manual work
(we’ll later see how to get them automatically)
Def: Pattern Application
Given a corpus and an extraction pattern,
pattern application
is the process of applying NERC,
finding the pattern in the corpus and extracting the corresponding relations.
15
X is the author of Y .
X wrote Y .
In her book Y , X writes
X ’s observations in his book Y .
Extraction patterns for wrote(X , Y ):
I value Russell’s observations in his book “Principia Mathematica”.
+
Corpus
Def: Pattern Application
Given a corpus and an extraction pattern,
pattern application
is the process of applying NERC,
finding the pattern in the corpus and extracting the corresponding relations.
16
X is the author of Y .
X wrote Y .
In her book Y , X writes
X ’s observations in his book Y .
Extraction patterns for wrote(X , Y ):
I value
Russell
’s observations in his book
“Principia Mathematica”
.
+
NERC’d
Corpus
Def: Pattern Application
Given a corpus and an extraction pattern,
pattern application
is the process of applying NERC,
finding the pattern in the corpus and extracting the corresponding relations.
17
X is the author of Y .
X wrote Y .
In her book Y , X writes
X ’s observations in his book Y .
Extraction patterns for wrote(X , Y ):
+
NERC’d
Corpus
=
<Russell, wrote, Principia Mathematica>
I value
Russell
’s observations in his book
“Principia Mathematica”
.
Pattern Application
Pattern application is a poor man’s way to extract relations.
18
X is the author of Y .
Extraction patterns for wrote(X , Y ):
Advantages:
• very simple
• no heavy machinery required
• can be explained and debugged
Disadvantages:
• requires manual work
• does not perform well
19
Fact Extraction
•
Definitions
•
without background knowledge
•
by extraction patterns
•
by Large Language Models
•
by Natural Language Inference
•
with background knowledge
•
by the DIPRE Algorithm
•
by classification
•
General Considerations
•
Semantic Representations
20
Relation extraction by LLMs
Generative language models can perform relation extraction in a zero‐shot setting.
<Bertrand Russell, Work For, City College of New York>
Wikipedia: Russell
Bertrand Russell was hired by the City College of New York in 1940, but was removed from the
position because a bishop took umbrage with Russel’s writings on sexual liberty.
List the entities of the types [LOCATION, ORGANIZATION, PERSON]
and relations of types [Organization Based In, Work For, Located In, Live In, Criticize]
among the entities in the given text.
[Wadhwa: “Revisiting Relation Extraction in the era of Large Language Models”, ACL 2023]
Lab!
21
Different techniques are being tried out for relation extraction, including
•
prompt enginering to get the model to extract only facts
Bertrand Russell wrote in his book “Justice in War‐Time”:
No nation was ever so virtuous as each believes itself,
and none was ever so wicked as each believes the other.
List all authors with their books:
[Wadhwa: “Revisiting Relation Extraction in the era of Large Language Models”, ACL 2023]
[Quotes by Russell]
Bertrand Russell wrote “Justice in War‐Time”
Books by Bertrand Russell include “Justice in War‐Time”
Bertrand Russell: “Justice in War‐Time”
Need fact extraction
to get the triples!
=> find a prompt that
generates succinct answers
Relation extraction by LLMs
Lab!
>more
22
Different techniques are being tried out for relation extraction, including
•
prompt enginering to get the model to extract only facts
Bertrand Russell wrote in his book “Justice in War‐Time”:
No nation was ever so virtuous as each believes itself,
and none was ever so wicked as each believes the other.
List all authors with their books
in JSON format
:
[Wadhwa: “Revisiting Relation Extraction in the era of Large Language Models”, ACL 2023]
[Quotes by Russell]
{ author: “Bertrand Russell”, book: “Justice in War Time” }
Relation extraction by LLMs
Lab!
>more
23
Different techniques are being tried out for relation extraction, including
•
prompt enginering to get the model to extract only facts
•
manually adding examples to cover all relations and entity types
“we constructed our prompt manually to contain at least one example of each entity and
each relation type”
“we create a prompt with only 20
exemplars capturing all entity and relation types.”
[Wadhwa: “Revisiting Relation Extraction in the era of Large Language Models”, ACL 2023]
Relation extraction by LLMs
Lab!
>more
24
Text: Edward Marks, an official with the ITAR explained their position…
Triplets: [Edward Marks:PER, work_for, ITAR:ORG]
Explanation: Edward Marks is an official with the ITAR, therefore it can be concluded that he
works for ITAR.
Text: At the University of California, Bertrand Russell...
[Wadhwa: “Revisiting Relation Extraction in the era of Large Language Models”, ACL 2023]
Different techniques are being tried out for relation extraction, including
•
prompt enginering to get the model to extract only facts
•
manually adding examples to cover all relations and entity types
•
extracting entities with their NERC classes in parallel
Relation extraction by LLMs
Lab!
>more
25
His book “
A History of Western Philosophy
” became a best-seller
and provided
Russell
with a steady income for the remainder of his life.
Extract the relation between
Russell
and “
A history [...]
”.
[Permitted answers: wasBornIn, married, wroteBook, ...]
Different techniques are being tried out for relation extraction, including
•
prompt enginering to get the model to extract only facts
•
manually adding examples to cover all relations and entity types
•
extracting entities with their NERC classes in parallel
•
using constrained decoding
Relation extraction by LLMs
Lab!
->Constrained-decoding
26
[Wadhwa: “Revisiting Relation Extraction in the era of Large Language Models”, ACL 2023]
Advantages:
•
Works out of the box without much technical knowledge
•
Does not need training data
Disadvantages
•
Needs heavy machinery
•
GTP-3 still invents a number of relations
•
It is not clear how LLMs perform on less common entity types and relations, such as
proteins, chemical compounds, electrical parts, fictional characters,...
killed, Killing, assassinates, assassination, Killed_By, Assassin, Shot_By
is_part_of, isPartOf,
Works_at, Works_for, WorkedFor, Worked_For,
Summer, Piano, Bank, Aircraft, Sex, ...
Relation extraction by LLMs
27
Fact Extraction
•
Definitions
•
without background knowledge
•
by extraction patterns
•
by Large Language Models
•
by Natural Language Inference
•
with background knowledge
•
by the DIPRE Algorithm
•
by classification
•
General Considerations
•
Semantic Representations
28
Natural Language Inference
(NLI, also: textual entailment, TE) is the task of determining
whether one sentence logically entails another one, contradicts it, or is neutral towards it.
Bertrand Russell wrote a book.
Bertrand Russell is an author.
Def: Natural Language Inference
Bertrand Russell wrote a book.
Bertrand Russell is British.
Bertrand Russell wrote a book.
Bertrand Russell cannot read and write.
=> Entailment
=> Neutral
=> Contradiction
NLI can be done
off‐the‐shelf by RoBERTa
and larger models
29
NLI can be used for relation extraction as follows:
In his book “Principia Mathematica”, Bertrand Russell proves that 1+1=2.
(Below the proof, he wrote: The above proposition can be occasionally useful.)
Relation extraction by NLI
30
In his book “Principia Mathematica”, Bertrand Russell proves that 1+1=2.
In his book
“Principia Mathematica”
,
Bertrand Russell
proves that 1+1=2.
NERC
1. Perform NERC
NLI can be used for relation extraction as follows:
Relation extraction by NLI
31
In his book “Principia Mathematica”, Bertrand Russell proves that 1+1=2.
In his book
“Principia Mathematica”
,
Bertrand Russell
proves that 1+1=2.
NERC
1. Perform NERC
Bertrand Russell
was born in
“Principia Mathematica”
2. For each relation, build a sentence with the entities from the input:
Bertrand Russell
wrote
“Principia Mathematica”
...
NLI can be used for relation extraction as follows:
Relation extraction by NLI
32
In his book “Principia Mathematica”, Bertrand Russell proves that 1+1=2.
In his book
“Principia Mathematica”
,
Bertrand Russell
proves that 1+1=2.
NERC
Bertrand Russell
was born in
“Principia Mathematica”
2. For each relation, build a sentence with the entities from the input:
Bertrand Russell
wrote
“Principia Mathematica”
1. Perform NERC
...
3. Check NLI for each sentence
NLI
Contradict
NLI can be used for relation extraction as follows:
Relation extraction by NLI
33
In his book “Principia Mathematica”, Bertrand Russell proves that 1+1=2.
In his book
“Principia Mathematica”
,
Bertrand Russell
proves that 1+1=2.
NERC
Bertrand Russell
was born in
“Principia Mathematica”
2. For each relation, build a sentence with the entities from the input:
Bertrand Russell
wrote
“Principia Mathematica”
1. Perform NERC
...
NLI
Entailment
3. Check NLI for each sentence
NLI can be used for relation extraction as follows:
Relation extraction by NLI
34
In his book “Principia Mathematica”, Bertrand Russell proves that 1+1=2.
In his book
“Principia Mathematica”
,
Bertrand Russell
proves that 1+1=2.
NERC
Bertrand Russell
was born in
“Principia Mathematica”
2. For each relation, build a sentence with the entities from the input:
Bertrand Russell
wrote
“Principia Mathematica”
1. Perform NERC
...
NLI
Entailment
3. Check NLI for each sentence
4. For the entailed relation, output a fact
<Bertrand Russell, wrote, Principia Mathematica>
[Cabot: “REBEL: Relation Extraction By End-to-end Language generation”, EMNLP 2021]
NLI can be used for relation extraction as follows:
Relation extraction by NLI
35
In his book “Principia Mathematica”, Bertrand Russell proves that 1+1=2.
Bertrand Russell
wrote
“Principia Mathematica”
entails
<Bertrand Russell, wrote, Principia Mathematica>
Advantages:
• very simple approach
• does not need training data
• does not need a very powerful model
Disadvantages:
•
NLI does not always work as intended
•
NLI still requires a reasonably large model
•
NLI may not work well for domain‐specific
entities and relations
NLI can be used for relation extraction as follows:
Relation extraction by NLI
36
Fact Extraction
•
Definitions
•
without background knowledge
•
by extraction patterns
•
by Large Language Models
•
by Natural Language Inference
•
with background knowledge
•
by the DIPRE Algorithm
•
by classification
•
General Considerations
•
Semantic Representations
Def: Pattern Deduction
Given a corpus, and given a KB,
pattern deduction
is the process of finding extraction patterns
that produce facts of the KB when applied to the corpus.
37
Corpus
Rowling
KB
wrote
“Harry Potter”
J. K. Rowling schrieb “Harry Potter”,
ein Buch welches Alt und Jung fasziniert.
J. K. Rowling schrieb “Harry Potter”,
ein Buch welches Alt und Jung fasziniert.
“X schrieb Y ” is a pattern for wrote(X ,Y )
Def: Pattern Deduction
Given a corpus, and given a KB,
pattern deduction
is the process of finding extraction patterns
that produce facts of the KB when applied to the corpus.
38
Corpus
Rowling
KB
wrote
“Harry Potter”
“X schrieb Y ” is a pattern for wrote(X ,Y )
Def: Pattern Application
Given a corpus and an extraction pattern,
pattern application
is the process of finding
the pattern in the corpus and extracting the corresponding facts.
39
Bertrand Russell schrieb “Principia Mathematica”,
ein Werk über die Grundlagen der Mathematik.
Wikipedia: Principa Mathematica
Corpus
Rowling
KB
wrote
“Harry Potter”
Russell
“Principia Mathematica”
wrote
“X schrieb Y ” is a pattern for wrote(X ,Y )
40
+
Bertrand Russell schrieb “Principia Mathematica”,
ein Werk über die Grundlagen der Mathematik.
Wikipedia: Principa Mathematica
Def: Pattern Application
Given a corpus and an extraction pattern,
pattern application
is the process of finding
the pattern in the corpus and extracting the corresponding facts.
Corpus
Rowling
KB
wrote
“Harry Potter”
Def: Pattern iteration/DIPRE
Pattern iteration
(also: DIPRE) is the process of repeated
• pattern deduction
• pattern application
... to continuously augment the KB.
41
Known facts
Patterns
New facts
Patterns
Task: DIPRE
Michelle ist verheiratet mit Barack.
Merkel ist die Frau von Sauer.
Michelle ist die Frau von Barack.
Priscilla ist verheiratet mit Elvis.
42
marriedTo
Merkel
Sauer
KB
Task: DIPRE
Michelle ist verheiratet mit Barack.
Merkel ist die Frau von Sauer.
Michelle ist die Frau von Barack.
Priscilla ist verheiratet mit Elvis.
Priscilla küsst Elvis.
43
“Priscilla kisses Elvis” will induce a wrong pattern,
and will derail all following extractions! (Semantic Drift)
marriedTo
Merkel
Sauer
KB
44
Fact Extraction
•
Definitions
•
without background knowledge
•
by extraction patterns
•
by Large Language Models
•
by Natural Language Inference
•
with background knowledge
•
by the DIPRE Algorithm
•
by classification
•
General Considerations
•
Semantic Representations
45
Training set creation
If a belief is popular, that does not mean it’s true;
on the contrary, in view of the silliness of the
majority of mankind, a widely spread belief is
more likely to be foolish than sensible, writes the
British philosopher Bertrand Russell.
All of the following methods for relation extraction need a training set, i.e.,
a set of pairs of a sentence and extracted facts.
<Russell, nationality, British>
<Russell, occupation, philosopher>
Sentence
Extracted facts
46
Training set creation
... writes the British philosopher Bertrand Russell.
All of the following methods for relation extraction need a training set, i.e.,
a set of pairs of a sentence and extracted facts.
<Russell, nationality, British>
<Russell, occupation, philosopher>
Sentence
Extracted facts
Where does the training set come from?
-
created manually
-
created by any knowledge‐free method
(LLMs, NLI, pattern application)
-
created by distant supervision
47
Training set creation
All of the following methods for relation extraction need a training set, i.e.,
a set of pairs of a sentence and extracted facts.
<Russell, nationality, British>
<Russell, occupation, philosopher>
Extracted facts
We expect the ML to iron out errors,
and be more reliable than the methods
that generate the training data
... writes the British philosopher Bertrand Russell.
Sentence
Where does the training set come from?
-
created manually
-
created by any knowledge‐free method
(LLMs, NLI, pattern application)
-
created by distant supervision
Distant supervision is available for
training but not for testing
48
Def: Distant Supervision
Distant supervision
is the extraction of facts from a corpus by help of a knowledge base,
under the assumption that every sentence that contains two entities that stand in a relation
in the knowledge base expresses that relation (as in DIPRE).
Russell
English
Britain
philosopher
speaks
occupation
nationality
KB
... writes the British
philosopher
Bertrand Russell
.
49
Def: Distant Supervision
Distant supervision
is the extraction of facts from a corpus by help of a knowledge base,
under the assumption that every sentence that contains two entities that stand in a relation
in the knowledge base expresses that relation (as in DIPRE).
<Russell, occupation, philosopher>
Russell
English
Britain
philosopher
speaks
occupation
nationality
KB
... writes the British
philosopher
Bertrand Russell
.
50
Def: Distant Supervision
Distant supervision
is the extraction of facts from a corpus by help of a knowledge base,
under the assumption that every sentence that contains two entities that stand in a relation
in the knowledge base expresses that relation (as in DIPRE).
NLI can remove wrong extractions.
that afternoon,
the
philosopher
told
Bertrand Russell
that
Russell
English
Britain
philosopher
speaks
occupation
nationality
KB
51
Def: Distant Supervision
Distant supervision
is the extraction of facts from a corpus by help of a knowledge base,
under the assumption that every sentence that contains two entities that stand in a relation
in the knowledge base expresses that relation (as in DIPRE).
NLI can remove wrong extractions.
that afternoon,
the
philosopher
told
Bertrand Russell
that
Russell
English
Britain
philosopher
speaks
occupation
nationality
KB
x
“the philosopher told Russell”
does not entail
<Russell, occupation, philosopher>
[Cabot: “REBEL: Relation Extraction By End-to-end Language generation”, EMNLP 2021]
<Russell, occupation, philosopher>
52
Fine‐tuned language models can solve relation extraction as a classification task.
Relation extraction by fine‐tuned models
<Russell, won, Nobel Prize>
Russell won the Nobel Prize in Literature
for his writings in which he champions
humanitarian ideals and freedom of thought.
Training data:
53
Fine‐tuned language models can solve relation extraction as a classification task.
Relation extraction by fine‐tuned models
<Russell, won, Nobel Prize>
Training data:
BERT
Russell won the Nobel Prize in Literature
for his writings in which he champions
humanitarian ideals and freedom of thought.
54
Fine‐tuned language models can solve relation extraction as a classification task.
Relation extraction by fine‐tuned models
<Russell, won, Nobel Prize>
Russell
won the
Nobel Prize in Literature
for his writings in which he champions
humanitarian ideals and freedom of thought.
Training data:
BERT
Lots of details to clarify:
- NERC text first
55
Fine‐tuned language models can solve relation extraction as a classification task.
Relation extraction by fine‐tuned models
<Russell, won, Nobel Prize>
Russell
won the
Nobel Prize in Literature
Training data:
BERT
Lots of details to clarify:
- NERC text first
- select input part of the text
56
Fine‐tuned language models can solve relation extraction as a classification task.
Relation extraction by fine‐tuned models
won
Training data:
BERT
Lots of details to clarify:
- NERC text first
- select input part of the text
- define training objective
Russell
won the
Nobel Prize in Literature
57
Fine‐tuned language models can solve relation extraction as a classification task.
Relation extraction by fine‐tuned models
Testing data:
BERT
Lots of details to clarify:
- NERC text first
- select input part of the text
- define training objective
- when applying the trained system:
create the output fact
Marie Curie
won the
Nobel Prize in Physics
for research on radiation phenomena .
<Marie_Curie, won, Nobel_Prize_in_Physics>
won
58
Relation extraction by fine‐tuned models
[via everydaypower]
<subj>Bertrand Russell
<rel>authored<obj>Value of Philosophy
<rel>nationality<obj>British
<rel>occupation<obj>philosopher
Training data:
Philosophy — while diminishing our feeling
of certainty as to what things are, it greatly
increases our knowledge as to what
they may be, writes the British philosopher
Bertrand Russel in “The Value of Philosophy”.
Fine‐tuned language models can solve relation extraction as a seq2seq task.
59
Relation extraction by fine‐tuned models
[via everydaypower]
<subj>Bertrand Russell
<rel>authored<obj>Value of Philosophy
<rel>nationality<obj>British
<rel>occupation<obj>philosopher
Training data:
Philosophy — while diminishing our feeling
of certainty as to what things are, it greatly
increases our knowledge as to what
they may be, writes the British philosopher
Bertrand Russel in “The Value of Philosophy”.
Fine‐tuned language models can solve relation extraction as a seq2seq task.
[Cabot: “REBEL: Relation Extraction By End-to-end Language generation”, EMNLP 2021]
BART
Language models can
“translate” natural language
text into structured facts.
60
Advantages:
•
achieves state‐of-the art results
•
simple approach
•
adaptable to different domains
Relation extraction by fine‐tuned models
Philosophy — while diminishing our feeling
of certainty as to what things are, it greatly
increases our knowledge as to what
they may be, writes the British philosopher
Bertrand Russel in “The Value of Philosophy”.
[via everydaypower]
Fine‐tuned language models can solve relation extraction as a seq2seq task or as classification.
Disadvantages:
•
needs training data
<subj>Bertrand Russell
<rel>authored<obj>Value of Philosophy
<rel>nationality<obj>British
<rel>occupation<obj>philosopher
61
Fact Extraction
•
Definitions
•
without background knowledge
•
by extraction patterns
•
by Large Language Models
•
by Natural Language Inference
•
with background knowledge
•
by the DIPRE Algorithm
•
by classification
•
General Considerations
•
Semantic Representations
62
Type Checking
<Russell, hasGodFather, John_Stuart_Mill>
<Russell, memberOf, India_League>
<Russell, memberOf, United_Nations>
<Russell, diedInPlace, “1970”>
63
Def: Type Checking
<Russell, hasGodFather, John_Stuart_Mill>
<Russell, memberOf, India_League>
<Russell, memberOf, United_Nations>
<Russell, diedInPlace, “1970”>
Type Checking
a statement means checking whether its subject and
object conform to the domain and range of the relation, respectively.
Type Check
OK
OK
OK
not OK
Type checks can help removing falsely extracted statements, in particular from ambiguous
phrases such as “was born in place/year”, “wrote music/book/check”, etc.
>more
64
Extracting Attributes
<Russell, wasBornInYear, “1872”>
<Russell, wasBornOnDate, “1872-05-18”>
<Princ.Math, hasISBN, “978-345-...”>
An
attribute
is a functional relation whose range are literals. Attributes can be extracted by similar
methods as entity‐to‐entity relations. They can be type‐checked, e.g., by regular expressions.
Type Check
“\d{4}” (?)
“-?\d+-\d+-\d+” (?)
“978-[0-9-]+” (?)
Literals may require normalization:
18 May 1872
18th of May 1872
18/05/1872
05/18/1872
Literals may have units that have to be (1) extracted and (2) normalized:
1kg
1000g
0.157 st
1 kilogram
2.204 pounds
>more
Dependency Parsing
65
Bertrand Russell, who was British, was refused at the College of New York.
“X was refused at Y ”
+
Pattern does not match!
Def: Dependency Parsing
66
Bertrand Russell, who ..., was refused at the College of New York...
“X was refused at Y ”
Dependency parses make patterns robust to sub‐ordinate sentences and other surface variations.
loc
compl
part
subj
subj
loc
compl
subOrd
compl
loc
nn
nn
A
dependency parse
of a sentence is a tree that reveals the syntactic structure of the sentence.
->Dependency-parsing
>more
Example: Patterns in NELL
67
NELL
(Never Ending Language Learner) is an information extraction project at Carnegie Mellon
University. It uses DIPRE to learn patterns for relations and new facts.
Apple
produced
MacBook
NELL: MacBook
68
Fact Extraction
•
Definitions
•
without background knowledge
•
by extraction patterns
•
by Large Language Models
•
by Natural Language Inference
•
with background knowledge
•
by the DIPRE Algorithm
•
by classification
•
General Considerations
•
Semantic Representations
69
Def: Frame
A
frame
is a predefined type of event with its participants (“roles”).
Common frame collections, with annotated corpora, are:
- FrameNet
- PropBank
- VerbAtlas
Frame: “move slightly”
Arg0: causer of motion
Arg1: thing in motion
Arg2: distance moved
Arg3: start point
Arg4: end point
Arg5: direction
General roles for all frames:
LOC: location
CAU: cause
EXT: extent
TMP: time
...
Arg0:
Revenue edged
Arg5:
up
Arg2:
3.4% to
Arg4:
$904 million
from
Arg3:
$874 million
TMP:
in last year’s third quarter.
[PropBank]
(simplified)
Annotated corpus:
70
Def: Slot Filling
Slot Filling
is the task of extracting a frame and values for its roles from a short text.
Frame: jailing event
Agent: British Police
Patient: Russell
Location: Briston Prison
When he was 89 years old, Russell took part in an anti‐nuclear
demonstration in London. Russell was jailed for 7 days in
Brixton Prison after replying “No I won’t” to the request of the
British Police to pledge himself to good behavior.
71
Slot Filling Techniques
Slot filling is an old discipline with dozens of approaches, usually
- dependency‐parse based (mark head word of a slot filler)
- span‐based (mark begin and end of a slot filler in the text)
- by help of a LLM
Slot filling classifies the text to a frame, and each item to a role.
demonstration in London.
Russell
was
jailed
for 7 days in
Brixton Prison
after replying “No I won’t” to the request of the
British Police
to pledge himself to good behavior.
classify as end of agent role
classify as beginning of agent role
main verb, classified as “jailing event”
72
Def: Abstract Meaning Representation
An
Abstract Meaning Representation
(AMR) of a sentence is a semantic representation in the form
of a rooted acyclic directed graph, whose nodes are either words or predefined keywords/frames,
and whose edges are predefined roles (usually PropBank) and their inverses.
I would never die for my beliefs, because I could be wrong.
[Bertrand Russell]
DIE
NEGATIVE
ever
I
CAUSE
BELIEVE
CAUSE
POSSIBLE
wrong
ARG1 (object)
POLARITY
TIME
ARG1
ARG1
ARG0
ARG0
ARG0
(agent)
ARG1
ARG1
Try a demo!
Capitalized: predefined concepts and roles
73
Def: Abstract Meaning Representation
DIE
NEGATIVE
ever
I
CAUSE
BELIEVE
CAUSE
POSSIBLE
wrong
ARG1 (object)
POLARITY
TIME
ARG1
ARG1
ARG0
ARG0
ARG0
(agent)
ARG1
ARG1
Try a demo!
Capitalized: predefined concepts and roles
AMR uses a standard frame vocabulary, and is thus relatively robust to synonyms and reformulations.
I would never die for my beliefs because it is possible that I am wrong.
74
AMR Parsing Techniques
There are large annotated datasets for AMR, and several AMR parsers have been proposed:
Two‐step parsers
first identify the concepts (by a sequence tagger), and then the
relations between these concepts (by classifying all possible links).
Graph‐transforming parsers
learn to transform the dependency graph to an AMR graph.
Seq2seq parsers
learn to transform the sentence to a lineralized form of the graph.
Example parsers: JAMR, Spring
I
would never
die
for my
beliefs
, because
I
could be
wrong.
75
Def: Discourse Representation Structure
A
Discourse Representation Structure
(DRS) of a sentence is a semantic representation of boxes,
each of which contains instantiated frames.
I would never die for my beliefs, because I could be wrong.
BELIEF
CREATOR: SPEAKER
DIE
Patient: SPEAKER
Cause:
Time:
TIME
AFTER: NOW
NEGATION
WRONG
Attribute
: SPEAKER
POSSIBILITY
PRESUPPOSITION
EXPLANATION
Capitalized items are predefined, parsing courtesy of Zacchary Sadeddine
76
DRS Parsing Techniques
There are two main annotated corpora for DRS:
- the Groningen Meaning Bank (GMB):
10,000 automatically annotated documents, with some manual checks
- the Parallel Meaning Bank (PMB)
10,000 automatically annotated, and manually verified, sentences
Rule‐based parsers
use POS-tagging, NERC, disambiguation, role labelling, coreference
resolution etc. plus manual rules
Seq2seq parsers
learn to transform the sentence into a linearized DRS representation
box1: { BELIEF
: { Creator: SPEAKER }} box2: ...
Summary: Fact Extraction
77
•
Fact extraction
aims to find facts with canonicalized entities and relations.
It consists of NERC, disambiguation, and relation extraction/classification
•
Techniques for Fact Extraction use Large Language Models or Fine‐tuned language models
•
Semantic parsing
aims to extract entire frames (events with roles), e.g.,
Semantic Role Labeling
,
AMR
,
DRS
“If an opinion contrary to your own makes you angry,
that is a sign that you are subconsciously aware
of having no good reason for thinking as you do.”
— Bertrand Russell
GoodReads: Bertrand Russell