Information Extraction:
Who does it?
©
Fabian M. Suchanek
Overview
2
•
Academic projects
-
WordNet
-
NELL
-
DBpedia
-
BabelNet
-
WikiData
-
YAGO
•
Industrial projects
-
Google
-
Microsoft
-
Ebay
-
Amazon
-
Facebook
-
IBM
-
Apple
Available under permissive licences
for you to use
WordNet
is a large lexical KB of the English language. It contains
information about nouns, verbs, adjectives and adverbs.
The project started in 1985, and was frozen in 2012.
WordNet
WordNet
3
WordNet/Person
4
Superclass
WordNet has a taxonomy
Root of the whole
taxonomy
>overview
-> Wikidata
NELL / Read The Web
5
NELL (Never Ending Language Learner) is an
information extraction project at Carnegie
Mellon University. It couples several learners.
Pattern Extraction
Table Extraction
Elvis
Emmanuel
Donald
Priscilla
Brigitte
Melania
Elvis married Priscilla.
Constraints
Morphology
“-a” are female names.
Learned Rules
+ many other learners
http://rtw.ml.cmu.edu/rtw/
Example: NELL about “MacBook”
6
CMU: Read The Web
>overview
-> Wikidata
DBpedia
7
DBpedia is a crowd-sourced community effort to extract structured
information from Wikipedia and make this information available
on the Web. [
DBpedia.org
]
8
DBpedia: Elvis
Example: DBpedia about Elvis
>overview
BabelNet
9
Elvis in BabelNet
BabelNet is a multilingual lexicalized semantic network and ontology.