Knowledge Base Construction

Content

In this class, we will take an overview of Information Extraction for Knowledge Base Construction. This is the process of deriving structured information (such as alive(Elvis)) from digital text (such as the sentence "Elvis is alive"). The lecture will cover named entity recognition, entity disambiguation, instance extraction, fact extraction, and ontological information extraction. We will then see how we can mine this data for correlations. We will also touch upon applications of Information Extraction, such as Google's knowledge graph and IBM's Watson question answering system, as well as academic projects such as YAGO, DBpedia, and NELL.

Grading

Schedule

DaySession 1 (Amphi Rubis)Session 2 (C128)
2017-11-21 Intro, Motivation, Knowledge Representation Research topics
2017-11-28 Named Entity Recognition, Evaluation, Disambiguation Lab 1: Disambiguation
2017-12-05 NERC, CRFs Lab 2: NERC
2017-12-12 POS Tagging (includes HMMs), Instance Extraction Lab 3: Instance Extraction with POS Tags
2017-12-19Fact Extraction, IE by Reasoning, Markov Logic Lab 4: Weighted MAX SAT (room C130)
2018-01-09 Semantic Web in practice, Decidability Lab 5: Entity Mapping
2018-01-16Data Security Lab 6: Password cracking (Deadline during the lab!)
2018-01-3013:30-15:00: Exam (B543)
2018-03-1313:30-15:00: Re-Exam (B310)
The exams are “closed-book”. Paper is provided. Bring a pen and a brain, the rest is on us.

The schedule beyond the current point of time is tentative. The PDF slides are provided for convenience only, the authoritative ones are the SVG slides.

Supplementary material for those interested: Rule Mining, Corpus, Character Encodings, Wrapper induction, Dependency Parsing.