Knowledge BasesA knowledge base (KB) is a structured, computer-processable description of the world. A KB can be thought of as a graph, in which the nodes are entities and the edges are relations. Here is an example: KBs serve all kinds of purposes, such as natural language understanding in chatbots, intelligent assistance, or Web search. Our research is guided by specific real-world problems on KBs. I work together with my colleagues and students in the DIG Team of Télécom Paris to formalize problems, to design principled models for their solution, and to develop real systems that produce that solution.
ProjectsApart from that, we work on several projects around knowledge bases:
- Knowledge Base Construction
- We work on extracting computer-readable information from Web sources. Our flagship project in this domain is YAGO, a large knowledge base constructed from Wikidata, schema.org, and other Web sources. But we also work on extracting commercial products from the Web and on repairing regular expressions.
- Completeness mining
- In the frame of an ANR grant, our goal is to find automatically where a knowledge base is missing information. Our flagship project here is the AMIE rule mining system, but we also work on determining the completeness of entities, or the necessity of attributes. Here is an overview on our work of mining completeness in knowledge bases.
- We have developed several approaches to query knowledge bases efficiently: One approach is based on Bash commands. Another one allows querying the knowledge base through Web services.
- Finally, we also work on applications of knowledge bases, such as Combinatorial creativity (making computers creative) or Semantic Culturomics (mining trends in history and society).
Older projects are
- Medical imaging: We worked on mapping brain activity to scientific terms (in collaboration with INRIA Saclay).
- DIVINA: A system that helps internet users make sure that their internet accounts are safe and secure.
- PARIS: PARIS is a project to learn mappings between knowledge bases.
- LEILA and SOFIE: These are projects that extract ontological information from natural language texts.
- Watermarking: This project developed methods to protect ontological knowledge against plagiarism.
Students / PostDocs
- Lihu Chen (PhD student, 2019-, co-advised with Gaël Varoquaux)
- Nedeljko Radulović (PhD student, 2018-, co-advised with Albert Bifet)
- Thomas Pellisier Tanon (PhD student, 2017-, co-advised with Antoine Amarilli)
- Jonathan Lajus (PhD student, 2016-)
- Julien Romero (PhD student, 2017-, co-advised with Nicoleta Preda)
Former Students / PostDocs
- Jérôme Dockès (PhD student, 2016-2019)
- Camille Bourgaux (Postdoc, 2017-2018; now at CNRS)
- Thomas Rebele (PhD student, 2015-2018; now at Orchestra Networks)
- Katerina Tzompanaki (Postdoc in 2016, since 2016 associate professor at the University of Cergy)
- Danai Symeonidou (Postdoc in 2015, since 2015 researcher at INRA)
- Luis Galárraga (PhD 2012-2016, since 2017 researcher at INRIA Rennes)