CC-BY
Fabian M. Suchanek
The Semantic Web
100
>intro
Semantic IE
You
are
here
2
Source Selection and Preparation
Entity Recognition
Entity Disambiguation
singer
Fact Extraction
KB
construction
Entity Typing
singer Elvis
Overview
3
•
Motivation
•
Knowledge Representation
•
URIs
•
Standard Vocabularies
•
Linked Data
•
SPARQL & OWL
•
RDFa, JSON-LD & friends
•
Applications
We can do information extraction — what now?
4
Airport
Location
Heathrow
London
blah
Sources of incompatility
5
<airport>
<placeOrCity>
?
?
Airport Namen
City
Heathrow Airport
Londres
Airport
Location
Heathrow
London
<airport>
<placeOrCity>
6
[Images form Wikicommons, except Oracle]
?
?
Airport Namen
City
Heathrow Airport
Londres
Airport
Location
Heathrow
London
Sources of incompatility
<airport>
<placeOrCity>
7
[Images form Wikicommons, except Oracle. Company logos for illustration only]
> more
Airport Namen
City
Heathrow Airport
Londres
Airport
Location
Heathrow
London
Sources of incompatility
?
?
Where do we need interaction?
•
Booking a flight
Interaction between office computer, flight company, travel agency, shuttle services, hotel, ...
•
Finding a restaurant
Interaction between mobile device, map service,
recommendation service, restaurant reservation
•
Intelligent home
Fridge knows my calendar, orders food if I am planning a dinner
•
Intelligent cars
Car knows my schedule, where and when to get gas, how not to hit other cars,
what are the legal regulations
8
> more
Where do we need interaction?
•
Adding data to a database
From XML files, from other databases
•
Merging data after company mergers
Different terminology has to be bridged, accounts to be merged
•
Merging data in research
e.g. biochemical, genetic , pharmaceutical research data
9
Def: Semantic Web
Idea: We need an infrastructure that allows computers to “understand” their data.
This infrastructure shall
• allow machines to process data from others
• ensure interoperability between schemas, devices and organizations
• allow data to describe data
• allow machines to reason on the data
• allow machines to answer semantic queries
10
This is what the Semantic Web aims at
The
Semantic Web
is an evolving extension of the World Wide Web,
in which data is made available in one standardized semantic format.
The Semantic Web
11
•
Motivation
•
Knowledge Representation
•
URIs
•
Standard Vocabularies
•
Linked Data
•
SPARQL & OWL
•
RDFa, JSON-LD & friends
•
Applications
Def: RDF
RDF
(Resource Description Framework) is a knowledge representation based on
• entities
• classes
• binary relations
• labels
12
singer
1935
person
born
type
“Elvis”
label
subclassOf
->knowledge-representation
>details
Knowledge Representation in the Semantic Web
<person>
<occupation>
13
Job
Elvis
Person
singer
Birth
1935
->knowledge-bases
singer
1935
born
type
The Semantic Web
14
•
Motivation
•
Knowledge Representation
•
URIs
•
Standard Vocabularies
•
Linked Data
•
SPARQL & OWL
•
RDFa, JSON-LD & friends
•
Applications
Globally identifying entities
15
Elvis
Elvis
Elvis
Elvis
KB1
KB2
KB3
KB4
>details
Def: Namespace / Qualified Name
A
namespace
is a named set of (so‐called “local”) names.
[
Wikipedia/Namespace
]
namespace: KB1
contains local names: Elvis, Priscilla, Lisa
namespace: KB2
contains local names: Elvis, Michael
16
A
qualified name
consists of a namespace name and a local name.
KB1:Elvis
KB1:Priscilla
KB2:Elvis
Examples
What if KBs have the same name?
17
Elvis
Elvis
Elvis
Elvis
ElvisKB
ElviPedia
ElvisKB
ElviPedia
Def: URI
A
URI
(Uniform Resource Identifier) is a string that follows the syntax
18
<scheme name> : <hierarchical part> [ <query> ] [ # <fragment> ]
Examples:
• URLs
• File identifiers
• FTP
• MailTo
http://elvis.com/biography.html#Birth
file:///c:/users/elvis/tripToMoon.txt
ftp://elvis@nsa.gov
mailto:him@elvis.com?subject=Where%20%are%20you
All URLs are URIs, but not all URIs
are URL (“dereferenceable”)
>details
19
http://elvis-
alive.org/Elvis
Each knowledge base and each entity has a URI
http://elvis-alive.org/
http://elvipedia.com/
>namespace&ambig
>namespace
http://elvis.org/kb/
http://yago-knowledge.org/
http://elvipe
dia.com/Elvis
http://elvis.
org/kb/Elvis
http://yago-
knowledge.
org/Elvis
=> Every entity has a globally unique id
URI of
knowledge
base
URI of entity
URIs are never ambiguous
A URI always refers to one entity, never to more entities.
20
http://elvis-alive.org/Elvis
x
A URI always refers to one entity, never to more entities.
21
http://yago-knowledge.org/Elvis
One entity can be referred to by several URIs.
URIs can be synonymous
http://elvis-alive.org/Elvis
x
Def: Namespace prefix, CURIE, base
A
namespace prefix
is an abbreviation for the first part of a URI.
A prefix with a local name yields a
CURIE
(also:Qname).
@prefix dbp: <http://dbpedia.org/> .
dbp:Elvis
22
A
base URI
is a URI relative to which URIs in the same document are interpreted.
@base <http://yago-knowledge.org/> .
<Elvis> = <http://yago-knowledge.org/Elvis>
(It is disputed whether the last character of the KB URI should be / or #. In any case, you need one of them.)
CURIE, means: <http://dbpedia.org/Elvis>
Def: Turtle
Turtle
(Terse RDF Triple Language) is a particular syntax for writing RDF facts.
23
Turtle can declare namespace prefixes and a base as follows:
A simple Turtle fact has the form
Example:
@prefix y: <http://yago-knowledge.org/>
y:Elvis y:loves y:Priscilla .
y:Priscilla y:loves <http://kb.org/cake>.
y:Elvis y:isCalled "The King" .
URI|Curie URI|Curie URI|Curie|literal .
@prefix P: <URI> .
@base <URI> .
>literals
see example
Turtle syntactic sugar
Turtle
(Terse RDF Triple Language) is a particular syntax for writing RDF facts.
24
Turtle can abbreviate triples as follows:
y:Elvis y:likes y:Priscilla .
y:Elvis y:likes y:Lisa .
>literals
y:Elvis y:likes y:Priscilla, y:Lisa .
y:Elvis y:likes y:Priscilla .
y:Elvis y:hates y:MikeStone .
y:Elvis
y:likes y:Priscilla ;
y:hates y:MikeStone .
y:Elvis y:likes y:someone .
y:someone y:hates y:MikeStone .
y:Elvis
y:likes [
y:hates y:MikeStone
]
creates an anonymous
entity with the listed
properties
see example
Literals with data types
25
Turtle allows attaching a
datatype
to a literal in the form
"literal"^^datatype
The datatype is given by a URI or Curie.
It is common to use the XML datatypes
see them
xsd:boolean
xsd:decimal
xsd:integer
xsd:double
xsd:float
xsd:date
xsd:time
xsd:dateTime
...
true, false
Arbitrary-precision decimal numbers
Arbitrary-size integer numbers IEEE floating-point
64-bit floating point numbers incl. ±Inf, ±0, NaN
32-bit floating point numbers incl. ±Inf, ±0, NaN
Dates (yyyy-mm-dd) with or without timezone
Times (hh:mm:ss.sss…) with or without timezone
Date and time with or without timezone
The Semantic Web
26
•
Motivation
•
Knowledge Representation
•
URIs
•
Standard Vocabularies
•
Linked Data
•
SPARQL & OWL
•
RDFa, JSON-LD & friends
•
Applications
Cross‐referencing
A KB can make statements about entities defined in other KBs.
@prefix y: <http://yago-knowledge.org/>
@prefix d: <http://dbpedia.org/>
y:Priscilla y:loves d:MikeStone .
27
Standard vocabulary
A KB can define vocabulary that is used by other KBs.
y:Singer
• subclasses
• superclasses
• label
• ...
28
y:Singer
type
AlizéeKB
Alizée by Ronny Martin Junnilainen
Def: RDF Vocabulary
RDF
is also a vocabulary (=KB) that defines basic notions of KB representation.
@prefix rdf: <http://www.w3.org/.../rdf/>
rdf:type, rdf:Property, rdf:Statement ...
see this KB
29
y:Singer
rdf:type
We can use notions from this KB:
Def: RDFS Vocabulary
RDFS
is a vocabulary (=KB) that defines basic notions for class representation.
@prefix rdfs: <http://www.w3.org/.../rdfs/>
rdfs:label, rdfs:subClassOf,
rdfs:domain, rdfs:range,
rdfs:Class, rdfs:Resource
see this KB
30
y:Singer
rdfs:subClassOf
“entity”
y:Person
Sharing vocabularies
Shared vocabularies mean
• shared work in defining entities
• inter-operability of KBs
Some shared vocabularies have become standards on the Semantic Web.
They have a standard namespace prefix. However, nothing prescribes
the use of these vocabularies or prefixes.
@prefix rdf: <http://really.dumb.fellow.org/>
rdf:TheKing rdf:type rdf:monarch .
31
More vocabularies
• Dublin Core (for describing documents)
• Schema.org (for Web content)
• Creative Commons (types of licences)
• Facebook Open Graph (for Web content)
• FOAF (Friend of a Friend; for contact information)
http://purl.org/dc/elements/1.1/
http://schema.org
http://creativecommons.org/ns#
http://ogp.me/
http://xmlns.com/foaf/spec/
32
Schema.org
Schema.org is a vocabulary by Google, Yahoo & Microsoft for describing Web content.
33
see it
The Semantic Web
34
•
Motivation
•
Knowledge Representation
•
URIs
•
Standard Vocabularies
•
Linked Data
•
SPARQL & OWL
•
RDFa, JSON-LD & friends
•
Applications
>deref
Def: Dereferenceable/Cool URI
A
dereferenceable URI
(also: Cool URI) is a URI that returns an RDF snippet if accessed on the
Internet by an RDF client.
W3C/Cool URIs
@prefix e: <http://elvispedia.org/>
e:Elvis e:sings e:aSong .
e:Elvis e:born e:Tupelo .
...
For this
to work,
the data
has to be
stored at
the domain
of the URI
35
http://elvispedia.org/Elvis
Try, e.g., wget http://dbpedia.org/resource/Elvis_Presley -O elvis.rdf --header="Accept: application/rdf+xml"
Cool URIs can be traversed
@prefix e: <http://elvispedia.org/>
@prefix d: <http://dbpedia.org/>
e:Priscilla e:loves d:MikeStone
...
@prefix d: <http://dbpedia.org/>
@prefix rdf: <http://w3c.org/.../rdf>
d:MikeStone rdf:type d:KarateClown
d:MikeStone d:livesIn d:LosAngeles
...
The real URI of DBpedia is http://dbpedia.org/resource/
36
http://dbpedia.org/MikeStone
Cool URIs can be traversed
try it out
The standard vocabularies (RDF, RDFS, schema.org, Creative Commons, etc.) all provide
dereferenceable URIs, as do many KBs.
37
KB1
KB2
KB3
KB4
Everybody can create KBs & URIs
birthDate
1935
type
RockSinger
married
1935-01-08
born
Singer
type
plays
YAGO
ElvisPedia
38
39
Distinct URIs => No use
Who is the spouse of the guitar player?
birthDate
1935
type
RockSinger
married
1935-01-08
born
Singer
type
plays
YAGO
ElvisPedia
Def: Knowledge Base Alignment
rdfs:subClassOf
owl:sameAs
rdfs:subPropertyOf
40
KB alignment
(also: KB mapping, KB linking) is the task of mapping the
entities, classes, and relations of one KB to their pendants in the other.
>Paris
birthDate
1935
type
RockSinger
1935-01-08
born
Singer
type
plays
OWL and RDF are standard vocabularies for the linking.
married
Match classes, entities, & relations
"Elvis"
"Elvis"
name
label
41
There are numerous approaches for KB linking. We show here
F. Suchanek, S. Abiteboul, P. Senellart:
“PARIS: Probabilistic Alignment of Relations, Instances, and Schema”
VLDB 2012
... which is still the state of the art.
1.
Match literals
(either by identity or with a similarity function)
Identical literals are equivalent by definition.
name
label
42
Match classes, entities, & relations
"Elvis"
"Elvis"
name
label
43
Match classes, entities, & relations
"Elvis"
"Elvis"
1.
Match literals
2. Assume small equivalence of all relations
name
label
44
Match classes, entities, & relations
"Elvis"
"Elvis"
2. Assume small equivalence of all relations
name
label
45
Match classes, entities, & relations
"Elvis"
"Elvis"
What about matching the entities?
What does it mean that both Elvises share the same name?
What does it mean if both Elvises share the same birth year?
name
label
46
>fun
Match classes, entities, & relations
"Elvis"
"Elvis"
Def: Local Functionality
47
The
local functionality
of a relation r and a subject s is one over the number of its objects.
fun(Elvis, born)=1
fun(Elvis, sang)=0.01
F. Suchanek, S. Abiteboul, P. Senellart: “PARIS: Probabilistic Alignment of Relations, Instances, and Schema”
1935
born
“All shook up”
...(98 more songs)...
“Let’s have a party”
sang
Def: Functionality
48
The
functionality
of a relation r is the harmonic mean of the local functionalities for all its subjects.
It is equivalent to the number of its subjects divided by the number of its facts:
Example:
fun(hasBirthDate)=1
(exactly one object per subject)
fun(hasDeathDate)=1
(at most one object per subject)
fun(hasNationality)=0.9
(few objects per subject)
fun(hasFriend)=0.2
(several objects per subject)
F. Suchanek, S. Abiteboul, P. Senellart: “PARIS: Probabilistic Alignment of Relations, Instances, and Schema”
The
inverse functionality
of a relation r
is defined analogously to the functionality.
Def: Inverse Functionality
"Elvis"
name
1935
born
ifun(name)=0.9
ifun(born)=0.1
49
The
inverse local functionality
for an object y and a relation r is the number of x with r(x, y) .
3.
If subjects share a relation that is highly
inverse functional,
and the object is matched, then match the subjects.
name
label
50
Match classes, entities, & relations
"Elvis"
"Elvis"
4.
If relations share many pairs, increase their match
name
label
51
Match classes, entities, & relations
"Elvis"
"Elvis"
name
label
52
Match classes, entities, & relations
"Elvis"
"Elvis"
4.
If relations share many pairs, increase their match
5. Iterate
name
label
53
Match classes, entities, & relations
"Elvis"
"Elvis"
6.
Compute class subsumption
(based on the overlap of entities)
Numerous other approaches exist (e.g. based on name similarity).
Match classes, entities, & relations
singer
AmericanSinger
type
type
54
Def: Linked Open Data Project
The goal of W3C’s Linked Open Data Project is to publish and link open KBs.
The project links equivalent entities and equivalent relations across different KBs.
W3C Task Force
55
This arrow means:
equivalent entities between iServe
and DBpedia have been linked.
The Linked Open Data Project
lod-cloud.net
56
As of 2023: 1000 KBs
Existing KBs include
• US census data
• BBC music database
• Gene ontologies
• general knowledge (YAGO etc.)
• UK government data
• geographical data in abundance
• national library catalogs (USA, Germany etc.)
• publications (DBLP)
• commercial products
• all Pokemons
...and many more
The Semantic Web
57
•
Motivation
•
Knowledge Representation
•
URIs
•
Standard Vocabularies
•
Linked Data
•
SPARQL & OWL
•
RDFa, JSON-LD & friends
•
Applications
SPARQL
58
SPARQL
(the “Sparql Protocol and RDF Query Language”) is the standard query language for RDF.
PREFIX yago: <http://yago-knowledge.org/resource/>
PREFIX schema: <http://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?pred ?obj WHERE {
yago:Elvis_Presley ?pred ?obj .
}
LIMIT 20
Definition of
namespace prefixes
Output variables
Variables start with “?”
One or more triple patterns
Other keywords include
UNION
,
GRAPH
(for named graphs),
FROM
(to specify a source KB),
and
OPTIONAL
(to make triple patterns optional).
Running SPARQL queries
59
SPARQL queries can be run
• on a local triple store (e.g., Jena)
• programmatically in a (Java or Python) program
• on a SPARLQ endpoint via API
https://query.wikidata.org/bigdata/namespace/wdq/sparql?query=
SELECT%20?s%20?p%20?o%20WHERE%20{?s%20?p%20?o}%20LIMIT%2010
• on a SPARQL endpoint with a user interface
https://www.w3.org/wiki/SparqlEndpoints
>owl
Try it out!
Def: Reasoning on RDFS data
60
RDFS comes with built‐in reasoning rules that allow deducing facts from existing facts.
The most important rules are:
•
transitivity of rdfs:subclassOf
•
transitivity of class membership
•
class membership by domain/range constraints
•
sub‐properties
actor
person
rdfs:subclassOf
rdf:type
entity
rdfs:subclassOf
Lisa
hasChild
hasChild
parent
rdfs:domain
knows
rdfs:subpropertyOf
[RDF Schema]
Def: Reasoning on RDFS data
61
actor
person
rdfs:subclassOf
rdf:type
entity
rdfs:subclassOf
Lisa
hasChild
hasChild
parent
rdfs:domain
rdfs:subclassOf
implicitly
deduced
information
RDFS can deduce
only positive
information!
It cannot say that
something should
not be the case!
knows
rdfs:subpropertyOf
knows
RDFS comes with built‐in reasoning rules that allow deducing facts from existing facts.
The most important rules are:
•
transitivity of rdfs:subclassOf
•
transitivity of class membership
•
class membership by domain/range constraints
•
sub‐properties
[RDF Schema]
OWL
62
The
Web Ontology Language
(OWL) is a family of reasoning languages for KBs,
which can also express contradictions.
Nobody can be both a person and a location.
OWL can be written in several syntaxes:
• Description Logics:
person ⊓ location ≡ ⊥
• Functional syntax:
DisjointClasses( :Person :Location)
• RDF syntax:
:person owl:disjointWith :location
• XML syntax:
<Ontology><Prefix ...><Declaration>...
[OWL Primer]
OWL Profiles
63
OWL comes in different flavors:
• OWL EL: for KBs with many properties and classes
• OWL QL: for KBs with many instances
• OWL RL: for rules
Different OWL profiles allow different types of statements, e.g.,
[OWL 2 Profiles]
OWL 2 QL supports the following axioms:
• subclass axioms (SubClassOf)
• class expression equivalence (EquivalentClasses)
• class expression disjointness (DisjointClasses)
• inverse object properties (InverseObjectProperties)
• ...
These restrictions are there to make the statements decidable, and to limit the reasoning
complexity.
->description-logics
>details
Def: Specifying constraints with SHACL
64
The
Shape Constraint Language
(SHACL) is a language for describing constraints on RDF data.
schema:Person rdf:type rdfs:Class, shacl:NodeShape ;
shacl:property [
shacl:path schema:worksFor ;
shacl:node schema:Organization
] ;
shacl:property [
shacl:path schema:birthDate ;
shacl:datatype xsd:date ;
shacl:maxCount 1 ;
shacl:minCount 1
] .
new entity, which
describes a predicate
predicate that is concerned
range of the predicate
maximum number of objects
minimum number of objects
range is literal
Violations of SHACL constraints are flagged by a validator.
SHACL
constraints
are part of
the KB,
in RDF!
see example
Specifying constraints with SHACL
65
The same constraints in graphical form:
see example
schema:Person
shacl:property
shacl:property
(anonymous entity)
(anonymous entity)
shacl:path
schema:worksFor
shacl:node
schema:Organization
shacl:path
schema:birthDate
shacl:node
xsd:date
shacl:maxCount
1
predicate
range of the predicate
schema:Person has
two properties
The Semantic Web
66
>RDFa
•
Motivation
•
Knowledge Representation
•
URIs
•
Standard Vocabularies
•
Linked Data
•
SPARQL & OWL
•
RDFa, JSON-LD & friends
•
Applications
How do we get HTML pages to RDF?
67
?
?
?
>details
-> JSON-LD
Def: RDFa
RDFa
is a syntax to annotate HTML pages with RDF.
RDFa Lite
68
>details
<div>
Martin Thunderbird<br>
Researcher in Rock’N’Roll Music of 1935-1977<br>
Memphis, Tennessee
</div>
Defining the vocabulary
All local names in an HTML node live in the namespace given by “vocab”.
<div vocab="http://schema.org/">
Martin Thunderbird<br>
Researcher in Rock’N’Roll Music of 1935-1977<br>
Memphis, Tennessee
</div>
69
>details
Defining the subject
All properties in the HTML node take as subject the entity given by
“resource”.
<div vocab="http://schema.org/"
resource="http://martin.org/me">
Martin Thunderbird<br>
Researcher in Rock’N’Roll Music of 1935-1977<br>
Memphis, Tennessee
</div>
70
>details
Defining a type
The type of the subject is given by “typeOf”.
<div vocab="http://schema.org/"
resource="http://martin.org/me" typeOf="Person">
Martin Thunderbird<br>
Researcher in Rock’N’Roll Music of 1935-1977<br>
Memphis, Tennessee
</div>
<http://martin.org/me> rdf:type <http://schema.org/Person> .
71
>details
Defining a fact with a literal object
A tag with “property” defines a fact between subject and that tag’s
text value.
<div vocab="http://schema.org/"
resource="http://martin.org/me" typeOf="Person">
<span property="name">Martin</span><br>
Researcher in Rock’N’Roll Music of 1935-1977<br>
Memphis, Tennessee
</div>
<http://martin.org/me> <http://schema.org/name> "Martin" .
72
>details
Defining a fact with an entity object
A tag with “property” and “resource” defines a fact between subject
and URI.
<div vocab="http://schema.org/"
resource="http://martin.org/me" typeOf="Person">
<span property="name">Martin Th</span><br>
<span property="homeLocation" resource=
"http://yago.org/Memphis">Memphis</span>
</div>
<http://martin.org/me> <http://schema.org/homeLocation>
<http://yago.org/Memphis> .
73
>details
Nested facts
A tag with “property” and “typeof” creates a new entity.
...
<span property="address" typeOf="postalAddress"
<span property=streetAddress>42 Elvis Rd</span>
<span property=postalCode>12345</span>
</span>
<http://martin.org/me> <http://schema.org/address> ADR .
ADR rdf:type <http://schema.org/postalAddress> .
ADR <http://schema.org/streetAddress> "42 Elvis Rd" .
ADR <http://schema.org/postalCode> "12345" .
74
>details
Def: JSON-LD
JSON-LD
is a JSON-based format for RDF facts.
75
{
"@context": {
"@vocab": "http://schema.org/",
"foaf": "http://xmlns.com/foaf/0.1/"
}
"@id": "http://martin.org",
"@type": "http://schema.org/Person"
"name": "Martin Thunderbird",
"homepage": "http://martin.org"
}
[
json-ld.org
,
W3C specification
]
defines the schema
says that all properties and values are relative
to schema.org
defines a prefix
defines the URI of this resource
defines the type of this resource
defines the type of this resource
defines facts about this resource
JSON-LD in HTML
JSON-LD can be embedded in HTML.
76
{
"@context": {
"@vocab": "http://schema.org/",
"foaf": "http://xmlns.com/foaf/0.1/"
}
"@id": "http://martin.org",
"@type": "http://schema.org/Person"
"name": "Martin Thunderbird",
"homepage": "http://martin.org"
}
<script type="application/ld+json">
</script>
Advantages:
•
less messy than RDFa
•
encouraged by
Google
Disadvantages:
• danger of inconsistency between visible HTML
and JSON-LD
The Semantic Web
77
>more
•
Motivation
•
Knowledge Representation
•
URIs
•
Standard Vocabularies
•
Linked Data
•
SPARQL & OWL
•
RDFa, JSON-LD & friends
•
Applications
Search engines scrape embedded RDF
78
>more
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Product",
"name": "Apple iPhone X",
"description": "iPhone X is an overdue and winning evolution of the iPhone, but you’ll need to leave your comfort zone to make a jump into the face-recognizing future.",
"image": "https://cnet1.cbsistatic.com/img/ZQICw4aW2fNpbmN34CSTJrUgcQA=/270x203/2017/11/10/0eae3ae5-43cc-4dcb-ab69-3009d696f27e/iphone-x.jpg",
"brand": {
"@type": "Thing",
79
JSON-LD embedded in Web page:
>more
Search engines scrape embedded RDF
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Product",
"name": "Apple iPhone X",
"description": "iPhone X is an overdue and winning evolution of the iPhone, but you’ll need to leave your comfort zone to make a jump into the face-recognizing future.",
"image": "https://cnet1.cbsistatic.com/img/ZQICw4aW2fNpbmN34CSTJrUgcQA=/270x203/2017/11/10/0eae3ae5-43cc-4dcb-ab69-3009d696f27e/iphone-x.jpg",
"brand": {
"@type": "Thing",
80
JSON-LD embedded in Web page:
>more
Search engines scrape embedded RDF
Search engines read licenses
81
>more
Facebook Like Button uses RDFa
82
>more
Facebook Like Button uses RDFa
@prefix og: <http://ogp.me/ns#> .
<http://www.imdb.com/title/tt0167923/?ref=fnaltt2> og:description
"A 1973 concert by Elvis Presley taped in Honolulu, Hawaii";
og:sitename "IMDb";
og:title "Elvis: Aloha from Hawaii (1973)";
og:type "video.tv-show";
og:url "http://www.imdb.com/title/tt0167923/";
ns1:fbmlapp_id "115109575169727" .
83
>more
Web Data Commons
84
The Web Data Commons project extracts structured data from the Common Crawl
(the largest public Web corpus).
# Websites with annotations in different formats
Half of all crawled
Web sites provide annotations
Statistics 1
Statistics 2
Web Data Commons
85
The Web Data Commons project extracts structured data from the Common Crawl
(the largest public Web corpus).
34m Web sites provide 100b triples about 20b entities
# Websites with entities by type
The Semantic Web
86
•
The Semantic Web is a collection of standards to describe facts of a knowledge base
in an unambiguous form.
•
Linked Data is a collection of knowledge bases that are interlinked
•
JSON-LD allows annotating Web pages with triples, and many Web sites do it
# Websites with entities by type
References
W3C: RDF
W3C: Semantic Web
W3C: RDFa lite
JSON-LD
Linked Data
W3C: RDFS
87