Wikibase Data Model & RaiseWikibase functions
The Wikibase Data Model is an ontology describing the structure of the data in Wikibase. A non-technical summary of the Wikibase model is available at DataModel/Primer. The initial conceptual specification for the Data Model was created by Markus Krötzsch and Denny Vrandečić, with minor contributions by Daniel Kinzler and Jeroen De Dauw. The Wikibase Data Model has been implemented by Jeroen De Dauw and Thiemo Kreuz as Wikimedia Germany employees for the Wikidata project.
RaiseWikibase provides the functions for the Wikibase Data Model:
from RaiseWikibase.datamodel import label, alias, description, snak, claim, entity
The functions entity
, claim
, snak
, description
, alias
and label
return the template dictionaries. So all basic operations with dictionaries in Python can be used. You can merge two dictionaries X
and Y
using X | Y
(since Python 3.9), {**X, **Y}
(since Python 3.5) and X.update(Y)
.
Let’s check the Wikidata entity Q43229 with an English label ‘organization’. You can create both English and German labels for the entity in a local Wikibase instance using RaiseWikibase:
labels = {**label('en', 'organization'), **label('de', 'Organisation')}
Multiple English and German aliases can also be easily created:
aliases = alias('en', ['organisation', 'org']) | alias('de', ['Org', 'Orga'])
Multilingual descriptions can be added:
descriptions = description('en', 'social entity (not necessarily commercial)')
descriptions.update(description('de', 'soziale Struktur mit einem gemeinsamen Ziel'))
To add statements (claims), qualifiers and references, we need the snak
function. To create a snak, we have to specify property
, datavalue
, datatype
and snaktype
. For example, if a Wikibase instance has the property with ID P1
, a label Wikidata ID
and datatype external-id
, we can create a mainsnak with that property and the value ‘Q43229’:
mainsnak = snak(datatype='external-id', value='Q43229', prop='P1', snaktype='value')
Just as an example of creating the qualifiers and references, let’s add:
qualifiers = [snak(datatype='external-id', value='Q43229', prop='P1', snaktype='value')]
references = [snak(datatype='external-id', value='Q43229', prop='P1', snaktype='value')]
We have now a mainsnak, qualifiers and references. Let’s create a claim for an item:
claims = claim(prop='P1', mainsnak=mainsnak, qualifiers=qualifiers, references=references)
If you need a claim with multiple values for one property, there are two opportunities. The first one is using the extend
function on lists:
claims1 = claim(prop='P1', mainsnak=mainsnak1, qualifiers=qualifiers1, references=references1)
claims2 = claim(prop='P1', mainsnak=mainsnak2, qualifiers=qualifiers2, references=references2)
claims1['P1'].extend(claims2['P1'])
The second option is using the mainsnak
and statement
functions:
snak1 = snak(datatype='external-id', value='Q43229', prop='P1', snaktype='value')
snak2 = snak(datatype='external-id', value='Q5', prop='P1', snaktype='value')
mainsnak1 = mainsnak(prop='P1', snak=snak1, qualifiers=[], references=[])
mainsnak2 = mainsnak(prop='P1', snak=snak2, qualifiers=[], references=[])
statements = statement(prop='P1', mainsnaks=[mainsnak1, mainsnak2])
Note that the claim
and statement
functions return the same template dictionaries, but their input parameters are different. The claim
function is useful when your claims have one value per property. Multiple values per property are easier to create using the statement
function.
All ingredients for creating the JSON representation of an item are ready. The entity
function does the job:
item = entity(labels=labels, aliases=aliases, descriptions=descriptions, claims=claims, etype='item')
where claims=claims
can be replaced by claims=statements
.
If a property is created, the corresponding datatype has to be additionally specified:
property = entity(labels=labels, aliases=aliases, descriptions=descriptions,
claims=claims, etype='property', datatype='string')
Note that these functions create only the dictionaries for the corresponding elements in the Wikibase Data Model. Writing into the database is performed using the page
and batch
functions.