[Corpora-List] [ANN] New DBpedia Snapshot 2021-06

DBpedia pr-aksw at informatik.uni-leipzig.de
Fri Jul 23 13:26:12 CEST 2021

Apologies for cross-posting. The full release description including further statistics can be found on https://www.dbpedia.org/blog/snapshot-2021-06-release/ <https://www.dbpedia.org/blog/snapshot-2021-06-release/>.

We are pleased to announce immediate availability of a new edition of the free and publicly accessible SPARQL Query Service Endpoint and Linked Data Pages, for interacting with the new Snapshot Dataset.

What is the “DBpedia Snapshot” Release?

Historically, this release has been associated with many names: "DBpedia Core", "EN DBpedia", and — most confusingly — just "DBpedia". In fact, it is a combination of —


EN Wikipedia data— A small, but very useful, subset (~ 1 Billion

triples or 14%) of the whole DBpedia extraction


theDBpedia Information Extraction Framework

<https://github.com/dbpedia/extraction-framework>(DIEF), comprising

structured information extracted from the English Wikipedia plus

some enrichments from other Wikipedia language editions, notably

multilingual abstracts in ar, ca, cs, de, el, eo, es, eu, fr, ga,

id, it, ja, ko, nl, pl, pt, sv, uk, ru, zh.


Links— 62 million community-contributed cross-references and

owl:sameAs links to other linked data sets on the Linked Open Data

(LOD) Cloud that allow to effectively find and retrieve further

information from the largest, decentral, change-sensitive knowledge

graph on earth that has formed around DBpedia since 2007.


Community extensions— Community-contributed extensions such as

additional ontologies and taxonomies.

Release Frequency & Schedule

Going forward, releases will be scheduled for the 15th of February, May, July, and October (with +/- 5 days tolerance), and are named using the same date convention as the Wikipedia Dumps that served as the basis for the release. An example of the release timeline is shown below:

June 6–8

June 8–20

June 20–July 10

July 10–20

Wikipedia dumps for June 1 become available on https://dumps.wikimedia.org/

Download and extraction with DIEF

Post-processing and quality-control period

Linked Data and SPARQL endpoint deployment

Data Freshness

Given the timeline above, the EN Wikipediadata of DBpedia Snapshot has a lag of 1-4 months.

Further Information

Growth of DBpedia, breakdown of links by domain, download instructions and some tips on how to effectively work with DBpedia are published as part of this blog post: https://www.dbpedia.org/blog/snapshot-2021-06-release/ <https://www.dbpedia.org/blog/snapshot-2021-06-release/>

-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 18476 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20210723/53e96aeb/attachment.txt>

More information about the Corpora mailing list