in Datasets

Product Scheme Classifications

Following with the activities performed to promote the CPV as a linked dataset we have finished the first beta release of new product scheme classifications (PSCs) as linked data in the context of e-procurement. Next diagram shows the ongoing work in the transformation of PSCs (gray ones are not yet transformed):

The process to promote all these PSCs (more information can be found in pscs-catalogue at thedatahub.org) have been carried out in a stepwise method (similar to http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook):

  1. Select the PSCs to be transformed and download the datasource (MSExcel in most of cases)
  2. Model the information about a PSC item using existing vocabularies. If it is required new concepts and relations can be defined such as in CPV case. URI design.
  3. Transform the data using Google Refine
  4. Create the mappings between a PSC and the Product Ontology (custom java-based reconciliator adapted to the descriptions of PSCs items)
  5. Create the mappings between a PSC and the CPV 2008 (custom java-based reconciliator between a source PSC and a target PSC)
  6. Validate mappings and links
  7. Add dataset descriptions using VoID vocabulary
  8. Store in Virtuoso and publish data with Pubby

The definition of a PSC item (?product) is comprised of the next properties:

  • URI for datasetshttp://purl.org/weso/pscs/{psc}/{year|version}/resource/ds
  • URI for resources: http://purl.org/weso/pscs/{psc}/{year|version}/resource/{id}
  • URI for classes and properties: http://purl.org/weso/pscs/{psc}/{year|version}/ontology/
  • rdf:type <pscs:PSCConcept> (rdf:type skos:Concept)
  • dcterms:identifier “id” (the id that is part of the URI)
  • skos:notation “raw id” (the real id that appears in the data source)
  • skos:prefLabel, gr:description and rdfs:label “description”
  • skos:inScheme <void:Dataset>, <skos:ConceptScheme>
  • skos:broaderTransitive/skos:narrowerTransitive <PSCConcept> (in some cases the broader of an item can not be inferred using the codes, in that case we have defined a custom property called “pscs:level“)
  • pscs:relatedMatch (mapping between  ?product and items of ProductOntology). The next release will include a “confidence” value to stablish the weight of matchings.
  • skos:exactMatch <PSCConcept> (some PSCs have already defined mappings among  them, we reuse this information)
  • skos:closeMatch <PSCConcept> (mapping between ?product and items of CPV 2008). The next release will include a “confidence” value to stablish the weight of matchings.
The whole linkset of PSCs can be found at http://purl.org/weso/pscs/ and we have also extracted out some statistics (PSC void:Dataset, IRI graph and triples):

The definitions have been made using the vocabularies:

The whole linkset uses links to other datasets (151,102):

  • GoodRelations  and Product Ontology products and descriptions

In order to create all this data we have used different tools:

Collaborators:

Acknowledgements:
This work is part of MOLDEAS system developed by the WESO Research Group in the partnership project 10ders Information Services project partially funded by the Spanish Ministry of Industry, Tourism and Trade with code TSI-020100-2010-919 and the European Regional Development Fund (EFDR) according to the National Plan of Scientific Research, Development and Technological Innovation 2008-2011, leaded by Gateway Strategic Consultancy Services and developed in cooperation with Exis-TI.

Note:

The initial version of CPV as linked data is available in order to ensure backward compatibility.

TO DO List

  • Example of queries
  • Confidence value in mappings
  • Check broken links
  • Link to other datasets, fix names (case sensitive)
  • Reconciliate all products and services with the DBPedia resources
  • Update public procurement notices with the new URIs

Write a Comment

Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.