in Datasets

Old-Fasioned Common Procurement Vocabulary 2008 and 2003

The Common Procurement Vocabulary (CPV) establishes a single classification system for public procurement aimed at standardising the references used by contracting authorities and entities to describe the subject of procurement contracts.

The CPV consists of a main vocabulary for defining the subject of a contract, and a supplementary vocabulary for adding further qualitative information. The main vocabulary is based on a tree structure comprising codes of up to 9 digits (an 8 digit code plus a check digit) associated with a wording that describes the type of supplies, works or services forming the subject of the contract.

The main vocabulary is based on a tree structure comprising codes of up to nine digits associated with a wording that describes the supplies, works or services forming the subject of the contract.

  • The first two digits identify the divisions (XX000000-Y);
  • The first three digits identify the groups (XXX00000-Y);
  • The first four digits identify the classes (XXXX0000-Y);
  • The first five digits identify the categories (XXXXX000-Y);

Each of the last three digits gives a greater degree of precision within each category. A ninth digit serves to verify the previous digits.

The supplementary vocabulary may be used to expand the description of the subject of a contract. The items are made up of an alphanumeric code with a corresponding wording allowing further details to be added regarding the specific nature or destination of the goods to be purchased.

The alphanumeric code is made up of:

  • a first level comprising a letter corresponding to a section;
  • a second level comprising four digits, the first three of which denote a subdivision and the last one being for verification purposes
The dataset created is comprised of CPV 2008 and CPV 2003 codes and the mappings between them. All this information is publicly available via the WESO SPARQL endpoint (5 star linked data) and a Pubby frontend. The structure of the data and definitions is the next one:

The definitions have been made using the vocabularies:

The whole dataset uses links to other datasets (28,839):

  • GoodRelations  and Product Ontology products and descriptions

In order to create all this data we have used different tools:


This work is part of MOLDEAS system developed by the WESO Research Group in the partnership project 10ders Information Services project partially funded by the Spanish Ministry of Industry, Tourism and Trade with code TSI-020100-2010-919 and the European Regional Development Fund (EFDR) according to the National Plan of Scientific Research, Development and Technological Innovation 2008-2011, leaded by Gateway Strategic Consultancy Services and developed in cooperation with Exis-TI.

TO DO List

  • Check broken links
  • Review the design of URIs
  • Create Named graphs to group different divisions/groups/classes/categories
  • Link to other datasets
  • Reconciliate all products and services with the DBPedia resources
  • Develop a GUI based on Exhibit, SNORQL, etc.
  • Send this dataset and statistics to the Linked Data Cloud
  • Update public procurement notices with the new URIs

Write a Comment


    • Dear Dominique,

      thank you very much, this is just the first part of a research project and one of the expected outputs of my PhD. We will release new datasets related to e-procurement from now to the end of the year in order to boost the use of LOD in this domain.

      The process of enriching the CPV and other schemes is not trivial due to the problems that appear when entity reconciliation is performed. Now I developed a simple program to reconciliate entities based on Lucene (analyzing the descriptions of the CPV codes) to finally search in the main datasets like DBPedia but the initial results were not too acceptable so this part should be improved. One of the initial approaches was align “CPV terms” with Wordnet in order to disambiguate the meaning of a description and link terms through Wordnet syns, etc. but it is just a proposal and this improvement will be carried out in a second stage.

      Anyway if you have any further question or suggestion…feel free to contact me!

      Thank you very much again!

      Best regards,

  1. Chema, it seems there is a failure after the redirection > “156.­35.­31.­156″. Thanks and congratulations for this project. Junta de Andalucía is going to use it.