Travelling to Lima and visiting UPC

Last week, I was invited by Prof. Carlos Raymundo of UPC to give a talk at the University and to also participate in some of the courses they teach there. I am very thankful to the UPC team for the excellent organization, they created a fantastic environment to collaborate with students and other people attending the conference. I am going to leave here my presentation (it is in Spanish):

Researching Semantic Web-Overview from Jose María Alvarez

On the other hand, I took advantage of visiting this great country to take a look around Lima and I could take some pictures, you can find the album at Flickr.

I hope we can stablish new collaborations and work together in some initiatives related to semantic web, linked data and research/innovation in general,

I will come back soon!

1st International Congress of Systems Engineering and Computation-Perú

I am really excited with my invitation to be speaker at the 1st International Congress of Systems Engineering and Computation within Universidad Peruana de Ciencias Aplicadas. I was invited by Prof. Dr. Carlos Raymundo through my colleague Prof. Hernán Sagastegui Chigne. I will be the lecturer on Wednesday and Thursday of the next week (7th and 8th of November) and the talk will be about “Researching in Semantic Web Technologies”, you can see the full schedule here.

My intention is to provide a good introduction to the Semantic Web and Linked Data initiatives apart from providing an in-depth review of existing works. I would also like take part of the time to organize a think-tank session in which the audience can collaborate giving opinions about specific open issues and make a discussion to launch some actions such as a hackathon or similar in next weeks but it is just a thought…

Stay tuned!

Cloud Computing and Semantics

Last weeks I have reviewed some of the existing works trying to mix semantics and cloud computing to improve some of the key-processes in a cloud environment. QoS and resource provisioning are two of the main processes that are supposed to take advantage of an intelligent decision support systems to dynamically adapt client requirements to cloud resources. According to the different types of cloud (SaaS, PaaS and IaaS) the use of formal models and knowledge bases can help to take decisions in different ways: prediction of resources, adjustment of “pay-as-go”, etc. Among other works I would like to leave here a list of relevant papers, etc. that I consider essential to understand the underlying problems, technology, current efforts and approaches to tackle them.

I will continue updating this post and the references but I think it is a good starting point to check all related works in this area. Moreover I had collected some papers related to Map/Reduce, SPARQL and more in the ROCAS project wiki.

Best,

WebIndex Launch

Today it is the official launch of the Web Index. Last months I have collaborated, through my activities in WESO Research Group, with the Web Foundation to promote its statistical data following the Linked Data principles. I think we have published an appropriate version of this data and I hope to continue this fruitful collaboration with my new colleagues in next months.

You can find more information about the Web Index as Linked Data in http://data.webfoundation.org/.

If you have any comment, suggestion, etc. please feel free to contact me at any time,

Best,

Re-reasoning starting…

Last days I have read a lot of papers about different topics such as FOL, inference, artificial intelligence, large scale reasoning, rule engines, etc. I have collected all these references in ROCAS wiki with the objective of saving all relevant works that can help me to finally develop our semantic reasoner for large datasets.

This afternoon I have found a paper entitled as “Making Web-Scale Semantic Reasoning More Service- Oriented: The Large Knowledge Collider” that presents the whole architecture of the well-know project LarKc, in some sense the ROCAS project was inspired by this European project but with a restricted scope and different objectives. This paper has two main points for me:

One of the authors is Zhisheng Huang who helped me in 2004 to use his Distributed Logic Programming system to animate humanoids when I was developing my final degree project to get the Bachelor Degree.
From a research point of view, authors present a compilation of works made during the execution of the LarKc project that should be relevant to ROCAS. It is not a research paper but a good summary of this project.

This is a short post but I want to highlight that the world is small enough to meet same people (in this case researchers) again and again! It is incredible! 🙂

Finally, I would also like to report my first progresses in the development. I have deployed a job in Hadoop to perform the classical graph-algorithm “Breadth-First Search”, this is one of the tries I am thinking about for performing reasoning tasks…the other approches can be summarized:

Distribute Jena rule engine (reusing source code)
Develop from scratch the typical backward chain engine using unification and resolution
Mix of 1 and 2 to avoid parsing rules, matching triples, etc.
Build a graph (rules in backward chain can be shown as an AND/OR tree) and try to infer new facts using unification and search.
…

Let’s rock it!

HPC-Europe2 visiting

In February I made an application to the HPC-Europa2 Transnational Access programme and I finally got a grant that enabled the opportunity of using the SARA infrastructure to test some algorithms. Thanks to the Professor Maarten de Rijke I could select the University of Amsterdam as host so…Now I am here for 6 weeks at the University of Amsterdam in the Institute for Informatics and more specifically in the Intelligent Systems Lab. I am very excited with this opportunity and I will do my best to get a good version of our reasoning prototype. Besides I would like to start a fruitful collaboration between people in this lab and our research group through publications, projects or whatever.

I would also like to thank all the administrative staff of UVA their time and consideration. When I arrived, last Thursday, in 15 minutes I had a visiting card, a desktop and WI-FI connection and a great sight…

I will keep you informed!

CFP: Data Mining on Linked Data workshop with Linked Data Mining Challenge

To be held during the 20th Int. Symposium on Methodologies of Intelligent Systems, ISMIS 2012, 4-7 December 2012 Macao (http://www.fst.umac.mo/wic2012/ISMIS/)

Official CFP

Workshop Scope

Over the past 3 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into a very promising candidate for addressing one of the biggest challenges in the area of intelligent information management: the exploitation of the Web as a platform for data and information integration in addition to document search. Although numerous workshops and even Challenges already emerged in the intersection of Data Mining and Linked Data (e.g. Know@LOD at ESWC) and even Challenges have been organized (USEWODs at WWW, http://data.semanticweb.org/usewod/2012/challenge.html), the particular setting chosen (with a highly topical Government-related dataset) will allow to explore new methods of exploiting Linked Data with state-of-the-art mining tools.

Workshop Topic

The workshop consists of an Open Track and of a Challenge Track.
The Open Track will expect submission of regular research papers, describing novel approaches to applying Data Mining techniques on the Linked Data sources.

Participation in the Challenge Track will require the participants to download a real-world RDF dataset from the domain of Public Contract Procurement, and accomplish at least one of the four pre-defined tasks on it using their own or publicly available data mining tool. To get access to the data, participants have to register to the Challenge Track at http://keg.vse.cz/ismis2012. Partial mapping to external datasets will also be available, which will allow for extraction of further potential features from the Linked Open Data cloud. Task 1 will amount to unrestricted discovery of interesting nuggets in the (augmented) dataset. Task 2 will be similar but the category of interesting hypotheses will be partially specified. Task 3 will concern prediction of one of the features natively present in training data (but only added to the evaluation dataset after the result submission). Task 4 will concern prediction of a feature manually added to a sample of the data by a team of domain experts.
Participants will submit textual reports (Challenge Track papers) and, for Tasks 3 and 4, also the classification results.

Submissions

Both the research papers (submitted to the Open Track) and the Challenge Track papers should follow the Springer formatting style. The templates for Word and LaTeX are available at the workshop web http://keg.vse.cz/ismis2012 and can be also found at http://www.springer.com/authors/book+authors?SGWID=0-154102-12-417900-0. The length of the submission should not exceed 10 pages. All papers will be made available at the workshop web pages and there will be a post-conference proceedings for selected workshop papers.

Papers (and results for Tasks 3 and 4) should be submitted using the EasyChair http://www.easychair.org/conferences/?conf=ismis2012dmold .

Important Dates

Data ready for download: June 20, 2012
Workshop paper and result data submissions: August 10, 2012
Notification of Workshop paper acceptance: August 25, 2012
Workshop: December 4, 2012

CFP: Focussed Topic Issue on “New trends on E-Procurement applying Semantic Technologies”

Call for Papers: “Special issue on New trends on E-Procurement applying Semantic Technologies”

Computers in Industry. An International, Application Oriented Research Journal (Impact Factor: 1.620)

Overview

The aim of this special issue is to collect innovative and high-quality research and industrial contributions regarding E-Procurement processes that can fulfill the needs of this new realm. This special issue aims at exploring the recent advances in the application of Semantic Technologies in the E-Procurement sector soliciting original scientific contributions in the form of theoretical, experimental and real research and case studies.

Important dates and Timeline

15 of July 2012, for abstracts (send directly to the guest editors)
1st of September, 2012 for invitations sent to authors to submit full paper
28 Feb 2013, full papers due (submit only to the Elsevier Editorial System)
1st of May 2013, first round of reviews
1st of July 2013, revised papers due (also submitted to the EES)
1st of September 2013, second round of reviews,
1st of November 2013, final papers due (also in EES)

if you need additional information, please contact guest editors:

Jose María Alvarez Rodríguez (Assistant Professor, University of Oviedo), Department of Computer Science, Faculty of Science, University of Oviedo, C/Calvo Sotelo, S/N, 33003, Oviedo, Asturias, Spain, E-mail: josem.alvarez@weso.es
José Emilio Labra Gayo (Associate Professor, University of Oviedo), Department of Computer Science, Faculty of Science, University of Oviedo, C/Calvo Sotelo, S/N, 33003, Oviedo, Asturias, Spain, E-mail: labra@uniovi.es
http://www.di.uniovi.es/~labra
Patricia Ordoñez de Pablos (Associate Professor, University of Oviedo), Department of Business Management, School of Economics and Business, University of Oviedo, Campus del Cristo – Avda. del Cristo, s/n 33006 Oviedo, Asturias, Spain, E-mail: patriop@uniovi.es

My Linked Data Lifecycle

Linked Data Lifecycle by Jose María Alvarez Rodríguez is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Old-Fasioned Common Procurement Vocabulary 2008 and 2003

The Common Procurement Vocabulary (CPV) establishes a single classification system for public procurement aimed at standardising the references used by contracting authorities and entities to describe the subject of procurement contracts.

The CPV consists of a main vocabulary for defining the subject of a contract, and a supplementary vocabulary for adding further qualitative information. The main vocabulary is based on a tree structure comprising codes of up to 9 digits (an 8 digit code plus a check digit) associated with a wording that describes the type of supplies, works or services forming the subject of the contract.

The main vocabulary is based on a tree structure comprising codes of up to nine digits associated with a wording that describes the supplies, works or services forming the subject of the contract.

The first two digits identify the divisions (XX000000-Y);
The first three digits identify the groups (XXX00000-Y);
The first four digits identify the classes (XXXX0000-Y);
The first five digits identify the categories (XXXXX000-Y);

Each of the last three digits gives a greater degree of precision within each category. A ninth digit serves to verify the previous digits.

The supplementary vocabulary may be used to expand the description of the subject of a contract. The items are made up of an alphanumeric code with a corresponding wording allowing further details to be added regarding the specific nature or destination of the goods to be purchased.

The alphanumeric code is made up of:

a first level comprising a letter corresponding to a section;
a second level comprising four digits, the first three of which denote a subdivision and the last one being for verification purposes

(Information available at: http://simap.europa.eu/codes-and-nomenclatures/codes-cpv/codes-cpv_en.htm)

The dataset created is comprised of CPV 2008 and CPV 2003 codes and the mappings between them. All this information is publicly available via the WESO SPARQL endpoint (5 star linked data) and a Pubby frontend. The structure of the data and definitions is the next one:

CPV 2008. Graph IRI: Graph IRI: http://purl.org/weso/cpv/2008. Total: 556,335
triples.
- Scheme: http://purl.org/weso/cpv/2008/scheme
- Dump file (Turtle) (25 MB)
- Division: http://purl.org/weso/cpv/2008/03000000
- Group: http://purl.org/weso/cpv/2008/03100000
- Class: http://purl.org/weso/cpv/2008/03110000
- Category: http://purl.org/weso/cpv/2008/03111000 | http://purl.org/weso/cpv/2008/03111100
- Mapping example:

http://purl.org/weso/cpv/2008/03111100

http://purl.org/weso/cpv/definitions/codeIn2003

http://purl.org/weso/cpv/2003/01113100

CPV 2003. Graph IRI: Graph IRI: http://purl.org/weso/cpv/2003. Total: 191,430
triples. http://purl.org/weso/cpv/2003/01113100
- Scheme: http://purl.org/weso/cpv/2003/scheme
- Dump file (Turtle) (7.8 MB)
CPV Definitions. Graph IRI: Graph IRI: http://purl.org/weso/cpv/definitions. Triples: 43
- Dump file (Turtle) (7,4 KB)

The definitions have been made using the vocabularies:

The whole dataset uses links to other datasets (28,839):

GoodRelations and Product Ontology products and descriptions

In order to create all this data we have used different tools:

Google Refine and the RDF extension (to produce data)
Pubby (to publish data)
OpenLink Virtuoso (to store data)

Collaborators:

José Emilio Labra (Main Researcher of WESO Research Group at the University of Oviedo)
The first version of the CPV was developed in conjunction with my colleagues of CTIC: Luis Polo and Emilio Rubiera in 2007.

Acknowledgements:

This work is part of MOLDEAS system developed by the WESO Research Group in the partnership project 10ders Information Services project partially funded by the Spanish Ministry of Industry, Tourism and Trade with code TSI-020100-2010-919 and the European Regional Development Fund (EFDR) according to the National Plan of Scientific Research, Development and Technological Innovation 2008-2011, leaded by Gateway Strategic Consultancy Services and developed in cooperation with Exis-TI.

TO DO List

Check broken links
Review the design of URIs
Create Named graphs to group different divisions/groups/classes/categories
Link to other datasets
Reconciliate all products and services with the DBPedia resources
Develop a GUI based on Exhibit, SNORQL, etc.
Send this dataset and statistics to the Linked Data Cloud
Update public procurement notices with the new URIs

Research papers

Following some selected publications are presented (find also here the BibTeX file) :