1st International Congress of Systems Engineering and Computation-Perú

I am really excited with my invitation to be speaker at the 1st International Congress of Systems Engineering and Computation within Universidad Peruana de Ciencias Aplicadas. I was invited by Prof. Dr. Carlos Raymundo through my colleague Prof. Hernán Sagastegui Chigne. I will be the lecturer on Wednesday and Thursday of the next week (7th and 8th of November) and the talk will be about “Researching in Semantic Web Technologies”, you can see the full schedule here.

My intention is to provide a good introduction to the Semantic Web and Linked Data initiatives apart from providing an in-depth review of existing works. I would also like take part of the time to organize a think-tank session in which the audience can collaborate giving opinions about specific open issues and make a discussion to launch some actions such as a hackathon or similar in next weeks but it is just a thought…

Stay tuned!

WebIndex Launch

Today it is the official launch of the Web Index. Last months I have collaborated, through my activities in WESO Research Group, with the Web Foundation to promote its statistical data following the Linked Data principles. I think we have published an appropriate version of this data and I hope to continue this fruitful collaboration with my new colleagues in next months.

You can find more information about the Web Index as Linked Data in http://data.webfoundation.org/.

If you have any comment, suggestion, etc. please feel free to contact me at any time,

Best,

CFP: Data Mining on Linked Data workshop with Linked Data Mining Challenge

To be held during the 20th Int. Symposium on Methodologies of Intelligent Systems, ISMIS 2012, 4-7 December 2012 Macao (http://www.fst.umac.mo/wic2012/ISMIS/)

Official CFP

Workshop Scope

Over the past 3 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into a very promising candidate for addressing one of the biggest challenges in the area of intelligent information management: the exploitation of the Web as a platform for data and information integration in addition to document search. Although numerous workshops and even Challenges already emerged in the intersection of Data Mining and Linked Data (e.g. Know@LOD at ESWC) and even Challenges have been organized (USEWODs at WWW, http://data.semanticweb.org/usewod/2012/challenge.html), the particular setting chosen (with a highly topical Government-related dataset) will allow to explore new methods of exploiting Linked Data with state-of-the-art mining tools.

Workshop Topic

The workshop consists of an Open Track and of a Challenge Track.
The Open Track will expect submission of regular research papers, describing novel approaches to applying Data Mining techniques on the Linked Data sources.

Participation in the Challenge Track will require the participants to download a real-world RDF dataset from the domain of Public Contract Procurement, and accomplish at least one of the four pre-defined tasks on it using their own or publicly available data mining tool. To get access to the data, participants have to register to the Challenge Track at http://keg.vse.cz/ismis2012. Partial mapping to external datasets will also be available, which will allow for extraction of further potential features from the Linked Open Data cloud. Task 1 will amount to unrestricted discovery of interesting nuggets in the (augmented) dataset. Task 2 will be similar but the category of interesting hypotheses will be partially specified. Task 3 will concern prediction of one of the features natively present in training data (but only added to the evaluation dataset after the result submission). Task 4 will concern prediction of a feature manually added to a sample of the data by a team of domain experts.
Participants will submit textual reports (Challenge Track papers) and, for Tasks 3 and 4, also the classification results.

Submissions

Both the research papers (submitted to the Open Track) and the Challenge Track papers should follow the Springer formatting style. The templates for Word and LaTeX are available at the workshop web http://keg.vse.cz/ismis2012 and can be also found at http://www.springer.com/authors/book+authors?SGWID=0-154102-12-417900-0. The length of the submission should not exceed 10 pages. All papers will be made available at the workshop web pages and there will be a post-conference proceedings for selected workshop papers.

Papers (and results for Tasks 3 and 4) should be submitted using the EasyChair http://www.easychair.org/conferences/?conf=ismis2012dmold .

Important Dates

Data ready for download: June 20, 2012
Workshop paper and result data submissions: August 10, 2012
Notification of Workshop paper acceptance: August 25, 2012
Workshop: December 4, 2012

CFP: Focussed Topic Issue on “New trends on E-Procurement applying Semantic Technologies”

Call for Papers: “Special issue on New trends on E-Procurement applying Semantic Technologies”

Computers in Industry. An International, Application Oriented Research Journal (Impact Factor: 1.620)

Overview

The aim of this special issue is to collect innovative and high-quality research and industrial contributions regarding E-Procurement processes that can fulfill the needs of this new realm. This special issue aims at exploring the recent advances in the application of Semantic Technologies in the E-Procurement sector soliciting original scientific contributions in the form of theoretical, experimental and real research and case studies.

Important dates and Timeline

15 of July 2012, for abstracts (send directly to the guest editors)
1st of September, 2012 for invitations sent to authors to submit full paper
28 Feb 2013, full papers due (submit only to the Elsevier Editorial System)
1st of May 2013, first round of reviews
1st of July 2013, revised papers due (also submitted to the EES)
1st of September 2013, second round of reviews,
1st of November 2013, final papers due (also in EES)

if you need additional information, please contact guest editors:

Jose María Alvarez Rodríguez (Assistant Professor, University of Oviedo), Department of Computer Science, Faculty of Science, University of Oviedo, C/Calvo Sotelo, S/N, 33003, Oviedo, Asturias, Spain, E-mail: josem.alvarez@weso.es
José Emilio Labra Gayo (Associate Professor, University of Oviedo), Department of Computer Science, Faculty of Science, University of Oviedo, C/Calvo Sotelo, S/N, 33003, Oviedo, Asturias, Spain, E-mail: labra@uniovi.es
http://www.di.uniovi.es/~labra
Patricia Ordoñez de Pablos (Associate Professor, University of Oviedo), Department of Business Management, School of Economics and Business, University of Oviedo, Campus del Cristo – Avda. del Cristo, s/n 33006 Oviedo, Asturias, Spain, E-mail: patriop@uniovi.es

My Linked Data Lifecycle

Linked Data Lifecycle by Jose María Alvarez Rodríguez is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Compiling Related Work about Linked Data Quality

One of the cornerstones to boost the use of Linked Data is to ensure the quality of data according to different terms like timely, correctness, etc. The intrinsic features of this initiative provide a framework for the distributed publication of data and resources (linking together datasources on the web). Due to this open approach some mechanisms should be added to check if data is well linked or it is just a try to link together some part of the web. Most of the cases of linking data use an automatic way to discover and create links between resources (e.g. Silk Framework), this situation implies that the process is, in some factors, ambiguous so human decision is required. In the case of the data, the quality may vary as information providers have different levels of knowledge, objectives, etc. Thus information and data are released in order to accomplish a specific task and their quality should be assessed depending on different criteria according to a specific domain.

For instance, a data provider is releasing information about payments, is it possible to check which is the decimal separator, 10,000 or 10.000? is this information homogenous across all resources in the dataset?. If a literal value should be “Oviedo”, what happen if the real value is “Obiedo”? How we can detect and fix these situations?

These cases have motivated some related work:

The PhD thesis of Christian Bizer that purposes a template language and a framework (WIQA) to detect if a triple fulfills the requirements to be accepted in a dataset. (2007)
LODQ vocabulary, is a RDF model to express criteria about 15 kind of metrics that have been formulated by Glenn McDonald in a mailing list. A processor of this vocabulary is still missing. (2011)
A paper entitled “Linked Data Quality Assessment through Network Analysis” by Christian Gueret, in which some metrics are provided to check the quality of links. This work is part of the LATC project. (2011)
The workshop COLD (Consuming Linked Data) is also a good start point to check problems and approaches to deal with the requirements of implementing linked data applications.
…that are collected in the aforementioned works.

In some sense we should think that this problem is new but the truth is that it is inherited from the traditional databases. One of the arising questions is the possibility of applying existing approaches to solve the assessment of quality in the linked data realm…but this will be evaluated in next posts.

This first post is just a short introduction to the linked data quality research and approaches. In next weeks, we try to review in depth these works and purpose a solution (LODQAM).

Thank you very much!

Excellent regards,