MapReduce Design patterns

Last and this week I have been preparing an introduction about the Map/Reduce algorithm I have found a lot of new excellent references (see my previous post) and I have read some books that I did not know. As a result I have made a compilation (it is just a summary that I will continue updating) that can serve as roadmap about what MapReduce is and what you can do with this programming model. As soon as I review my presentation and the examples I will upload them,

Week #10 and #11 in Thessaloniki

Hi! I have been a little bit lazy about writing in the blog but it is now time of recovering good practices. I am going to summarize my tasks during the last two weeks,

Reading (pi = persona interest)

Writing, reviewing and researching

  • I am managing a Special Issue in the Journal of Computers in Industry, Elsevier.
  • I finished the review of a book for Manning Publications.
  • I have been included as Technical Development Editor in Manning Publications.
  • I have been included as PC member in the workshop proposal “Data Mining on Linked Data (DMoLD’13) workshop with Linked Data Mining Challenge”  thanks to my colleagues at the University of Economics in Prague.
  • I am reviewing a paper for the journal “Expert Systems with Applications” (IF: 2.203)
  • I am reviewing and finishing the paper with my colleague Alejandro Montes about his final Master Project.

Meetings

  • I have had a meeting with my SEERC colleagues to talk about next actions.
  • I have had a meeting with Michalis Vafoupolus to prepare the Linked Data Cup paper.
  • I have had two meetings with Lum about his Bachelor Degree Project. It is a kind of supervising to address the problem of sentiment analysis using Rapidminer, Lingpipe, Alchemy API and a custom solution.

Coding and Tools

  • I have made in my leisure time a tool for unifying company names called CORFU using Python, NLTK and the APIs of Google Places, Linkedin and Google Suggestions. It also includes other algorithms based on string similarity, etc.
  • I have developed a simple sentiment analyzer using Alchemy API.
  • I have adapted some examples of Map Reduce patterns

Other things

  • I continue my fight to learn Greek…I have to study a bit more!
This is all I can remember…perhaps I am missing something…!

Week #9 in Thessaloniki

Just a few comments for this week (to be completed)….

Reading (pi = persona interest)

Writing, reviewing and researching

  • I am finishing the book chapter about publishing statistical data in RDF
  • I am managing a Special Issue in the Journal of Computers in Industry, Elsevier
  • I just realized that Labra added me in the Acknowledgements part of his work about “Multilingual Open Data Patterns” I am very proud of that! (to be honest I just collaborated in the first presentation with some links and specially through some comments with regards to SKOS-XL). I also suggest to read the paper in which each of the patterns is explained and discussed with excellent examples.

Meetings

  • I have had a meeting with my SEERC colleagues to present my prototype and plan next actions in QoS, etc.
  • I have had a meeting with Michalis Vafoupolus to prepare the Linked Data Cup paper.

Coding and Tools

  • I have implemented a real-time based architecture using the Lambda approach and following some hints from Pere Ferrera. It is not the same algorithm and I am just take the approach to tackle the problem not source code. Next steps include to use RDF as views for batch and real-time layers using SPARQL federated queries (for instance Fedex). The example just takes a Twitter stream using Tweet4J API and counts words presenting the results in a HTML page. Documentation is available here and also the the source code (under development).
  • I have linked to the CPV the public procurement notices from UK, USA and AUS.

Other things

Week #7 in Thessaloniki

Just a few comments for this week (to be completed)….

Reading (pi = persona interest)

Writing, reviewing and researching

  • I am reviewing a paper for a Special Isuee of a JCR journal
  • I am finishing the book chapter about publishing statistical data in RDF
  • I have also made the first review of WESOMENDER (we have to work hard to get a good contribution but the expectations are high)
  • I have been invited to be part of the PC of the Special Session “Engineering Tool Integration for Industrial Automation System Development (ETAS 2013)” in conjuction with IECON2013
  • I have joined in the research group “Comercio Electronico en Colombia – GICOECOL” thanks to Luz Andrea RODRIGUEZ ROJAS with whom I will collaborate to empower the use of Open Data in e-Health.

Meetings

Coding and Tools

I have implemented a real time word counter of Twitter status using different tecniques:

  • The classical Observer design pattern
  • The Storm framework, I have reused some examples to implement my own spouts and bolts
  • The Trident framework on the top of Storm, I have also reused some examples of the storm-starter project customizing the code to get a better understanding

Other things

This week I have started the 3-month Greek course and I am very happy because I can now understand some words and read a little bit :) Besides my classmates are from a lot of countries: Bulgaria, Germany, Bosnia, France, Serbia, New Zealand, Italy, Moldova and Russia. It is a GREAT experience.

 

Week #5 in Thessaloniki

The last week I have been focused on two main tasks: my presentation at the City College and the submission of a paper. Following the same structure of last weeks I leave some links to the activities I am carrying out:

Reading

Writing and reviewing

  • I have continued with the structure and firts contents of two papers and one special issue proposal.
  • I have managed all the abstracts for the COMIND Special Issue.
  • I have submitted a paper to “Computers and Human Behavior
  • I have made the presentation in the following bullet to the Deparment of Computer Science at City College
  • I have reviewed my previous presentation about MOLDEAS and the new one is supposed to be more didactic such as an “Intro” to Linked Open Data


Coding and Tools

I have not made any relevant progress in developing tasks.

 

Week #3 in Thessaloniki

This week I will be updating this post because I am reading a lot of papers and I need a way to track them. Following the same structure of last weeks I leave some links to the activities I am carrying out:

Reading

I have focused on some interesting subjects Statistics (Bayesian networks), Data Streams, Feedback Control Loops, Autonomous Computing and e-Learning systems (this is just for personal interest). I have started, and finished, the next list of papers and books:

Writing and reviewing

I would like to leave the link to an article about “How to review a paper“, an excellent guide to evaluate your reviews and take into account your responsibilities as reviewer.

Coding and Tools

I have not made any relevant progress in developing tasks but I have being refreshing my know-how on R.

Teaching

I have finished the evaluation of alumni in Health Information Systems and I am very proud of the marks and the work carried out by student during the last months. I have some links of their works building mashups but I prefer do not leave here the links due to privacy issues.

Week #2 in Thessaloniki

Hi all,

This week has elapsed very fast and I have made a lot of things that I leave bellow:

Reading

My main concern in the research is how can I address the automatic computation of a lot of sensors (applications, cloud management platforms, etc.), i.e. how can I monitorize resources? and which the variables to be taken into account are. In this sense I have read some papers from my colleagues at SEERC and other authors:

The main outcome of this work has been an small presentation about how to process Big Data applying the Lambda architecture, more specifically adding semantic to this process. It is a just a proposal and first thinkings but I will do my best to debug and design the whole process.

Writing

I have made some progresses in the article about the experience publishing the “Webindex” as Linked Data and I have also planned the potential articles for this year and their contents.

Coding

Sometimes you feel very motivated to test new tools and frameworks and I am now in this phase:

Teaching

I have finished the evaluation of Health Information Systems in Nursing and Physiotherapy course at the University of Oviedo. They have developed very good works applying Web 2.0 concepts for building mashups in the Health sector, I am very proud of all students.

Administrative Stuff

Here it is where I spent most of the time (and thanks to my colleague Fotis) but I finally got (almost) all the required documentation:

I believe something is missing but, anyway, it is just a summary…

Week #1 in Thessaloniki

As you may know I am starting a new stage in a new country and institution. Now I am a Marie Curie Experienced Researcher (Postdoc) working at SEERC in Thessaloniki, more specifically in the RELATE-ITN FP7 project. My research will be address some topics such as stream reasoning, cloud computing, big data, etc. to create a system for monitoring QoS in cloud computing environments and service oriented architectures. As far as I know the objective is to get information about applications on the cloud and verify that the current status of different variables are aligned to SLAs so it is necessary to continuosly gather data from applications, promote to an existing knowledge-base and check restrictions through reasoning processes for finally making decisions such as new provisioning, etc.

This first week I was adapting to my new office and I would like to thank you to all administrative staff and colleagues from SEERC for their warm welcome! I am very motivated. My work during this first week was focused on reading papers, testing tools and coding some prototypes. Following I am going to leave a summary of my activities:

Reading (specs, research papers and books)

Writing

  • I am finishing a technical report abou our work in the Webindex project
  • I am starting to write some papers in different topics that I will announce as soon as they are finished

Development and testing

This is more or less what I have been doing this first week, I think I have improved and refreshed part of my know-how and I have also designed a first version of “A semantic-based lambda architecture for QoS Management in Cloud Computing and Service Oriented Architectures” that I will present on Tuesday.
Let’s rock it!