Week #3 in Thessaloniki

This week I will be updating this post because I am reading a lot of papers and I need a way to track them. Following the same structure of last weeks I leave some links to the activities I am carrying out:

Reading

I have focused on some interesting subjects Statistics (Bayesian networks), Data Streams, Feedback Control Loops, Autonomous Computing and e-Learning systems (this is just for personal interest). I have started, and finished, the next list of papers and books:

Writing and reviewing

I would like to leave the link to an article about “How to review a paper“, an excellent guide to evaluate your reviews and take into account your responsibilities as reviewer.

Coding and Tools

I have not made any relevant progress in developing tasks but I have being refreshing my know-how on R.

Teaching

I have finished the evaluation of alumni in Health Information Systems and I am very proud of the marks and the work carried out by student during the last months. I have some links of their works building mashups but I prefer do not leave here the links due to privacy issues.

R & Big Data Intro

I am very committed to enhance my know-how on delivering solutions deal with Big Data in a high-performance fashion. I am continuously seeking for tools, algorithms, recipes (e.g. Data Science e-book), papers and technology to enable this kind of processing because it is consider to be relevant in next years but it is now a truth!

Last week I was restarting the use of R and the rcmdr to analyze and extract statistics out from my phd experiments using the Wilcoxon Test. I started with R three years ago when I developed a simple graphical interface in Visual Studio to input data and request operations to the R interpreter, the motivation of this work was to help a colleague with his final degree project and the experience was very rewarding.

Which is the relation between Big Data and R?

It has a simple explanation, a key-enabler to provide added-value services is to manage and learn about historical logs so putting together an excellent statistics suite and the Big Data realm it is possible to answer the requirements of a great variety of services from domains like nlp, recommendation, business intelligence, etc. For instance, there are approaches to mix R with Hadoop  such as RICARDO or Parallel R and  new companies are emerging to offer services based on R to process Big Data like Revolution Analytics.

This post was a short introduction to R as a tool to exploit Big Data. If you’re interested in this kind of approaches, please take a look to next presentation by Lee Edfelsen:

Keep on researching and learning!