Experiments

At the beginning of my journey, I was mainly concern on continously learning new stuff following MOOCs one after another, without spending enough time to "digest" all the learning and using it on real problems. It was not a wise approacch and, after a while, I have started to realize that it is actually not enough.

The best way to really master a topic is to find a balance between theory and continuous practicey/be hands-on. In other words, get your hands dirty working on some pet projects, replicating experiments, or even better, capstone projects and Kaggle competitions.

	Description	Links
Experimenting with 'ggplot2' July - October 2017	Building more knowledge around the 'ggplot2' package and how to use to create powerful visualizations and custom graphical elements. Learnings and findings summirized in a set of blogs (see Links). [Technology stack: R & R ecosystem]	Basic Plotting Essential Concepts Guidelines for good plots How to work with maps Customize with 'grid' Customize with 'ggplot2'
Extending ggplot2: create a new geom June 2017	Build a custom geom for ggplot2 that can be used to add the expected result for a single storm observation to a map. The data, these wind radii, are available for Atlantic basin tropical storms since 1988 through the Extended Best Tract dataset. [Technology stack: R & R ecosystem]	More Info... Code & Data
ML - KNN Algorithm April 2017	Using the KNN (K-Nearest Neighbors) algorithm to address a regression problem: the prediction of house values in the Seattle area. [Technology stack: Jupyter Notebook & Python ecosystem]	KNN - Regression problem Code & Data
NLP - Naive Bayes Classifier December 2016	Using a Naive Bayes Classifier to perform text classification: classify spam vs. ham sms using the SMS Spam Collection v. 1 dataset. [Technology stack: R & R ecosystem]	Naive Bayes
NLP - Exploring the `tidytext` package December 2016	Using the 'tidytext' package on different datasets (e.g. some books from the Project Gutenberg collection) to find useful insights/ information from text and transform it into data that could be used for further analysis. [Technology stack: R & R ecosystem]	Basic Usage Sentiment Analysis
NLP - Exploring the `tm` package November 2016	Using the 'tm' package on the SMS Spam Collection v. 1 dataset to find useful insights/ information from text and transform it into data that could be used for further analysis. [Technology stack: R & R ecosystem]	Basic Usage
Capstone Project June 2016	The Capstone Project of the "Data Science Specialization", created by JHU in collaboration with SwiftKey. The goal is to create a text prediction application so that when someone types “I went to the”, the application should presents three options for what the next word might be. For example, the three words might be gym, store, restaurant. The language model should be created using the HC corpora. [Technology stack: R & R ecosystem]	Artifacts Lesson Learned Code

Discovering the new Digital world