VistSOS Data Visualization Framework

Welcome I am Felipe Poveda student of Computer Science at Politecnico di Milano. For this project my mentors are Massimiliano Cannata and Milan Antonovic. This is the wiki of Vistsos: The istSOS Data Visualization Framework project.

Introduction
Visualization is the human activity of representing by means of images phenomena of a different nature. Therefore, data visualization, is the activity of representing data obtained from many different sources with the ultimate goal of helping users to understand the patterns of information behind. The objective of this project is to create a customizable framework that will allow users to define and publish data visualizations charts linked with istSOS. The objective mentioned before imposes two main challenges: one of technical nature an the other more associated with the "effective" application of design.

Proposal summary
In order to extent the available characteristic offered by the istSOS server, I propose the use of an open source Javascript library for the implementation of VistSOS: the istSOS Data Visualization Framework. The proposed solution shall allow users of the istSOS ecosystem to configure, customize and include a wide set of chart, bar, grid and map visualization types in their websites. The framework will offer complete support for the connection of the visualization widgets with multiple data-exchange formats like CSV, TSV and JSON. Considering the real-time nature of the data sources the framework will support the generation of real-time visualization widgets specially crafted to improve the readiness to display data of this kind. The framework shall be implemented with robustness, reliability and customization as its core development principles. Recommended practices for software development like Version Control Systems use and continous testing will be followed.

Implementation details

 * There are several candidates libraries for the implementation of the framework, many of them based on Javascript. Considering the goal and requirements of the project, one suitable candidate for the implementations of the framework can be the D3.js library which offers an important set of data visualization types both interactive and non-interactive.


 * Besides the fact that D3.js is subject to the terms of an open license like BSD-3, two important points in favor of D3.js is its well documented API and its capacity of rendering of real-time data. For a detailed comparison of charting frameworks please refer to:.


 * In order to improve the compatibility with the majority of browsers, standardized languages like HTML and CSS should be properly used. The proposed library D3.js support the generation of graphs in SVG format, supported by the majority of browsers that includes an SVG rendering engine. The manipulation of the browser’s DOM supported by D3.js is also tested against Firefox, Chrome (Chromium), Safari (WebKit), Opera and IE9.


 * The multiple support of data-interchange formats like CSV, TSV or JSON should be implemented with any pre-processing routine needed to normalize the incoming data. However, many charting frameworks support by default this formats and D3.js is not the exception.

Repository
https://github.com/felipe07/vistsos

How to test the code

 * Get the code from the repository.
 * Install and configure a Web Server (I use Apache).
 * Copy the preliminary vega-lite html files (this files are just tests using vega-lite) in the corresponding web document folder.
 * Request the file from a Web browser.

Report #1 - Exploring visualization libraries

 * During this week I started the evaluation of 3 javascript visualization libraries: Rickshaw and Vega-Lite. This evaluation consist in the creation of bar, line, and scatterplot charts using the aforementioned libraries and JSON data coming from istSOS.


 * I made a python script to insert rainfall and temperature sensor measurements comming from the agency ARPA Lombardia. This script process the csv files in order to adjust them to the csv format requiered by istSOS.

Rickshaw
Rickshaw is a graphical toolkit for the creation of interactive visualizations. It was created by extending D3.js to support the definition of an important number of graphical characteristic.

Although rickshaw seems to follow the same philosophy of vega-lite (Wilkinson’s grammar of graphics) because it uses a declarative approach to define the charts, rickshaw does not include all the elements that vega-lite use to map datasets into statistical graphical representations and therefore operations like transformations of data or statistical summarization are not available.

Supported charts

 * Rickshaw support several types of charts: line, bar, area, stack and real-time. It is possible to extend any of the available charts with custom-made characteristics or even create a new kind of chart with javascript code. Rickshaw uses the same approach of D3.js in order to configure the different properties of the charts. This approach is based on the W3C Selectors API (https://www.w3.org/TR/selectors-api/) that offers a declarative way of matching DOM nodes with patterns simplifying the chart definition process.

Customizability
Although Rickshaw is a well-known visualization library it is not updated as frequently as vega-lite and therefore lacks of a continuous group of developers that can improve even more this interesting library.
 * With rickshaw it’s possible to define interactive real-time visualizations and also control all the graphical elements. This can be achieved modifying the chart properties or extending with javascript code the available set of charts.

Cubism
Cubims is a javascript library for visualizing time-series. Cubism makes use of horizon-charts (http://vis.berkeley.edu/papers/horizon/) which is an effective and understandable way of visualizing real-time series that also reduces the consumption of vertical space making a better use of the screen. The effectiveness of this approach is measured in terms of how well users perceive real-time data and are able to work with it. It can be plugged with many data providers like CSV or JSON files or real-time streams of data.

Supported charts

 * By default there is only one chart supported by this library: the horizon-chart, therefore the definition of new charts requires javascript codification.

Customizability

 * Is it possible to customize horizon-charts changing graphical elements like colors, scales, size or data formats.

Unfortunately, cubism is not an active project (https://github.com/square/cubism/graphs/contributors).

Vega-lite
Vega-lite is a javascript library designed to offer maximum customizability through the idea of a grammar for graphics (The grammar of graphics, Wilkinson). It works mapping a data set into properties of graphical marks that ultimately can be represented as visualization representations in the browser. This approach is declarative and offers a big level of personalization during the definition of the charts.

Supported charts

 * Vega-lite supports a wide range of static charts: bar, line, grouped bar, trellis, area, bubble and stack. The availability of this variety of charts represents a big advantage along with the configuration capabilities that vega-lite offer allowing the user to modify any aspect of the visualization. Reading data from CSV or JSON files is easy and the possibility to filter incoming data before it is mapped to a statistical graphical representation is a key feature.


 * Vega-lite doesn’t support interactive visualizations by itself, instead it is necessary to declare the chart using the parent project Vega which offers an even bigger set of customizable charts, including interactive ones.

Customizability
Vega-lite is an active project considering the number of commits made to the project (https://github.com/vega/vega-lite/graphs/contributors) and also how recent they are.
 * Vega-lite is highly customizable allowing the user to modify any graphical element of the visualization. Customize an element of the visualization is simple because it requires the declaration of properties instead of creating a code routine or even doing selections (as D3.js does).

Schedule
There is an initial schedule for the project which should be re-defined with the tutors during next weeks. Provisionally, I would like to share the activities and milestones defined by me chronologically ordered (also subject of change during coming weeks):


 * Project planning with tutor:
 * Timeline adjustments
 * Definition of charts to be implemented (divided in 3 groups)
 * Definition of the technical limitations of the framework


 * First prototype implementation:
 * Real-time data unsupported
 * Non-interactive widgets
 * Minimum customizable options: Color, size and position.
 * Implementation of
 * 1st group of charts
 * Hard-coded data


 * Prototype evaluation


 * Addition of feature:
 * Processing of CSV data
 * Processing of TSV data


 * Prototype evaluation


 * Addition of feature:
 * Processing of JSON data


 * Prototype evaluation


 * Second prototype implementation:
 * Implementation of 2nd group of charts
 * Interactive widgets
 * Customizable options: Color, size, position and scale


 * Prototype evaluation


 * Third prototype implementation:
 * Implementation of the 3rd group of charts
 * Real-time data supported


 * Prototype evaluation


 * Prototype evaluation


 * Final product presentation

What new functionality this project brings

 * The framework will provide a wide set of interactive and non-interactive data visualization charts, grids, plots and maps able to encode data coming from a network of sensors, external databases or files.


 * The framework will allow the user to customize the color, scale, size or position of any available widget.


 * Real-time data visualizations shall be available.


 * The framework will support the processing of incoming data represented as CSV, TSV and JSON.

Who will use results of this project
Organizations and individuals interested in visualization of real time series provided by istSOS.

Student's Biography
I was born in Bogotá, Colombia. I have a bachellor in Systems Engineering from the District University of Bogotá and currently I'm doing a M.Sc. in Computer Science at Politecnico de Milano. I worked before in some companies doing software development. I learned from many kind and humble people ways to improve my coding skills to the point that now I am working on this exciting project. I'm interested in interdisciplinary projects.