VistSOS Data Visualization Framework

Welcome I am Felipe Poveda student of Computer Science at Politecnico di Milano. For this project my mentors are Massimiliano Cannata and Milan Antonovic. This is the wiki of Vistsos: The istSOS Data Visualization Framework project.

Introduction
Visualization is the human activity of representing by means of images phenomena of a different nature. Therefore, data visualization, is the activity of representing data obtained from many different sources with the ultimate goal of helping users to understand the patterns of information behind. The objective of this project is to create a customizable framework that will allow users to define and publish data visualizations charts linked with istSOS. The objective mentioned before imposes two main challenges: one of technical nature an the other more associated with the "effective" application of design.

Proposal summary
In order to extent the available characteristic offered by the istSOS server, I propose the use of an open source Javascript library for the implementation of VistSOS: the istSOS Data Visualization Framework. The proposed solution shall allow users of the istSOS ecosystem to configure, customize and include a wide set of chart, bar, grid and map visualization types in their websites. The framework will offer complete support for the connection of the visualization widgets with multiple data-exchange formats like CSV, TSV and JSON. Considering the real-time nature of the data sources the framework will support the generation of real-time visualization widgets specially crafted to improve the readiness to display data of this kind. The framework shall be implemented with robustness, reliability and customization as its core development principles. Recommended practices for software development like Version Control Systems use and continous testing will be followed.

Implementation details

 * There are several candidates libraries for the implementation of the framework, many of them based on Javascript. Considering the goal and requirements of the project, one suitable candidate for the implementations of the framework can be the D3.js library which offers an important set of data visualization types both interactive and non-interactive.


 * Besides the fact that D3.js is subject to the terms of an open license like BSD-3, two important points in favor of D3.js is its well documented API and its capacity of rendering of real-time data. For a detailed comparison of charting frameworks please refer to:.


 * In order to improve the compatibility with the majority of browsers, standardized languages like HTML and CSS should be properly used. The proposed library D3.js support the generation of graphs in SVG format, supported by the majority of browsers that includes an SVG rendering engine. The manipulation of the browser’s DOM supported by D3.js is also tested against Firefox, Chrome (Chromium), Safari (WebKit), Opera and IE9.


 * The multiple support of data-interchange formats like CSV, TSV or JSON should be implemented with any pre-processing routine needed to normalize the incoming data. However, many charting frameworks support by default this formats and D3.js is not the exception.

Structure of the framework
The first version of the diagram representing the framework structure can be accessed from this link | Framework Structure.

The following aspects will change during the next days:


 * Instead of requesting observations directly to the istSOS REST API, the framework will use the javascript API developed by the team.
 * It is possible that instead of an iframe the framework will generate a reausabe Web Component containing the chart.

Repository
https://github.com/felipe07/vistsos

How to test the code
Inside the head section include:



Inside the body of the HTML document include a declaration of a chart with some parameters, e.g.

 

Note the id of the div and the parameter 'divId' of the istsos-chart element should match.

The property 'type' can take one of the following values:


 * line
 * bar
 * punch-card
 * scatterplot
 * overview-detail
 * multivariable
 * trellis (https://en.wikipedia.org/wiki/Small_multiple)

VistSOS support the following list of configuration options:


 * server: The server address to access istSOS.
 * service: A service configured in istSOS.
 * offering: The offering associated with the procedure.
 * procedure: A procedure configured in istSOS.
 * property: A list of observed properties to visualize.
 * from: initial date time.
 * until: final date time.
 * color: Color of the mark (line, bar, circle, etc) used to visualize the data. Not applicable to multivariate charts.
 * color2: Second color of the mark. So far, Applicable only for overview-detail charts.
 * strokeWidth: Line thickness. So far, Applicable to line and multivariate charts.
 * aggregate: An aggregation operation applied on the data, e.g. count, average, stdev, sum, variance. Currently supported by punch-card charts. For more information check vega aggregate documentation: https://vega.github.io/vega-lite/docs/aggregate.html.
 * timeUnit: A time unit or combination of time units to apply to the chart, e.g. date, year, month, hours, minutes, monthdate, monthday, monthdatehour, etc. For more information check vega time unit documentation: https://vega.github.io/vega-lite/docs/timeunit.html.
 * timeUnit2 : Same as timeUnit but applied to the second axis. Currently only applicable to punch-card charts.
 * timeFormat: A date time format string, e.g. %y/%m/%d (2 digits year, month number, day of the month). For a detailed list of time format options check d3.js documentation: https://github.com/d3/d3-time-format/blob/master/README.md#locale_format.
 * rowTimeUnit: The time unit (year, month, date, hours, etc) or combination of time units applied to each row of a trellis chart. Each row is represented as a separate plot visualizing a different subset of the datset, e.g, each row represents a year.
 * xTimeUnit: The time unit (year, month, date, hours, etc) or combination of time units applied to the X axis of a trellis chart.
 * yTimeUnit: The time unit (year, month, date, hours, etc) or combination of time units applied to the Y axis of a trellis chart.
 * bin: If this parameter is equal to the string "true", the trellis chart will create a number bins aggregating the data by measurement. To use a different aggregation, set this parameter to "false" and specify a supported aggregation operation with the parameter aggregate.

Report #1 (May 29)
1. What did you get done this week?


 * During this week I started the evaluation of 3 javascript visualization libraries: Rickshaw and Vega-Lite. This evaluation consist in the creation of bar, line, and scatterplot charts using the aforementioned libraries and JSON data coming from istSOS.


 * I made a python script to insert rainfall and temperature sensor measurements comming from the agency ARPA Lombardia. This script process the csv files in order to adjust them to the csv format requiered by istSOS.

2. What do you plan on doing next week?


 * Continue the evalutation of the javascript libraries adding interactive charts. Evaluate the javascript library Cubism with bar, line and scatterplot charts.


 * Create a first version of the evaluation report (including factors such as performance, number of charts supported and customizability)

3. Are you blocked on anything?


 * Not really but during the evaluation of the Rickshaw library the customization of the x axis doesn't work properly when the chart have more than 2 weeks of data. By not working properly I mean the relation between the x axis and the points in the y axis can be confusing for the end user.

Report #2 (June 5)
1. What did you get done this week?


 * Based on the creation of statistical charts using data from istSOS I made an evaluation of three javascript libraries for data visualization: vega-lite, rickshaw and cubism. This report is available on the wiki of the project[1].

2. What do you plan on doing next week?


 * Prepate a showcase of static and dynamic vega-lite charts and socialize this results with the team.

3. Are you blocked on anything?


 * No.

Report #3 (June 12)
1. What did you get done this week?


 * During this week I did a refactoring of the code. This activity consisted in the separation of the chart specification from the data. I created a first version of a chart showcase that I will socialize with the team in order to show the available characteristics of vega and vega-lite.

2. What do you plan on doing next week?


 * I have to add more dynamic charts to the showcase. I will begin to code a usable prototype of the framework.

3. Are you blocked on anything?


 * No.

Report #4 (June 19)
1. What did you get done this week?


 * I completed a first version of a functional framework prototype (No validations and a minimal set of configuration options). This prototype allows a user to configure a minimum set of options to generate a chart that can be embedded as an IFrame inside an external html document.

2. What do you plan on doing next week?


 * I have to include more configuration options in the prototype, like color, scale and others.
 * I have to add validations to the Chart Designer.
 * I have to add support for selecting timezones in the Chart Designer.
 * I have to connect the prototype with the project | istSOS Web API in order to use the getObservations method.

3. Are you blocked on anything?


 * No.

Report #5 (June 26)
1. What did you get done this week?


 * I included validations for the fields of the Chart Designer.
 * I included the time zone offset as part of the getObservations request.
 * I included a color picker for the marks of the chart (e.g. color of the line if a chart line is selected).

2. What do you plan on doing next week?


 * I have to include more configuration options in the prototype, like a second color, scale and others.
 * I have to connect the prototype with the project | istSOS Web API in order to use the getObservations method.
 * I have to migrate to Bower for dependencies management.
 * I have to implement a Multiseries chart with multiple X axes.

3. Are you blocked on anything?


 * No.

Report #6 (July 3)
1. What did you get done this week?


 * I implemented an HTML Import Web Component to manage the generation of the chart.
 * I implemented a Custom HTML Element to be the embedded part of the chart in the client website.
 * I connected VistSOS with the istSOS Javascript Core API in order to get observations.
 * I've migrated the implemented functionality that generates the chart in order to make it compatible with the developed Web Components.
 * Some dependencies were migrated to bower.

2. What do you plan on doing next week?


 * I have to finish the implementation of the Multiseries chart with multiple Y axes.
 * Evalute and implement -if possible- a full histogram divided in parts and a cumulative histogram.
 * Include more configuration options in the prototype, like a second color, scale and others.

3. Are you blocked on anything?


 * No.

Report #7 (July 10)
1. What did you get done this week?


 * I finished the implementation of a multiseries chart with multiple y axes.
 * I included the following parameters as configuration options for the charts: stroke width (for line charts), timezone offset for the date filters and second color for the overview-detail chart.
 * Now the unit of measurement of the observed properties is used as the label for the axis.

2. What do you plan on doing next week?


 * Evalute and implement -if possible- a full histogram divided in parts and a cumulative histogram.
 * Create a work-around for the incompatibility of the WebComponentsJS library with Mozilla Firefox and Internet Explorer.

3. Are you blocked on anything?


 * No.

Web Components
Having in mind that the functionalities provided by the VistSOS should allow the user to configure a set of parameters in order to design the final specification of the chart, it is necessary to use a proper technique to achieve this goal. The first attempt was based on the idea of using IFrames which works well but with the following drawbacks:


 * It's a heavy element that potencially can decrease the performance of the client's website (the one using the embedded chart).
 * It's not declarative therefore not alligned with the goal of the framework.
 * Does not offer much customization options.

On the other hand, Web Components offer a declarative, efficient and very flexible way of defining embeddable components. For this project the following features of Web Components will be used:


 * HTML Imports.
 * Custom Elements.
 * Templates

The brower support for Web Components does not cover yet all the available browsers, however most of them support a wide set of the available features. For more information read: https://blog.revillweb.com/web-component-challenges-a09ebc598d65#.us8rn324w

Rickshaw
Rickshaw is a graphical toolkit for the creation of interactive visualizations. It was created by extending D3.js to support the definition of an important number of graphical characteristic.

Although rickshaw seems to follow the same philosophy of vega-lite (Wilkinson’s grammar of graphics) because it uses a declarative approach to define the charts, rickshaw does not include all the elements that vega-lite use to map datasets into statistical graphical representations and therefore operations like transformations of data or statistical summarization are not available.

Supported charts

 * Rickshaw support several types of charts: line, bar, area, stack and real-time. It is possible to extend any of the available charts with custom-made characteristics or even create a new kind of chart with javascript code. Rickshaw uses the same approach of D3.js in order to configure the different properties of the charts. This approach is based on the W3C Selectors API (https://www.w3.org/TR/selectors-api/) that offers a declarative way of matching DOM nodes with patterns simplifying the chart definition process.

Customizability
Although Rickshaw is a well-known visualization library it is not updated as frequently as vega-lite and therefore lacks of a continuous group of developers that can improve even more this interesting library.
 * With rickshaw it’s possible to define interactive real-time visualizations and also control all the graphical elements. This can be achieved modifying the chart properties or extending with javascript code the available set of charts.

Cubism
Cubims is a javascript library for visualizing time-series. Cubism makes use of horizon-charts (http://vis.berkeley.edu/papers/horizon/) which is an effective and understandable way of visualizing real-time series that also reduces the consumption of vertical space making a better use of the screen. The effectiveness of this approach is measured in terms of how well users perceive real-time data and are able to work with it. It can be plugged with many data providers like CSV or JSON files or real-time streams of data.

Supported charts

 * By default there is only one chart supported by this library: the horizon-chart, therefore the definition of new charts requires javascript codification.

Customizability

 * Is it possible to customize horizon-charts changing graphical elements like colors, scales, size or data formats.

Unfortunately, cubism is not an active project (https://github.com/square/cubism/graphs/contributors).

Vega-lite
Vega-lite is a javascript library designed to offer maximum customizability through the idea of a grammar for graphics (The grammar of graphics, Wilkinson). It works mapping a data set into properties of graphical marks that ultimately can be represented as visualization representations in the browser. This approach is declarative and offers a big level of personalization during the definition of the charts.

Supported charts

 * Vega-lite supports a wide range of static charts: bar, line, grouped bar, trellis, area, bubble and stack. The availability of this variety of charts represents a big advantage along with the configuration capabilities that vega-lite offer allowing the user to modify any aspect of the visualization. Reading data from CSV or JSON files is easy and the possibility to filter incoming data before it is mapped to a statistical graphical representation is a key feature.


 * Vega-lite doesn’t support interactive visualizations by itself, instead it is necessary to declare the chart using the parent project Vega which offers an even bigger set of customizable charts, including interactive ones.

Customizability
Vega-lite is an active project considering the number of commits made to the project (https://github.com/vega/vega-lite/graphs/contributors) and also how recent they are.
 * Vega-lite is highly customizable allowing the user to modify any graphical element of the visualization. Customize an element of the visualization is simple because it requires the declaration of properties instead of creating a code routine or even doing selections (as D3.js does).

Schedule
There is an initial schedule for the project which should be re-defined with the tutors during next weeks. Provisionally, I would like to share the activities and milestones defined by me chronologically ordered (also subject of change during coming weeks):


 * Project planning with tutor:
 * Timeline adjustments
 * Definition of charts to be implemented (divided in 3 groups)
 * Definition of the technical limitations of the framework


 * First prototype implementation:
 * Real-time data unsupported
 * Non-interactive widgets
 * Minimum customizable options: Color, size and position.
 * Implementation of
 * 1st group of charts
 * Hard-coded data


 * Prototype evaluation


 * Addition of feature:
 * Processing of CSV data
 * Processing of TSV data


 * Prototype evaluation


 * Addition of feature:
 * Processing of JSON data


 * Prototype evaluation


 * Second prototype implementation:
 * Implementation of 2nd group of charts
 * Interactive widgets
 * Customizable options: Color, size, position and scale


 * Prototype evaluation


 * Third prototype implementation:
 * Implementation of the 3rd group of charts
 * Real-time data supported


 * Prototype evaluation


 * Prototype evaluation


 * Final product presentation

What new functionality this project brings

 * The framework will provide a wide set of interactive and non-interactive data visualization charts, grids, plots and maps able to encode data coming from a network of sensors, external databases or files.


 * The framework will allow the user to customize the color, scale, size or position of any available widget.


 * Real-time data visualizations shall be available.


 * The framework will support the processing of incoming data represented as CSV, TSV and JSON.

Who will use results of this project
Organizations and individuals interested in visualization of real time series provided by istSOS.

Student's Biography
I was born in Bogotá, Colombia. I have a bachellor in Systems Engineering from the District University of Bogotá and currently I'm doing a M.Sc. in Computer Science at Politecnico de Milano. I worked before in some companies doing software development. I learned from many kind and humble people ways to improve my coding skills to the point that now I am working on this exciting project. I'm interested in interdisciplinary projects.