SMASH: A Cloud-Based Architecture for Big Data Processing and Visualization of Traffic Data

Abstract
In recent times, big data has become a popular research topic and brought about a range of new challenges that must be tackled to support many commercial and research demands. The transport arena is one example that has much to benefit from big data capabilities in allowing to process voluminous amounts of data that is created in real time and in vast quantities. Tackling these big data issues requires capabilities not typically found in common Cloud platforms. This includes a distributed file system for capturing and storing data, a high performance computing engine able to process such large quantities of data, a reliable database system able to optimize the indexing and querying of the data, and geospatial capabilities to visualize the resultant analyzed data. In this paper we present SMASH, a generic and highly scalable Cloud-based architecture and its implementation that meets these many demands. We focus here specifically on the utilization of the SMASH software stack to process large scale traffic data for Adelaide and Victoria although we note that the solution can be applied to other big data processing areas. We provide performance results on SMASH and compare it with other big data solutions that have been developed.

This publication has 13 references indexed in Scilit: