Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ST-Toolkit, a Framework for Trajectory Data Warehousing


Published on

Presentation of ST-Toolkit: a Framework for trajectory Data warehousing short paper published in AGILE 2011, Utrecht 20th April 2011.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

ST-Toolkit, a Framework for Trajectory Data Warehousing

  1. 1. ST-Toolkit: a Framework for Trajectory Data Warehousing Authors AGILE 2011 Utrecht – 20/04/2011 Simone Campora Jose Fernandes De Macedo Laura Spinsanti
  2. 2. Overview <ul><li>Agenda </li></ul><ul><ul><li>Introduction </li></ul></ul><ul><ul><li>Generic Data Warehouse Schema for Trajectories </li></ul></ul><ul><ul><li>Generic Data Warehouse Architecture </li></ul></ul><ul><ul><li>Some experiments </li></ul></ul><ul><ul><li>Conclusions </li></ul></ul>
  3. 3. Why Trajectory Data Warehousing The motivation behind Trajectory Data Warehouses (TrDWs) is to transform raw moving objects' trajectories to valuable information that can be exploited for decision-making purposes in ubiquitous applications, such as location-based services, traffic control management, etc using an OLAP (or STOLAP) fashion.
  4. 4. Why Trajectory Data Warehousing <ul><li>Problems in Trajectory Management </li></ul><ul><li>Rapid Access to huge archive of Data </li></ul><ul><ul><li>(e.g. our dataset counts 2Mlns records in one week only!) </li></ul></ul><ul><li>Knowledge Discovery on Trajectories </li></ul><ul><li>(extract interesting patterns from Raw Coordinates) </li></ul><ul><li>Knowledge Presentation </li></ul><ul><li>(how to deliver such information extractions?) </li></ul><ul><li>Semantic Integration </li></ul><ul><li>(how to integrate semantic support-data?) </li></ul>
  5. 5. Related Work Spatio-Temporal DBMS Secondo (Güting et al .) Hermes (Pelekis et al .) Trajectory Data Warehouses Università di Venezia (Orlando et al .)
  6. 6. Why Trajectory Data Warehousing Solution Comparison
  7. 7. Our Contribution <ul><li>Is developed following two main axes: </li></ul><ul><ul><li>Theoretical </li></ul></ul><ul><ul><ul><li>by providing a generic TrDW schema propositions that is robust, intuitive and fits the most used cases </li></ul></ul></ul><ul><ul><ul><li>by providing a centric and non fragmented overview on the main topics of trajectory data warehousing </li></ul></ul></ul><ul><ul><li>Architectural </li></ul></ul><ul><ul><ul><li>by deploying a modular cross-database cross-platform Middleware to support Spatio-Temporal data warehouse modeling </li></ul></ul></ul>
  8. 8. Trajectory Extraction
  9. 9. Generic Data Warehouse Schema <ul><li>Solution built around “Episodes” </li></ul><ul><li>Independent External Semantics </li></ul><ul><li>Trajectory-based Pre-Grouping </li></ul><ul><li>(MutiDimER notation) </li></ul>
  10. 10. <ul><li>Designing a Data Warehouse solution can be tricky because of </li></ul>Data Warehouse Design Issues Lack of standard interfaces every commercial/academic solution is implementing different approaches to istantiate multi dimensional models into Databases <ul><li>Longer learning curves </li></ul><ul><li>Difficulties while migrating to different architectures (RDBMS,Distributed FS, … ) </li></ul><ul><li>Difficulties in replicating the same TDW on the same architecture </li></ul>
  11. 11. Generic Data Warehouse Architecture
  12. 12. Example: Data Warehouse Design First Step Data are Streamed From Raw Datasets into Primary Memory Second Step Java Objects are Buffered and Istantiated Asychronously Sent to the RDBMS Third Step Java Objects are Persisted into RDBM and properly Indexed Fourth Step The MultiDimensional Model is istantiated from RDBMS data sources + DW Metedata Definitions
  13. 13. Some Experiments <ul><li>The Milano Dataset </li></ul><ul><li>Our Experiments are aimed to test </li></ul><ul><ul><li>SOLAP Queries </li></ul></ul><ul><ul><li>STOLAP Queries </li></ul></ul><ul><ul><li>“ Presence” Custom Specified Measure Validation </li></ul></ul>What is the role of semantics in query complexity? Features Value Records 2075213 Trajectories 83134 Stops 464584 Moves 1527495 POIs 39776
  14. 14. SOLAP Query <ul><ul><li>SOLAP Query With Semantics: Retrieve all the stops that occur in a range = Ɵ close to events of type “Restaurant” and that are located in Milan. </li></ul></ul><ul><li>SOLAP Query Without Semantics: Retrieve all the stops that occur in a range = Ɵ close to regions where there is a high concentration Ω of trajectories at lunch time (12:00-13:00) or dinner time (19:00-20:00) and that are located in Milano. </li></ul>int theta = 10; SpatialFilter filter = new DistanceFilter( eventDimension.getProperty(&quot;Event Shape&quot;), new Point( 45.28,9.12),theta); OlapQuery query = new OlapQuery(); query.addSelection(stopMeasure,OlapQuery.COLUMNS); query.addSelection(eventDimension,OlapQuery.ROWS); query.addFilter(filter); query.setCube(stCube); query.execute();
  15. 15. STOLAP Query <ul><ul><li>STOLAP Query With Semantics: Give the number of visits of a moving entity for events of type “Restaurant” where its own trajectory started occur in a range = Ɵ close to a residential area (where residential area is a record of the Event dimension) </li></ul></ul><ul><ul><li>STOLAP Query Without Semantics : Give the number of visits of a moving entity occur in a range = Ɵ close to regions where there is a high concentration Ω of trajectories at lunch time (12:00-13:00) or dinner time (19:00-20:00) where its own trajectory started near a residential area ( defines as an area where a number of ᵚ trajectories start) </li></ul></ul>SpatialFilter filter = new DistanceFilter( eventDimension.getProperty(&quot;Event Shape&quot;), stopMeasure, 1); OlapQuery query = new OlapQuery(); query.addSelection(presenceMeasure,OlapQuery.COLUMNS); query.addSelection(eventDimension,OlapQuery.ROWS); query.addFilter(filter); query.addCondition(&quot;[Event].[Food Shop]&quot;); query.addCondition(&quot;[Trajectory].[Trajectory Group].[Number of Trajectories > 10]&quot;); query.setCube(stCube); query.execute(); N_VISITS OBJET_ID 64640 89754 56055 78796 52015 70702 49995 76930 47470 79088 46460 82085
  16. 16. Presence Measure Validation Presence Measure: Problem: how to aggregates the number of trajectories within a hierarchical fully-geometric dimension avoiding the double-counting problem ?
  17. 17. Some Experiments Presence Measure: understanding the problem 1 1 1 1 0 1 1
  18. 18. Presence Measure Validation Solution : define an aggregation algorithm that can use spatial operators! Our application can define SQL injections for spatial-aggregates : String sqlExpression = &quot;case when get_trj_space_area_intersections(trdw_episode_facts.geom) > 0 then ceil(1/get_trj_space_area_intersections(trdw_episode_facts.geom)) else 0 end &quot;; Measure presence = new VirtualMeasure(“Trj Presence Measure&quot;, factTable, “presence&quot;, sqlExpression);
  19. 19. Presence Measure Validation Results on 260 Trajectories subset Milano – Arese: 2 Milano – Assago: 2 Milano – Bollate: 1 Milano – Bresso: 2 Milano – Buccinasco: 2 Milano - Cesano Boscone: 6 Milano – Cormano: 2 Milano – Corsico: 2 Milano - Cusano Milanino: 2 Milano – Gaggiano: 2 Milano - Locate di Triulzi: 2 Milano – Milano: 186 Milano – Novate: 2 Milano – Opera: 2 Milano – Pero: 2 Milano - Peschiera Borromeo: 2 Milano – Rho: 14 Milano – Rozzano: 2 Milano - San Donato Milanese: 1 Milano - San Giuliano Milanese: 6 Milano – Segrate: 2 Milano - Settimo Milanese: 8 Milano - Trezzano Rosa: 4 Milano - Zibido San Giacomo: 2 Monza and Brianza – Mezzano: 2 Milano: 258 Monza and Brianza: 2 Lombardia: 260
  20. 20. Conclusions <ul><li>Summarizing : we are proposing </li></ul><ul><li>a cross-database cross-platform generic middleware for spatio-temporal DW </li></ul><ul><li>a modular architecture that can be enriched with user-defined aggregation functions </li></ul><ul><li>a proposal for independent integration of Semantics for Trajectories </li></ul><ul><li>the first (known) implementation of a Semantic enriched Trajectory Data Warehouse </li></ul>
  21. 21. Thanks for your attention Any Question? Suggestions? Comments? For more information: Thanks for the attention