Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Search Computing Overview

11,686 views

Published on

Search Computing keynote @ CAISE 2010

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Search Computing Overview

  1. 1. SearchComputing Stefano Ceri, Keynote talk at CAISE, Hammamet, June 9, 2010 Joint work with: Adnan Abid, Mamoun Abu Helu, Davide Barbieri, Daniele Braga, Marco Brambilla, Alessandro Bozzon, Alessandro Campi, Sofia Ceppi, Francesco Corcoglioniti, Emanuele Della Valle, Davide Eynard, Piero Fraternali, Nicola Gatti, Giorgio Ghisalberghi, Michael Grossniklaus, Davide Martinenghi, Marco Masseroli, Maristella Matera, Chiara Pasini, Elena Pellizzotti, Stefania Ronchi, Marco Tagliasacchi, Luca Tettamanti, Salvatore Vadacca, Riccardo Volonterio, Serge Zagorac
  2. 2. Genesis of Search Computing  My “Gong Show” challenge at 2003 Lowell Workshop: “Find an ethnical restaurant in a nice place close to Milano” .  Logically a composition of domains: – Restaurants (ethnical) – Geo-locations (nice place close to Milano)  Composing maps with “geo-located” information is now solved by all search engines … … but in general no system is capable of composing arbitrary semantic domains Database Management Prof. Stefano Ceri
  3. 3. Motivating Examples 3  “Who are the strongest candidates in Europe for competing on software ideas?”  “Who is the best doctor who can cure insomnia in a close-by hospital?”  “Where can I attend an interesting scientific conference in my field and at the same time relax on a beautiful beach nearby?” Database Management Prof. Stefano Ceri
  4. 4. Their Common Aspect 4  Multi-domain queries  Individual answers are on the Web  A knowledgeable user would do the query step-by-step: – Search database conferences, get their city – Check that the city average temperature is warm enough – Search low-cost flights via a broker for that city – Search luxury hotels via another broker  We want a system for supporting this search process – Build several “solutions” which already integrate all dimensions – Rank “solutions” according to a global rank function and output results in rank order – Support user-friendly query definition and result browsing – Add search domains while the search proceeds – Possibly change the relative weight of each ranking Database Management Prof. Stefano Ceri
  5. 5. 5 OVERALL FRAMEWORK Database Management Prof. Stefano Ceri
  6. 6. Search Computing architecture: overall view 6 Front End High-Level Query Final User Query Analysis Results Cache Sub-queries Cache Query To Domain Result Mapper Transformation Cache Low-level queries Merged Results Query Planner Cache Concrete Query Plan Query Engine WS-Framework Main Query flow OP 1 OP 2 ... OP N Cache Cache <Uses> relation Domain Domain Service WS Framework Repository Repository World Cache Database Management Prof. Stefano Ceri
  7. 7. Search Computing architecture: overall view 7 Front End High level query “Where can I attend a DB Sub query 1 scientific conference close to High-Level Query “Where can I attend a beautiful beach reachable Final User Query Analysis with cheap flights?” Results a DB scientific Sub query 2 conference?” Cache “place close to a beautiful Sub-queries beach?” Sub query 3 Cache Query To Domain Result Mapper “place reachable Transformation with cheap flight?” Cache Low-level queries Merged Results Query Planner Cache Concrete Query Plan Query Engine WS-Framework Main Query flow OP 1 OP 2 ... OP N Cache Cache <Uses> relation Domain Domain Service WS Framework Repository Repository World Cache Database Management Prof. Stefano Ceri
  8. 8. Search Computing architecture: overall view 8 Front End High-Level Query Final User Query Analysis Results Cache Sub-queries Low level query 1 Cache ConfSearch(“DB”,placeX,dateY) Query To Domain Result Mapper Transformation Low level query 2Cache TourSearch(“Beach”,PlaceX) queries Low-level Merged Results Query Planner Low level query 3 Flight(“cost<200”,PlaceX,DateY) Cache Concrete Query Plan Query Engine WS-Framework Main Query flow OP 1 OP 2 ... OP N Cache Cache <Uses> relation Domain Domain Service WS Framework Repository Repository World Cache Database Management Prof. Stefano Ceri
  9. 9. Search Computing architecture: overall view 9 Front End High-Level Query Presented results Final User Query Analysis Results ESWC-Crete-Olympic Cache CAISE- Hammamet – Alitalia TOOLS-Malaga-EasyJet Sub-queries Cache Query To Domain Result Mapper Transformation Cache Low-level queries Merged Results Query Planner Results Cache Query plan Concrete Query Plan Query Engine WS-Framework Main Query flow Cache OP 1 OP 2 ... OP N Services invocations Cache and operators execution <Uses> relation Domain Domain Service WS Framework Repository Repository World Cache Database Management Prof. Stefano Ceri
  10. 10. Search Computing architecture: incremental prototyping 11 Front End Concrete Query Plan Low-level queries Sub-queries High-Level Query Final User Query Analysis Results Cache Sub-queries Cache Admin Interface Query To Domain Result Mapper Transformation Cache Low-level queries Merged Results Query Planner Cache Concrete Prototype 1: Query Plan Core behaviour of the Query Engine system. WS-Framework OP 1 OP 2 ... OP N Cache Cache • Engine-based execution of queries • Domain repository <Uses> relation • Service repository • Coarse result presentation Domain Domain Service WS Framework Repository Repository World Cache Database Management Prof. Stefano Ceri
  11. 11. Search Computing architecture: incremental prototyping 12 Front End Concrete Query Plan Low-level queries Sub-queries High-Level Query Final User Query Analysis Results Cache Sub-queries Cache Admin Interface Query To Domain Result Prototype 2: Mapper Transformation Cache Planning Low-level queries Merged Results • Automatic optimized query planning Query Planner Cache Concrete Prototype 1: Query Plan Core behaviour of the Query Engine system. WS-Framework OP 1 OP 2 ... OP N Cache Cache • Engine-based execution of queries • Domain repository <Uses> relation • Service repository • Coarse result presentation Domain Domain Service WS Framework Repository Repository World Cache Database Management Prof. Stefano Ceri
  12. 12. Search Computing architecture: incremental prototyping 13 Prototype 4: Front End High level queries Concrete Query Plan Low-level queries Prototype 3: Sub-queries High-Level Query Mapping and Final User presentation Query Analysis Results Cache • mapping to domains • presentation of results Sub-queries Cache Admin Interface Query To Domain Result Prototype 2: Mapper Transformation Cache Planning Low-level queries Merged Results • Automatic optimized query planning Query Planner Cache Concrete Prototype 1: Query Plan Core behaviour of the Query Engine system. WS-Framework OP 1 OP 2 ... OP N Cache Cache • Engine-based execution of queries • Domain repository <Uses> relation • Service repository • Coarse result presentation Domain Domain Service WS Framework Repository Repository World Cache Database Management Prof. Stefano Ceri
  13. 13. CAISE FOCUS on: Service Registration 14 Service Marts: • Conceptualrepresentatio nofresourcesasentities and connections • Logicalrepresentationofs ignatures • Physicalrepresentationa s service implementations Database Management Prof. Stefano Ceri
  14. 14. CAISE FOCUS on: Front-end 15 Liquid Query Front End  Client- High-Level Query sideframeworkforconfi Final User guration and Query Analysis Results automaticrenderingof Cache query and Sub-queries resultinterfaces Query To Domain Result Cache Mapper Transformation  User interaction Cache Low-level queries primitives that allow to Merged Results perform explanatory Query Planner search Cache Concrete Query Plan Query Engine WS-Framework OP 1 OP 2 ... OP N Cache Cache Domain Domain Service WS Framework Repository Repository World Cache Database Management Prof. Stefano Ceri
  15. 15. CAISE FOCUS on: Development Process 16 Development Support Deploy <<implements>> <<deploys>> Time Search Services SeCo platform Environment Service Developer SeCo Expert Tools supporting Service Publishing Time Wrapping  Service Registration <<implements>>  Query Design <<defines>> Materialization / Normalization  Performance Monitoring Service Publisher <<uses>> <<performs>> Registration of Service Mart <<produces>> Service Mart <<uses>> Repository Config. Time <<defines>> Liquid Query <<produces>> Template Expert User Execution Time <<uses>> <<submits>> Liquid Query User Interface Specification <<manipulates>> <<uses>> Final User Liquid Result Database Management Prof. Stefano Ceri
  16. 16. 17 SERVICE REGISTRATION Database Management Prof. Stefano Ceri
  17. 17. Service Registration in SeCo Objective: providing a framework for registering services as first-class citizens within SeCo => Service Marts  High-level abstractions of “real world entities” that provide a simple interface to users and hide implementation details  Inspired by Data Marts, a data modeling pattern used in data warehousing  Each Service Mart can have multiple modalities of data access and can be mapped to multiple service implementations, possibly offered by different providers =>Connection Patterns  High-level abstractions of “real world relationships” that provide a simple interface to users and hide implementation details  Built by means of attributes that share the same domains Database Management Prof. Stefano Ceri
  18. 18. Service Marts – Conceptual Level Every SM definition includes a name and a collection of the exposed attributes,i.e. the attributes of the real world object described by the SM Movie(Title, Director, Year, Language, Genres(Genre), Actors(Name, Sex))  Atomic, single valued, typed attributes  Repeating groups (multi-valued, typed attributes)  Each “repeating group” is a non-empty set of typed sub-attributes that collectively defines a property of the service mart The model choices are:  To support structural complexity with only one level of nesting (rather than an arbitrary level of nestings)  To avoid explicit descriptions of relationship (using repeating groups for M:N relationships) Database Management Prof. Stefano Ceri
  19. 19. Service Marts – Logical Level At this level, each SM is associated with one or more Access Patterns, i.e.: Movie1(TitleO, DirectorO, ScoreRO, YearO, LanguageI, Genres.GenreO, Actors.NameO , Actors.SexO,Genres.GenreI) Movie2(TitleI, DirectorO, YearO, LanguageO, Genres.GenreO, Actors.NameO , Actors.SexO)  Access patterns contain adorned attributes, i.e. attributes tagged with one of the following:  I, if they are input attributes  O, if they are Output attributes  R, if they are attributes used for ranking – they may or may not be visible in output  Movie1 makes access to movies by Language and Genre (i.e., “action movies in English”) and results are ranked by Score (a new attribute).  Movie2 makes access to movies by Title (e.g. “Ben Hur”). We expect few (zero, one, more) results which are not ranked. Database Management Prof. Stefano Ceri
  20. 20. Service Marts – Physical Level At this level, every Access Pattern can match different Service Implementations, having:  Physical URI to be called  Physical properties which are specific to the implementation  Mapping between logical and physical parameters IMDBMovie1(MovieTitleO, DirectorO, StarsRO, YearO, LanguageI, Genres.GenreI, Actors.NameO , Actors.GenderO) IMDBMovie AP: Movie1 URI: http://... TTL=6000, chunksize=10, cacheable=true, exposed=false, ... Title Director Score Year Language ... MovieTitle Director Stars Year Lang ... Database Management Prof. Stefano Ceri
  21. 21. External and Selector Attributes  external attributes, for supporting access and ranking SM Movie(Title, Director, Year, Language, …) AP Movie1: TitleO | DirectorO | YearO | … | ScoreRO | GenreI AP Movien: TitleO | DirectorO | YearO | … | TitleI External attributes  selector attributes, for supporting choices among service implementations SM Movie(Title, Director, Year, Language, …) Language SI Movie Implementation 1 ... Selector SI Movie Implementation n Database Management Prof. Stefano Ceri
  22. 22. Connection Patterns Connections between marts only exist in terms of attributes that share the same domains, on different levels of abstraction:  Conceptually by a nondirected edge with a name: PlayingMovie(Movie,Theatre) Movie Theatre  Logically by an edge (possibly directed) with name and join condition: PlayingMovie(Movie,Theatre): (Title=Movie.Title) Movie4 Theatre2 Database Management Prof. Stefano Ceri
  23. 23. Connection Patterns – Logical Level Directed edge: Information is “piped” from one access pattern to another, along connection attributes which are in output in the first service and in input in the second service -> PIPE JOIN Movie1 Title Director Score Year Language A.Name A.Sex G.Genre Theatre1 Name Address M.Start M.Title Database Management Prof. Stefano Ceri
  24. 24. Connection Patterns – Logical Level Undirected edge: results are produced by both access patterns in output and then joined -> PARALLEL JOIN Movie1 Title Director Score … … G.Genre Theatre1 Name Address M.Start M.Title Database Management Prof. Stefano Ceri
  25. 25. Join of two Services, Pipe Version, NY City Search only in NY Movie Theatre Service Mart Service Mart Movie1 Movie2 Theatre1 Access Access Access pattern pattern pattern IMDB2 Service Interface IMDB1 Hyperrev1 Service Service Interface Interface Google1 NYLocalSearch Service Service Interface Interface Database Management Prof. Stefano Ceri
  26. 26. 27 JOIN OF TWO SEARCH SERVICES Database Management Prof. Stefano Ceri
  27. 27. JOIN of Web Services  Input: items resulting from TWO web service calls, possibly ranked  Output: composed items resulting from the concatenation of matching items, presented in a “global ranking order”  Matching condition using: – value equality, – partial set matching – term matching within a vocabulary …..  Services are known, their matching function is predefined: this is not service discovery! Database Management Prof. Stefano Ceri
  28. 28. Join 29 Service X Service Y bx5 by5 bx4 by4 bx3 by3 bx2 by2 bx1 by1 r1 r2 r3 Database Management Prof. Stefano Ceri
  29. 29. Matching items 30 Database Management Prof. Stefano Ceri
  30. 30. Choice of the join strategies  The join search space – Different explorations for different joins methods under different assumptions and with different guarantees Chunksize Chunk tij Any exploration trajectory Candidate join result for this space is a join strategy Database Management Prof. Stefano Ceri
  31. 31. Nested Loop - Rectangular 32 Database Management Prof. Stefano Ceri
  32. 32. Merge scan - Triangular 33 Database Management Prof. Stefano Ceri
  33. 33. Parallel and Pipe Joins 34  Parallel join of two search services (1) S1 (1,2)n C1 period: 150 ms stop: 10 excess: (1,1) (2) S2  Pipe join of two search services (2) (1) (1,10)5(0,1)n S1 S2 size: 20 period: 500 ms stop: 1 Database Management Prof. Stefano Ceri
  34. 34. 35 SUPPORT OF “SIMILARITY JOINS" Database Management Prof. Stefano Ceri
  35. 35. Supporting value similarity  Concept of “nearness” is widely implemented depending on different contexts, such as: Lexical near (similar strings) Spatial near (between addresses/geo locations) Temporal near (between dates/times) Economic near (between costs)  Context is defined according to the attributes involved => Semantics of nearness built bottom-up, starting from the physical layer (available services) up to the conceptual one. Database Management Prof. Stefano Ceri
  36. 36. Similarity comes from Shared Domains The attribute “address” is shared by the 4 entities. Its restaurant apartment semantic type, Address Address describing a location, enables “nearness” Spatial connections between Near each pair of entities (i.e. addresses can Address Address be compared for “nearness” within the hotel theatre same city, country, …) Database Management Prof. Stefano Ceri
  37. 37. Supporting Nearness within Services Severalphysicalservicesnativelysupport ranking bydistances (e.g. GoogleMovies)  E.g.: GoogleMovies receives the user address as input, and returns theatres ranked by distance, each one with its address as output. UserAddress and Distance are external attributes. GoogleMovies(UserAddressI, DistanceR| NameO, AddressO, Movie.TitleI, Movie.StartTimeO) GoogleMovies AP: Theatre1 URI: http://... TTL=6000, chunksize=10, cacheable=true, provides=Spatial Near UserAddress Name Address M.Title M.StartTime ... IAddr Name OAddr MovieTit MovieTime ... Database Management Prof. Stefano Ceri
  38. 38. “Nearness” Support within Services Theatre Restaurant Spatial Near Restaurant2 Address Name Cuisine Price Spatial near Theatre1 UserAddress Name Address M.Title M.StartTime Distance GoogleMovies AP: Theatre1 URI: http://... TTL=600, chunksize=10, cache=1, provides=Spatial Near UserAddr Name Address M.Title ... Addr Name Addr MovieTit ... Database Management Prof. Stefano Ceri
  39. 39. Nearness Services within the Execution Engine Ad-hoc services providing the notion of distance at the physical level require two domain values as input and produce their distance as output  Two input attributes to specify two values of the domain  One output attribute specifies the distance in given units SpatialNear System URI: http://... TTL=600, chunksize=1, cacheable=1, ... Input1, Input2: Coordinates Output: Distance (Km) Database Management Prof. Stefano Ceri
  40. 40. Supporting Nearness within the Execution Engine Theatre Restaurant Spatial Near Restaurant2 Address Name Cuisine Price Theatre1 Address Name M.Title M.StartTime Spatial Near Addr1 Addr2 Distance SpatialNear System URI: http://... TTL=600, chunksize=1, cacheable=1, ... Input1, Input2: Coordinates Output: Distance (Km) Database Management Prof. Stefano Ceri
  41. 41. Join of three Services at the three Levels in NY Search only in NY Movie Theatre Restaurant Service Mart Service Mart Service Mart Spatial Near Movie1 Movie2 Theatre1 Rest1 Rest2 AP Access Access Access Access providing pattern pattern pattern pattern spatial near IMDB2 Yahoo1 Yahoo2 Service Service Service Interface Interface Interface IMDB1 Service Interface Hyperrev1 Google1 NYLocalSearch Service Service Service Interface Interface Interface Database Management Prof. Stefano Ceri
  42. 42. Three Levels with Connection Semantics Services Connections Name (with associated Conceptual Service Mart semantics) Bindings between SM and AP attributes, plus definition of extra attributes Join attributes,directed vs Logical Access Pattern undirected edge (with nearness service APs added as needed) Bindings between AP attributes and SI parameters Service Interface (with Physical associated semantics and Nearness Services with system services) Database Management Prof. Stefano Ceri
  43. 43. Resource graph Specialized way for describing search service based knowledge available on the web [ER model, ontology, class diagram?] News Restaurant Exhibition ... Piece ... Concert ... Artist ... Photo ... Hotel Movie ... Metro Station Theatre Landmark ... ShoppingCenter ... Database Management Prof. Stefano Ceri
  44. 44. 46 APPLICATION DEVELOPMENT PROCESS Database Management Prof. Stefano Ceri
  45. 45. SeCo development process Search Service and Registration Development  Main Roles: • Service developer Service developer Implement search service • Service publisher Adaptation Service • Expert user Service publisher Wrap or materialize Register service • SeCo expert service mart and interface  Dichotomy: Configuration Application • Top-down Service Mart model vs. Expert user Design Liquid Query Template Bottom-up • Run time Manual optimization needed? N Liquid Query model Y vs. Refinement Query Plan Design time Query Plan model SeCo expert Panta Rhei plan refinement Database Management Prof. Stefano Ceri
  46. 46. The service registration process Service Description SM Identification Buttom up Strategy YES NO Some SM retrieved ? YES SM CREATION Modification Hybrid Strategy of the SM structure? SM UPDATE NO Associated SI Update Top down Strategy (new connections) SM MAPPING AP CREATION Service Physical Description END Database Management Prof. Stefano Ceri
  47. 47. The SM Creation process, with semantic hints SM CREATION Movie(Title, Director, Score, Year, Genres(Genre), Openings(Country, Date), Actors(Name)) Type SM Name and attributes conventions schema definition Movie: S: (n) movie, film, picture, moving picture, moving-picture show, motion picture, motion-picture show, picture show, pic, flick (a form of entertainment that enacts a story by WN sound and a sequence of images giving the SM and attributes Semantical Description illusion of continuous movement) "they went to Synsets (and tags?) a movie every Saturday night"; Automatic recommendation Director: S: (n) film director, director of connectable SMs (the person who directs the making of a film) SM1 Connection patterns Shows(Movie, Theatre): [(Title=Title)] Theatres (CP) definition SMn Defined CP: Shows Textual_near Possible CP: Title (String)  Textual_near Spatial_near Composition Language Textual near operators association Year (Date)  Temporal_near … Temporal_near Database Management Prof. Stefano Ceri
  48. 48. The SM Mapping procedure SM MAPPING Original SM Movie(Title, Director, Score, Year, …) Director: String Director: S: the person who directs the making of a film) f Director (String) SI Selector ImdbMovie: Title | Director | Score | Year | … Auxiliary Selector CorrespondingSM attributes attributes attributes (i.e. query attributes) Database Management Prof. Stefano Ceri
  49. 49. SeCo Tools • Online tool suite that covers the whole development process • Mashup-based • Built by using state of the art technologies: 1. MVC on the client: Javascript MVC 2. UI organization and panels: Yahoo! User Interfaces 3. Diagram drawing and editing: WireIt Database Management Prof. Stefano Ceri
  50. 50. Service Mart Registration 53 Database Management Prof. Stefano Ceri
  51. 51. Mapping editor 54 Database Management Prof. Stefano Ceri
  52. 52. Query Registration Interface 55 Database Management Prof. Stefano Ceri
  53. 53. Query Registration Editor, Logical Connections 56 Database Management Prof. Stefano Ceri
  54. 54. 57 LIQUID QUERY INTERFACE Database Management Prof. Stefano Ceri
  55. 55. Liquid Query “ A new paradigm allowing users to formulate and get responses to multi-domain queries through an exploratory information seeking approach, based upon structured information sources exposed as software services…” • Composite answers obtained by aggregating search results from various domains • Highlight the contribution of each search service • Join of results based on the structural information afforded by the search service interfaces • Refine the user query • Re-shape the result list Database Management Prof. Stefano Ceri
  56. 56. Liquid query definition It consists of subsetting and parametrizing the resource graph... News Restaurant Exhibition ... Piece ... Concert ... Artist ... Photo ... Hotel Movie ... Metro Station Theatre Landmark ... ShoppingCenter ... = inputs, outputs + GR = global ranking Database Management Prof. Stefano Ceri
  57. 57. Liquid query definition ... And then characterizing the user interaction News Restaurant Exhibition Concert Artist Photo Hotel Expand Plus: Metro Station • Parametrization of global ranking • Data visualization options • .. and so on Database Management Prof. Stefano Ceri
  58. 58. Query Submission Concert Hotels query conditions query conditions Database Management Prof. Stefano Ceri
  59. 59. Query Execution & Result Presentation Database Management Prof. Stefano Ceri
  60. 60. 63 SECO ENGINE Database Management Prof. Stefano Ceri
  61. 61. Overview  The tools is aimed at developers and permits to compose, plan and run a SeCo query  Four panels, one for each query processing phase: Query Logical Physical Query composition planning planning execution Splashscreen! Database Management Prof. Stefano Ceri
  62. 62. Query composition (1) Service interface browser • listsregistered service interfaces • Input and output parameters are listed Selected service’s statistics • collected service statistics are displayed • statisticsmaybeeditedfortestingpurposes Database Management Prof. Stefano Ceri
  63. 63. Query composition (2) User-entereddatalog-likequery Queryoptimisationparameters • joinsimplicitlyencodedbydatalogvars • control the behaviourof the planner • $varsencodequeryinputsprovided at • trigger the planning process runtime Database Management Prof. Stefano Ceri
  64. 64. Logical planning Database Management Prof. Stefano Ceri
  65. 65. Physical planning Database Management Prof. Stefano Ceri
  66. 66. Query execution (1) Executionsession management • a sessioncorrespondsto a single queryexecution, where multiple usercommandsmaybeissued • query input parameters are specified at sessioninitialisation Execution status • displays the currentsession status • displays the status of the executioncommandsissued so far Executioncommandsforms • a more-allcommandrequires more queryresults • a more-onecommandrequires more resultsbyextracting more data from a specific service invokedby the query Database Management Prof. Stefano Ceri
  67. 67. Query execution (2) Queryresults • Displaysrankedr esults, assoonascomp uted Executiontime line • displaysactivati onofexecutionu nits (e.g. service calls) • usefulto fine tune the engine and the join strategies Database Management Prof. Stefano Ceri
  68. 68. Query execution (3) Service calls log • displays service calls at the chunkgranularity • showsresponsetimes, statistics, cache behaviour Database Management Prof. Stefano Ceri
  69. 69. DEMO http://demo.search-computing.eu Database Management Prof. Stefano Ceri
  70. 70. 73 SUMMARY OF SECO RESULTS Database Management Prof. Stefano Ceri
  71. 71. Results after 18 months 74  Concepts – Service marts, rank join methods, pantarhei, liquidquery  Researchresults – Springer LNCS: SearchComputingChallenges and Directions – Manypublications (withVLDB,WWW), manyongoingsubmissions – Filingof US Patent (top-k method, random&sequentialservices)  Prototypes – Executionenvironment, focus on liquidquery and on integration – Design supportenvironment, focus on mashups  Dissemination – Fifteenkeynotetalks, twelvearticles in the Italian press – SeCo Web site, SeCo blog, facebook, linked-in, twittercommunities – SearchComputing Graduate Course at PoliMi  Temporary research positions (1 phd, 5 post-ms, 3 post-doc) Database Management Prof. Stefano Ceri
  72. 72. Publications 75 SeCo - D. Braga, A. Campi, S. Ceri, A. RaffioJoining the results of heterogeneous search enginesInformation Systems, Vol. 33, Issues 7-8, (November-December 2008), Pages 658-680 - D. Braga, S. Ceri, F. Daniel, D. MartinenghiOptimization of Multi-Domain Queries on the WebVLDB 2008: 562-573, Auckland, New Zealand, August 2008 - D. Braga, S. Ceri, F. Daniel, D. MartinenghiMashing Up Search Services, IEEE Internet Computing 12(5): 16-23 (2008) - D. Braga, D. Calvanese, A. Campi, S. Ceri, F. Daniel, D. Martinenghi, P. Merialdo, R. Torlone, NGS: a framework for multi-domain query answering, ICDE Workshops 2008: 254-261 - S. Ceri, Search Computin Invited Paper, 25th International Conference on Data Engineering, Shanghai, March 29 - April 2, 2009 - D. Barbieri, A. Bozzon, D. Braga, M. Brambilla,A. Campi, S. Ceri, E. Della Valle, P. Fraternali, M. Grossniklaus, D. Martinenghi, S. Ronchi, M. TagliasacchiData-driven optimization of - search service composition for answering multi-domain queries (USETIM 2009) workshop at VLDB 2009, Lyon, France, August 24-28, 2009 - M.Brambilla, S. Ceri, Engineering Search Computing Applications: Vision and Challenges The 7th joint meeting of the European Software Engineering Conference (ESEC) and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), Amsterdam, The Netherlands, August 24-28 2009 - S. Ceri Search Computing The 2009 IEEE/WIC/ACM International Conference on Web Intelligence, Milan, Italy, September 15-18 2009 - S. Ceppi and N. Gatti, An Automated Mechanism Design Approach for Sponsored Search Auctions with Federated Search Engines In Proceedings of the 12^th Workshop on Agent- Mediated Electronic Commerce (AMEC) in the 9^th International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Toronto, Canada May 10 2010 - D. Martinenghi, M. Tagliasacchi, and S. Ceri Top-k pipe-join International Workshop on Ranking in Databases, Long Beach, USA, March 2010 - A. Bozzon, M. Brambilla, S. Ceri, P. FraternaliLiquid Query: Multi-Domain Exploratory Search on the WebWWW 2010 - 19th International World Wide Web Conference - Raleigh, North Carolina, April 26-30 2010 - A. Campi, S. Ceri, A. Maesani, S. RonchiDesigning Service Marts for Engineering Search Computing Applications The Tenth International Conference on Web Engineering, ICWE 2010, Vienna, Austria, July 5-9 2010 Related - M. Brambilla, S. Ceri, I. Celino, D. Cerizza, E. Della Valle, F. M. Facca, A. Turati, C. TziviskouExperiences in the Design of Semantic Services Using Web Engineering Methods and Tools Journal on Data Semantics 2008- A. Raffio, D. Braga, S. Ceri, P. Papotti, M. Hernandez Clip: a Visual Language for Explicit Schema Mappings International Conference on Data Engineering (ICDE), April 2008 - D. Braga, D. Calvanese, A. Campi, S. Ceri, F. Daniel, D. Martinenghi, P. Merialdo, R. TorloneA New Generation Search Engine Supporting Cross Domain Queries Italian Symposium on Advanced Database Systems (SEBD), June 2008 - D. Braga, D. Calvanese, A. Campi, S. Ceri, F. Daniel, D. Martinenghi, P. Merialdo, R. TorloneNGS: a Framework for Multi-Domain Query Answering IIMAS, International Conference on Data Engineering Workshops (ICDE), April 2008 - A. Raffio, D. Braga, S. Ceri, P. Papotti, M. Hernandez Clip: a Tool for Mapping Hierarchical Schemas ACM SIGMOD/PODS Conference, Demo Session, June 2008 - A. Bozzon, M. Brambilla, P. FraternaliConceptual Modeling of Multimedia Search Applications Using Rich Process Models ICWE 2009, Springer LNCS, vol. 5648, ISBN 978-3-642-02817-5. - E. Della Valle, S. Ceri, D. F. Barbieri, D. Braga, A. CampiA First Step Towards Stream Reasoning Future Internet Symposium (FIS) 2008, pp. 72-81. - A. Bozzon, M. Brambilla, F. M. Facca, G. ToffettiCarughiA Conceptual Modeling Approach to Business Service Mashup Development IEEE International Conference on Web Services, ICWS 2009, Los Angeles. IEEE Press, July 2009, pp. 751 - 758. - P. Fraternali, M. Brambilla, A. Bozzon, Model-Driven Design of Audiovisual Indexing Processes for Search-Based Applications Content-Based Multimedia Indexing, 2009, CBMI '09, IEEE Press, ISBN: 978-1-4244-4265-2, pp. 120-125. - D. F. Barbieri, D. Braga, S. Ceri, E. Della Valle and M. Grossniklaus, C-SPARQL: SPARQL for Continuous Querying Proceedings of WWW 2009, 18th International World Wide Web Conference (Poster), Madrid, Spain, April 2009 - D. F. Barbieri, D. Braga, S. Ceri, E. Della Valle and M. GrossniklausContinuous Queries and Real-time Analysis of Social Semantic Data with C-SPARQL In Proceedings of SDoW 2009, 2nd ISWC Workshop on Social Data on the Web, Washington, DC, USA, October 2009 - D. F. Barbieri, D. Braga, S. Ceri, E. Della Valle and M. GrossniklausC-SPARQL: A Continuous Query Language for RDF Data Streams International Journal of Semantic Computing (IJSC), 2010, World Scientific Publishing - D. F. Barbieri, D. Braga, S. Ceri and M. GrossniklausAn Execution Environment for C-SPARQL Queries In Proceedings of EDBT 2010, 13th International Conference on Extending Database Technology, Lausanne, Switzerland, March 2010 Database Management Prof. Stefano Ceri
  73. 73. Web Site & Blog 76  Web Site  TechWatch Blog Blog stats: ~ 900 absoluteuniquevisitors in the last twomonths Database Management Prof. Stefano Ceri
  74. 74. Accessesto Web Site & Blog 77 Visits: 20% USA, 18% Italy, 6% UK, 4% India, 4% Canada  Provenance  Sources Database Management Prof. Stefano Ceri
  75. 75. Search Computing First Workshop June 17-19, 2009 78 Database Management Prof. Stefano Ceri
  76. 76. Search Computing Challenges and Directions (LNCS, vol. 5950, Ceri-Brambilla eds.) 79  Part 1: Vision – Ceri: Search computing – Baeza-Yates: Next generation search – Weikum: Search for knowledge  Part 2: Technology Watch – Della Valle-Buganza-Gatti: The search engine industry – Casati-Daniel-Soi: Mashup technologies – Baumgartner-Campi-Gottlob-Herzog: Web data extraction – Hedeler-Belhajjame-Campi-Embury-Fernandez-Paton:Dataspaces – Bozzon-Fraternali: Multimedia and multimodal information retrieval  Part 3: Issues in Search Computing – Campi-Ceri-Gottlob-Ronchi: Service marts – Braga-Campi-Grossniklaus: Join methods and query optimization – Ilyas-Martinenghi-Tagliasacchi: Rank aggregation – Braga-Grossinklaus-Ceri: Panta Rhei, a query execution environment – Brambilla-Ceri-Fraternali-Manolescu: Liquid queries and liquid results – Brambilla-Ceri: Software engineering of search computing applications – Masseroli-Paton-Spasic: Search computing and the life sciences Database Management Prof. Stefano Ceri
  77. 77. Second Workshop: Design Principles 80  Consolidate severalongoingresearchchapterstouching the variousaspectsof the project  Developconnectionstootherresearchprojects so asto share knowledge - and possiblybuildcooperationsbased on mutualcomplementarity.  Settinginternaldeadlinesto project evolution – Beingreadyfor the workshop – Dump organisational responsibility to session chairs  Try a more discussion-oriented format – Ourview – Guest’s views – Panel/discussion (sometimes driven, sometimes not)  Produce Proceedings as Springer LNCS, each session contributing to a short part Database Management Prof. Stefano Ceri
  78. 78. Second SeCo Workshop Last Week 81 Database Management Prof. Stefano Ceri
  79. 79. Second Workshop: Sessions 82  Pre-Workshop (Milano, May 25) – Searchas a Process – Business Models  Workshop (Como, May 26-28) – SemanticResourceFramework – WrappingTechnology and OntologicalAnnotation – Design Tools and MashupLanguages – SearchComputing and ResearchEvaluation – Query Processing – Rank Join – SearchComputingforBioMedicalApplications – User-CenteredApproachtoSearchComputingApplications  Post-Workshop (Milano, May 31) – VisualInterfacesforComplexSearch Database Management Prof. Stefano Ceri
  80. 80. Lookingforward 83  Establishstrongerco-operationwithotherprojects – Bothfortechnology and applications  StrengthenSeCo “coreresearch” – Cover the processlifecyclewithmethods&tools – Improve result visualization and user interaction – Usesemantics in service registration and query processing – Turn PantaRheiinto a full Service Base Management System (SBMS) withnewrank join methods, proximity, uncertainty…  Strengthen the prototypes – Fullydevelop the registrationenvironment – Extend the executionenvironment, makeitscalableoverclouds – Extend the liquid interface, cover mobile interfaces  Put a “killer” application online (usable!)  Exploreexploitationoptions Database Management Prof. Stefano Ceri

×