La bi, l'informatique décisionnelle et les graphes
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
679
On Slideshare
679
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
16
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Neo Technology, Inc ConfidentialLa BI,linformatique décisionnelleet les graphesPhilip Rathle, Sr Dir Productsphilip@neotechnology.comhttp://twitter.com/prathle
  • 2. Neo Technology, Inc ConfidentialLes Grapheset la Pensée
  • 3. Neo Technology, Inc ConfidentialThe BrainStructure: Neurons Connected by Synapses.Processing: Signals Relayed Between Neurons through Synapses
  • 4. Neo Technology, Inc ConfidentialHuman ThinkingStructured Unstructured (Creative)Both forms involve processing connections
  • 5. Neo Technology, Inc ConfidentialLes Applications Graphesdans le Commerce
  • 6. Neo Technology, Inc ConfidentialEarly Adopters of Graph Tech
  • 7. Neo Technology, Inc ConfidentialEvolution of Web SearchSurvival of the FittestPre-1999WWW IndexingDiscrete Data1999 - 2012Google InventsPageRankConnected Data(Simple)2012-?Google Knowledge Graph,Facebook Graph SearchConnected Data(Rich)
  • 8. Neo Technology, Inc ConfidentialEvolution of Online Recruiting2010-11Resume Searching &ScoringAggregated DataSurvival of the Fittest2011-12Social Job SearchConnected Data
  • 9. Neo Technology, Inc ConfidentialConsumer Web Giants Depends on Five GraphsGartner’s “5 Graphs”Social GraphRef: http://www.gartner.com/id=2081316Interest GraphPayment GraphIntent GraphMobile Graph
  • 10. Neo Technology, Inc ConfidentialGraph Buzz!
  • 11. Neo Technology, Inc ConfidentialCore Industries& Use Cases:Web / ISVFinance &InsuranceCommuni-cationsLogisticsLifeSciencesMedia &PublishingEducation,Not-for-ProfitGovernment,Aerospace,Gaming, OtherNetworkManagementMDMSocialGeoAuthorization &Access ControlContentManagementRecommend-ationsFraudDetection,OtherAccentureSelect Commercial Customers (Community Users Not Included)Neo4j Adoption Snapshot*
  • 12. Neo Technology, Inc ConfidentialEcosystème de laTechnologie Graph
  • 13. Neo Technology, Inc ConfidentialData Storage & Processing• Graph Databases• Graph Compute EnginesProgramming:• Graph-Centric APIs & Languages• Graph AlgorithmsTools:• Visualization Tools & Libraries• OtherKey Graph Analytic Technologies
  • 14. Neo Technology, Inc ConfidentialTypical Graph BI EnvironmentApplicationOtherDatabasesETLNeo4jClusterData Storage &Business Rules ExecutionReportingGraph-Dashboards&Ad-hocAnalysisGraphVisualizationEnd User Ad-hoc visual navigation &discoveryBulk AnalyticInfrastructure(e.g. Graph ComputeEngine)ETLGraph Mining &AggregationData ScientistAd-HocAnalysis
  • 15. What is aGraph DatabaseA graph database is an online (“real-time”)database management system with CRUDmethods that expose a graph data model• Two important properties:• Native graph processing, includingindex-free adjacency1 to facilitate traversals• Native graph storage engine, i.e.written from the ground up to beoptimized for managing graph data1] See Rodriguez, M.A., Neubauer, P., ,“The Graph Traversal Pattern,” 2010 (http://arxiv.org/abs/1004.1001)
  • 16. Overview of PopularGraph Data Models• Property Graph• Description: A “directed, labeled, attributed, multi-graph”1 which exposes three building blocks: nodes, typedrelationships and key-value properties on both nodes andrelationships• Vendors: Neo4j, OrientDB, InfiniteGraph, Dex• RDF Triples• Description: URI-centered subject-predicate-objecttriples as pioneered by the semantic web movement2• Vendors: AllegroGraph, Sesame• HyperGraph• Description: A generalized graph where a relationshipcan connect an arbitrary amount of nodes (compared tothe more common binary graph models)3• Vendors: HyperGraphDB,TrinityDB1] Rodriguez, M.A., Neubauer, P., “Constructions from Dots and Lines,” 2010, http://arxiv.org/abs/1006.23612] W3C,“The Resource Description Framework (RDF),” 2004, http://www.w3.org/RDF/3] Wikipedia, http://en.wikipedia.org/wiki/Hypergraph
  • 17. Graph Compute EngineProcessing platforms that enable graph globalcomputational algorithms to be run againstlarge data setsGraph MiningEngine(Working Storage)In-Memory ProcessingSystem(s)of RecordGraph ComputeEngineData extraction,transformation,and load
  • 18. Neo Technology, Inc ConfidentialGraph Global QueriesWhat is the max/min/avg. number of connections per node?(aka “Degree Distribution”)
  • 19. Neo Technology, Inc ConfidentialQuoi faire avec un Graph Database?Example: Facebook Graph Search
  • 20. Neo Technology, Inc ConfidentialFor the Facebook Graph Question:What sushi restaurants in NYC do my friends like?
  • 21. Neo Technology, Inc ConfidentialWhat the Graph Looks Like:What sushi restaurants in NYC do my friends like?
  • 22. Neo Technology, Inc ConfidentialWhat the Cypher Query Looks Like:What sushi restaurants in NYC do my friends like?START me=node:person(name = Philip),location=node:location(location=New York),cuisine=node:cuisine(cuisine=Sushi)MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant)-[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine)RETURN restaurant
  • 23. Neo Technology, Inc ConfidentialWhat the Search Looks Like:What sushi restaurants in NYC do my friends like?
  • 24. Neo Technology, Inc ConfidentialWhat Other Graph Searches Look LikeWhat drugs will bind to protein X and not interact with drugY?
  • 25. Neo Technology, Inc ConfidentialGraph Dashboards
  • 26. Social Network Analysis
  • 27. Fraud Detection & Money Laundering
  • 28. Service Assurance& Network Failure Analysis
  • 29. Neo Technology, Inc ConfidentialIndustry Example:5 Graphs ofCommunications
  • 30. Neo Technology, Inc Confidential#1:The Network GraphGraphs in CommunicationsCell Signal AnalysisRouterServiceDEPENDS_ONSwitch SwitchRouterFiber LinkFiber LinkFiber LinkOceanfloorCableDEPENDS_ONDEPENDS_ONDEPENDS_ONDEPENDS_ONDEPENDS_ONDEPENDS_ONDEPENDS_ONDEPENDS_ONDEPENDS_ONLINKEDLINKEDLINKEDDEPENDS_ON“What if” Downtime Analysis(Service-to-Infrastructure Mapping)Network Inventory &Cost Accounting
  • 31. Neo Technology, Inc Confidential#2:The Social GraphGraphs in CommunicationsMobile apps,Collaboration,Social Recommendations,and more...
  • 32. Neo Technology, Inc Confidential#3:The Call GraphGraphs in CommunicationsPlan & Feature Recommendations,Assess Churn Risk
  • 33. Neo Technology, Inc Confidential#4: Master Data GraphGraphs in CommunicationsOrganizational HierarchyManagementResource AuthorizationRef: http://www.slideshare.net/verheughe/how-nosql-paid-off-for-telenor
  • 34. Neo Technology, Inc Confidential#5:The Help Desk GraphGraphs in CommunicationsOnline Recommendationsfor Case Avoidance
  • 35. Neo Technology, Inc ConfidentialEntitlements & IdentityManagementNetwork AssetManagementNetwork Cell AnalysisGeo Routing(Public Transport)BioInformaticsEmergent Graph in Other Industries(Actual Neo4j Graphs)Insurance Risk Analysis
  • 36. Neo Technology, Inc ConfidentialWeb Browsing Portfolio AnalyticsMobile Social ApplicationGene SequencingEmergent Graph in Other Industries(Actual Neo4j Graphs)
  • 37. Neo Technology, Inc ConfidentialCas d’étudesselectionés
  • 38. Neo Technology, Inc ConfidentialBackground• World’s largest provider of IT infrastructure, software& services• HP’s Unified Correlation Analyzer (UCA) application is akey application inside HP’s OSS Assurance portfolio• Carrier-class resource & service management, problemdetermination, root cause & service impact analysis• Helps communications operators manage large,complex and fast changing networksBusiness problem• Use network topology information to identify rootproblems causes on the network• Simplify alarm handling by human operators• Automate handling of certain types of alarms Helpoperators respond rapidly to network issues• Filter/group/eliminate redundant NetworkManagement System alarms by event correlationSolution & Benefits• Accelerated product development time• Extremely fast querying of network topology• Graph representation a perfect domain fit• 24x7 carrier-grade reliability with Neo4j HA clustering• Met objective in under 6 monthsIndustry: Web/ISV, CommunicationsUse case: Network ManagementGlobal (U.S., France)
  • 39. Neo Technology, Inc ConfidentialBackground•One of the world’s largest logistics carriers•Projected to outgrow capacity of old system•New parcel routing system•Single source of truth for entire network•B2C & B2B parcel tracking•Real-time routing: up to 5M parcels per dayBusiness problem•24x7 availability, year round•Peak loads of 2500+ parcels per second•Complex and diverse software stack•Need predictable performance & linearscalability•Daily changes to logistics network: route fromany point, to any pointSolution & Benefits•Neo4j provides the ideal domain fit:•a logistics network is a graph•Extreme availability & performance with Neo4jclustering•Hugely simplified queries, vs. relational forcomplex routing•Flexible data model can reflect real-world datavariance much better than relational•“Whiteboard friendly” model easy to understandIndustry: LogisticsUse case: Parcel Routing
  • 40. Neo Technology, Inc ConfidentialIndustry: Online Job SearchUse case: Social / Recommendations• Online jobs and career community, providinganonymized inside information to job seekersBusiness problem• Wanted to leverage known fact that most jobs arefound through personal & professional connections• Needed to rely on an existing source of socialnetwork data. Facebook was the ideal choice.• End users needed to get instant gratification• Aiming to have the best job search service, in a verycompetitive marketSolution & Benefits• First-to-market with a product that let users find jobsthrough their network of Facebook friends• Job recommendations served real-time from Neo4j• Individual Facebook graphs imported real-time into Neo4j• Glassdoor now stores > 50% of the entire Facebooksocial graph• Neo4j cluster has grown seamlessly, with new instancesbeing brought online as graph size and load have increasedPersonCompanyKNOWSPersonPersonKNOWSCompanyKNOWSWORKS_ATWORKS_ATNeo Technology ConfidentialBackgroundSausalito, CA
  • 41. Neo Technology, Inc ConfidentialIndustry: CommunicationsUse case: Recommendations•Cisco.com serves customer and businesscustomers with Support Services•Needed real-time recommendations, toencourage use of online knowledge base•Cisco had been successfully using Neo4j for itsinternal master data management solution.•Identified a strong fit for onlinerecommendationsSolution & Benefits•Cases, solutions, articles, etc. continuously scrapedfor cross-reference links, and represented in Neo4j•Real-time reading recommendations via Neo4j•Neo4j Enterprise with HA cluster•The result: customers obtain help faster, withdecreased reliance on customer supportNeo Technology ConfidentialBackgroundBusiness problem•Call center volumes needed to be lowered byimproving the efficacy of online self service•Leverage large amounts of knowledge stored inservice cases, solutions, articles, forums, etc.•Problem resolution times, as well as supportcosts, needed to be loweredSupportCaseSupportCaseKnowledgeBaseArticleSolutionKnowledgeBaseArticleKnowledgeBaseArticleMessageSan Jose, CACisco.com
  • 42. Neo Technology, Inc ConfidentialInteractive Television ProgrammingIndustry: CommunicationsUse case: Social gamingBackground• Europe’s largest communications company• Provider of mobile & land telephone lines toconsumers and businesses, as well as internetservices, television, and other servicesSolution & Benefits• Interactive, social offering gives fans a way toexperience the game more closely• Increased customer stickiness for Deutsche Telekom• A completely new channel for reaching customerswith information, promotions, and ads• Clear competitive advantageFrankfurt, GermanyBusiness problem• The Fanorakel application allows fans to have aninteractive experience while watching sports• Fans can vote for referee decisions and interact withother fans watching the game• Highly connected dataset with real-time updates• Queries need to be served real-time on rapidlychanging data• One technical challenge is to handle the very highspikes of activity during popular games
  • 43. Neo Technology, Inc ConfidentialReasons for Choosing a GraphDatabase1. Order-of-magnitude improvements in queryperformance for complex, connected data2. Drastically accelerated applicationdevelopment cycles3. Maintainability and extensibility of thedata model4. Maturity and reliability of the product
  • 44. Neo Technology, Inc ConfidentialQuestions ?
  • 45. Neo Technology, Inc ConfidentialMerci !Pour  aller  plus  loin  :Cédric  Fauvet  –  Votre  contact  en  FranceE-­‐mail  :  Cedric.Fauvet@neotechnology.comTwi+er  :  @Neo4jFrCommunauté  Francophone  :  meetup.com/graphdb-­‐france