The Value of Data• “[finding] Japanese restaurants in New YorkCity liked by people from Japan” – a challengingFacebook query.• "The value of any piece of information is onlyknown when you can connect it withsomething else that arrives at a future point intime,” – CIA’s CTO Ira "Gus" Hunt
Connecting DataPerson Building?Work Live RR Visit Eat Shop
Relationships are everywhereCRM, Sales &Marketing NetworkMgmt, TelecomIntelligence(Government& Business)FinanceHealthcareResearch:GenomicsSocialNetworksPLM (ProductLifecycleMgmt)IG
Graph Data• Data structure for representing complicatedrelationships between entities.• Wide applications in…– Bioinformatics.– Social networks.– Network management… etc.
The Graph Specialist !• Everyone specializes– Doctors, Lawyers, Bankers, Developers • Why was data so normalized for so long !• NoSQL is all about the data specialist• Specializing in…– Distribution / deployment– Physical data storage– Logical data model– Query mechanism
What Problem!IngestUpdateNavigateMassive graph data require efficient and intelligent toolsto analyze and understand it.
Scaling Writes• Big/Fast data demands write performance• Most NoSQL solutions allow you to scalewrites by…– Partitioning the data– Understanding your consistency requirements– Allowing you to defer conflictsIngest
High Performance Edge IngestIG Core/APIC1C2C3E12E23TargetContainersPipelineContainersE(1->2)E(3->1)E(2->3)E(2->1)E(2->3)E(3->1)E(1->2)E(3->2)E(1->2)E(2->3)E(3->1)E(2->1)E(2->3)E(3->1)E(3->2)E(1->2)PipelineAgentIngest
Trade offs• Excellent for efficient use of page cache• Able to maintain full database consistency• Achieves highest ingest rate in distributedenvironments• Almost always has highest “perceived” rate• Trading Off :• Eventual consistency in graph (connections)• Updates are still atomic, isolated and durable but phased• External agent performs graph buildingIngest
Scaling Reads and QueryDistributed APIApplication(s)Partition 1 Partition 3Partition 2 Partition ...nProcessor Processor Processor ProcessorPartitioning and Read Replicas… easy right !
Why are Graphs Different ?Distributed APIApplication(s)Partition 1 Partition 3Partition 2 Partition ...nProcessor Processor Processor ProcessorNavigate
Distributed Navigation• Detect local hops and perform in memorytraversal• Send the partial path to the distributedprocessing to continue the navigation.• Intelligently cache remote data when accessedfrequently• Route tasks to other hosts when it is optimalNavigate
Schema – It’s not your enemy !(at least not all the time...)• Schema vs Schema-less– Database religion– InfiniteGraph supports schema, but does notrestrict connection types between vertices– We take advantage of the schema when pruning.– …Working on a hybrid support.
GraphViewsLeveraging Schema in the GraphPatient PrescriptionDrugIngredientOutcomeComplaintVisitAllergyPhysicianNavigate
Schema Enables Views• GraphViews are extremely powerful• Allow Big Data to appear small !• Connection inference can lead to exponentialgains in query performance• Views are reusable between queries• Built into the native kernelNavigate
Why InfiniteGraph™?• Objectivity/DB is a proven foundation– Building distributed databases since 1993– A complete database management system• Concurrency, transactions, cache, schema, query, indexing• It’s a Graph Specialist !– Simple but powerful API tailored for data navigation.– Easy to configure distribution model
Advanced Configured Placement• Physically co-locate “closely related” data• Driven through a declarative placement model• Dramatically speeds “local” readsFacility Data Page(s)Patient Data Page(s)MrCitizenVisit VisitDrJonesSanJoseFacilityDrSmithPrimaryPhysicianHasHas WithAtLocated LocatedFacility Data Page(s)DrBlakeSunny-valeDrQuinnLocated LocatedWithAt
Fully Distributed Data ModelZone 2Zone 1HostAIG Core/APIDistributed Object and Relationship Persistence LayerCustomizable PlacementHostB HostC HostXAddVertex()