Using Graph Databases in Real-Time to
Solve Resource Authorization at Telenor

Graph Connect San Francisco – 4 Oct 2013
by...
Telenor Norway
Subsidiary of the Telenor Group
2 billions USD in mobile revenues 2012

Sebastian Verheughe
Lead Developer ...
Disclaimer
The presentation is not identical to the implementation
due to security reasons but shows how we have
modeled a...
A very

aspect of
our business
Telenor Norway Middleware Services
Channel

Channel

used by 42 channels
calls 35 sub-systems
10,000 code classes
500 requ...
Our Problem
20 minutes to calculate all accessible resources
1500 lines of SQL to implement the authorization logic
“solve...
Why a Graph Database?
Access

Parent Company

User

Which resources
does the user
have access to?

Part of Company

Sales
...
Tailored Read Model
The Model makes read queries
as simple and efficient as possible.
First find your questions
then model...
High Level Architecture
Clients

Classic MW
Services

other
sources

tx log
RDBMS

check
access

Resource
Authorization

M...
Conditional Rules
ACCESS is given with the following include parameters:
access to subsidiaries and access to content
Only...
Different Access Needs
Access Subsidiaries & Content

Super Admin

Access Content

Umbrella Admin
Access S&C

Admin
Graph Algorithm
Prerequisite: The user node
1. Follow all ACCESS relationships and
read the access parameters on the relat...
Solution Value
1. Performance optimized from minutes to seconds.
2. Simplicity of writing and understanding business rules...
Authorization Complexity
• Not a collection of isolated customer trees *
• Not all users of a customer have equal access
•...
How we Started with Neo4j
1. Searched the internet for articles about graph
database and different solutions.
2. Downloade...
Lessons Learned
• Choose a solution/technology that fits your problem
• New way of thinking – build competence in org.
• P...
Alternative In-Memory RDBMS
Option 1: Use existing database
- Performance issues due to shared data / suboptimal
structure...
Different Graph Structures
get all accessible subscriptions
2 000

1 000

1 700 ms

Company X: 147 000

750 ms

Company Y:...
Different Graph Structures
check access to single subscription
2 000

1 000

1 ms

Company X: 147 000

1 ms

Company Y: 52...
Production Performance
retrieve all accessible resources
RDBMS Disk

RDBMS (mem cached)

Graph In-Heap

Company X

12 min
...
Technical Details
Production Details
Graph Size
heap)

27 million nodes (pre-warmed in
~1x properties, ~2x relationships

Traffic Volume

~1...
Production Has Access Query

Time (ms)

Time (ms)
Production All Queries

Time (ms)
Garbage collection
Implementing the Algorithm
Lets look at the Neo4j Traversal Framework
Iterable<Node> getAccessibleResources(…) {

Evaluato...
Implementing the Algorithm
Evaluator is a simple filter, e.g. for Node
type
class MyEvaluator implements Evaluator {
publi...
Implementing the Algorithm
The custom Expander contains business
rules!
class ResAuthExpander implements PathExpander<Path...
Implementing the Algorithm
Generates the valid relationships to traverse.
public getExpander(boolean accToSub, boolean acc...
U-Turn Strategy
4.
Access

User

Does the user
have access to
subscription X?

3.

5.

Up to find path quickly
Down to che...
The Zigzag Problem
What if we also have reversed access to the subscription
payer?

User

Op

IT

E
d

Jo

Subscriptions

...
The Many-to-Many Problem
The nodes Op & IT may be connected through many
subscriptions

Does the user
have access to
depar...
Deployment View
• Two equal instances of Neo4j embedded in Tomcat
• Access through Java API due to need for custom logic
•...
Dual Model Cost
There are some drawbacks with dual models
also
• Not possible to simply join the ACL with resource
tables ...
Unexplored Areas
Combining Access Control List & Graph
• Best of both worlds (simple logic, fast lookup)
Algorithm
–
–
–
–...
Was is worth it?

Yes!
The user experience is important in
Telenor
Questions?
Web References
• Telenor Norway
• The Project - How NOSQL Paid off for
Telenor
• JavaWorld - Graphs for Security
Upcoming SlideShare
Loading in …5
×

Using Graph Databases in Real-time to Solve Resource Authorization at Telenor - Sebastian Verheughe @ GraphConnect SF 2013

1,981 views
1,771 views

Published on

Learn how Telenor uses Neo4j to protect data in business critical services running in production. Sebastian will discuss lessons learned both with technology and our experience after running it in production for half a year, backing many of our mission critical services.

The most recently updated version of the slides are available here: http://www.slideshare.net/verheughe/graph-connect-sf-4-oct-2013

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,981
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
34
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Why: access to secret numbers, access to modify/delete subscriptions, possibility to send/receive messages
  • We use Neo4j for our business critical services, both customer/product services , but also operational services.A channel is client type, e.g. the web solution for corporate customer, or helpdesk solution, or app, and may consist of many clients
  • The project business case was based on a future point in time where we could not any more onboard any large corporate customers
  • Drawing on the white board the required logic made us understand that a graph database might be a good solution
  • Take the hit on write, and make read easy! (for us, read performance is the problem – not write performance)Also, don’t blindly copy tables/foreign keys into nodes and relationships – drop what’s not needed and remember that relationships may have properties in graph
  • RDBMS is still mastering the data as it is used in many different use-cases where that is beneficial.
  • TIME LEFT: 30 minutes (10 used)
  • The last part is important to us. It was really hard to extract the resource authorization out of the relational database, but not we can much more easy replace the current implementation with another one in the future if neccesary.
  • Production logs does not contain user data, so just one big organization was sampled to get production data for a specific customer
  • TIME LEFT: 20 minutes (20 used)Graphperformance based on test environment, see charts in the technical section for production numbers not specific for a unique customer
  • We only have detailed logs for a short while back – so we cannot review all data since production.First production two years ago with limited traffic, full production since spring 2013
  • We always continue, since we also have our custom expander. This way, we have a clean separation of concern in our code.We also have more advanced filters peaking around the node before it decides to include or exclude the node
  • This is the most important part of the code, the one place where we now are able to write down the business logic in a simple and natural way.Note that we only have ONE class containing the business rules independently of which use-case we are running.
  • The relationships and directions that are allowed to traverse given the different switch parameters.
  • This is possible since we have a tree graph. Demonstrates the importance of understanding how a graph works, because than you may greatly improve performance by smart traversal strategies.
  • TIME LEFT: 10 minutes (30 used)
  • Extra knowledge, such as which subscription you are
  • Using Graph Databases in Real-time to Solve Resource Authorization at Telenor - Sebastian Verheughe @ GraphConnect SF 2013

    1. 1. Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor Graph Connect San Francisco – 4 Oct 2013 by Sebastian Verheughe
    2. 2. Telenor Norway Subsidiary of the Telenor Group 2 billions USD in mobile revenues 2012 Sebastian Verheughe Lead Developer for Neo4j solution Coding Architect
    3. 3. Disclaimer The presentation is not identical to the implementation due to security reasons but shows how we have modeled and solved the problem in general. However, all presented data (numbers & charts) are real, unfiltered and extracted from the production logs
    4. 4. A very aspect of our business
    5. 5. Telenor Norway Middleware Services Channel Channel used by 42 channels calls 35 sub-systems 10,000 code classes 500 requests/second 20,000 orders/day Backend Backend Channel Channel MOBILE MW Channel Channel Providing business logic and data for all channels in the mobile value chain BUSINESS LOGIC & DATA Backend Handles users with access to X00,000 resources Backend Backend Backend
    6. 6. Our Problem 20 minutes to calculate all accessible resources 1500 lines of SQL to implement the authorization logic “solved” by caching data going stale and the solution did not scale…
    7. 7. Why a Graph Database? Access Parent Company User Which resources does the user have access to? Part of Company Sales Finance Production HR Sub The questions we wanted answered required traversal of tree structures. Tablet Subscription Owner Sub Tablet Uses Subscription Phone
    8. 8. Tailored Read Model The Model makes read queries as simple and efficient as possible. First find your questions then model your graph graph model = relational model
    9. 9. High Level Architecture Clients Classic MW Services other sources tx log RDBMS check access Resource Authorization Message Queue Neo4j
    10. 10. Conditional Rules ACCESS is given with the following include parameters: access to subsidiaries and access to content Only find children of PARENT COMPANY given access to subsidiaries is allowed User Only look at PART OF COMPANY given access to content is allowed Only look at SUBSCRIPTION OWNER given access to content is allowed
    11. 11. Different Access Needs Access Subsidiaries & Content Super Admin Access Content Umbrella Admin Access S&C Admin
    12. 12. Graph Algorithm Prerequisite: The user node 1. Follow all ACCESS relationships and read the access parameters on the relationship 2. Follow all PARENT COMPANY relationships given access to subsidiaries is allowed 3. Follow all PART OF COMPANY relationships given access to content is allowed 4. Follow all SUBSCRIPTION OWNER relationships given access to content is allowed
    13. 13. Solution Value 1. Performance optimized from minutes to seconds. 2. Simplicity of writing and understanding business rules for the query traversal. 3. Scalability by performance allowing us to onboard more corporate customers (project business case) Autonomous Service with it’s own life-cycle and data repository.
    14. 14. Authorization Complexity • Not a collection of isolated customer trees * • Not all users of a customer have equal access • Not a fixed schema, form or size for all customers • Real-time updated with customer & product data The data form a highly connected living graph * Covered later in Technical Details
    15. 15. How we Started with Neo4j 1. Searched the internet for articles about graph database and different solutions. 2. Downloaded and quickly prototyped the solution we liked that matched our requirements (Neo4j). 3. Workshop with Neo4j and our project developers to quickly gain competence and ensure design QA. 4. Solution QA with Neo4j before production and help with performance issues / tuning.
    16. 16. Lessons Learned • Choose a solution/technology that fits your problem • New way of thinking – build competence in org. • Profile your java code to make it really fast • Don’t put everything into the graph (functional creep) • Need to know how traversal works (e.g. shortest path) • Benchmark the graph to evaluate your traversal speed
    17. 17. Alternative In-Memory RDBMS Option 1: Use existing database - Performance issues due to shared data / suboptimal structure - Complexity since SQL not designed for traversal Option 2: Separate database + Might reach same performance as graph db + Familiar technology - Complexity since SQL not designed for traversal Decided to go with our instinct Graph Database
    18. 18. Different Graph Structures get all accessible subscriptions 2 000 1 000 1 700 ms Company X: 147 000 750 ms Company Y: 52 000 1300 ms Company Z: 95 000 Data from test – repeated prod sampling gave ~2.4 sec for 215,000 subscriptions
    19. 19. Different Graph Structures check access to single subscription 2 000 1 000 1 ms Company X: 147 000 1 ms Company Y: 52 000 1 ms Company Z: 95 000
    20. 20. Production Performance retrieve all accessible resources RDBMS Disk RDBMS (mem cached) Graph In-Heap Company X 12 min 18 sec < 2 sec Company Y 22 min 58 sec < 2 sec Company Z 3 min 15 sec < 2 sec Check single resource access 1 ms No operational problems in production
    21. 21. Technical Details
    22. 22. Production Details Graph Size heap) 27 million nodes (pre-warmed in ~1x properties, ~2x relationships Traffic Volume ~1000 req/min during biz hours ~ 40K daily real-time updates Performance ms Avg: 1 ms, 99% < 4 ms, 99.9% < 9 JVM warmed) Sun 6, 20 GB Heap (~15 GB pre-
    23. 23. Production Has Access Query Time (ms) Time (ms)
    24. 24. Production All Queries Time (ms) Garbage collection
    25. 25. Implementing the Algorithm Lets look at the Neo4j Traversal Framework Iterable<Node> getAccessibleResources(…) { Evaluator myEvaluator = … Expander myExpander = … return Traversal.description() .evaluator(myEvaluator) .expander(myExpander) .traverse(startNode).nodes(); }
    26. 26. Implementing the Algorithm Evaluator is a simple filter, e.g. for Node type class MyEvaluator implements Evaluator { public Evaluation evaluate(Path path) { if <I am interested in this node> return Evaluation.INCLUDE_AND_CONTINUE; else return Evaluation.EXCLUDE_AND_CONTINUE; } }
    27. 27. Implementing the Algorithm The custom Expander contains business rules! class ResAuthExpander implements PathExpander<PathExpander> { … public … expand(Path path, BranchState<…> state) { if (path.lastRelationship rel == ACCESS) accToSub = rel.getProperty(ACCESS_TO_SUBSIDIARIES); accToCont = rel.getProperty(ACCESS_TO_CONTENT); state.set( getExpander(accToSub, accToCont) ); } return state.get().expand(…) } Single expander class to control business
    28. 28. Implementing the Algorithm Generates the valid relationships to traverse. public getExpander(boolean accToSub, boolean accToCont) { PathExpander exp = StandardExpander.DEFAULT.add(ACCESS,…); if (accToSub) exp.add(PARENT_COMPANY,…) if (accToCont) exp.add(PART_OF_COMPANY,…).add(SUBSCRIPTION_OWNER,…); return exp; } }
    29. 29. U-Turn Strategy 4. Access User Does the user have access to subscription X? 3. 5. Up to find path quickly Down to check access 6. 2. 7. 1. 8. X Subscription Reversing the traversal increases performance from n/2 to 2d where n and d are tree size and depth (we went from 1s to
    30. 30. The Zigzag Problem What if we also have reversed access to the subscription payer? User Op IT E d Jo Subscriptions Solvable by adding state to the traversal (or check path)
    31. 31. The Many-to-Many Problem The nodes Op & IT may be connected through many subscriptions Does the user have access to department Op? Op IT Access User Subscription Traversal becomes time consuming (e.g. M2M market) However, we only needed to implement the rule for direct access to sub.
    32. 32. Deployment View • Two equal instances of Neo4j embedded in Tomcat • Access through Java API due to need for custom logic • Using Neo4j 1.8 without HA (did not like ZooKeeper) Resource Authorization Neo4j tx log RDBMS Message Queue Resource Authorization Neo4j
    33. 33. Dual Model Cost There are some drawbacks with dual models also • Not possible to simply join the ACL with resource tables in the relational database - queries needed redesign • The complexity added by code and infrastructure necessary to manage an additional model. • Not ordinary competence (in Norway at least)
    34. 34. Unexplored Areas Combining Access Control List & Graph • Best of both worlds (simple logic, fast lookup) Algorithm – – – – Find all affected users when the graph is updated Invalidate users access control list Calculate all accessible resources for each user Store result in users access control list Could then skip the U-turn and many-to-many problem.
    35. 35. Was is worth it? Yes! The user experience is important in Telenor
    36. 36. Questions?
    37. 37. Web References • Telenor Norway • The Project - How NOSQL Paid off for Telenor • JavaWorld - Graphs for Security

    ×