Optimization Algorithms & Graph Structures: Tools for Better Decision Making
The first topic will present a mathematical approach to make optimal inventory decisions. The second topic will focus on how graph databases can be used when a decision process relies on data relationships.
Inventory management, a new look on a common problem
Inventory management is a problem that every retailer must tackle. Usually this problem is solved using common statistical tools such as Poisson and Normal distributions. However, a large part of these inventories is poorly handled due to their nature. Many items have very few customers’ demands like once a week or even less, but need to be stocked nonetheless for a variety of reasons. This gives rise to a variety of challenges but also opportunities for more flexible algorithmic tools. We will give an overview and rationale of the different models involved, starting from probabilistic forecasting as an input to an inventory control policy optimization with Markov Decision Processes. Even though this can get very technical, we promise to keep the presentation light, accessible and free from equations!
Graph databases: When data relationships really matter
Graph databases have gained popularity in the recent years as a powerful technology that allows understanding of relationships between data records. We have explored some popular graph databases in the market such as Neo4j and JanusGraph running on top of Cassandra and HBase to determine their usability in a production-ready cloud environment. In this talk, we will be sharing our findings and lessons learned. We will also show you a concrete example of a graph database usage to address a specific business problem.
10. Up to
75%*
of Retail products
are slow Moving
*https://smartech.gatech.edu/bitstream/handle/1853/13469/Managing%20Slow%20Moving%20Perishables%20in%20the%20
Grocery%20Industry.pdf
SLOW-MOVING ITEMS
Slow-movers are products with very
low sales, maybe one unit per week
or even less.
They are important!
18. S
s
Time
But we want the best (s, S) policy!
It must minimize costs and match a service level.
INVENTORY
OPTIMIZATION
(s, S) Policy
When the inventory contains s or less items, an
order is made to reach level S.
21. DEMAND
FORECAST
INVENTORY OPTIMIZATION FULFILLMENT
Demand distribution profile
Normal Poisson
(mean, variance) (mean)
Stationary (long term)
demand distribution
Lead time distribution profile
Normal Poisson
(mean, variance) (mean)
Uncertainty profiles
selection
Optimization Inventory
Policy
(s, S)
TRADITIONAL
INVENTORY
OPTIMIZATION
23. 23
Markov Decision Process vs traditional inventory optimization
Normal Poisson
(mean, variance) (mean)
Uncertainty profiles selection
(s, S)
Optimization Inventory Policy
DEMAND
FORECAST
INVENTORY OPTIMIZATION FULFILLMENT
EOQ
ROP
DISCRETE DEMAND
DISTRIBUTION
PERIOD 1 PERIOD 2 PERIOD 3
1
+0
+1
+2
+3
-1
0
1
2
3
-1
0
1
2
3
2
1
0
3%
20%
77% 1
0
50%
50%
+0
+1
+2
+3
1
0
50%
50%
MARKOV DECISION PROCESS
Empirical Distribution
Distributional Forecast
0 4 5 6 7 10 25
qty
probability
35%
30%
5% 5%
10% 10%
5%
OR
24. MARKOV DECISION PROCESS
• Andrey Andreyevich Markov
• Based on
• the probability to transition from one state to
another
• the possible actions
• The cost associated to each state and action
• What is the best action at each state?
STATES
ACTIONS
INVENTORY QUANTITIES
DEMAND DISTRIBUTION
ORDER QUANTITIES
INVENTORY COSTS
TRANSITIONS
REWARDS
25. 25
2
1
0
3%
20%
77%
Period 2 Period 3
STATES
ACTIONS
INVENTORY QUANTITIES
DEMAND DISTRIBUTION
ORDER QUANTITIES
1
INVENTORY COSTS
TRANSITIONS
REWARDS
0
-1
1
2
3
4
INDIVIDUAL ITEM AT A SINGLE LOCATION OVER 3 PERIODS
MARKOV DECISION PROCESS
2
1
0
3%
20%
77%
2
1
0
3%
20%
77%
0
-1
1
2
3
4
+ 0
+ 1
+ 2
+ 3
Period 1
$60
$60
$60
$10
26. 26
Period 2 Period 3
2 3%
STATES
ACTIONS
INVENTORY QUANTITIES
DEMAND DISTRIBUTION
ORDER QUANTITIES
1
INVENTORY COSTS
0
-1
1
2
3
4
+ 0
+ 1
+ 2
+ 3
$60
$60
$60
$10
Period 1
TRANSITIONS
REWARDS
0
-1
1
2
3
4
+ 0
+ 1
+ 2
+ 3
$1050
$1050
$1050
$1000
INDIVIDUAL ITEM AT A SINGLE LOCATION OVER 3 PERIODS
MARKOV DECISION PROCESS
27. 27
1
0
50%
50%
Period 2 Period 3
STATES
ACTIONS
INVENTORY QUANTITIES
DEMAND DISTRIBUTION
ORDER QUANTITIES
1
INVENTORY COSTS
0
-1
1
2
3
4
+ 0
+ 1
+ 2
+ 3
$60
$60
$60
$10
Period 1
TRANSITIONS
REWARDS
0
-1
1
2
3
4
2 3%
+ 0
+ 1
+ 2
+ 3
$60
$60
$60
$10
0 20%
1 77%
1
0
50%
50%
INDIVIDUAL ITEM AT A SINGLE LOCATION OVER 3 PERIODS
MARKOV DECISION PROCESS
28. 28
Period 2 Period 3
STATES
ACTIONS
INVENTORY QUANTITIES
DEMAND DISTRIBUTION
ORDER QUANTITIES
2
1
0
3%
20%
77%
1
INVENTORY COSTS
0
-1
1
2
3
4
+ 1
$60
Period 1
TRANSITIONS
REWARDS
1
0
50%
50%
+ 0
$10
1
0
50%
50%
+ 0
$20
1
0
50%
50%
+ 3
$60
+ 2
$50
+ 0
$20
+ 3
$60
+ 0
$30
0
-1
1
2
3
4
3%
32%
47%
18%
0
-1
1
2
3
4
INDIVIDUAL ITEM AT A SINGLE LOCATION OVER 3 PERIODS RESULT: DYNAMIC POLICY
MARKOV DECISION PROCESS
33. Inventor optimization is about balancing cost and
service level
Slow movers are a large part of most inventories
They must be handled differently than fast movers
• Through better forecast
• Specialized optimization algorithms
CONCLUSION
34. Tools for Better
Decision Making
Graph Databases
Data Science | Design | Technology 34
(Source: http://docs.janusgraph.org/latest/getting-started.html)
40. • Identify patterns of relationships between records
• Join relevant tables together
• Number of joins can quickly increase
• Wildcard search cannot be done using SQL queries
Can Relational
DBs Do It?
42. • A database management system with CRUD
operations working on a graph data model
• Part of the NoSQL family
• Graph data model: composed of vertices, edges and
attributes
• Nodes represent entities
• Edges represent associations between vertices
What is a Graph
Database?
44. • Neo4j: most popular graph database
• JanusGraph: graph framework with a variety of
storage (BigTable, HBase and Cassandra) and
indexing (Elasticsearch, Solr and Lucene)
backends
• ArangoDB: multi-model (graph, document and
key-value) database
Popular Graph
Databases
47. • Configuration 1: Single-node Neo4j DB
• Configuration 2: JanusGraph with a single-node
Cassandra backend
• Configuration 3: JanusGraph with a 3-node HBase
backend
• DB and Java application running on the same
Kubernetes cluster
What We Have
Tried
48. Results Neo4j JanusGraph
Resilience HA cluster with a master-
slave replication setup.
Traffic can be directed to
slave as a failover plan.
Available only in
Enterprise Edition.
Both Cassandra and
HBase provide a
replication mechanism.
Traffic can be directed to
a second JanusGrpah
instance.
Horizontal Autoscaling - Additional nodes can be added at runtime to a HA
cluster
- Both Kubernetes and GKE support horizontal
autoscaling
- Disks cannot be dynamically provisioned on
Kubernetes
Querying the Database - Cypher query
language
- Drivers
- Gremlin query
language
- Java driver
49. • Powerful in analyzing relationships
• Cannot be used as a main data store
• Adds more complexity to code (transaction
management)
• Cluster management requires admin knowledge
• HBase requires knowledge of the Hadoop
ecosystem
• Current stable Kubernetes supports only CPU
autoscaling
Lessons Learned
Based on our constraints
and experiments
51. • BigTable as a backend
• Voice commands
• ArangoDB
Future Work
52. • Meetup group created in April 2017
• Now 952 members (Oct 24th)
• One meetup per month.
• This is your meetup! Propose topics
you would like to present
Data Science | Design | Technology
53. 53
members…..Once we reach
… one lucky participant of our
meetups will win a prize!
Invite your friends to join
the DSDT group