Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure


Published on

Atul Gore at Redis Day NYC 2019

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure

  2. 2. PRESENTED BY 1 Introduction About Altizon, Datonis & MInt Platforms. 2 Using Redis in our Platform How/where we use Redis. Agenda: 3 Data plumbing: Requirements & Design Understand the requirements & motivation behind using Redis for our Data plumbing infra. 4 Demo
  3. 3. PRESENTED BY About Altizon & its offerings • Started in 2013. HQ in Pune, India. • 3 main offerings in the IOT market – Datonis Edge Gateway. – Datonis platform - Generic IOT platform – MInt (Manufacturing Intelligence) platform - specifically for the Manufacturing Domain. • Customers in – Manufacturing: metal, auto, tyre and chemical industries. – Wind and solar – Utilities, Smart township
  4. 4. PRESENTED BY • Datonis is a purpose built IOT PaaS. • Routinely process 100 M events/day. Tested for 2 B events/day. • Built completely using open source tech: Redis, Kafka, Mongodb, Kura etc. • Written in ROR, Java, Python & React JS. • Edge: OOB connectivity for prevalent Industrial protocols like OPC, Modbus, Profinet. Pluggable custom protocols. • SDKs in Java, C, Python, Ruby for building your own gateways. Datonis IOT Platform & Edge
  5. 5. PRESENTED BY High Level Architecture Cloud Instream Analytics/Rule Engine Datonis Event Ingestion Service Datonis REST API & Query Service OPC / Modbus / Custom MES Aggregation Service DATONIS – TIMESERIES STORE
  6. 6. PRESENTED BY Redis in Datonis: Object Cache • You guessed it! We use it as an object cache :) • Event Ingestion response time is cardinal for us. We cache: – Thing metadata, data model. – Licensing information (since we are multi-tenant) – Throttling of edge devices & api calls made by clients. • Speeds up event ingestion by up to 20x as compared to db calls to get the same data. • Currently implemented using Cache aside (with lazy load) strategy by using after save hooks in our mongoid model. • A RT/WT mechanism using a DAO layer in works. Will be used as a OM library by all our data processing components.
  7. 7. PRESENTED BY Redis in Datonis: Instream Analysis & Rule Engine • Storing data for instream analytics (sorted sets and queues) – Tumbling Windows - Metering • E.g. Units consumed by an energy meter since start of day/month. – Sliding Windows - CBM • E.g. 10 minute moving average of temperature and pressure of an industrial boiler – Last K events • E.g. Moving average of last 20 temperature readings • Kafka streams based microservices keep computing these windows on streaming data & accumulating it in Redis. • The Datonis Rule engine uses this to drive all the condition based monitoring and alerting use cases.
  8. 8. PRESENTED BY Data Plumbing Infra - The need • Enterprise Customers are getting tech savvy and building their own data warehouses/data lakes that combine data from their IT systems & OT systems. • Some customers are building their analytics/ML platforms on top of Datonis with a very domain specific use case e.g. Energy, Utilities. • Datonis to ingest, filter, aggregate data from various edge data sources across multiple sites and geos. • Data will often need to be enriched with master data, especially in manufacturing context. e.g. production plans, shifts, part codes, expected cycle times for a job etc. • Data to be delivered to target platform continuously based on agreed schedule. • Target system is generally on AWS/Azure/private datacenter.
  9. 9. PRESENTED BY • Delivered with minimum latency. • Aggregated and delivered based on different time window and push frequency requirements. E.g. – Minutely/Hourly aggregated data – Aggregated at 1 minute intervals but pushed every 10 minutes. – At the end of an 8 hour shift • Handle late arrivals due to edge connectivity loss. • Push data only for a subset of things/assets or based on their state i.e. push only if they are in operational state. Data Plumbing Infra - Functional requirements
  10. 10. PRESENTED BY • Distributed and Fault tolerant execution. • Extremely reliable scheduling • HA state store • At least once delivery semantics. Data completeness is important, its okay to deliver it more than once in case of errors/failures. • Throttling/back pressure to avoid blowing up during extreme load situations, infra outages or communication failures to target systems. • Lightweight in terms of hardware resources. Data Plumbing Infra - Architectural requirements
  11. 11. PRESENTED BY IOT Data - Understanding the time dimension IOT Data is always stored with reference to a timeline. Every event has at least 3 time values. • Timestamp – The time when an event in generated by the thing on the edge side. • Created at – The time when an event is added to the Cloud Time series Store. • Updated at – The time when it was updated in the Cloud Time series Store.
  12. 12. PRESENTED BY Let’s run with an example for understanding this better. • We want to push data for a customer to a AWS Kinesis Stream. • There are 100 things in the account that are periodically streaming temperature and pressure readings. • We want to push data aggregations (sum, min, max, count, avg) every 5 minutes. • This config is cached in redis against the tenant ID. • In an ideal scenario, we’ll push one bucket of aggregated data per thing to AWS every 5 minutes. • In case of late arriving data the number of buckets might change. Data Plumbing Infra - Design Approach
  13. 13. PRESENTED BY Data Plumbing Infra - High Level Architecture Cloud DATONIS – TIMESERIES STORE ERP Systems (REST API) Asure Event Hub AWS IoT JDBC Data Sink Redis Master Redis Slave #1 Redis Slave #2 Adapter Instance #N Adapter Instance #2 Adapter Instance #1
  14. 14. PRESENTED BY We need to schedule a push every 5 minutes. • A sorted set containing ‘Milliseconds after epoch’ is kept against every tenant configuration. • A scheduler thread wakes up every second, looks up the sorted set for entries < currentTime and gets a lock for further processing. Data Plumbing Infra - Scheduler thread Long startTimeMs = System.currentTimeMillis(); Set<Tuple> tenantConfigs = jedis.zrangeByScoreWithScores(Constants.TENANT_NEXT_SCHEDULED_PUSH_TS, 0, startTimeMs.doubleValue()); //Get a lock String retVal = jedis.set(Constants.PUSH_MASTER_LOCK, (String)properties.get(Constants.INSTANCE_ID), "NX", "PX",lockTimeoutMs); if (retVal == null || retVal.equals("OK") == false) { continue; }
  15. 15. PRESENTED BY • Assume current time is 10:00:15, bucket size is 5 mins, so we query for the earliest timestamp recorded by Datonis between 9:55:15 and 10:00:15. • If this timestamp is >= 9:55:00, then data tx from edge to cloud is in sync and we only need to push 1 bucket of data per thing. i.e 9:55:00 to 10:00. • But if the earliest timestamp returned is say 9:54:30, then data tx is slightly delayed from the edge, and in this case we need to push 2 buckets i.e. 9:50:00 to 9:55:00 and 9:55:00 to 10:00:00. • If you notice, time is assumed as 10:00:15. This is deliberate. 15 second is an offset value that is configurable. This is for allowing data from edge to reach Datonis and settle down in the TS Store after processing. Hence the scheduler stores the sorted set values with this offset. Data Plumbing Infra - Scheduler thread
  16. 16. PRESENTED BY For a tenant, determine the start and end and query the ‘created_at’ dimension in the Datonis TS Store to get the earliest recorded ‘timestamp’ between that range. Data Plumbing Infra - Scheduler thread //tenant is an element in the tenantConfigs returned by the sorted set query previously. Long toTimeMs = new Double(tenant.getScore()).longValue(); //tenantConfig is the cached tenant congfiguration Long bucketSizeMs = tenantConfig.getDataPushFrequencyMinutes()*60000; Long fromTimeMs = getBucketStartTimestamp(toTimeMs - bucketSizeMs, bucketSizeMs) + pushMasterOffsetMs; //Update toTimeMs to nearest bucket value < current time in case any run is missed. toTimeMs = getBucketStartTimestamp(startTimeMs, bucketSizeMs) + pushMasterOffsetMs; //QUery the TS store to get the earliest timestamp for things in this time range JSONArray things = DatonisTSStore.get_earliest_timestamp(fromTimeMs,toTimeMs)
  17. 17. PRESENTED BY • The scheduler now adds all the things for which data needs to be pushed to a ‘ scheduled workitems’ queue. • It will then move on to scheduling the next round of data push for the tenant Data Plumbing Infra - Scheduler thread String[] thingArr = new String[things.size()]; Iterator<JSONObject> it = things.iterator(); while(it.hasNext()) { JSONObject thing =; thingArr[i++]=thing.toJSONString(); } jedis.lpush(Constants.SCHEDULED_WORK_ITEMS_QUEUE, thingArr); Long nextRunTime = toTimeMs + bucketSizeMs; //update tenant’s next scheduled time. jedis.zadd(Constants.TENANT_NEXT_SCHEDULED_PUSH_TS, nextRunTime.doubleValue(), tenant.getElement());
  18. 18. PRESENTED BY • Multiple worker threads block on availability of work items on the ‘scheduled’ queue. • Once data is available in a queue, it is atomically moved to a ‘in-progress’ queue that is created per worker thread. Data Plumbing Infra - Worker thread queueName = Constants.RUNNING_WORK_ITEMS_QUEUE_PREFIX + properties.get(Constants.INSTANCE_ID) + Thread.currentThread().getName(); String item = jedis.brpoplpush(Constants.SCHEDULED_WORK_ITEMS_QUEUE,queueName); JSONObject thing = (JSONObject) parser.parse(item); String thing_id = thing.get(“id”); Long from = thing.get(“from”); Long to = thing.get(“to”); JSONArray data = DatonisTSStore.query(thing_key,from,to); // Code to transmit data to AWS Kinesis. // After successful tx move remove the item from the in progress queue. jedis.lpop(queueName);
  19. 19. PRESENTED BY • The ‘in-progress’ work item is then processed by the worker thread. • Data for the relevant time buckets is fetched from the Datonis Timeseries Store and pushed to the Kinesis stream. • After successful completion the item is deleted from the ‘in-progress’ queue and the worker goes back to blocking on the ‘scheduled’ queue for new work items. • Workers can process the work items in parallel on many nodes and their turnaround time typically depends on the response time of the DatonisTSStore and the target system. • Workers are programmed to back off incrementally upto 30 seconds when there are network i/o issues and then move on to the next work item in the queue. • But then what happens to the failed work item? Data Plumbing Infra - Worker thread
  20. 20. PRESENTED BY • A monitor thread keeps checking for failed work items every few seconds. • Failed work items can be detected using queue size of worker threads. If a worker thread’s queue size is > 1, then it has one or more failed work items because of an exception. • Such workitem(s) are moved back to the scheduled queue for reprocessing. Data Plumbing Infra - Monitor thread for (int i=0;i<numWorkers;i++) { String redisQueueName = Constants.RUNNING_WORK_ITEMS_QUEUE_PREFIX + properties.get(Constants.INSTANCE_ID) + "-" + Constants.WORKER_THREAD_NAME_PREFIX + "-" + i; Long queueLength = jedis.llen(redisQueueName); if (queueLength > 1) { //do an atomic swap for (i=0;i<queueLength-1;i++) { jedis.rpoplpush(redisQueueName,Constants.SCHEDULED_WORK_ITEMS_QUEUE); } } }
  21. 21. PRESENTED BY Recap of Redis features used in Datonis • As a general purpose object cache - Vital for speeding up our ingestion API. • Keys with TTL - Throttling for clients, lock timeouts • Conditional Setting and deleting of keys - Mutex between nodes. • Sorted Sets and Range queries - Window computes, Scheduling. • Queues and atomic swap operations - Guaranteed work item execution. • Sentinel Master Slave setup - HA State store
  22. 22. Thank you! Thank you!