Is the elephant in the room


Published on

Published in: Technology, Business
1 Comment
  • A little bit of background. One of the Siemens employees attended my talk on Aadhaar in 2010 and wanted a repeat of the same for their employees. This one therefore is a mashup of technology trends that uses Aadhaar and Flipkart examples for illustration. This was the keynote address at Siemens TECh Days - Aug 2012, Bangalore.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Is the elephant in the room

  1. 1. Is the Elephant in the room? Regunath B Twitter : @RegunathB
  2. 2. Quick read 1.8 million words?The story is about a battle between great kings and sons, with the principal characters beingArjuna, Pandu, Bhishma, Bharata, Karna, Duryodhana, Yudhishthira etc. Source : The Gramener blog for visualizations – Analysis of the entire text contained in the Mahabharatha (
  3. 3. Insights from Social Media Source : ttwick Billionaires page (Bill Gates Twitter Social Media profile) (
  4. 4. Insights from Social Media Source : Impact page of Satyamevjayate (
  5. 5. What is Big Data?● Big Data challenges and opportunities arise when information in an enterprise demonstrates following characteristics: – Volume ● Transaction data from enterprise systems – For example : Financial transactions, Orders – Variety ● Structured and Unstructured data – For example : Customer contact, Social Media, Biometrics – Velocity ● High information arrival rates – For example : Application events, Tagging, Rating of content● Big Data opportunities arise when the enterprise is able to derive Value from the data characteristics defined above
  6. 6. Food for thought.... on theorems and laws● Do hardware and technology trends affect your technology selection? – CPU, RAM and disk size double every 18-24 months [Moore’s law] – Disk seek time remains nearly constant at around 5% speed-up per year● Data Seek vs. Data transfer – Software that leverage one of the above (or) a combination B+ tree index, LSM tree index, “Fractal tree”● CAP theorem effect – ability to achieve only 2 of 3 properties of shared- data systems : data Consistency, system Availability and tolerance to network Partitions● Bandwidth is the most scare commodity in a Data Center
  7. 7. Aadhaar Patterns & Technologies• Principles • POJO based application implementation • Light-weight, custom application container • Http gateway for APIs• Compute Patterns • Data Locality • Distribute compute (within a OS process and across)• Compute Architectures • SEDA – Staged Event Driven Architecture • Master-Worker(s) Compute Grid• Data Access types • High throughput streaming : bio-dedupe, analytics • High volume, moderate latency : workflow, UID records • High volume , low latency : auth, demo-dedupe, search – eAadhaar, KYC
  8. 8. Aadhaar Architecture • Real-time monitoring using Events• Work distribution using SEDA & Messaging• Ability to scale within JVM and across• Recovery through check-pointing• Sync Http based Auth gateway• Protocol Buffers & XML payloads• Sharded clusters • Near Real-time data delivery to warehouse • Nightly data-sets used to build dashboards, data marts and reports
  9. 9. Putting data to work at Aadhaar
  10. 10. Deployment Monitoring
  11. 11. Big Data at Flipkart ● Website traffic – Millions of page hits per day – product catalogs, item availability, promotions, search – Millions of active sessions and shopping carts – Latencies measured in low digit milliseconds ● Growing list of categories (Books, Mobiles, Toys, Personal,Home,Baby, Digital music...) – Electronic inventory – MP3, eBooks, movies ● New business models, newer channels ● Understanding users, user profiles, social media, experience – Tera bytes of logs containing browsing behavior, data from multiple engagement channels – Recommendations based on millions of possible item matches and relevance algorithms
  12. 12. Is the Elephant in the room?From Wikipedia:"Elephant in the room" is an English metaphorical idiom for an obvious truth that is being ignoredor goes unaddressed.Big Data opportunities and challenges are real and present -It is the Elephant in the room.
  13. 13. Some takeaways from experience● Make everything API based● Everything fails (hardware, software, network, storage) – System must recover, retry transactions, and sort of self-heal● Security and privacy should not be an afterthought● Scalability does not come from one product – Watch out for solution and technology stereotyping● Open scale out is the only way to go – Heterogeneous, multi-vendor, commodity compute, growing linear fashion. Nothing else can adapt!