Is the Elephant in the room?                                         Regunath B                               regunathb@gm...
Quick read 1.8 million words?The story is about a battle between great kings and sons, with the principal characters being...
Insights from Social Media                         Source : ttwick Billionaires page (Bill Gates Twitter Social Media prof...
Insights from Social Media                                         Source : Impact page of Satyamevjayate                 ...
What is Big Data?●   Big Data challenges and opportunities arise when information in an enterprise    demonstrates followi...
Food for thought.... on theorems and laws●   Do hardware and technology trends affect your technology selection?     –   C...
Aadhaar Patterns & Technologies•    Principles      •         POJO based application implementation      •         Light-w...
Aadhaar Architecture                              •                                  Real-time monitoring using Events•   ...
Putting data to work at Aadhaar
Deployment Monitoring
Big Data at Flipkart ●   Website traffic      –   Millions of page hits per day – product catalogs, item availability, pro...
Is the Elephant in the room?From Wikipedia:"Elephant in the room" is an English metaphorical idiom for an obvious truth th...
Some takeaways from experience●   Make everything API based●   Everything fails (hardware, software, network, storage)    ...
Is the elephant in the room
Upcoming SlideShare
Loading in...5
×

Is the elephant in the room

1,712

Published on

Published in: Technology, Business
1 Comment
4 Likes
Statistics
Notes
  • A little bit of background. One of the Siemens employees attended my talk on Aadhaar in 2010 and wanted a repeat of the same for their employees. This one therefore is a mashup of technology trends that uses Aadhaar and Flipkart examples for illustration. This was the keynote address at Siemens TECh Days - Aug 2012, Bangalore.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,712
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
69
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide

Is the elephant in the room

  1. 1. Is the Elephant in the room? Regunath B regunathb@gmail.com Twitter : @RegunathB
  2. 2. Quick read 1.8 million words?The story is about a battle between great kings and sons, with the principal characters beingArjuna, Pandu, Bhishma, Bharata, Karna, Duryodhana, Yudhishthira etc. Source : The Gramener blog for visualizations – Analysis of the entire text contained in the Mahabharatha (http://blog.gramener.com/category/visualisations)
  3. 3. Insights from Social Media Source : ttwick Billionaires page (Bill Gates Twitter Social Media profile) (http://ttwick.com/blog/bill-gates-twitter-social-media/)
  4. 4. Insights from Social Media Source : Impact page of Satyamevjayate (http://www.satyamevjayate.in/impact/impact.php/)
  5. 5. What is Big Data?● Big Data challenges and opportunities arise when information in an enterprise demonstrates following characteristics: – Volume ● Transaction data from enterprise systems – For example : Financial transactions, Orders – Variety ● Structured and Unstructured data – For example : Customer contact, Social Media, Biometrics – Velocity ● High information arrival rates – For example : Application events, Tagging, Rating of content● Big Data opportunities arise when the enterprise is able to derive Value from the data characteristics defined above
  6. 6. Food for thought.... on theorems and laws● Do hardware and technology trends affect your technology selection? – CPU, RAM and disk size double every 18-24 months [Moore’s law] – Disk seek time remains nearly constant at around 5% speed-up per year● Data Seek vs. Data transfer – Software that leverage one of the above (or) a combination B+ tree index, LSM tree index, “Fractal tree”● CAP theorem effect – ability to achieve only 2 of 3 properties of shared- data systems : data Consistency, system Availability and tolerance to network Partitions● Bandwidth is the most scare commodity in a Data Center
  7. 7. Aadhaar Patterns & Technologies• Principles • POJO based application implementation • Light-weight, custom application container • Http gateway for APIs• Compute Patterns • Data Locality • Distribute compute (within a OS process and across)• Compute Architectures • SEDA – Staged Event Driven Architecture • Master-Worker(s) Compute Grid• Data Access types • High throughput streaming : bio-dedupe, analytics • High volume, moderate latency : workflow, UID records • High volume , low latency : auth, demo-dedupe, search – eAadhaar, KYC
  8. 8. Aadhaar Architecture • Real-time monitoring using Events• Work distribution using SEDA & Messaging• Ability to scale within JVM and across• Recovery through check-pointing• Sync Http based Auth gateway• Protocol Buffers & XML payloads• Sharded clusters • Near Real-time data delivery to warehouse • Nightly data-sets used to build dashboards, data marts and reports
  9. 9. Putting data to work at Aadhaar
  10. 10. Deployment Monitoring
  11. 11. Big Data at Flipkart ● Website traffic – Millions of page hits per day – product catalogs, item availability, promotions, search – Millions of active sessions and shopping carts – Latencies measured in low digit milliseconds ● Growing list of categories (Books, Mobiles, Toys, Personal,Home,Baby, Digital music...) – Electronic inventory – MP3, eBooks, movies ● New business models, newer channels ● Understanding users, user profiles, social media, experience – Tera bytes of logs containing browsing behavior, data from multiple engagement channels – Recommendations based on millions of possible item matches and relevance algorithms
  12. 12. Is the Elephant in the room?From Wikipedia:"Elephant in the room" is an English metaphorical idiom for an obvious truth that is being ignoredor goes unaddressed.Big Data opportunities and challenges are real and present -It is the Elephant in the room.
  13. 13. Some takeaways from experience● Make everything API based● Everything fails (hardware, software, network, storage) – System must recover, retry transactions, and sort of self-heal● Security and privacy should not be an afterthought● Scalability does not come from one product – Watch out for solution and technology stereotyping● Open scale out is the only way to go – Heterogeneous, multi-vendor, commodity compute, growing linear fashion. Nothing else can adapt!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×