Your SlideShare is downloading. ×
Data analytics using the cloud   challenges and opportunities for india
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Data analytics using the cloud challenges and opportunities for india

1,932
views

Published on

A presentation on opportunity and challenges for India's economy in the cloud computing era

A presentation on opportunity and challenges for India's economy in the cloud computing era

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,932
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Data Analytics using the Cloud - Challenges and Opportunities for India
  • 2. Introduction AJAY OHRI Author 1,2,3 Thinker 1,2 Founder, DECISIONSTATS ohri2007@gmail.com http://linkedin.com/in/ajayohri
  • 3. What comes next? Data Analytics- Older Paradigms Thoughts on Stats and Computer Science Overview - Data Storage, Cloud Computing
  • 4. Data Analytics old (er) paradigms - SAS and SPSS languages, ETL and DWs newer paradigms - R and Python, Scala and Hadoop More machine learning, less classical stats
  • 5. Is statistics lagging behind computer science Classical statistics- too few data Big Data era- cost of throwing data is more than cost of storing it Machine learning - seems to be the flavor
  • 6. Data Storage older paradigms - RDBMS and Spreadsheets structure and interactivity new paradigms- NoSQL, Hadoop , cloud enabled spreadsheets (?)
  • 7. Cloud Computing- defined by NIST http://www.nist.gov/itl/csd/cloud-102511.cfm cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction or http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
  • 8. Service Models for Cloud Computing SaaS- Software as a service IaaS - Infrastructure as a service PaaS-Platform as a service
  • 9. Service Models for Cloud Computing IaaS - Infrastructure as a service http://media.amazonwebservices.com/IDC_Business_Value_of_AWS_Accelerates_Over_time.pdf http://www.gartner.com/technology/reprints.do?id=1-1IMDMZ5&ct=130819&st=sb
  • 10. Service Models for Cloud Computing PaaS - Platform as a service http://www.gartner.com/technology/research/cloud-computing/report/paas-cloud.jsp http://www.forrester.com/search?N=20033+10001&sort=3&everything=true&source=browse&
  • 11. Service Models for Cloud Computing SaaS - Software as a service http://www.forrester.com/Software--as--a--Service-%28SaaS%29 http://www.gartner.com/newsroom/id/1963815 http://www.forbes.com/sites/louiscolumbus/2013/02/19/gartner-predicts-infrastructure-services-will-accelerate-cloud- computing-growth/ http://my.gartner.com/portal/server.pt? open=512&objID=202&&PageID=5553&mode=2&in_hi_userid=2&cached=true&resId=2332215&ref=AnalystProfile http://www.gartner.com/it-glossary/software-as-a-service-saas/
  • 12. Deployment Models for Cloud Computing Private- Community- Public- Hybrid-
  • 13. Data Analytics (traditional) -Porter’s Model Threat of Mobility- Low (Lockin) Industry Rivalry- Medium (Many) Supplier Power- High(S/w, H/W) Buyer Power- Medium Substitutes- Low (Not many alternatives to SAS, SPSS)
  • 14. Data Analytics (cloud based) -Porter’ s Model Threat of Mobility- High (Easy switch as data and analytics is cloud based) Industry Rivalry- High( Global providers) Supplier Power- Low (open source ,free , GPL) Buyer Power -High (lots of options outsource, insource,crowd source) Substitutes- High (lots of options Python, R , Julia etc)
  • 15. Data Analytics in India - Porter’s Diamond Model Chance- Favorable supply of engineers , Mature outsource and service industry , Rapid growth domestically Factor Conditions- Good Service Industry Firm Strategy- relative lack of ecosystem hampers analytics entrepreneurs Demand Conditions- High Government- Little or No interference
  • 16. India in traditional Data Analytics Strengths Weakness reliable pool of experienced engineering talent inability or unwillingness to invest in huge upfront capex for hardware and software for analytics Opportunities Threats ability to navigate upstream based on cost based arbitrage than skill based value addition thus vulnerable to competition
  • 17. India in Cloud Based Data Analytics Strengths Weakness experienced service industry with huge pool of trained engineering and analytical talent lack of deep domain depth relative lack of ecosystem for cutting edge analytics entrepreneurship slow to embrace open source Opportunities Threats no more capital expenditure needed in software and hardware virtualization offers secure delivery from any location risk management needs to be more mature lack of data privacy regulations
  • 18. Biggest Challenge to using Cloud Google, Amazon,Oracle Cloud, Salesforce, Zoho and Microsoft Azure are some well-known cloud vendors Most of the cloud infrastructure is based out of United States of America
  • 19. Biggest Challenge to using Cloud ==NSA?
  • 20. Biggest Challenge to using Cloud Google, Amazon,Oracle Cloud, Salesforce, Zoho and Microsoft Azure are some well-known cloud vendors Most of the cloud infrastructure is based out of United States of America Unfortunately the USA Govt taps the information for both security as well as economic advantages Unfortunately American Companies seek and get economic advantages for such cooperation Unfortunately in the age of cyber war and the biggest proponent across the border, we have no critical infrastructure as a service for economic players In the future, you wont need United Nations to sanction countries. You just switch off their internet and their economy will shut off. Foreign digital infrastructure can be used to infiltrate Stuxnet like viruses in the domestic supply chain? India may be self reliant in agriculture and semi reliant in manufacturing arms, but we are totally dependent on new generation and even current generation computing
  • 21. Biggest Opportunities to using Cloud Build our critical digital grid using local companies - POSSIBLE Build our next generation of cyber warriors and cyber farmers - VERY POSSIBLE Teach more distributed computing earlier ;) Regulation like EU to ensure Indian Citizen Data stays within Indian State’s administrative boundaries and within reach of Indian legal system Compare ADHAAR Card with information in emails, social networks, on the personal computer ?? Better regulation - POSSIBLE OR NOT POSSIBLE ---DEPENDS ON ELECTIONS ?
  • 22. Moving onto Cloud Based Data Analytics Open Source analytics like Python and R Support Distributed Computing Memory is no problem now ( especially for R) on the cloud
  • 23. Existing Data Analytics in India Lots of Analytics Outsourcing Both SAS and SPSS are present Open Source Analytics on the rise but still palpable lack of awareness Data - ETL- Data WareHouse- SQL Query- Stats Software MINDSET
  • 24. Existing Data Analytics in India Cloud Computing Explicitly uses Linux for Efficiency Your Windows CERTIFICATIONS can hinder your IT Department’s mindset on the cloud Data Science requires cross functional learning
  • 25. Developments in Stats Software A New Hope - Julia, Pandas http://julialang.org/ http://pandas.pydata.org/ The Empire Strikes Back - SAS http://www.sas.com/en_us/software/cloud.html https://www.sas.com/en_us/software/sas-hadoop.html Return of the Jedi http://www.r-bloggers.com/
  • 26. a few Developments in Analytics Revolution R on the cloud (AWS) www.revolutionanalytics.com/RRE-AWS SAS on the cloud http://blogs.sas.com/content/sascom/2013/04/29/start-planning-now-for-sas-9-4/ http://www.allanalytics.com/author.asp?section_id=1411&doc_id=262924 Apache Spark and R http://amplab-extras.github.io/SparkR-pkg/
  • 27. a few Developments on the Cloud Amazon http://aws.amazon.com/ Google https://cloud.google.com/products/ IBM http://www.ibm.com/cloud-computing/in/en/ Oracle https://cloud.oracle.com/java
  • 28. a few Developments in R RHadoop Project https://github.com/RevolutionAnalytics/RHadoop/wiki OpenCPU Project https://www.opencpu.org/ rOpenSci Project http://blog.programmableweb.com/2013/03/20/pw-interview-karthik-ram-ropensci-wrapping-all-science-apis/
  • 29. The future of Open Cloud R + Python on OpenStack ? There is a fair degree that Apache Hadoop related projects like Shark / Spark would be there and We need a Hadoop Based Data Warehouse Solutions(?) We need to hedge for US Policy Interference Education and developer ecosystems have to keep pace
  • 30. Thank You

×