Hadoop in the Microsoft Enterprise

1,217 views
1,119 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,217
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Hadoop in the Microsoft Enterprise

  1. 1. Hadoop in the Microsoft Enterprise @DanRosanova West Monroe Partners http://danrosanova.wordpress.com
  2. 2. W hat does the MicrosoftEnterprise Look Like
  3. 3. Microsoft Enterprise - Basics
  4. 4. The reality is slightly more complicated Client OS Office Active Directory System Center Applications Server OS Virtualization NTFS ETL OLTP OLAP SAN / NAS
  5. 5. Data lives here
  6. 6. Users live here
  7. 7. System Administrators LiveHERE
  8. 8. Many enterprises are fragmented
  9. 9. VERY fragmented
  10. 10. How do we make sense of this Fragmentation ?
  11. 11. Into this
  12. 12. Existing and New tools SSAS ODBC For HIVE + SSIS
  13. 13. Applied to Customer Transactional Social MediaWeb Logs Sqoop Hive Flume HDFSIVR Logs Email & Chat Pig Pig
  14. 14. Transactional DataID Sector Rev Code Acct Amount01 Q X QW 10000 120002 P X AB 20000 102003 O X CD 20000 1122104 N XX QW 50000 232305 M X CD 30000 3306 L XX AB 10000 32323107 K X ER 20000 1208 J XXX CD 20000 323309 I X QW 40000 546810 H X ER 50000 23411 G O AB 55000 76512 F X ER 70000 3453813 E XX ER 25000 345647614 D XXX AB 10000 456415 C X QW 10000 45616 B XX YZ 11000 4417 A X AB 15000 456
  15. 15. Transactional + Temporal Data
  16. 16. Complete information awareness Transactional Temporal Spatial Contextual
  17. 17. This sounds familiarHow is this different from otherfrom other consolidation approaches?
  18. 18. Work with departmentalization Not against it
  19. 19. Agile Analytics Self Service Accessible Easy to use
  20. 20. How do you eat an elephant
  21. 21. Start Simple: Storage $1000 SAN HDInsight (Hadoop) 50 5 TB 5 25 100 1000
  22. 22. Create log storage• IIS (Web Server) Logs• Event Logs• Trace Logs
  23. 23. Incorporate log analysis• Give users access to HDInsight / Hadoop with tools they know• Encourage departmental experiementation• Development and IT departments are good early adopters• So is marketing
  24. 24. Tap into existing ETL• You already have a TON of ETL• Reuse as much as you can• At first aim for replication• Incrementally add value
  25. 25. Why do all this?
  26. 26. How most organizations view their Customer
  27. 27. Some organizations see
  28. 28. What we should see
  29. 29. Understand your data Understand your @Dan Rosanova West Monroe Partners http://danrosanova.wordpress.com

×