• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Hadoop in the Microsoft Enterprise

Hadoop in the Microsoft Enterprise






Total Views
Views on SlideShare
Embed Views



2 Embeds 21

http://eventifier.co 20
http://eventifier.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Hadoop in the Microsoft Enterprise Hadoop in the Microsoft Enterprise Presentation Transcript

    • Hadoop in the Microsoft Enterprise @DanRosanova West Monroe Partners http://danrosanova.wordpress.com
    • W hat does the MicrosoftEnterprise Look Like
    • Microsoft Enterprise - Basics
    • The reality is slightly more complicated Client OS Office Active Directory System Center Applications Server OS Virtualization NTFS ETL OLTP OLAP SAN / NAS
    • Data lives here
    • Users live here
    • System Administrators LiveHERE
    • Many enterprises are fragmented
    • VERY fragmented
    • How do we make sense of this Fragmentation ?
    • Into this
    • Existing and New tools SSAS ODBC For HIVE + SSIS
    • Applied to Customer Transactional Social MediaWeb Logs Sqoop Hive Flume HDFSIVR Logs Email & Chat Pig Pig
    • Transactional DataID Sector Rev Code Acct Amount01 Q X QW 10000 120002 P X AB 20000 102003 O X CD 20000 1122104 N XX QW 50000 232305 M X CD 30000 3306 L XX AB 10000 32323107 K X ER 20000 1208 J XXX CD 20000 323309 I X QW 40000 546810 H X ER 50000 23411 G O AB 55000 76512 F X ER 70000 3453813 E XX ER 25000 345647614 D XXX AB 10000 456415 C X QW 10000 45616 B XX YZ 11000 4417 A X AB 15000 456
    • Transactional + Temporal Data
    • Complete information awareness Transactional Temporal Spatial Contextual
    • This sounds familiarHow is this different from otherfrom other consolidation approaches?
    • Work with departmentalization Not against it
    • Agile Analytics Self Service Accessible Easy to use
    • How do you eat an elephant
    • Start Simple: Storage $1000 SAN HDInsight (Hadoop) 50 5 TB 5 25 100 1000
    • Create log storage• IIS (Web Server) Logs• Event Logs• Trace Logs
    • Incorporate log analysis• Give users access to HDInsight / Hadoop with tools they know• Encourage departmental experiementation• Development and IT departments are good early adopters• So is marketing
    • Tap into existing ETL• You already have a TON of ETL• Reuse as much as you can• At first aim for replication• Incrementally add value
    • Why do all this?
    • How most organizations view their Customer
    • Some organizations see
    • What we should see
    • Understand your data Understand your @Dan Rosanova West Monroe Partners http://danrosanova.wordpress.com