5. Hadoop
native libraries
for Windows
Contributed
FileSystem
implementation
for Azure Storage
Hive
interactive
query
execution
10,000+
Code line
contributions
HDFS
permissions
model mapped
to Windows
Azure VM
donation
intended for
Jenkins Servers
supporting
Continuous
Integration efforts
6,000+
Engineering
hours
Native Task
Controller
for Hadoop
on Windows
2 minsUser readinessReachTechnology enablers IntroductionWhy we are here- MSFT understands that Hadoop technology will transform business capabilities Hadoop in nascent stage, audience are early adopters MSFT’s role is to help make democratization of Hadoop happen
1.5 minsExample of democratization of technology throughout history: Cell Phone What was the trend that enabled this transformation? AMPS was a pioneering technology that helped drive mass market usage of cellular technology, but it had several serious issues by modern standards. It was unencrypted and easily vulnerable to eavesdropping via a scanner. These phones were eventually superseded by Digital AMPS (D-AMPS) in 1990, and AMPS service was shut down by most North American carriers by 2008. In the 1990s, the 'second generation' mobile phone systems emerged. Two systems competed for supremacy in the global market: GSM standard and CDMA standard. These differed from the previous generation by using digital instead of analog transmission, and also fast out-of-band phone-to-network signaling. When 2G emerged, mobile phone usage skyrocketed and has progressed from there with Mobile broadband data (3G) and eventually Native IP Networks (4G). What did we hit a billion cell phone users? In 2003 there were 1 billon cell phone users. (A History of Popular Culture: More of Everything, Faster and Brighter’ Raymond F. Betts)1 Billions users in 2003
1.5minsExample of democratization of technology throughout history:TravelWhat was the trend that enabled this transformation? The introduction of the Commercial Jet Airliner in 1952 allowed safe, long distance, non-stop air travel across the Atlantic ocean. What was the point in time in which we hit a billion airline travelers?TBD
3 minsWhy Are We Here/Crossroads Ripeness for the democratization of Hadoop technology Adoption challengesMSFT solutionsEvery person in the business will benefit from the value the technology brings Adoption Blockers: Market Analysis Data Research Hadoop is perceived to deliver multiple benefits specifically around unstructured data, however, lack of in-house expertise stands out as the main impediment to its implementation.Top three concerns/blockers:Lack of in-house expertise 59%Lack of integration with existing BI tools 41%Lack of security & Complex to manage and administer 37% MSFT Solutions to blockers SQL Server- 46% share. Most widely deployed database on the planet. Office- Excel is the most widely used BI tool with over 1 billion people using Office Windows Azure Active Directory- Of all the organizations that have a directory, roughly 95% of them use Active DirectoryBlog: http://blogs.msdn.com/b/windowsazure/archive/2012/11/27/windows-azure-active-directory-processes-200-billion-authentications-connecting-people-data-and-devices-around-the-globe.aspx
2 minsMSFT investments and contributions to Hadoop We are starting at the bottom, in the source code. Making the source base work better not just for Windows, but for everyoneCommunity Contributions:10s of thousands of code line contributed (across all deliverables) 6000+ engineering hours contributed (since February 2012)Others:Apache Build/Verification Infrastructure:Working with Apache Infrastructure team & Hadoop Core PMC on donation of Azure VM’s to be used as Jenkins Servers for Continuous IntegrationInteractive Query: Contributing code and query processing experience to help with Hive query performance (Stinger, ORC & Tez projects)Hadoop on Windows (1.0 & 2.0):Contributed back our porting efforts for Hadoop on Windows including:Command-line scripts for the Hadoop surface areaMapping the HDFS permissions model to WindowsNative Task Controller for Windows Implementation of Hadoop native libraries for Windows (compression codecs, native I/O)ASV Driver:Contributed our FileSystem implementation for Azure Storage
2 minsThese contributions allow us to deliver great products to the market with HadoopHortonworks Data Platform on Windows (Highlight Hortonworks partnership)HDInight Service on Windows Azure
1 minMomentum: Here is what some of our customers are doing with BI Ascribe: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=710000002092 Ascribe created a BI solution that monitors infectious disease outbreaks on the national level, and also improves operations for local care providers.BI solution that helps healthcare providers detect, predict, and respond more quickly to outbreaks of infectious disease and other health threats.Delivering healthcare tools faster, speeding response, and providing actionable insight into large volumes of data.About: One of the UK’s leading suppliers of clinically focused IT solutions for the healthcare industry.Need: Provide rapid insight into large volumes of data from multiple sources to help clinicians improve services. Solution: Designed an end-to-end Big Data solution with BI tools based on Microsoft SQL Server 2012 and Windows Azure HDInsight Service.Benefits:Transforms healthcare with near-real-time access to information Speeds response to health threats Provides actionable insight into large volumes of structured and unstructured data
1 min343 Industries/Halo 4: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=710000002102 Helping to improve the gaming experience – we make weekly tweaks on the game itself based on heuristics gathered from player data and spot trends that even allow us to prevent cheaters.About: The 343 Industries development team hosts and manages Halo 4Need: The team needed to provide BI insight about the game to internal and external customers.Solution: The team implemented a solution that uses Windows Azure HDInsight Service, based on the Apache Hadoop data-processing framework, and Microsoft BI technologies.Benefits:Enhances user experience through increased agility and faster response times. Connects Halo 4 team directly to customers through weekly updates. Keeps playing field level by providing in-game analysis to detect cheaters. Facilitates customized campaigns aimed at retaining players. Opportunity to plug Mike Flasko’s “Master Chief Love Hive: Hadoop in the Cloud” session at 4:25-5:05pm.
It's not enough to have great Hadoop runtime, incredibly scalable, easy access, usable and great TCO, etc. You have to get the value out of the end. Show a demo about how we are delivering end value from data managed by HadoopData Explorer & GeoFlowGeoFlow builds:Summit attendees time lapse Hadoop/Big data heat map Summit attendees + Hadoop/Big data together