DC HUG Hadoop for Windows


Published on

Presentation on the new, beta, release of the Hortonworks Data Platform for Windows

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

DC HUG Hadoop for Windows

  1. 1. Hadoop for WindowsTerry PadgettSolution Engineer, Hortonworks28 February 2013© Hortonworks Inc. 2013
  2. 2. Why Apache Hadoop on Windows?• About 70% of the Earth’s surface is covered by water … so we build ships• About 72% of servers run Microsoft Windows … so we build Hadoop for Windows © Hortonworks Inc. 2013
  3. 3. What’s in the box?Component Version PatchesHadoop 1.0.3 119Pig 0.9.3 19Hive 0.9.0 12HCatalog 0.4.1 2WebHCat 0.1.4 NoneSqoop 1.4.2 NoneOozie 3.2.0 NoneInitial Hadoop Core Apache JIRA:https://issues.apache.org/jira/browse/HADOOP-8079Apache Hadoop branch-1-win github:https://github.com/apache/hadoop-common/tree/branch-1-winApache Hadoop branch-trunk-win github:https://github.com/apache/hadoop-common/tree/branch-trunk-win © Hortonworks Inc. 2013
  4. 4. What’s changed?- Command-line scripts for the Hadoop surface area- HDFS permissions model mapped to Windows- Resolved issues with path semantics between Java and Windows- Native Task Controller for Windows- Implementation of a Block Placement Policy to support cloud environments, more specifically Azure- Implementation of Hadoop native libraries for Windows (compression codecs, native I/O)- Resolved several reliability issues- Several new unit test cases written for the above changes © Hortonworks Inc. 2013
  5. 5. What do you get?• More deployment choices• Hadoop for Windows is for on-premise deployment – Good fit for organizations with Hadoop operational experience – Next step for those who are ready to move from POC to production• Use HDInsight for public and private cloud deployments – HDInsight Service -> Windows Azure– available in Preview today – HDInsight Server -> for interoperability across platforms of Hadoop, with Microsoft tools, on premise – Developer Preview available today• Full interoperability across platforms• Created through partnership between Hortonworks and Microsoft – Eighteen months of development time Page 5 © Hortonworks Inc. 2013
  6. 6. Installation Tidbits• Prereqs – Microsoft Visual C++ 2010 Redistributable Package (64 bit) – Microsoft.NET Framework 4.0 – JDK 6u31 or higher – For the love of God do not install in a directory path containing a space – Python 2.7.3 – Add the installation directory to path – Hive metastore – Embedded Derby is provided – Alternately, using SQL Server requires table and user set up and SQL Server JDBC Driver – Server time must be in sync – Enable remote scripting – Ports © Hortonworks Inc. 2013
  7. 7. Installation• MSI installer executed on each host• Cluster configuration file assigns node roles# Sample clusterproperties.txt file#Log directoryHDP_LOG_DIR=d:hadooplogs#Data directoryHDP_DATA_DIR=d:hdpdata#HostsNAMENODE_HOST=NAMENODE_MASTER.acme.comSECONDARY_NAMENODE_HOST=SECONDARY_NAMENODE_MASTER.acme.comJOBTRACKER_HOST=JOBTRACKER_MASTER.acme.comHIVE_SERVER_HOST=HIVE_SERVER_MASTER.acme.comOOZIE_SERVER_HOST=OOZIE_SERVER_MASTER.acme.comTEMPLETON_HOST=TEMPLETON_MASTER.acme.comSLAVE_HOSTS=slave1.acme.com, slave2.acme.com, slave3.acme.com#Database hostDB_FLAVOR=derbyDB_HOSTNAME=DB_myHostName#Hive propertiesHIVE_DB_NAME=hiveHIVE_DB_USERNAME=hiveHIVE_DB_PASSWORD=hive#Oozie propertiesOOZIE_DB_NAME=oozieOOZIE_DB_USERNAME=oozieOOZIE_DB_PASSWORD=oozie © Hortonworks Inc. 2013
  8. 8. Cluster ManagementStart/Stop via Services Administration Tool © Hortonworks Inc. 2013
  9. 9. CLI Consistency © Hortonworks Inc. 2013
  10. 10. Our Old Friends, Still Here © Hortonworks Inc. 2013
  11. 11. What’s next• HDP for Windows will be GA in Q2• Eventual alignment with other Hortonworks distributions• Contact Microsoft for HDInsight information © Hortonworks Inc. 2013
  12. 12. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.