Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Hadoop for WindowsTerry PadgettSolution Engineer, Hortonworks28 February 2013© Hortonworks Inc. 2013
Why Apache Hadoop on Windows?• About 70% of the Earth’s surface is covered by water  … so we build ships• About 72% of ser...
What’s in the box?Component            Version          PatchesHadoop               1.0.3            119Pig               ...
What’s changed?- Command-line scripts for the Hadoop surface area-  HDFS permissions model mapped to Windows- Resolved iss...
What do you get?• More deployment choices• Hadoop for Windows is for on-premise deployment   – Good fit for organizations ...
Installation Tidbits• Prereqs  – Microsoft Visual C++ 2010 Redistributable Package (64 bit)  – Microsoft.NET Framework 4.0...
Installation• MSI installer executed on each host• Cluster configuration file assigns node roles# Sample clusterproperties...
Cluster ManagementStart/Stop via Services Administration Tool     © Hortonworks Inc. 2013
CLI Consistency   © Hortonworks Inc. 2013
Our Old Friends, Still Here    © Hortonworks Inc. 2013
What’s next• HDP for Windows will be GA in Q2• Eventual alignment with other Hortonworks  distributions• Contact Microsoft...
Questions?
Upcoming SlideShare
Loading in …5
×

DC HUG Hadoop for Windows

644 views

Published on

Presentation on the new, beta, release of the Hortonworks Data Platform for Windows

Published in: Technology
  • Be the first to comment

  • Be the first to like this

DC HUG Hadoop for Windows

  1. 1. Hadoop for WindowsTerry PadgettSolution Engineer, Hortonworks28 February 2013© Hortonworks Inc. 2013
  2. 2. Why Apache Hadoop on Windows?• About 70% of the Earth’s surface is covered by water … so we build ships• About 72% of servers run Microsoft Windows … so we build Hadoop for Windows © Hortonworks Inc. 2013
  3. 3. What’s in the box?Component Version PatchesHadoop 1.0.3 119Pig 0.9.3 19Hive 0.9.0 12HCatalog 0.4.1 2WebHCat 0.1.4 NoneSqoop 1.4.2 NoneOozie 3.2.0 NoneInitial Hadoop Core Apache JIRA:https://issues.apache.org/jira/browse/HADOOP-8079Apache Hadoop branch-1-win github:https://github.com/apache/hadoop-common/tree/branch-1-winApache Hadoop branch-trunk-win github:https://github.com/apache/hadoop-common/tree/branch-trunk-win © Hortonworks Inc. 2013
  4. 4. What’s changed?- Command-line scripts for the Hadoop surface area- HDFS permissions model mapped to Windows- Resolved issues with path semantics between Java and Windows- Native Task Controller for Windows- Implementation of a Block Placement Policy to support cloud environments, more specifically Azure- Implementation of Hadoop native libraries for Windows (compression codecs, native I/O)- Resolved several reliability issues- Several new unit test cases written for the above changes © Hortonworks Inc. 2013
  5. 5. What do you get?• More deployment choices• Hadoop for Windows is for on-premise deployment – Good fit for organizations with Hadoop operational experience – Next step for those who are ready to move from POC to production• Use HDInsight for public and private cloud deployments – HDInsight Service -> Windows Azure– available in Preview today – HDInsight Server -> for interoperability across platforms of Hadoop, with Microsoft tools, on premise – Developer Preview available today• Full interoperability across platforms• Created through partnership between Hortonworks and Microsoft – Eighteen months of development time Page 5 © Hortonworks Inc. 2013
  6. 6. Installation Tidbits• Prereqs – Microsoft Visual C++ 2010 Redistributable Package (64 bit) – Microsoft.NET Framework 4.0 – JDK 6u31 or higher – For the love of God do not install in a directory path containing a space – Python 2.7.3 – Add the installation directory to path – Hive metastore – Embedded Derby is provided – Alternately, using SQL Server requires table and user set up and SQL Server JDBC Driver – Server time must be in sync – Enable remote scripting – Ports © Hortonworks Inc. 2013
  7. 7. Installation• MSI installer executed on each host• Cluster configuration file assigns node roles# Sample clusterproperties.txt file#Log directoryHDP_LOG_DIR=d:hadooplogs#Data directoryHDP_DATA_DIR=d:hdpdata#HostsNAMENODE_HOST=NAMENODE_MASTER.acme.comSECONDARY_NAMENODE_HOST=SECONDARY_NAMENODE_MASTER.acme.comJOBTRACKER_HOST=JOBTRACKER_MASTER.acme.comHIVE_SERVER_HOST=HIVE_SERVER_MASTER.acme.comOOZIE_SERVER_HOST=OOZIE_SERVER_MASTER.acme.comTEMPLETON_HOST=TEMPLETON_MASTER.acme.comSLAVE_HOSTS=slave1.acme.com, slave2.acme.com, slave3.acme.com#Database hostDB_FLAVOR=derbyDB_HOSTNAME=DB_myHostName#Hive propertiesHIVE_DB_NAME=hiveHIVE_DB_USERNAME=hiveHIVE_DB_PASSWORD=hive#Oozie propertiesOOZIE_DB_NAME=oozieOOZIE_DB_USERNAME=oozieOOZIE_DB_PASSWORD=oozie © Hortonworks Inc. 2013
  8. 8. Cluster ManagementStart/Stop via Services Administration Tool © Hortonworks Inc. 2013
  9. 9. CLI Consistency © Hortonworks Inc. 2013
  10. 10. Our Old Friends, Still Here © Hortonworks Inc. 2013
  11. 11. What’s next• HDP for Windows will be GA in Q2• Eventual alignment with other Hortonworks distributions• Contact Microsoft for HDInsight information © Hortonworks Inc. 2013
  12. 12. Questions?

×