Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

451 Research Impact Report


Published on

Learn why 451 Research believes Infochimps is well-positioned with an easy-to-consume managed service for those without Hadoop expertise, as well as a stack of technologically interesting projects for the 'devops' crowd.

Opening with a market positioning statement and ending with a competitive and SWOT analysis, Matt Aslett provides a comprehensive impact report.

Published in: Technology
  • Be the first to comment

451 Research Impact Report

  1. 1. Infochimps targets enterprises with stream-processing additions to big data PaaSAnalyst: Matt Aslett14 Nov, 2012Big data PaaS provider Infochimps has updated its Infochimps Platform with the addition ofstream-processing capabilities to the Infochimps Data Delivery Service based on technologiesfirst developed at Twitter and LinkedIn. With its first paying customer on board, the company isnow seeking partnerships to help support its enterprise-focused PaaS offering. The 451 Take Theres a big difference between offering Hadoop as a service to be configured, deployed and managed, and offering a managed service that masks the complexity of configuring and deploying Hadoop. We believe the latter will gain traction as more late adopters begin to look at adopting the benefits of Hadoop without investing upfront in the expertise and infrastructure required to support it. While Infochimps will need to establish the trust of its target customers, it is well-positioned with an easy-to-consume managed service for those without Hadoop expertise, as well as a stack of technologically interesting projects for the devops crowd.ContextWe first covered Infochimps earlier this year when the company pivoted from being a datamarketplace provider to releasing the technology that supported its data marketplace, both asopen source projects and as PaaS. The initial focus was on making it easier to deploy the HadoopCopyright 2012 - The 451 Group 1
  2. 2. data-processing framework via a Chef-based systems provisioning, deployment and updating toolcalled IronFan. Infochimps has expanded since then with the addition in April of an operationsdashboard called Dashpot, and in August with the addition of the Apache Flume-based DataDelivery Service (DDS) for integrating with existing data sources, as well as early data-streamingfunctionality in DDS via extensions to Wukong, the companys Ruby for Hadoop. The latest additionto the platform expands its support for stream processing through the integration of open sourcestream-processing projects Storm and Kafka.Initially developed by BackType and released as an open source project by Twitter in August 2011following its acquisition of the social analytics provider, Storm is a stream-processing engine. Kafka,meanwhile, is a distributed message queue originally developed by LinkedIn and used by thecompany in a number of projects, including feeding all activity events to its data warehouse andHadoop, as well as keeping its search engine up to date with network activity in real time. Stormand Kafka are used by Infochimps as the foundation of DDS, which is used to connect thecompanys Hadoop-based PaaS with multiple existing data sources, enabling real-time integrationof relevant data for processing and analysis.DDS is a key component of the Infochimps Platform that elevates it beyond a platform for Hadoopdeployment to being a potential big data management and analytics platform of choice. It is DDSthat will enable businesses to adopt the Infochimps Platform alongside existing data managementtechnologies and quickly gain insight from new and existing sources of data.Infochimps main selling point is in lowering the barriers to adopting Hadoop. While there is a lot ofcomplex technology involved – such as IronFan, elastic Hadoop, DDS, elasticsearch, NoSQL andNewSQL databases, Wukong and Dashpot – the platform is delivered as a service designed to maskthat complexity. The company maintains that it can take customers from nowhere to generatingbusiness insight from the Infochimps Platform in 30 days, without the need to hire specialistsupport and analytics staff, or invest in specialist infrastructure.Infochimps has attracted nine paying customers since its platform went live in the second quarter,with an average selling price of $200,000. The company charges customers per node per month forwhat is currently a public cloud offering hosted on Amazon Web Services or Rackspace Cloud.Infochimps has established relationships (soon to be announced) to deliver both private cloud andvirtual private cloud offerings supported in its customers own datacenters or via their trusteddatacenter provider. The company is launching its cloud services across a network of tier fourdatacenters in North America and will begin offering its big data cloud services in the first quarterof 2013. The potential to support private cloud deployments will be aided by the fact that IronFan isCopyright 2012 - The 451 Group 2
  3. 3. a key component in VMwares Serengeti project to make it easy to configure and deploy Hadoop onvirtual machines, while the Infochimps Platform also supports the OpenStack API.The shift toward more enterprise-focused services and partnerships is being led by former Teradataand StackIQ executive (and Xerox PARC EIR) Jim Kaskade, who joined the company as CEO inAugust, replacing cofounder Joe Kelly, who became COO. Kaskade has also been busy lining up anew major financing round. Infochimps had previously raised a total of $3m from investorsincluding DFJ Mercury, although that was during its previous incarnation as a data marketplaceprovider. The company currently has 23 employees, up from 14 in March.CompetitionThere are an increasing number of vendors offering Hadoop as a service, with Amazon and Googlebeing the biggest players at this point. While they therefore pose a competitive threat toInfochimps, the value proposition is quite different, since it still requires a degree of expertise toconfigure, deploy and manage a cloud-based Hadoop service in comparison to Infochimpsmanaged services approach. Weve seen limited uptake of cloud-based Hadoop services to date,with the main use case being development and testing. Indeed, weve noted before that if acompany begins to move toward a larger-scale deployment, the costs can be prohibitive enough torequire on-premises deployment. While Infochimps service is initially based on the public cloud, ithas designs on supporting deployment choice. The company also believes that with the addedvalue of IronFan, DDS, Wukong, Dashpot and the rest, along with its managed services approach, ithas enough to justify the additional cost above that of running Hadoop on a public cloud servicewith the required expertise.Other Hadoop service providers include SunGard, Treasure Data, Qubole, Mortar Data and Guavus,while Infochimps believes its closest competition will come from MetaScale, the Hadoop managedservices subsidiary of Sears Holdings, and tresata, the stealthy data platform provider founded byformer Bank of America managing director for big data and analytics Abhi Mehta. Other vendors aretrying to mask the complexity of configuring and deploying Hadoop by building it into largeron-premises application stacks, so we might also expect would-be customers to consider the likesof Drawn to Scale, Splice Machine or Digital Reasoning, depending on the specific application. Thecompany must also be considered a rival to some extent with Hadoop distributors such as Cloudera,Hortonworks, MapR, IBM and EMC, although there is also the potential for partnerships here, asindicated by the fact that Cloudera CEO Mike Olson is an adviser to Infochimps.Copyright 2012 - The 451 Group 3
  4. 4. SWOT Analysis Strengths Weaknesses We were already fans of the Chef-based cluster Managed services relationships are built on trust. platform tuned for the needs of enterprises using While Infochimps has technological expertise, it Hadoop. DDS adds all-important integration with will need to establish itself before some would-be existing tools that will help drive wider adoption. customers will consider it. Opportunities Threats We are seeing an increasing need for technologies and The big services and software providers are services that mask the complexity of configuring, unlikely to sit back and let demand for Hadoop deploying and managing Hadoop for late adopters. managed services go elsewhere. Expect the Infochimps has both. competition to increase with demand.Copyright 2012 - The 451 Group 4
  5. 5. Reproduced by permission of The 451 Group; © 2012. This report was originally published within 451 Research’s Market Insight Service. For additional information on 451 Research or to apply for trial access, go to: www.451research.comCopyright 2012 - The 451 Group 5