Big data company and startup investigation


Published on

12 big data company products investigation: Datameer, Qubole, Alpine, Pentaho, SiSense, 1010data, Palantir, Platfora, Data Torrent, Continuuity, Nuevora

Published in: Technology, Business
  • @undefined Thomas, thanks for the clarification.
    Are you sure you want to  Yes  No
    Your message goes here
  • Nice summary.

    One point of clarification -- Alpine does not simply use databases and Hadoop as a data source, it actually runs inside those platforms (through SQL or MR push-down). In the analytics domain, that is an important distinction.

    Also, Alpine is unique among these vendors in its support for predictive analytics. The rest are mostly limited to BI-type analytics. Palantir can perform predictive analytics, but it offers custom solutions rather than off-the-shelf software.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Big data company and startup investigation

  1. 1. + Big Data Company and Startup Investigation Chen Ting Zhao
  2. 2. + Edit Log  2013.12 First edition  2014.4 Add company in P11-15
  3. 3. + Big Data Landscape
  4. 4. + Datameer Products: Technologies It’s a full lifecycle product for Big Data: data integration, analytics and visualization Open ecosystem: allow end user to publish reuse analytics template as application Business Model: Product and Service License Consultant of Industry Solution Reference: Full lifecycle and Open ecosystem Basic but can solve 80% problem Great user experience. 55 type of data sources statistical analytics on spreadsheet mining: clustering, decision tree, association and recommendation 22 widgets in inforgraphic using drag and drop
  5. 5. + Qubole Founded by Pre Facebook Analytics Lead  Infrastructure service: Big Data Service for data professionals.  100% Managed Hadoop Cluster in the cloud (auto-scaling, high performance, self-maintenance)  Built-in Data Connectors to Apps and Data Sources (AppNexus, RedShift, mongoDB and more)  24/7 Customer Support (via chat, phone or e-mail) from data infrastructure SWAT team  All these services are hosted in the Amazon AWS cloud  API and Web UI to:  Lunch a new Hadoop cluster  Run Hadoop Job, Hive Query, Pig Job, Workflow, etc.  Monitoring and Schedule the analytics tasks.  Business Model  $199/Month: Startup: 3 user, 550 QCUH, 170 Invocations  $999/Month: Premium: 10 user, 4300 QCUH, 1050 Innovations  Customer: Pinterest, Decide, MediaMath, etc
  6. 6. + Alpine  World’s First Collaborative, Code-Free, Advanced Analytics Solution for Big Data  Web GUI interface to create a analytics workflow(like SPSS Modeler), built in full statistical functionality including Time-Series Analysis, Classification, Regression, Decision Trees and more. Complete customization with native support for Hadoop and Relational Databases  Data source from: Greenplum Database, Hadoop Cluster(Pivotal, Greenplum, Cloudera, Apache, and MapR), HAWQ  Easy visualization of data: Frequency, Histogram, Heatmap, Time Series and Boxplot  Easy collaborate with other to finish a analytics task using workspace.  Business Model: on line SaaS service
  7. 7. + Pentaho  Feature:  Web-based data access wizard for business users  Powerful data integration and federation for IT and developers  Access to any data type from Excel to big data sources such as Hadoop, NoSQL and high performance analytic databases Technical Highlight: • Visual designer for MapReduce jobs to reduce development cycles. • Data preparation, modeling and exploration of unstructured data sets. • Cluster support, enabling distributed processing of jobs across multiple nodes. • Unique in-Hadoop execution for extremely fast performance.
  8. 8. + SiSense  Standard BI software:  Interactive Dashboard and Reports  Manager and Mashup Data  Integration with other application
  9. 9. + 1010data  Cloud-based analytical platform allows businesses to glean new insights from unlimited amounts and varieties of data, from the businesses themselves and other companies.  Business Model  Data consumers (business executives, analysts and application developers)  Data providers  Third-party integrators  Independent software vendors (ISVs)  The 1010data analytics platform is an integrated stack that includes:  proprietary database management system (DBMS), data integration tools, and unique user interface (the Trillion-Row Spreadsheet)  Not a Hadoop based solution
  10. 10. + Palantir  Palantir stack their analysis on the data platform: data integration, search and discovery, knowledge management and collaboration.  Palantir Gotham and Palantir Metropolis are the core technologies that power any Palantir solution.  Gotham: Get a clear picture of all your data, structured or unstructured. Pivot analysis seamlessly between the semantic, geospatial, and temporal views. Collaborate at the colleague, team, and organizational level.  Metropolis: a quantitative analysis platform that provides a suite of analytical tools enabling complex, multi-study research demands.  Industry Solutions:Government,Health and Commercial   They focusing on the data integration and technologies of how to process different type of data effectively, not focusing on how to make automatically analytics on data to replace data scientist work.
  11. 11. + Platfora  Product: Simplify how use can use Hadoop by simplify the data collection and analytics process with tools and UI. Demo:
  12. 12. +  A real-time streaming process engine on Hadoop/Yarn, designed to enable highly scalable, massively distributed real-time computations with minimal overhead. Data Torrent Architecture: Dev Process Execution Logic Hadoop Grid
  13. 13. + Continuuity  Coo
  14. 14. + Nuevora  Focusing on Marketing and Customer Behavior Analytics.  Process  Scenarios:  Customer Acquisition Growth  Customer acquisition is a broad term that is used to identify the processes and procedures used to locate, qualify and ultimately secure the business of new customers.  Customer Retention Growth  The true health of a business closely relates to how well it can acquire and retain customers. An organization’s ability to provide value drives its customer retention, which impacts the success of the entire business.  Customer Value Growth  The growth of a business is fueled by maximizing the wallet share and life-time profitability of its customer base … whether you call it “customer relationship management” or just good business.  Inside Technologies:  Hadoop: Data processing  R: predictive model  Tableau: data visualization