Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cloud platform aws-gcp-azure-bluemix

391 views

Published on

Cloud services from AWS, Azure, GCP, and Bluemix mapped to their open source origination and/or counterparts

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Cloud platform aws-gcp-azure-bluemix

  1. 1. A couple of days ago I came across the article "Mapping AWS, Google Cloud, Azure Services to Big Data Warehouse Architecture" here. I do know a bit about data warehousing, and even big data warehouse architecture. However, what interests me is actually a "map of various cloud services against the big data warehouse architecture". More precisely, cloud services from "the three most popular cloud platforms: Microsoft Azure, Google Cloud Platform, and Amazon AWS" are mapped to their open source origination and/or counterparts. As a technical IBMer, my primary area is Big Data & Advanced Analytics, but I happen to know a little about the IBM Bluemix platform. So for (more) completeness, here it comes Bluemix! - Note though, here only Bluemix services involved in big data warehouse architecture are listed. To explore more, see Bluemix website. Disclaimer 1.While I'm employed by IBM this article represents completely my personal viewpoints. Furthermore, I've tried my best but still I can't guarantee the 100% completeness, accuracy, and/or potential services changes. 2.The original author of the article aforementioned own(s) the copyright and by no means I'm modifying the content. Neither do I agree nor disagree with the author on the content. However, for convenience, I'm putting the original table (or map) along with their IBM Bluemix counterparts side by side. PS, Due to space limitation, all the open source stuff in the Bluemix column refers to the cloud service provisioned by IBM Bluemix rather than the original open source software, e.g., HDFS/Hadoop/Hive, etc. means the individual component within BigInsights for Apache Hadoop or BigInsights for Apache Hadoop (Subscription) service and PostgreSQL refers to ElephantSQL and/or Compose for PostgreSQL service.
  2. 2. Open Source Amazon AWS Microsoft Azure Google Cloud IBM Bluemix Batch Ingest Sqoop File Transfer Flume StreamSets AWS Data Transfer Services (various options) Import/Export Service Data Factory Cloud DataFlow Sqoop File Transfer Lift (Aspera) Flume Various services Streaming Ingest Flume StreamSets Amazon Kinesis Firehose Event Hubs IOT Hub Cloud DataFlow Flume, Spark Streaming Analytics Persistent Storage HDFS RDBMS S3, Glacier RDS Storage Blob HDFS SQL Database Persistent Disk Google Cloud Storage Cloud SQL HDFS RDBMS (IBM Proprietary: Db2, dashDB, Informix ... open source: MySQL, PostgreSQL ... NoSQL: MongoDB, Redis, Cloudant ... Block Storage, Cloud Object Storage, File Storage, CDN, etc. Transient Storage Kafka Kinesis Event Hubs IOT Hub HDInsight (Kafka) Cloud Pub/Sub Cloud IoT Core Kafka, Message Hub Batch Processing Hive Flink, Spark MapReduce PostgreSQL EMR Spark EMR Hadoop EMR Presto AWS Batch Redshift Azure Batch HDInisght (Spark/Map Reduce) SQL Data Warehouse Data Lake Analytics Cloud Dataflow (open source Apache Beam) Cloud DataProc (Spark, Hadoop) Hive, Spark, MapReduce, MySQL, PostgreSQL Db2, Information Server on Cloud, etc. Stream Processing Flink Spark Beam Amazon Kinesis Streams Amazon Kinesis Analytics EMR Spark Stream Analytics HDInsight (Storm, Spark) Cloud Dataflow (open source Apache Beam) DataProc (Spark, Hadoop) Spark Streaming Analytics Machine Learning Scikit Tensorflow Spark MLLib Lex Polly Recognition Azure ML Cognitive Services Natural Language SpeechTranslati Data Science Experience (includes
  3. 3. TensorFlow etc. Huge number of libraries Amazon Machine Learning on Vision Video ML Engine support for R, Python with scikit, TensorFlow, Spark with MLLib, etc.) Watson Machine Learning Serving Storage Graph JanusGraph N/A Marketplace Only, e.g. OrientDB N/A Marketplace only, e.g OrientDB N/A IBM Graph Serving Storage BI/EDW Impala + Kudu Redshift Athena SQL Data Warehouse BigQuery Db2 for Warehouse BigSQL Serving Storage Search (keywords + facets) Solr Amazon CloudSearch Amazon Elasticsearch Azure Search N/A Marketplace, e.g. Solr Solr, Compose for ElasticSearch Serving Storage RDBMS PostgreSQL RDS SQL DB Cloud SQL IBM Proprietary: Db2, dashDB, Informix ... and open source: MySQL, PostgreSQL ... Serving Storage NoSQL HBase DynamoDB HDInsight (HBase) CosmosDB BigTable Spanner DataStore NoSQL: HBase, MongoDB, Redis, Cloudant, Redis ... Sandboxes Notebook Zeppelin EMR Zeppelin Azure Notebooks Cloud Datalab Data Science Experience (Juypter) Spark Sandboxes Data Science or Preparation Platform Dataiku DSS Community Edition (not open source) N/A Marketplace only, e.g. Dataiku DSS N/A Marketplace only, e.g. Dataiku DSS Cloud DataPrep (beta). Under the hood this is Trifacta. Data Science Experience Clients/Data Apps Superset (BI) Quicksight PowerBI Google Data Studio Data Science Experience Watson Machine Learning Decision Optimization Orchestration Airflow AWS Data Pipeline Data Factory N/A Marketplace Workload Scheduler (?) ETL Tool N/A AWS Glue (beta) Data Factory N/A Marketplace Data Connect Information Server on
  4. 4. Cloud MDM Hub N/A N/A Marketplace N/A Marketplace N/A Marketplace MDM on Cloud Lineage N/A AWS Glue (beta) N/A N/A Information Server on Cloud Catalog N/A AWS Glue (beta) Data Catalog N/A Marketplace Information Server on Cloud

×