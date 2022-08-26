Successfully reported this slideshow.
Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to Be Evaluated

Aug. 26, 2022
0 likes 2 views
Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to Be Evaluated

Aug. 26, 2022
0 likes 2 views
Data & Analytics

Anil Inamdar, VP & Head of Data Solutions, Instaclustr
Most organizations considering open source and open core cloud technologies as part of their all-important data stack understand they need to rigorously evaluate the software's licensing terms and gauge the long-term health of its community and ecosystem. What still happens less frequently ' but is just as crucial to these risk assessments ' is developing a thorough understanding of the business models governing the commercial organizations attached to each data-layer technology being considered. You must discern the underlying motivations of the vendors or technology providers you depend on to deliver or support open source data-layer software (as well as those vendors with strong influence over its development and maintenance). By acutely understanding these incentives, you can identify if, where, and how they may map to possible risks to your enterprise's adoption and ongoing open source implementation. Don't limit the assessment to licenses and community health -- although both are still very key variables.

This session will discuss specifics on what you need to look for and consider when vetting open source data technologies in the cloud as offered by:

-- Businesses using OSS as the foundation of their own intellectual property
-- Businesses that maintain total control offer the OSS they offer
-- Major cloud providers

Anil Inamdar, VP & Head of Data Solutions, Instaclustr
Most organizations considering open source and open core cloud technologies as part of their all-important data stack understand they need to rigorously evaluate the software's licensing terms and gauge the long-term health of its community and ecosystem. What still happens less frequently ' but is just as crucial to these risk assessments ' is developing a thorough understanding of the business models governing the commercial organizations attached to each data-layer technology being considered. You must discern the underlying motivations of the vendors or technology providers you depend on to deliver or support open source data-layer software (as well as those vendors with strong influence over its development and maintenance). By acutely understanding these incentives, you can identify if, where, and how they may map to possible risks to your enterprise's adoption and ongoing open source implementation. Don't limit the assessment to licenses and community health -- although both are still very key variables.

This session will discuss specifics on what you need to look for and consider when vetting open source data technologies in the cloud as offered by:

-- Businesses using OSS as the foundation of their own intellectual property
-- Businesses that maintain total control offer the OSS they offer
-- Major cloud providers

Data & Analytics

Data Con LA 2022-Open Source or Open Core in Your Data Layer? What Needs to Be Evaluated

  1. 1. ©Instaclustr Pty Limited, 2020 August 13, 2022 Open Source or Open Core? What Needs to Be Evaluated Before Diving In. October 19, 2021 Anil Inamdar VP & Head of Data Solutions @Instaclustr (Now Part of SPOT by NetAapp) (Open-source that is really open-source)
  2. 2. Open-source: A Brief History
  3. 3. Open-Source: History • 1984 – Richard Stallman found the Free Software Foundation (“FSF”) (www.fsf.org) • 1985 – GNU Public License (“GPL”) officially announced • 1994 - Linux 1.0 is released under the GPL by Linus Torvalds • 1998 - Netscape released its software as a free software. The term “Open Source” is first time used • 2003 - Linux OS/Apache Web Server become mainstream • Progress continues …
  4. 4. Open-Source: Progression Beginning Expansion Maturity Timeline 1990s 2000s 2015 Purpose Revolution against closed source Built with commercial customers in mind Built for enterprise customers, offered on cloud as SaaS Modus Operandi Asynchronous collaboration Typically developed in a company in the beginning, moved to foundation at a later stage Primarily developed by a commercial business Market Early adapters, startups Mid market, enterprise Enterprise Licensing Free Partly free, could charge for enhancements Partly free, could charge for enhancements Example Linux, MySQL Hadoop, Kafka, Cassandra Confluent, Mongo, Elastic.co
  5. 5. Open-Source: Popularity
  6. 6. Why Open-source?
  7. 7. Open-Source: Advantages Economies of Scale • No Licensing Cost • Faster time to market • No vendor lock-in • No technology lock-in Security • Safety in numbers • More people use and test, more loopholes will be found • “Given enough eyeballs, all bugs are shallow.” Linus’ Law Intellectual Property • Open-source foundations – provide flexible licenses • Enterprises benefit from clarity and transparency Scalability & Reliability • The software is Peer- reviewed, so reliable • Also, adoption by fast changing, multiple organizations make it robust and scalable Supportive Community • Thriving, vibrant communities • Common purpose and passion • Apache Software Foundation over – 8000 committers Innovation • Bill Joy’s Law: No matter who you are, most of the smartest people work for someone else! • You can decide on the direction of your product • Unusual use cases
  8. 8. What is Open-core?
  9. 9. Open-Core: Definition • Open-core is a business model for monetizing free OSS • The “core” of the software is open with added proprietary features • Open-source is always a “project” while open-core is a “product” • Open-core exploited some of the challenges with open-source - Absence of support - Need for features like monitoring, auto-provisioning • Successfully exploited by vendors like Cloudera, DataStax, Confluent • But that’s not the whole story!
  10. 10. Open-Core: March towards proprietary model Support Gap Feature Gap Delivery Model Open-source catches up Open-source catches up
  11. 11. Open-source vs Open-Core: 5 Considerations
  12. 12. 1. Licensing 2. Governance 3. Brand 4. Business Models 5. Ecosystem 5 Key Considerations in Evaluating Open-Source vs. Open-Core 5 Key Considerations
  13. 13. Licensing
  14. 14. Copyright ● Somebody owns the copyright in all code (exactly who can be complicated and depends on project structure) ● Licences are how the copyright holder provides permission for other people to use the code they hold copyright in ● Open-source licenses are generally perpetual, but the copyright holder can change the license with new versions Open Source Licences
  15. 15. Open- Source Licences ● https://en.wikipedia.org/wiki/Comparison_of_free_and_open- source_software_licences#cite_note-73 ● https://choosealicense.com/licenses/ It’s complicated!
  16. 16. Open- Source Licences Summary Licence Type Description Examples Permissive Licenses Allow broad, free use and modification with minimal obligation Apache 2.0 MIT License 3-clause-BSD Copyleft Licenses Allow broad, free use but require any modifications to be made public under the same license. Often prohibited by corporate open- source policies. GPL Mozilla Public License Eclipse Public License Custom Licenses Can do whatever they want. Increasingly used by open core companies to protect themselves against AWS,etc with provisions against deploying as a managed service. Each one require legal review before use. Confluent Community License Server-Side Public License (Mongo, Elastic.co) Elastic License Redis Source Available License
  17. 17. Governance
  18. 18. ● Ultimately, the project owner holder gets to decide what changes go into the codebase ● Foundation: Where the project owner is a foundation, this provides open governance and decision making ○ Apache Foundation (Apache Web Server, Cassandra, Kafka, Lucene etc) ○ Cloud Native Computing Foundation - CNCF (Kubernetes, Vitess, etc) ● Corporation: Where they project owner is a corporation, they have unilateral decision-making ability as to what changes to accept and if they want to change the license for future versions ● Elastic.co is copyright holder for Elasticsearch although it is significantly based on Apache Lucene ● Redis is very unclear but has made moves to more open governance of the core Governance Goverance and Ownership
  19. 19. Brand
  20. 20. ● The brand is the logo that goes along with the software ● It is not critical for making the decision to use the software, but it surely confuses everyone. Usually goes hand in hand with Governance. ● The three logos on the top are owned by Apache Foundation, a not-profit entity ● While the logos for the MariaDB and Scylla are owned by corporations, a profit-making company – even though they are open-source ● Similarly, the logo for open-source Drupal is owned by an individual – Dries Buytaert as BDFL (Benevolent Dictator For Life) Brand Who owns the logo?
  21. 21. Business Models
  22. 22. ● Pure open-source (Apache Cassandra, Apache Kafka etc.) o Community, key corporations behind the projects ● Open core (e.g., Datastax, Confluent) ○ Generally, want to keep the core strong but have a big interest in protecting the value in their proprietary extensions ● Open-source IP owner (e.g., Elastic.co, Scylla) ○ Really using the “open source” mantel as marketing ● Cloud Providers ○ Provide stability, stamp of maturity but minimal investment in community dev. Core product investment varies. ○ Elastic.co changed its licensing to compete with Amazon’s hosting solution thereby forcing AWS and Netflix to fork the code and start Open Distribution. Business Models
  23. 23. Whitepaper: https://docs.google.com/document/d/1TKeguk2_5pspymiPcGR5qQf mlYuCa9zH5oxvZ3QMmmg/edit Project Assessment Reports: Kafka: https://drive.google.com/file/d/1- 2ESj1vRw2__6B4T39FNJjTpsZyw4KEN/view?usp=sharing Cassandra: https://docs.google.com/document/u/0/d/1AIMbC- 556rH2XfqqkacytDMFaFvzWmhn/edit?usp=docs_home&ths=true&rt pof=true White Paper
  24. 24. Ecosystem
  25. 25. ● Many and large users of an open-source project provide assurance of future stability ○ Apple and Netflix as major contributors to Cassandra ○ elastic.co muddling Elasticsearch licenses resulted in AWS, Netflix and others producing Open Distro ● Ecosystem of consultants and people providing integrated solutions also provides stability ● Many a times, a key software is identified together with related open-source components – see example on the next page. Ecosystem Usage
  26. 26. Ecosystem: Confluent Kafka vs Apache Kafka Function Confluent Kafka Open-Source Alternative Query KSQL Apache Calcite Monitoring Confluent Control Center Prometheus Alerting Confluent Control Center Prometheus (Alert Manager) Replication / Mirroring Confluent Control Center MirrorMaker 2 Security FIPS, PCI and SOC2 certified FIPS, PCI and SOC2 certified (Instaclustr Platform) Backup / Restore Confluent Control Center Instaclustr backup (ZK configs, data directory, and transaction directories to S3) Connectors Proprietary + open-source support Open-source alternatives available Ecosystem Support Restricted to Kafka Instaclustr : Kafka, Cassandra, ElasticSearch, Spark etc.
  27. 27. ● ElasticSearch began using Apache 2.0 license ● Elastic.co experimented with mixed in tree licensing (x-pack) ● The governance and decisions were made by Elastic.co – a commercial company ● Elastic.co made money by providing hosting services plus proprietary add-ons. ● AWS started offering a similar hosting service for ElasticSearch, undercutting Elactic.co ● Elastic.co changed their licensing model from Apache 2.0 to Elastic Licensing and SSPL (Server-Side Public License) ● So basically, AWS could not longer use ElasticSearch going forward ● AWS on the other hand forked the then version of Elasticsearch (along with Netflix) - OpenSearch ● So, we now have two choices for ElasticSearch! ElasticSearch Saga
  28. 28. Some Popular OSS Software License Governance Cassandra Non-Profit (Apache Foundation) Apache 2.0 DataStax Cassandra Apache 2.0 (Core) + Proprietary Commercial (DataStax) Scylla Kafka Apache 2.0 Non-Profit (Apache Foundation) Confluent Kafka MariaDB Drupal Apache 2.0 (Core) + Proprietary Commercial (Confluent) Commercial (MariaDB) Commercial (Scylla) Non-Profit (Drupal Association) GPLv2, MariaDB Corporation GNU AGPL, Scylla Corporation GPLv2, BDFL (Individual)
  29. 29. Instaclustr Open Source That’s Really Open Source No license fees. No lock-in.
  30. 30. The New World Instaclustr Focus
  31. 31. How Instaclustr Delivers Support Consulting Managed Platform Full Lifecycle Operations First Certified and Tested Builds Fills Gaps From Open Source Ecosystem 100% Open Source No Licenses/ No Lock In © Instaclustr Pty Limited, 2020
  32. 32. Cassandra DataStax Mongo Mongo Kafka Confluent Redis Redis Labs © Instaclustr Pty Limited, 2020 Open-Source Alternatives Closed Source/Open Core Open Source $0 License Fees No Lock-In Open Source Enterprise Enhancements Elasticsearch ElasticCo • License $ • Lock In • Proprietary • License $ • Lock In • Proprietary • License $ • Lock In • Proprietary • License $ • Lock In • Proprietary • License $ • Lock In • Proprietary
  33. 33. 33
  34. 34. Company Overview Founded in 2013 CBR Australia 250+ Employees in 4 offices HQ in Redwood City, CA USA 300+ Customers Globally Global Presence 24x7 TechOps and Dev in Australia and the USA © Instaclustr Pty Limited, 2020
  35. 35. Conclusion • Use 100 % open-source technologies if - You want to continue using true open-source software and not move towards proprietary software - Don’t want to pay software licensing costs - Avoid vendor locking - Avoid technology locking - Want to be part of open-source community - Want to continue to innovate your solution
  36. 36. Special Promotion • 2-hour open-source consulting advice • 1-week free access to Instaclustr platform - Spin up 3 node clusters in 3-5 minutes! - Cassandra, Kafka, ElasticSearch, Redis • Mention “Data Con LA August 2022” to get the above promotions
  37. 37. www.instaclustr.com info@instaclustr.com @instaclustr THANK YOU! © Instaclustr Pty Limited, 2020 Anil Inamdar VP & Head of Data Solutions anil.inamdar@instaclustr.com

