SLC Snowflake User Group - Mar 12, 2020

From the Data Work Out event: Performant and scalable Data Science with Dataiku DSS and Snowflake Managing the whole process of setting up a machine learning environment from end-to-end becomes significantly easier when using cloud-based technologies. The ability to provision infrastructure on demand (IaaS) solves the problem of manually requesting virtual machines. It also provides immediate access to compute resources whenever they are needed. But that still leaves the administrative overhead of managing the ML software and the platform to store and manage the data. A fully managed end-to-end machine learning platform like Dataiku Data Science Studio (DSS) that enables data scientists, machine learning experts, and even business users to quickly build, train and host machine learning models at scale, needs to access data from many different sources and can also access data provided by Snowflake. Storing data in Snowflake has three significant advantages: a single source of truth, shorten the data preparation cycle, scale as you go.

Zero to Snowflake Presentation

Brett VanderPlaats

This document outlines an agenda for a 90-minute workshop on Snowflake. The agenda includes introductions, an overview of Snowflake and data warehousing, demonstrations of how users utilize Snowflake, hands-on exercises loading sample data and running queries, and discussions of Snowflake architecture and capabilities. Real-world customer examples are also presented, such as a pharmacy building new applications on Snowflake and an education company using it to unify their data sources and achieve a 16x performance improvement.

New! Real-Time Data Replication to Snowflake

Precisely

Your business is adopting the Snowflake cloud data platform to rapidly deliver data insights and lower the costs of your data warehouse. But you have a problem – what happens when data changes on your mainframe and IBM i systems? How do you make sure Snowflake is always up-to-date and in sync with these systems of record? If you can’t integrate changes occurring on your mainframe and IBM i systems to Snowflake, your business will miss the critical data it needs to drive real-time insights and decision making. Join us to learn how the latest enhancements to Precisely Connect help your business meet its data-driven goals by sharing changes made on legacy, mainframe, and IBM systems to Snowflake in real time. During this webinar, you will learn more about: - How to easily support data replication from mainframe and IBM i to Snowflake - Connect’s enhanced data replication capabilities for cloud data platforms - How customers are using Connect to support their cloud data platform strategies

AWS Summit Singapore 2019 | Snowflake: Your Data. No Limits

AWS Summits

This document discusses Snowflake, a cloud data platform. It describes Snowflake's mission to enable organizations to be data-driven. It outlines problems with traditional data architectures like complexity, limited scalability, inability to consolidate data, and rigid costs. Snowflake's solution is a cloud-native data warehouse delivered as a service that offers instant elasticity, end-to-end security, and the ability to query structured and semi-structured data using SQL. Key benefits of Snowflake include supporting any scale of data, users and workloads; paying only for resources used; and providing simplicity, scalability, flexibility and elasticity.

Snowflake for Data Engineering

Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...

Certus Solutions

Snowflake is a cloud data warehouse that provides elasticity, scalability, and simplicity. It allows organizations to consolidate their diverse data sources in one place and instantly scale up or down their compute capacity as needed. Aptus Health, a digital marketing company, used Snowflake to break down data silos, integrate disparate data sources, enable broad data sharing, and provide a scalable and cost-effective solution to meet their analytics needs. Snowflake addressed both business needs for timely access to centralized data and IT needs for flexibility, extensibility, and reducing ETL work.

Actionable Insights with AI - Snowflake for Data Science

Talk @ ScaleUp 360° AI Infrastructures DACH, 2021: Data scientists spend 80% and more of their time searching for and preparing data. This talk explains Snowflake’s Platform capabilities like near-unlimited data storage and instant and near-infinite compute resources and how the platform can be used to seamlessly integrate and support the machine learning libraries and tools data scientists rely on.

This document discusses launching a data platform on Snowflake and the skills and technology required. It outlines that Snowflake provides a low barrier to entry with pay-per-use pricing and the ability to scale compute resources up and down as needed. Running a data platform requires data modeling skills and being able to work in an agile environment. The company's platform is a wrapper service built on Snowflake that extracts, loads, transforms data and provides a semantic layer for business users.

Delivering Data Democratization in the Cloud with Snowflake

Delivering rapid-fire Analytics with Snowflake and Tableau

Until recently, advancements in data warehousing and analytics were largely incremental. Small innovations in database design would herald a new data warehouse every 2-3 years, which would quickly become overwhelmed with rapidly increasing data volumes. Knowledge workers struggled to access those databases with development intensive BI tools designed for reporting, rather than exploration and sharing. Both databases and BI tools were strained in locally hosted environments that were inflexible to growth or change. Snowflake and Tableau represent a fundamentally different approach. Snowflake’s multi-cluster shared data architecture was designed for the cloud and to handle logarithmically larger data volumes at blazing speed. Tableau was made to foster an interactive approach to analytics, freeing knowledge workers to use the speed of Snowflake to their greatest advantage.

Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...

Introducing the Snowflake Computing Cloud Data Warehouse

Introducing Snowflake, an elastic data warehouse delivered as a service in the cloud. It aims to simplify data warehousing by removing the need for customers to manage infrastructure, scaling, and tuning. Snowflake uses a multi-cluster architecture to provide elastic scaling of storage, compute, and concurrency. It can bring together structured and semi-structured data for analysis without requiring data transformation. Customers have seen significant improvements in performance, cost savings, and the ability to add new workloads compared to traditional on-premises data warehousing solutions.

Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...

Amazon Web Services

Snowflake is a cloud-based data warehouse that is built for the cloud. It was founded in 2012 and has raised $1 billion in funding. Snowflake's architecture separates storage, compute, and metadata services, allowing it to offer unlimited scalability, multiple clusters that can access shared data with no downtime, and full transactional consistency across the system. Snowflake has over 2000 customers including large enterprises that use it for analytics, data science, and sharing large volumes of data securely.

Intro to Data Vault 2.0 on Snowflake

This document provides an introduction and overview of implementing Data Vault 2.0 on Snowflake. It begins with an agenda and the presenter's background. It then discusses why customers are asking for Data Vault and provides an overview of the Data Vault methodology including its core components of hubs, links, and satellites. The document applies Snowflake features like separation of workloads and agile warehouse scaling to support Data Vault implementations. It also addresses modeling semi-structured data and building virtual information marts using views.

Master the Multi-Clustered Data Warehouse - Snowflake

Matillion

Snowflake is one of the most powerful, efficient data warehouses on the market today—and we joined forces with the Snowflake team to show you how it works! In this webinar: - Learn how to optimize Snowflake - Hear insider tips and tricks on how to improve performance - Get expert insights from Craig Collier, Technical Architect from Snowflake, and Kalyan Arangam, Solution Architect from Matillion - Find out how leading brands like Converse, Duo Security, and Pets at Home use Snowflake and Matillion ETL to make data-driven decisions - Discover how Matillion ETL and Snowflake work together to modernize your data world - Learn how to utilize the impressive scalability of Snowflake and Matillion

Does it only have to be ML + AI?

The document discusses machine learning and artificial intelligence applications inside and outside of Snowflake's cloud data warehouse. It provides an overview of Snowflake and its architecture. It then discusses how machine learning can be implemented directly in the database using SQL, user-defined functions, and stored procedures. However, it notes that pure coding is not suitable for all users and that automated machine learning outside the database may be preferable to enable more business analysts and power users. It provides an example of using Amazon Forecast for time series forecasting and integrating it with Snowflake.

Analyzing Semi-Structured Data At Volume In The Cloud

Robert Dempsey

Presentation from Snowflake Computing at the November 2015 Data Wranglers DC meetup. The Cloud, Mobile and Web Applications are producing semi-structured data at an unprecedented rate. IT professionals continue to struggle capturing, transforming, and analyzing these complex data structures mixed with traditional relational style datasets using conventional MPP and/or Hadoop infrastructures. Public cloud infrastructures such as Amazon and Azure provide almost unlimited resources and scalability to handle both structured and semi-structured data (XML, JSON, AVRO) at Petabyte scale. These new capabilities coupled with traditional data management access methods such as SQL allow organizations and businesses new opportunities to leverage analytics at an unprecedented scale while greatly simplifying data pipeline architectures and providing an alternative to the "data lake".

Altis AWS Snowflake Practice

SamanthaSwain7

Demystifying Data Warehouse as a Service

Data Sharing with Snowflake

Every day, businesses across a wide variety of industries share data to support insights that drive efficiency and new business opportunities. However, existing methods for sharing data involve great effort on the part of data providers to share data, and involve great effort on the part of data customers to make use of that data. However, existing approaches to data sharing (such as e-mail, FTP, EDI, and APIs) have significant overhead and friction. For one, legacy approaches such as e-mail and FTP were never intended to support the big data volumes of today. Other data sharing methods also involve enormous effort. All of these methods require not only that the data be extracted, copied, transformed, and loaded, but also that related schemas and metadata must be transported as well. This creates a burden on data providers to deconstruct and stage data sets. This burden and effort is mirrored for the data recipient, who must reconstruct the data. As a result, companies are handicapped in their ability to fully realize the value in their data assets. Snowflake Data Sharing allows companies to grant instant access to ready-to-use data to any number of partners or data customers without any data movement, copying, or complex pipelines. Using Snowflake Data Sharing, companies can derive new insights and value from data much more quickly and with significantly less effort than current data sharing methods. As a result, companies now have a new approach and a powerful new tool to get the full value out of their data assets.

Snowflake: The Good, the Bad, and the Ugly

Tyler Wishnoff

A 30 day plan to start ending your data struggle with Snowflake

This document outlines a 30-day plan to address common data struggles around loading, integrating, analyzing, and collaborating on data using Snowflake's data platform. It describes setting up a team, defining goals and scope, loading sample data, testing and deploying business logic transformations, creating warehouses for business intelligence tools, and connecting BI tools to the data. The goal is that after 30 days, teams will be collaborating more effectively, able to easily load and combine different data sources, have accurate business logic implemented, and gain more insights from their data.

Introducing Direct Database Access with Snowflake + Intrinio

Intrinio

Data Vault 2.0 - Getting Started | Certus Solutions

Certus Solutions

Demystifying Data Warehouse as a Service (DWaaS)

This is from the talk I gave at the 30th Anniversary NoCOUG meeting in San Jose, CA. We all know that data warehouses and best practices for them are changing dramatically today. As organizations build new data warehouses and modernize established ones, they are turning to Data Warehousing as a Service (DWaaS) in hopes of taking advantage of the performance, concurrency, simplicity, and lower cost of a SaaS solution or simply to reduce their data center footprint (and the maintenance that goes with that). But what is a DWaaS really? How is it different from traditional on-premises data warehousing? In this talk I will: • Demystify DWaaS by defining it and its goals • Discuss the real-world benefits of DWaaS • Discuss some of the coolest features in a DWaaS solution as exemplified by the Snowflake Elastic Data Warehouse.

SQL Analytics Powering Telemetry Analysis at Comcast

Comcast is one of the leading providers of communications, entertainment, and cable products and services. At the heart of it is Comcast RDK providing the backbone of telemetry to the industry. RDK (Reference Design Kit) is pre-bundled opensource firmware for a complete home platform covering video, broadband and IoT devices. RDK team at Comcast analyzes petabytes of data, collected every 15 minutes from 70 million devices (video and broadband and IoT devices) installed in customer homes. They run ETL and aggregation pipelines and publish analytical dashboards on a daily basis to reduce customer calls and firmware rollout. The analysis is also used to calculate WIFI happiness index which is a critical KPI for Comcast customer experience. In addition to this, RDK team also does release tracking by analyzing the RDK firmware quality. SQL Analytics allows customers to operate a lakehouse architecture that provides data warehousing performance at data lake economics for up to 4x better price/performance for SQL workloads than traditional cloud data warehouses. We present the results of the “Test and Learn” with SQL Analytics and the delta engine that we worked in partnership with the Databricks team. We present a quick demo introducing the SQL native interface, the challenges we faced with migration, The results of the execution and our journey of productionizing this at scale.

KoprowskiT_SQLRelay2014#9_London_FromPlanToBackupToCloud

Tobias Koprowski

Microsoft released SQL Azure more than two years ago - that's enough time for testing (I hope!). So, are you ready to move your data to the Cloud? If you’re considering a business (i.e. a production environment) in the Cloud, you need to think about methods for backing up your data, a backup plan for your data and, eventually, restoring with Red Gate Cloud Services. In this session, you’ll see the differences, functionality, restrictions, and opportunities in SQL Azure and On-Premise SQL Server 2008/2008 R2/2012. We’ll consider topics such as how to be prepared for backup and restore, and which parts of a cloud environment are most important: keys, triggers, indexes, prices, security, service level agreements, etc.

Data Culture Series - Keynote - 3rd Dec

Jonathan Woodward

Big data. Small data. All data. You have access to an ever-expanding volume of data inside the walls of your business and out across the web. The potential in data is endless – from predicting election results to preventing the spread of epidemics. But how can you use it to your advantage to help move your business forward? Drive a Data Culture within your organisation Keynote include Ric Howe & Anthony Saxby

CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori

walk2talk srl

The document discusses considerations for migrating databases to Microsoft Azure SQL Database. It covers cloud options like Infrastructure as a Service (IaaS) using SQL Server on Azure VMs and Platform as a Service (PaaS) options like Azure SQL Database. It also discusses analyzing database compatibility, different migration methods like using BACPAC files or the Data Migration Assistant, and ways to optimize the migration process like monitoring tempdb usage.

What's hot

Launching a Data Platform on Snowflake

KETL Limited

Delivering Data Democratization in the Cloud with Snowflake

Delivering rapid-fire Analytics with Snowflake and Tableau

Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...

Introducing the Snowflake Computing Cloud Data Warehouse

Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...

Amazon Web Services

Intro to Data Vault 2.0 on Snowflake

Master the Multi-Clustered Data Warehouse - Snowflake

Matillion

Does it only have to be ML + AI?

Analyzing Semi-Structured Data At Volume In The Cloud

Robert Dempsey

Altis AWS Snowflake Practice

SamanthaSwain7

Demystifying Data Warehouse as a Service

Data Sharing with Snowflake

Snowflake: The Good, the Bad, and the Ugly

Tyler Wishnoff

A 30 day plan to start ending your data struggle with Snowflake

Introducing Direct Database Access with Snowflake + Intrinio

Intrinio

Data Vault 2.0 - Getting Started | Certus Solutions

Certus Solutions

What's hot (17)

Launching a Data Platform on Snowflake

Delivering Data Democratization in the Cloud with Snowflake

Delivering rapid-fire Analytics with Snowflake and Tableau

Smartsheet’s Transition to Snowflake and Databricks: The Why and Immediate Im...

Introducing the Snowflake Computing Cloud Data Warehouse

Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...

Intro to Data Vault 2.0 on Snowflake

Master the Multi-Clustered Data Warehouse - Snowflake

Does it only have to be ML + AI?

Analyzing Semi-Structured Data At Volume In The Cloud

Altis AWS Snowflake Practice

Demystifying Data Warehouse as a Service

Data Sharing with Snowflake

Snowflake: The Good, the Bad, and the Ugly

A 30 day plan to start ending your data struggle with Snowflake

Introducing Direct Database Access with Snowflake + Intrinio

Data Vault 2.0 - Getting Started | Certus Solutions

Similar to SLC Snowflake User Group - Mar 12, 2020

Demystifying Data Warehouse as a Service (DWaaS)

SQL Analytics Powering Telemetry Analysis at Comcast

Siwawong Wuttipongprasert

KoprowskiT_SQLRelay2014#9_London_FromPlanToBackupToCloud

Tobias Koprowski

Data Culture Series - Keynote - 3rd Dec

Jonathan Woodward

CCI2017 - Considerations for Migrating Databases to Azure - Gianluca Sartori

walk2talk srl

Tableau & MongoDB: Visual Analytics at the Speed of Thought

MongoDB

This document discusses how Tableau and MongoDB can work together for visual analytics of big data. It describes how MongoDB is a NoSQL database that can handle unstructured and semi-structured data like JSON, and how Tableau allows users to connect to MongoDB through an ODBC driver and visualize the data without needing to write code. The document outlines scenarios where big data comes from human, machine, and process sources and how the combination of Tableau and MongoDB's schema-on-read approach reduces the need for ETL. It also previews demos of connecting Tableau to MongoDB using both the ODBC driver and a PostgreSQL interface.

Modernizing Global Shared Data Analytics Platform and our Alluxio Journey

Alluxio, Inc.

Refactoring your EDW with Mobile Analytics Products

Luke Han

The document discusses refactoring an enterprise data warehouse (EDW) at China Construction Bank (CCB) to leverage mobile analytics and big data. CCB has a large existing EDW infrastructure handling over 1PB of core data and 4TB of incremental data daily. They have transformed their EDW over time, adding a Hadoop platform and migrating some data and queries. Kyligence products help accelerate queries and enable self-service analytics on the large data volumes.

It ready dw_day3_rev00

SQLCAT: Tier-1 BI in the World of Big Data

Denny Lee

This document summarizes a presentation on tier-1 business intelligence (BI) in the world of big data. The presentation will cover Microsoft's BI capabilities at large scales, big data workloads from Yahoo and investment banks, Hadoop and the MapReduce framework, and extracting data out of big data systems into BI tools. It also shares a case study on Yahoo's advertising analytics platform that processes billions of rows daily from terabytes of data.

Add Redis to Postgres to Make Your Microservices Go Boom!

Dave Nielsen

Cloud-native Semantic Layer on Data Lake

Getting value from IoT, Integration and Data Analytics

Oow2016 review-db-dev-bigdata-BI

Oracle OpenWorld 2016 focused on several key themes: 1. A shift away from a single, central Oracle database and toward distributed architectures like PDBs, sharding, Hadoop, and machine learning. 2. Adopting open source technologies and industry trends like Node.js, Docker, microservices, and Python. 3. Advancing Oracle's cloud strategy through migration tools, cloud@customer, and subscription models while improving the user experience of SaaS applications.

Oracle OpenWorld 2016 Review - Focus on Data, BigData, Streaming Data, Machin...

Lucas Jellema

One key area of Oracle OpenWorld 2016 was data in various shapes. Big Data, streaming data and traditional transactional data. The power of SQL to access and unleash all data - even data in NoSQL databases. The advent of the citizen data scientist. Streaming data analysis in real time on vast and fast and vast data, data discovery. And the new Oracle Database 12cR2 release. Forms, APEX, SQL and PL/SQL.

Azure Synapse Analytics Overview (r2)

James Serra

Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs. This is a huge deck with lots of screenshots so you can see exactly how it works.

Myth Busters II: BI Tools and Data Virtualization are Interchangeable

Denodo

Watch Here: https://bit.ly/2NcqU6F We take on the 2nd myth about data virtualization and it’s one that suggests a BI tool can substitute a data virtualization software. You might be thinking: If I can have multi-source queries and define a logical model in my reporting tool, why would I need a data virtualization software? Reporting tools, no doubt important and necessary, focus on the visualization of data and it’s presentation to the business user. Data virtualization is a governed data access layer designed to connect to and provide transparency of all enterprise data. Yet the myth suggests that these technologies are interchangeable. So we’re going to take it on! Watch this webinar as we compare and contrast BI tools and data virtualization to draw a final conclusion.

AMIS Oracle OpenWorld 2013 Review Part 1 - Intro Overview Innovation, Hardwar...

Getting value from IoT, Integration and Data Analytics

The document provides an agenda and overview of announcements from Oracle OpenWorld 2013. Key announcements include the Oracle Database In Memory option, Sparc M6-32 server, Backup Logging and Recovery Appliance, expanded cloud services, and new capabilities for big data and JSON. Oracle aims to lead in areas around big data, in-memory computing, and cloud services and hopes to ease customers' transition to mobile, cloud, and big data technologies.

Presto @ Treasure Data - Presto Meetup Boston 2015

Taro L. Saito

Treasure Data simplifies event analytics for the complex digital world. Our customers send us 1,000,000 events per second and issue 30,000+ Presto queries everyday to understand their customers better. One of the challenges is designing a cloud database with zero downtime to support a global customer base. We have achieved this goal by developing several open-source technologies; Fluentd and Embulk enable seamless log collection from stream/batch sources, and with MessagePack we can provide an extensible columnar store that accommodates future schema changes. Finally, Presto allows us to serve a wide variety of data processing our customers perform on our service. In this talk, I will present an overview of our system, and how our customers keep using Presto while collecting and extending their data set.

Ensuring Quality in Data Lakes (D&D Meetup Feb 22)

lakeFS

The document discusses improving data quality in a data lake. It describes three levels (L1-L3) of data lake maturity: L1 involves storing data in an object store in a basic format like CSV files. This provides good performance, cost efficiency, and developer experience. L2 adds optimized table formats like Delta Lake, Hudi and Iceberg that maintain metadata and transaction logs to enable features like schema enforcement, data versioning and isolation. L3 adds data version control systems like lakeFS that extend the object store with Git-like source control operations. This allows instantly reverting bad data, developing data in isolation, and simplifying data reproducibility. LakeFS was demonstrated as an example solution

Free Training: How to Build a Lakehouse