Deep dive into how digital analytics stacks need to evolve with businesses, and how self-describing data and event data modeling are the key elements that enable Snowplow data pipeliens to elegantly evolve over time
How to evolve your analytics stack with your business using SnowplowGiuseppe Gaviani
Presented at Snowplow London Meetup, 8 February 2017
Christophe Bogaert, Data Scientist at Snowplow, talked about how businesses are constantly evolving, why that means their analytics stack needs to evolve with it and how Snowplow supports that evolution. With Snowplow, you can flexibly define the events and entities to represent your business. Finally, he talked about event data modeling and how to handle the evolution of your data pipeline.
A talk through the journey we've been through at Snowplow thinking about event data, starting with our focus on web and then mobile analytics, and exploring our current and future technical and analytic approaches
How Gousto is moving to just-in-time personalization with SnowplowGiuseppe Gaviani
Presented at Snowplow London Meetup, 8 February 2017
Dejan Petelin, head of data science at Gousto, gave a presentation about their data journey, explaining how data reflects the customer’s voice and the importance of joining up all data sources. The goal is to delight and retain customers – critical for a subscription business like Gousto’s. Gousto is using Snowplow as a unified log, to scale up its data capabilities, listen to its customer and provide them with a more personalized experience. Finally, Gousto is moving to the real-time pipeline to enable just-in-time personalization.
Our cofounder Alex Dean gave an introduction to Snowplow and then talked about our roadmap for 2017. Alex touched on several topics including support for more clouds, support for more storage targets, tailoring Snowplow to your industry, more intelligent event sources, moving our batch pipeline to Spark, mega-scale Snowplow and real-time support for Sauna, our decisioning and response system. Presented on 5 April 2017.
Why use big data tools to do web analytics? And how to do it using Snowplow a...yalisassoon
There are a number of mature web analytics products that have been on the market for ~20 years. Big data tools have only really taken off in the last 5 years. So why use big data tools mine web analytics data?
In this presentation, I explore the limitations of traditional approaches to web analytics, and explain how big data tools can be used to address those limitations and drive more value from the underlying data. I explain how a combination of Snowplow and Qubole can be used to do this in practice
How to evolve your analytics stack with your business using SnowplowGiuseppe Gaviani
Presented at Snowplow London Meetup, 8 February 2017
Christophe Bogaert, Data Scientist at Snowplow, talked about how businesses are constantly evolving, why that means their analytics stack needs to evolve with it and how Snowplow supports that evolution. With Snowplow, you can flexibly define the events and entities to represent your business. Finally, he talked about event data modeling and how to handle the evolution of your data pipeline.
A talk through the journey we've been through at Snowplow thinking about event data, starting with our focus on web and then mobile analytics, and exploring our current and future technical and analytic approaches
How Gousto is moving to just-in-time personalization with SnowplowGiuseppe Gaviani
Presented at Snowplow London Meetup, 8 February 2017
Dejan Petelin, head of data science at Gousto, gave a presentation about their data journey, explaining how data reflects the customer’s voice and the importance of joining up all data sources. The goal is to delight and retain customers – critical for a subscription business like Gousto’s. Gousto is using Snowplow as a unified log, to scale up its data capabilities, listen to its customer and provide them with a more personalized experience. Finally, Gousto is moving to the real-time pipeline to enable just-in-time personalization.
Our cofounder Alex Dean gave an introduction to Snowplow and then talked about our roadmap for 2017. Alex touched on several topics including support for more clouds, support for more storage targets, tailoring Snowplow to your industry, more intelligent event sources, moving our batch pipeline to Spark, mega-scale Snowplow and real-time support for Sauna, our decisioning and response system. Presented on 5 April 2017.
Why use big data tools to do web analytics? And how to do it using Snowplow a...yalisassoon
There are a number of mature web analytics products that have been on the market for ~20 years. Big data tools have only really taken off in the last 5 years. So why use big data tools mine web analytics data?
In this presentation, I explore the limitations of traditional approaches to web analytics, and explain how big data tools can be used to address those limitations and drive more value from the underlying data. I explain how a combination of Snowplow and Qubole can be used to do this in practice
Technical introduction to Snowplow given at Big Data Beers on 25th September 2014. a Explored how we use a variety of "big data" technologies including Hadoop, Kinesis and Redshift.
Snowplow Analytics: from NoSQL to SQL and back againAlexander Dean
A talk I gave to London NoSQL about Snowplow's journey from using NoSQL (via Amazon S3 and Hive), to columnar storage (via Amazon Redshift and PostgreSQL), and most recently to a mixed model of NoSQL and SQL, including S3, Redshift and Elasticsearch.
Use cases and examples using Apache Spark, presented at the Hadoop User Group (UK) November 2014 Hadoop Meetup
http://www.meetup.com/hadoop-users-group-uk/events/217791892/
Real-time user profiling based on Spark streaming and HBase by Arkadiusz Jach...Big Data Spain
Agora owns dozens of themed, classified, entertainment and social services. There are news and sports portals, forums, advertising services, blogs and many other thematic websites. All sites generate over 400 page views per second (under normal conditions) and considerably more events (likes focus, clicks and scrolling events). It raises one question: how to build user profiles real-time in such a dynamic and changing environment?
Session presented at Big Data Spain 2015 Conference
15th Oct 2015
Kinépolis Madrid
http://www.bigdataspain.org
Event promoted by: http://www.paradigmatecnologico.com
Abstract: http://www.bigdataspain.org/program/thu/slot-16.html
On large-scale web sites, users leave thousands of traces every second. Businesses need to process and interpret these traces in real-time to be able to react on the behavior of their users.
In this talk, Andreas will show a real world example of the power of a modern open-source stack.
He will walk you through the design of a real-time clickstream analysis PAAS solution based on Apache Spark, Kafka, Parquet and HDFS, explain our decision making and present our lessons learned.
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...Amazon Web Services
In this session we will demonstrate how non-experts in machine learning, can easily analyze their data with QuickSight and build scalable and production-ready predictive models with Amazon machine learning. After the session you will have a good understanding how to define problems from your business, in terms of data and predictive models, and you will be able to apply analytics and machine learning concepts as a competitive advantage.
AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)Amazon Web Services
Making earth observation data available by using Amazon S3 is accelerating scientific discovery and enabling the creation of new products. Attend and learn how the scale and performance of Amazon S3 lets earth scientists, researchers, startups, and GIS professionals gather and analyze planetary-scale data without worrying about limitations of bandwidth, storage, memory, or processing power. Learn how AWS is being used to combine satellite imagery, social data, and telemetry data to produce new products and services. Learn also how Amazon S3 provides much more than storage, and how an open geospatial data lake on Amazon S3 can be used as the basis for planetary-scale applications built with Amazon EMR, Amazon API Gateway, and AWS Lambda. As part of this talk, AWS customer Digital Globe demonstrates how they use open data stored in S3 to distribute high-resolution satellite imagery to their customers around the world.
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesAmazon Web Services
Streaming data applications can deliver compelling, near real-time user experiences, but building the back-end infrastructure to collect and process streaming data is difficult. Amazon Kinesis Firehose makes it easy for you to load streaming data into AWS without having to build custom stream processing applications. In this webinar, we will introduce Amazon Kinesis Firehose and discuss how to ingest streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service using Amazon Kinesis Firehose. We will also highlight key use cases based on real-world examples from IoT, AdTech, E-Commerce, and Gaming. Join us to: - Get an introduction to streaming data and an overview of Amazon Kinesis Firehose - Learn about common streaming data use cases from IoT, Ad Tech, E-Commerce, and Gaming - Understand how to use Amazon Kinesis Firehose to load streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service Who should attend: Developers, data analysts, data engineers, architects
Analysing data analytics use cases to understand big data platformdataeaze systems
Get big picture of data platform architecture by knowing its purpose and problem it solves.
These slides take top down approach, starting with basic purpose of data platform ie. to serve analytics use cases. These slides categorise use cases and analyses their expectation from data platform.
Google BigQuery for Everyday DeveloperMárton Kodok
IV. IT&C Innovation Conference - October 2016 - Sovata, Romania
A. Every scientist who needs big data analytics to save millions of lives should have that power
Legacy systems don’t provide the power.
B. The simple fact is that you are brilliant but your brilliant ideas require complex analytics.
Traditional solutions are not applicable.
The Plan: have oversight over developments as they happen.
Goal: Store everything accessible by SQL immediately.
What is BigQuery?
Analytics-as-a-Service - Data Warehouse in the Cloud
Fully-Managed by Google (US or EU zone)
Scales into Petabytes
Ridiculously fast
Decent pricing (queries $5/TB, storage: $20/TB) *October 2016 pricing
100.000 rows / sec Streaming API
Open Interfaces (Web UI, BQ command line tool, REST, ODBC)
Familiar DB Structure (table, views, record, nested, JSON)
Convenience of SQL + Javascript UDF (User Defined Functions)
Integrates with Google Sheets + Google Cloud Storage + Pub/Sub connectors
Client libraries available in YFL (your favorite languages)
Our benefits
no provisioning/deploy
no running out of resources
no more focus on large scale execution plan
no need to re-implement tricky concepts
(time windows / join streams)
pay only the columns we have in your queries
run raw ad-hoc queries (either by analysts/sales or Devs)
no more throwing away-, expiring-, aggregating old data.
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...Amazon Web Services
AWS has a large and growing portfolio of big data management and analytics services, designed to be integrated into solution architectures that meet the needs of your business. In this session, we look at analytics through the eyes of a business intelligence analyst, a data scientist, and an application developer, and we explore how to quickly leverage Amazon Redshift, Amazon QuickSight, RStudio, and Amazon Machine Learning to create powerful, yet straightforward, business solutions.
Speaker:
Paul Armstrong, Solutions Architect, Amazon Web Services
Design for Scale - Building Real Time, High Performing Marketing Technology p...Amazon Web Services
DynamoDB presented by David Pearson from AWS
Bizo Business Audience Marketing success story on AWS by Alex Boisvert, Director of Engineering, Bizo
In today's world, consumer habits change fast and marketing decisions need to be made within seconds, not days. Delivering engaging advertising experiences requires real time, high performing architectures that provide digital advertisers the ability to measure and improve the performance of their campaigns and tie them more closely to corporate goals. The insights gleaned from the massive amounts of data collected can then be used to dynamically adjust media spend and creative execution for optimal performance. The AWS Cloud enables you to deliver marketing content and advertisements with the levels of availability, performance, and personalization that your customers expect. Plus, AWS lowers your costs. Join us to learn about how big data and low latency / high performing architectures are changing the game for digital advertising.
Lambda-B-Gone: In-memory Case Study for Faster, Smarter and Simpler AnswersVoltDB
Dennis Duckworth presented at In-Memory Computing Summit 2016, explaining a case study for how MaxCDN replaced the complex Lambda Architecture with VoltDB for a faster, simpler and smarter platform.
Snowplow at the heart of Busuu's data & analytics infrastructureGiuseppe Gaviani
Presented at Snowplow London Meetup, 8 February 2017
Bruce Pannaman, data scientist at Busuu, talked about why they are using Snowplow to validate and enrich data, enable one source of truth across different data sources, cope with peaks and troughs in the data stream, and easily integrate with third party systems such as Intercom, a customer messaging platform. One of Busuu’s future projects is to load multiple A/B tests into the apps and monitor their results in real time.
Technical introduction to Snowplow given at Big Data Beers on 25th September 2014. a Explored how we use a variety of "big data" technologies including Hadoop, Kinesis and Redshift.
Snowplow Analytics: from NoSQL to SQL and back againAlexander Dean
A talk I gave to London NoSQL about Snowplow's journey from using NoSQL (via Amazon S3 and Hive), to columnar storage (via Amazon Redshift and PostgreSQL), and most recently to a mixed model of NoSQL and SQL, including S3, Redshift and Elasticsearch.
Use cases and examples using Apache Spark, presented at the Hadoop User Group (UK) November 2014 Hadoop Meetup
http://www.meetup.com/hadoop-users-group-uk/events/217791892/
Real-time user profiling based on Spark streaming and HBase by Arkadiusz Jach...Big Data Spain
Agora owns dozens of themed, classified, entertainment and social services. There are news and sports portals, forums, advertising services, blogs and many other thematic websites. All sites generate over 400 page views per second (under normal conditions) and considerably more events (likes focus, clicks and scrolling events). It raises one question: how to build user profiles real-time in such a dynamic and changing environment?
Session presented at Big Data Spain 2015 Conference
15th Oct 2015
Kinépolis Madrid
http://www.bigdataspain.org
Event promoted by: http://www.paradigmatecnologico.com
Abstract: http://www.bigdataspain.org/program/thu/slot-16.html
On large-scale web sites, users leave thousands of traces every second. Businesses need to process and interpret these traces in real-time to be able to react on the behavior of their users.
In this talk, Andreas will show a real world example of the power of a modern open-source stack.
He will walk you through the design of a real-time clickstream analysis PAAS solution based on Apache Spark, Kafka, Parquet and HDFS, explain our decision making and present our lessons learned.
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...Amazon Web Services
In this session we will demonstrate how non-experts in machine learning, can easily analyze their data with QuickSight and build scalable and production-ready predictive models with Amazon machine learning. After the session you will have a good understanding how to define problems from your business, in terms of data and predictive models, and you will be able to apply analytics and machine learning concepts as a competitive advantage.
AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)Amazon Web Services
Making earth observation data available by using Amazon S3 is accelerating scientific discovery and enabling the creation of new products. Attend and learn how the scale and performance of Amazon S3 lets earth scientists, researchers, startups, and GIS professionals gather and analyze planetary-scale data without worrying about limitations of bandwidth, storage, memory, or processing power. Learn how AWS is being used to combine satellite imagery, social data, and telemetry data to produce new products and services. Learn also how Amazon S3 provides much more than storage, and how an open geospatial data lake on Amazon S3 can be used as the basis for planetary-scale applications built with Amazon EMR, Amazon API Gateway, and AWS Lambda. As part of this talk, AWS customer Digital Globe demonstrates how they use open data stored in S3 to distribute high-resolution satellite imagery to their customers around the world.
Introduction to Amazon Kinesis Firehose - AWS August Webinar SeriesAmazon Web Services
Streaming data applications can deliver compelling, near real-time user experiences, but building the back-end infrastructure to collect and process streaming data is difficult. Amazon Kinesis Firehose makes it easy for you to load streaming data into AWS without having to build custom stream processing applications. In this webinar, we will introduce Amazon Kinesis Firehose and discuss how to ingest streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service using Amazon Kinesis Firehose. We will also highlight key use cases based on real-world examples from IoT, AdTech, E-Commerce, and Gaming. Join us to: - Get an introduction to streaming data and an overview of Amazon Kinesis Firehose - Learn about common streaming data use cases from IoT, Ad Tech, E-Commerce, and Gaming - Understand how to use Amazon Kinesis Firehose to load streaming data into Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service Who should attend: Developers, data analysts, data engineers, architects
Analysing data analytics use cases to understand big data platformdataeaze systems
Get big picture of data platform architecture by knowing its purpose and problem it solves.
These slides take top down approach, starting with basic purpose of data platform ie. to serve analytics use cases. These slides categorise use cases and analyses their expectation from data platform.
Google BigQuery for Everyday DeveloperMárton Kodok
IV. IT&C Innovation Conference - October 2016 - Sovata, Romania
A. Every scientist who needs big data analytics to save millions of lives should have that power
Legacy systems don’t provide the power.
B. The simple fact is that you are brilliant but your brilliant ideas require complex analytics.
Traditional solutions are not applicable.
The Plan: have oversight over developments as they happen.
Goal: Store everything accessible by SQL immediately.
What is BigQuery?
Analytics-as-a-Service - Data Warehouse in the Cloud
Fully-Managed by Google (US or EU zone)
Scales into Petabytes
Ridiculously fast
Decent pricing (queries $5/TB, storage: $20/TB) *October 2016 pricing
100.000 rows / sec Streaming API
Open Interfaces (Web UI, BQ command line tool, REST, ODBC)
Familiar DB Structure (table, views, record, nested, JSON)
Convenience of SQL + Javascript UDF (User Defined Functions)
Integrates with Google Sheets + Google Cloud Storage + Pub/Sub connectors
Client libraries available in YFL (your favorite languages)
Our benefits
no provisioning/deploy
no running out of resources
no more focus on large scale execution plan
no need to re-implement tricky concepts
(time windows / join streams)
pay only the columns we have in your queries
run raw ad-hoc queries (either by analysts/sales or Devs)
no more throwing away-, expiring-, aggregating old data.
Analytics on AWS with Amazon Redshift, Amazon QuickSight, and Amazon Machine ...Amazon Web Services
AWS has a large and growing portfolio of big data management and analytics services, designed to be integrated into solution architectures that meet the needs of your business. In this session, we look at analytics through the eyes of a business intelligence analyst, a data scientist, and an application developer, and we explore how to quickly leverage Amazon Redshift, Amazon QuickSight, RStudio, and Amazon Machine Learning to create powerful, yet straightforward, business solutions.
Speaker:
Paul Armstrong, Solutions Architect, Amazon Web Services
Design for Scale - Building Real Time, High Performing Marketing Technology p...Amazon Web Services
DynamoDB presented by David Pearson from AWS
Bizo Business Audience Marketing success story on AWS by Alex Boisvert, Director of Engineering, Bizo
In today's world, consumer habits change fast and marketing decisions need to be made within seconds, not days. Delivering engaging advertising experiences requires real time, high performing architectures that provide digital advertisers the ability to measure and improve the performance of their campaigns and tie them more closely to corporate goals. The insights gleaned from the massive amounts of data collected can then be used to dynamically adjust media spend and creative execution for optimal performance. The AWS Cloud enables you to deliver marketing content and advertisements with the levels of availability, performance, and personalization that your customers expect. Plus, AWS lowers your costs. Join us to learn about how big data and low latency / high performing architectures are changing the game for digital advertising.
Lambda-B-Gone: In-memory Case Study for Faster, Smarter and Simpler AnswersVoltDB
Dennis Duckworth presented at In-Memory Computing Summit 2016, explaining a case study for how MaxCDN replaced the complex Lambda Architecture with VoltDB for a faster, simpler and smarter platform.
Snowplow at the heart of Busuu's data & analytics infrastructureGiuseppe Gaviani
Presented at Snowplow London Meetup, 8 February 2017
Bruce Pannaman, data scientist at Busuu, talked about why they are using Snowplow to validate and enrich data, enable one source of truth across different data sources, cope with peaks and troughs in the data stream, and easily integrate with third party systems such as Intercom, a customer messaging platform. One of Busuu’s future projects is to load multiple A/B tests into the apps and monitor their results in real time.
An outline of the differing role of KPIs at startups vs mature businesses, drawing out the implications for the approach and methodology to their development.
Span Conference: Why your company needs a unified logAlexander Dean
Apache Kafka and Amazon Kinesis are more than just message queues — they can serve as a unified log which you can put at the heart of your business, effectively creating a "digital nervous system" which your company's applications and processes can be re-structured around.
In this talk, Alex will provide an introduction to unified log technology, highlight some killer use cases and also show how Kinesis is being used "in anger" at Snowplow. Alex's talk will draw on his experiences working with event streams over the last two and a half years at Snowplow; it’s also heavily influenced by Jay Kreps’ unified log monograph, and by Alex's recent work penning Unified Log Processing, a Manning book. Alex's talk will show how event streams inside a unified log are an incredibly powerful primitive for building rich event-centric applications, unbundling local transactional silos and creating a single version of truth for a company.
Alex's talk will conclude with a live demo of Amazon Kinesis in action processing Snowplow events.
Implementing improved and consistent arbitrary event tracking company-wide us...yalisassoon
Talk on the role Snowplow plays as part of the larger project to make data accessible to product marketing and other data-driven teams at StumbleUpon. Touches on technical and organizational challenges
2016 09 measurecamp - event data modelingyalisassoon
Presentation by Christophe Bogaert to Measurecamp London September 2016. Christophe discussed what makes consuming and analysing event-streams difficult, and outlined a number of techniques for overcoming those obstacles.
Yali presentation for snowplow amsterdam meetup number 2yalisassoon
Digital analytics is a very exciting place to work because digital event data is becoming more interesting as more of our digital lives are intermediated by digital platforms.
In this presentation I explain how at Snowplow we're working to make it easier to build insight and act on digital event.
Snowplow is at the core of everything we doyalisassoon
Presentation authored by Simon Rumble covering the journey that Bauer Media Australia have gone through implementing Snowplow, and the central role Snowplow now plays in their data strategy / products.
On the importance of evolving your data pipeline with your business, and how Snowplow enables that through self-describing data and the ability to recompute your data models on the entire event data set.
Programmatic Advertising: How To Join In On the FunHanapin Marketing
Carrie Albright, Associate Director of Services at Hanapin Marketing, discusses how to get into Programmatic Advertising, vendors you can hire, the definitions of different programmatic strategies, and steps to building your strategic plan.
Originally presented at Hero Conf London in October 2016.
[WSO2Con Asia 2018] Patterns for Building Streaming AppsWSO2
This slide deck explains how to enable digital transformation through streaming analytics and how easily streaming applications can be implemented
Learn more: https://wso2.com/library/conference/2018/08/wso2con-asia-2018-patterns-for-building-streaming-apps/
Today’s highly connected world is flooding businesses with big and fast-moving data. The ability to trawl this data ocean and identify actionable insights can deliver a competitive advantage to any organization. The WSO2 Analytics Platform enables businesses to do just that by providing batch, real-time, interactive and predictive analysis capabilities all in one place.
In this tutorial we will
* Plug in the WSO2 Analytics Platform to some common business use cases
* Showcase the numerous capabilities of the platform
* Demonstrate how to collect data, analyze, predict and communicate effectively
* Demonstrate how it can analyze integration, security and IoT scenarios
Stick around till the end and you will walk away with the necessary skills to create a winning data strategy for your organization to stay ahead of its competition.
To view recording of this webinar please use below URL:
http://wso2.com/library/webinars/2016/06/analytics-in-your-enterprise/
Big data spans many fields and brings together technologies like distributed systems, machine learning, statistics and Internet of Things (IoT). It has now become a multi-billion dollar industry with use cases ranging from targeted advertising and fraud detection to product recommendations and market surveys.
Some use cases such as urban planning can be slower (done in batch mode), while others such as the stock market needs results in milliseconds (done is a streaming fashion). Different technologies are used for each case; MapReduce for batch analytics, complex event processing for real-time analytics and machine learning for predictive analytics. Furthermore, the type of analysis ranges from basic statistics to complicated prediction models.
This webinar will discuss the big data landscape including
Concepts, use cases and technologies
Capabilities and applications of the WSO2 analytics platform
WSO2 Data Analytics Server
WSO2 Complex Event Processor
WSO2 Machine Learner
New feature overview of Cubes 1.0 – lightweight Python OLAP and pluggable data warehouse. Video: https://www.youtube.com/watch?v=-FDTK80zsXc Github sources: https://github.com/databrewery/cubes
GraphQL Summit 2019 - Configuration Driven Data as a Service Gateway with Gra...Noriaki Tatsumi
In this talk, you’ll learn about techniques used to build a scalable GraphQL based data gateway with the capability to dynamically on-board various new data sources. They include runtime schema evolution and resolver wiring, abstract resolvers, auto GraphQL schema generation from other schema types, and construction of appropriate cache key-values.
Snowplow: open source game analytics powered by AWSGiuseppe Gaviani
This is a presentation by Alex Dean and Yali Sassoon at Snowplow about open source game analytics powered by AWS. It was presented at the Games Developer Conference (GDC) in San Francisco, February 2017
Real-time big data analytics based on product recommendations case studydeep.bi
We started as an ad network. The challenge was to recommend the best product (out of millions) to the right person in a given moment (thousands of users within a second). We have delivered 5 billion ad views since 24 months. To put it in the scale context: If we would serve 1 ad per second it will take 160 years to serve 5 billion ads.
So we needed a solution. SQL databases did not work. Popular NoSQL databases did not work. Standard data warehouse approaches (pre-aggregations, creating schemas) - did not work too.
Re-thinking all the problems with huge data streams flowing to us every second we have built a complete solution based on open-source technologies and fresh, smart ideas from our engineering team. It is called deep.bi and now we make it available to other companies.
deep.bi lets high-growth companies solve fast data problems by providing scalable, flexible and real-time data collection, enrichment and analytics.
It was built using:
- Node.js - API
- Kafka - collecting and distributing data
- Spark Streaming - ETL, data enrichments
- Druid - real-time analytics
- Cassandra - user events store
- Hadoop + Parquet + Spark - raw data store + ad-hoc queries
(MBL305) You Have Data from the Devices, Now What?: Getting the Value of the IoTAmazon Web Services
We are collecting tons of sensor data from billions of devices. How do you get the value from your IoT data sources? In this session, we will explore different strategies for collecting and ingesting data, understanding its frequency, and leveraging the potential of the cloud to analyze and predict trends and behavior to get most out of your deployed devices.
WSO2Con EU 2016: An Introduction to the WSO2 Analytics PlatformWSO2
In today’s connected, organizations have access to an enormous amount of data but only use a very small subset of it. This data can give you hindsight, oversight, insight and foresight about your enterprise and the world that communicates with. In can be leverage to gain a considerable competitive advantage in the market.
The WSO2 Data Analytics platform lets you collect data, explore it through batch, real-time, interactive and predictive processing technologies and communicate your results. In this talk, we will discuss the WSO2 Data Analytics platform and how it brings together all analytics technologies into a single platform and user experience.
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.BI
Deep.bi It helps ecommerce teams improve their performance by providing current and detailed insights.
It bring operational excellence and performance for:
- Category Managers / Merchandisers
- Marketers
- Customer service
- UX / Design Team
- Tech / IT
- Executives / Managers
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics PlatformWSO2
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
With Hadoop, we can easily process data from the disk, but this consumes a lot of time. The value of certain insights, such as a traffic alerts or heart attack alerts, degrades with time and handling this time sensitive data needs realtime technologies that can produce output within milliseconds. Moreover, some use cases need advanced analytics like machine learning.
In this talk, we will discuss about the WSO2 Data Analytics platform that brings together all the technologies into one platform. It lets you collect data through a one sensor API, process it using batch, realtime or predictive technologies and communicate your results all within a single platform and user experience.
Presenter:
Srinath Perera
Vice President – Research,
WSO2
During this session we will cover the best practices for implementing a product catalog with MongoDB. We will cover how to model an item properly when it can have thousands of variations and thousands of properties of interest. You'll learn how to index properly and allow for faceted search with milliseconds response latency and how to implement per-store, per-sku pricing while still keeping a sane number of documents. We will also cover operational considerations, like how to bring the data closer to users to cut down the network latency.
Similar to Snowplow: evolve your analytics stack with your business (20)
Presentation given by Christophe Bogaert at the inaugural Snowplow Meetup New York in March 2016. Christophe described the event data modeling process at a high level before diving into specific tools and techniques for developing performant models.
[Note: This is a partial preview. To download this presentation, visit:
https://www.oeconsulting.com.sg/training-presentations]
Sustainability has become an increasingly critical topic as the world recognizes the need to protect our planet and its resources for future generations. Sustainability means meeting our current needs without compromising the ability of future generations to meet theirs. It involves long-term planning and consideration of the consequences of our actions. The goal is to create strategies that ensure the long-term viability of People, Planet, and Profit.
Leading companies such as Nike, Toyota, and Siemens are prioritizing sustainable innovation in their business models, setting an example for others to follow. In this Sustainability training presentation, you will learn key concepts, principles, and practices of sustainability applicable across industries. This training aims to create awareness and educate employees, senior executives, consultants, and other key stakeholders, including investors, policymakers, and supply chain partners, on the importance and implementation of sustainability.
LEARNING OBJECTIVES
1. Develop a comprehensive understanding of the fundamental principles and concepts that form the foundation of sustainability within corporate environments.
2. Explore the sustainability implementation model, focusing on effective measures and reporting strategies to track and communicate sustainability efforts.
3. Identify and define best practices and critical success factors essential for achieving sustainability goals within organizations.
CONTENTS
1. Introduction and Key Concepts of Sustainability
2. Principles and Practices of Sustainability
3. Measures and Reporting in Sustainability
4. Sustainability Implementation & Best Practices
To download the complete presentation, visit: https://www.oeconsulting.com.sg/training-presentations
The world of search engine optimization (SEO) is buzzing with discussions after Google confirmed that around 2,500 leaked internal documents related to its Search feature are indeed authentic. The revelation has sparked significant concerns within the SEO community. The leaked documents were initially reported by SEO experts Rand Fishkin and Mike King, igniting widespread analysis and discourse. For More Info:- https://news.arihantwebtech.com/search-disrupted-googles-leaked-documents-rock-the-seo-world/
Affordable Stationery Printing Services in Jaipur | Navpack n PrintNavpack & Print
Looking for professional printing services in Jaipur? Navpack n Print offers high-quality and affordable stationery printing for all your business needs. Stand out with custom stationery designs and fast turnaround times. Contact us today for a quote!
Accpac to QuickBooks Conversion Navigating the Transition with Online Account...PaulBryant58
This article provides a comprehensive guide on how to
effectively manage the convert Accpac to QuickBooks , with a particular focus on utilizing online accounting services to streamline the process.
What are the main advantages of using HR recruiter services.pdfHumanResourceDimensi1
HR recruiter services offer top talents to companies according to their specific needs. They handle all recruitment tasks from job posting to onboarding and help companies concentrate on their business growth. With their expertise and years of experience, they streamline the hiring process and save time and resources for the company.
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...BBPMedia1
Marvin neemt je in deze presentatie mee in de voordelen van non-endemic advertising op retail media netwerken. Hij brengt ook de uitdagingen in beeld die de markt op dit moment heeft op het gebied van retail media voor niet-leveranciers.
Retail media wordt gezien als het nieuwe advertising-medium en ook mediabureaus richten massaal retail media-afdelingen op. Merken die niet in de betreffende winkel liggen staan ook nog niet in de rij om op de retail media netwerken te adverteren. Marvin belicht de uitdagingen die er zijn om echt aansluiting te vinden op die markt van non-endemic advertising.
Attending a job Interview for B1 and B2 Englsih learnersErika906060
It is a sample of an interview for a business english class for pre-intermediate and intermediate english students with emphasis on the speking ability.
Skye Residences | Extended Stay Residences Near Toronto Airportmarketingjdass
Experience unparalleled EXTENDED STAY and comfort at Skye Residences located just minutes from Toronto Airport. Discover sophisticated accommodations tailored for discerning travelers.
Website Link :
https://skyeresidences.com/
https://skyeresidences.com/about-us/
https://skyeresidences.com/gallery/
https://skyeresidences.com/rooms/
https://skyeresidences.com/near-by-attractions/
https://skyeresidences.com/commute/
https://skyeresidences.com/contact/
https://skyeresidences.com/queen-suite-with-sofa-bed/
https://skyeresidences.com/queen-suite-with-sofa-bed-and-balcony/
https://skyeresidences.com/queen-suite-with-sofa-bed-accessible/
https://skyeresidences.com/2-bedroom-deluxe-queen-suite-with-sofa-bed/
https://skyeresidences.com/2-bedroom-deluxe-king-queen-suite-with-sofa-bed/
https://skyeresidences.com/2-bedroom-deluxe-queen-suite-with-sofa-bed-accessible/
#Skye Residences Etobicoke, #Skye Residences Near Toronto Airport, #Skye Residences Toronto, #Skye Hotel Toronto, #Skye Hotel Near Toronto Airport, #Hotel Near Toronto Airport, #Near Toronto Airport Accommodation, #Suites Near Toronto Airport, #Etobicoke Suites Near Airport, #Hotel Near Toronto Pearson International Airport, #Toronto Airport Suite Rentals, #Pearson Airport Hotel Suites
Explore our most comprehensive guide on lookback analysis at SafePaaS, covering access governance and how it can transform modern ERP audits. Browse now!
2. Our businesses are
constantly evolving…
• Our digital products (apps and platforms) are
constantly developing
• The questions we ask of our data are constantly
changing
• It is critical that our analytics stack can evolve
with our business
3. Self-describing data Event data modeling+
Analytics stack that evolves
with your business
How Snowplow users evolve their
analytics stacks with their business
6. As a Snowplow user, you can
define your own events and entities
Events
Entities
(contexts)
• Build castle
• Form alliance
• Declare war
• Player
• Game
• Level
• Currency
• View product
• Buy product
• Deliver product
• Product
• Customer
• Basket
• Delivery van
8. Then send data into
Snowplow as self-
describing JSONs
1. Validation
2. Dimension
widening
3. Data
modeling
{
“schema”: “iglu:com.israel365/
temperature_measure/jsonschema/1-0-0”,
“data”: {
“timestamp”: “2016-11-16 19:53:21”,
“location”: “Berlin”,
“temperature”: 3
“units”: “Centigrade”
}
}
{
"$schema": "http://iglucentral.com/schemas/
com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
"description": "Schema for an ad impression
event",
"self": {
"vendor": “com.israel365",
"name": “temperature_measure",
"format": "jsonschema",
"version": "1-0-0"
},
"type": "object",
"properties": {
"timestamp": {
"type": "string"
},
"location": {
"type": "string"
},
…
},
…
Event
Schema
reference
Schema
9. The schemas can then be
used in a number of ways
• Validate the data (important for data quality)
• Load the data into tidy tables in your data
warehouse
• Make it easy / safe to write downstream data
processing application (e.g. for real-time users)
11. What is event data modeling?
1. Validation
2. Dimension
widening
3. Data
modeling
Event data modeling is the process of using business logic to aggregate over
event-level data to produce 'modeled' data that is simpler for querying.
13. In general, event data modeling is
performed on the complete event stream
• Late arriving events can change the way you
understand earlier arriving events
• If we change our data models: this gives us the
flexibility to recompute historical data based on the
new model
15. How do we handle pipeline
evolution?
PUSH
FACTORS:
What is being
tracked will
change over
time
PULL
FACTORS:
What questions are
being asked of the
data will change
over time
Businesses are not static, so event pipelines should not be either
Web
Apps
Servers
Comms channels
Push …
Data
warehouse
Data exploration
Predictive modeling
Real-time dashboards
Real-time,
data-driven
applications
RT
Bidder
Voucher
Person-
alization
…
Collection Processing
Smart car / home
…
16. Push example:
new source of event data
• If data is self-describing it is easy to add an additional
sources
• Self-describing data is good for managing bad data
and pipeline evolution
I’m an email send event and I have
information about the recipient (email
address, customer ID) and the email
(id, tags, variation)
18. Answering the question:
3 possibilities
1. Existing data model
supports answer
2. Need to update data
model
3. Need to update data
model and data
collection
• Possible to answer
question with existing
modeled data
• Data collected
already supports
answer
• Additional
computation required
in data modeling step
(additional logic)
• Need to extend event
tracking
• Need to update data
models to
incorporate
additional data (and
potentially additional
logic)
19. Self-describing data and the ability to recompute data
models are essential to enable pipeline evolution
Self-describing data Recompute data models on entire data set
• Updating existing events and entities in
a backward compatible way e.g. add
optional new fields
• Update existing events and entities in a
backwards incompatible way e.g. change
field types, remove fields, add compulsory fields
• Add new event and entity types
• Add new columns to existing derived
tables e.g. add new audience segmentation
• Change the way existing derived tables
are generated e.g. change sessionization logic
• Create new derived tables