Companies like Buffer, SeatGeek, and Asana aren’t just talking about the value of data, they’re building data infrastructure that can actually deliver it. Join this 45-minute webinar to learn why these companies are investing in data and what you need to know to keep up.
Customer analysis is becoming more popular, and more complicated. As a result, many companies are turning to Mixpanel to move beyond simple visitor information, and start tracking the actions their users are taking. But what’s the best way to analyze this data?
That’s the subject of our latest webinar, Analyzing Mixpanel Data with SQL. Andy Granowitz from Wagon and Shaun McAvinney from RJMetrics will discuss the importance of analyzing Mixpanel data alongside data from your other data sources, the challenges in consolidating that data, and how to query that data using SQL to find valuable insights into your users' actions.
Salesforce & SQL: Get More from Your CRM Data Using the Tools You LoveJanessa Lantz
Salesforce is one of your company’s core data sources, so why is it so difficult to work with the data? This webinar explains how you can take back your CRM data and explore it in completely new ways.
Quick iteration and reusability of metric calculations for powerful data exploration.
At Looker, we want to make it easier for data analysts to service the needs of the data-hungry users in their organizations. We believe too much of their time is spent responding to ad hoc data requests and not enough time is spent building, experimenting, and embellishing a robust model of the business. Worse yet, business users are starving for data, but are forced to make important decisions without access to data that could guide them in the right direction. Looker addresses both of these problems with a YAML-based modeling language called LookML.
This paper walks through a number of data modeling examples, demonstrating how to use LookML to generate, alter, and update reports—without the need to rewrite any SQL. With LookML, you build your business logic, defining your important metrics once and then reusing them throughout a model—allowing quick, rapid iteration of data exploration, while also ensuring the accuracy of the SQL that’s generated. Small updates are quick and can be made immediately available to business users to manipulate, iterate, and transform in any way they see fit.
From Architecture to Analytics: A look at Simply Business’s data strategy Looker
Revamping Your Data Approach to Enable the Pace & Flexibility Needed to Make Timely Business Decisions
This slide deck is from Stewart Duncan of Simply Business and Zach Taylor of Looker. They discussed how Simply Business revamped their data approach to enable the pace and flexibility they needed to make timely decisions based on their data.
Simply Business offers a B2B web product that simplifies the experience of selecting and purchasing business insurance. With over 300,000 policyholders, Simply Business is the largest SME insurer in the UK. Removing friction for users as they complete the insurance quote process is critical to providing a hassle-free user experience and driving continuous improvements in conversion rates. Learn how data analytics helps them strive for these improvements.
Get ideas on the following:
- Using data for rapid A/B testing and to analyze user journeys to inform product development.
Transitioning from a traditional data architecture to a modern cloud-based stack integrating MongoDB, AWS Redshift, Hadoop and Looker.
- Designing a data platform and organizational process that drives data-driven behavior across an organization.
- Empowering product and marketing teams to get into granular online engagement and attribution data which has removed the bottleneck to making informed decisions.
Presented at UnGagged Los Angeles, November 7 2019
Google Analytics 360 is great – if you can afford it.
But what if you’re in that awkward phase where you’re a growing business who is starting to bump up on the limits of what you can do with the free version, but not yet large enough to cough up the cash for 360?
In this session, Dana will cover a plan on getting more out of Google Analytics without resorting to expensive add-ons.
Lessons from Digital Natives: How Retailers Power their Businesses with DataOpsNexla
The next generation of digital native retailers are fluent in data. They’re using it to create new customer experiences and rapidly scale their businesses. Learn how companies like Instacart and Poshmark are leveraging data operations to delight customers, onboard new vendor partners, and work smarter. Presented at the NRF's Shop.org conference, this presentation explains the three simple rules for taking control of your data.
Customer analysis is becoming more popular, and more complicated. As a result, many companies are turning to Mixpanel to move beyond simple visitor information, and start tracking the actions their users are taking. But what’s the best way to analyze this data?
That’s the subject of our latest webinar, Analyzing Mixpanel Data with SQL. Andy Granowitz from Wagon and Shaun McAvinney from RJMetrics will discuss the importance of analyzing Mixpanel data alongside data from your other data sources, the challenges in consolidating that data, and how to query that data using SQL to find valuable insights into your users' actions.
Salesforce & SQL: Get More from Your CRM Data Using the Tools You LoveJanessa Lantz
Salesforce is one of your company’s core data sources, so why is it so difficult to work with the data? This webinar explains how you can take back your CRM data and explore it in completely new ways.
Quick iteration and reusability of metric calculations for powerful data exploration.
At Looker, we want to make it easier for data analysts to service the needs of the data-hungry users in their organizations. We believe too much of their time is spent responding to ad hoc data requests and not enough time is spent building, experimenting, and embellishing a robust model of the business. Worse yet, business users are starving for data, but are forced to make important decisions without access to data that could guide them in the right direction. Looker addresses both of these problems with a YAML-based modeling language called LookML.
This paper walks through a number of data modeling examples, demonstrating how to use LookML to generate, alter, and update reports—without the need to rewrite any SQL. With LookML, you build your business logic, defining your important metrics once and then reusing them throughout a model—allowing quick, rapid iteration of data exploration, while also ensuring the accuracy of the SQL that’s generated. Small updates are quick and can be made immediately available to business users to manipulate, iterate, and transform in any way they see fit.
From Architecture to Analytics: A look at Simply Business’s data strategy Looker
Revamping Your Data Approach to Enable the Pace & Flexibility Needed to Make Timely Business Decisions
This slide deck is from Stewart Duncan of Simply Business and Zach Taylor of Looker. They discussed how Simply Business revamped their data approach to enable the pace and flexibility they needed to make timely decisions based on their data.
Simply Business offers a B2B web product that simplifies the experience of selecting and purchasing business insurance. With over 300,000 policyholders, Simply Business is the largest SME insurer in the UK. Removing friction for users as they complete the insurance quote process is critical to providing a hassle-free user experience and driving continuous improvements in conversion rates. Learn how data analytics helps them strive for these improvements.
Get ideas on the following:
- Using data for rapid A/B testing and to analyze user journeys to inform product development.
Transitioning from a traditional data architecture to a modern cloud-based stack integrating MongoDB, AWS Redshift, Hadoop and Looker.
- Designing a data platform and organizational process that drives data-driven behavior across an organization.
- Empowering product and marketing teams to get into granular online engagement and attribution data which has removed the bottleneck to making informed decisions.
Presented at UnGagged Los Angeles, November 7 2019
Google Analytics 360 is great – if you can afford it.
But what if you’re in that awkward phase where you’re a growing business who is starting to bump up on the limits of what you can do with the free version, but not yet large enough to cough up the cash for 360?
In this session, Dana will cover a plan on getting more out of Google Analytics without resorting to expensive add-ons.
Lessons from Digital Natives: How Retailers Power their Businesses with DataOpsNexla
The next generation of digital native retailers are fluent in data. They’re using it to create new customer experiences and rapidly scale their businesses. Learn how companies like Instacart and Poshmark are leveraging data operations to delight customers, onboard new vendor partners, and work smarter. Presented at the NRF's Shop.org conference, this presentation explains the three simple rules for taking control of your data.
Data Science Resources for Project and Product ManagersBrian Lynch
Presentation slides for the Charlottetown Product & Project Managers and Business Analyst Group by Darcy Norman. Provides a range of recommended data science resources for managers.
customTask: Your New Google Analytics BFFDana DiTomaso
If you’re an advanced Google Analytics and Google Tag Manager user, you might be familiar with the Measurement Protocol and how it works. But are you using it to it’s fullest potential? Introducing customTask, one of the most powerful ways to customize and control your Google Analytics implementation. In this talk you’ll learn the basics of how customTask works, along with implementation ideas and tips to get you started.
Presented at BrightonSEO, April 2019
Building the Ideal Stack for Machine LearningSingleStore
Machine Learning is not new, but its application across memory-optimized distributed systems has led to an explosion in both the number and capability of its uses. Pandora develops personalized content recommendations with machine learning algorithms, Tesla has produced the first widely distributed autonomous vehicle, and Amazon uses autonomous robots to move packages within its warehouses and even deliver packages. When coupled with real-time data, advanced analytics approaches like machine learning and deep learning create immediate business opportunities.
Machine learning has never been more accessible—if your data pipelines support real-time analysis. Attendees will learn tools and techniques for integrating machine learning models across industries and organizations. Steven Camiña, MemSQL Product Manager, will walk through critical technologies needed in your technology ecosystem, including Python, Apache Kafka, Apache Spark, and a real-time database.
Learn how to grow your mobile app.
Lessons from Kamo Asatryan- founder or LOLApps and Revsmirk.
Learn about Ad Networks, User Acquisition, Mobile Growth.
5 Google Analytics Features You Should Be UsingMatchCraft
Google Analytics is jam-packed with features that give you an instant status on your website’s health, mishaps and opportunities. The only problem? Most marketers don’t have some of the best features of Google Analytics enabled—leaving opportunity for optimization on the table.
These five features of Google Analytics are just a few of our favorites!
Overcoming Technical SEO Challenges for Enterprise Sites - LearnInbound 2019Sam Marsden
You might have a standard set of processes and fixes when dealing with normal-sized sites, but how does that change when you start working with large enterprise sites? How do you adapt SEO processes to work effectively for clients with these needs? In this session, Sam will provide efficient and effective strategies on how to tackle complex SEO challenges for enterprise level sites.
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...Alluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses & Data Lakes with Kyligence Cloud
George Demarest, Head of Marketing, Kyligence
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
SMX London 2019 - Automating Reporting - Data Studio for Search MarketersSam Marsden
How can you show off your work to your clients and your boss? Give them a great report showing your work. How can you tell a story with the data that shows why things are working as they are? Create a data-driven story around your report.
Google Data Studio is a free tool that allows you to unlock the power of your data with interactive dashboards and reports that help make smarter business decisions. Even better, you can automate much of the work so it happens while you sleep. This session shows you how.
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsSingleStore
What’s in Store For This Presentation?
1. MemSQL: A real-time database for transactions and analytics
2. Spark Use Cases
3. Example: Geospatial Enhancements
Amazon Neptune is a service that allows you to use graph structures and nodes to visualize stored data in an accessible way. You can find more in our blog entry: https://tinyurl.com/y623ff5j
All the sources are linked in the presentation.
Enjoy and don't forget to check out our blog and other social media!
LCloud Blog https://bit.ly/2Vgooz4
Facebook https://bit.ly/2tCqBJS
Twitter https://twitter.com/LCLOUD16
LinkedIn https://bit.ly/2syaQCr
YouTube https://bit.ly/2tGV62b
Questions? Feel free to ask:
kontakt@lcloud.pl
https://lcloud.pl/
How to Build a $24 Million Ecommerce Company in 2 YearsJanessa Lantz
Learn the drivers fueling success for best-in-class ecommerce companies, how to identify breakout success within the first six months of your business, and the three things you can't afford to get wrong.
Data Science Resources for Project and Product ManagersBrian Lynch
Presentation slides for the Charlottetown Product & Project Managers and Business Analyst Group by Darcy Norman. Provides a range of recommended data science resources for managers.
customTask: Your New Google Analytics BFFDana DiTomaso
If you’re an advanced Google Analytics and Google Tag Manager user, you might be familiar with the Measurement Protocol and how it works. But are you using it to it’s fullest potential? Introducing customTask, one of the most powerful ways to customize and control your Google Analytics implementation. In this talk you’ll learn the basics of how customTask works, along with implementation ideas and tips to get you started.
Presented at BrightonSEO, April 2019
Building the Ideal Stack for Machine LearningSingleStore
Machine Learning is not new, but its application across memory-optimized distributed systems has led to an explosion in both the number and capability of its uses. Pandora develops personalized content recommendations with machine learning algorithms, Tesla has produced the first widely distributed autonomous vehicle, and Amazon uses autonomous robots to move packages within its warehouses and even deliver packages. When coupled with real-time data, advanced analytics approaches like machine learning and deep learning create immediate business opportunities.
Machine learning has never been more accessible—if your data pipelines support real-time analysis. Attendees will learn tools and techniques for integrating machine learning models across industries and organizations. Steven Camiña, MemSQL Product Manager, will walk through critical technologies needed in your technology ecosystem, including Python, Apache Kafka, Apache Spark, and a real-time database.
Learn how to grow your mobile app.
Lessons from Kamo Asatryan- founder or LOLApps and Revsmirk.
Learn about Ad Networks, User Acquisition, Mobile Growth.
5 Google Analytics Features You Should Be UsingMatchCraft
Google Analytics is jam-packed with features that give you an instant status on your website’s health, mishaps and opportunities. The only problem? Most marketers don’t have some of the best features of Google Analytics enabled—leaving opportunity for optimization on the table.
These five features of Google Analytics are just a few of our favorites!
Overcoming Technical SEO Challenges for Enterprise Sites - LearnInbound 2019Sam Marsden
You might have a standard set of processes and fixes when dealing with normal-sized sites, but how does that change when you start working with large enterprise sites? How do you adapt SEO processes to work effectively for clients with these needs? In this session, Sam will provide efficient and effective strategies on how to tackle complex SEO challenges for enterprise level sites.
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses ...Alluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Achieving Massive Concurrency & Sub-second Query Latency on Cloud Warehouses & Data Lakes with Kyligence Cloud
George Demarest, Head of Marketing, Kyligence
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
SMX London 2019 - Automating Reporting - Data Studio for Search MarketersSam Marsden
How can you show off your work to your clients and your boss? Give them a great report showing your work. How can you tell a story with the data that shows why things are working as they are? Create a data-driven story around your report.
Google Data Studio is a free tool that allows you to unlock the power of your data with interactive dashboards and reports that help make smarter business decisions. Even better, you can automate much of the work so it happens while you sleep. This session shows you how.
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsSingleStore
What’s in Store For This Presentation?
1. MemSQL: A real-time database for transactions and analytics
2. Spark Use Cases
3. Example: Geospatial Enhancements
Amazon Neptune is a service that allows you to use graph structures and nodes to visualize stored data in an accessible way. You can find more in our blog entry: https://tinyurl.com/y623ff5j
All the sources are linked in the presentation.
Enjoy and don't forget to check out our blog and other social media!
LCloud Blog https://bit.ly/2Vgooz4
Facebook https://bit.ly/2tCqBJS
Twitter https://twitter.com/LCLOUD16
LinkedIn https://bit.ly/2syaQCr
YouTube https://bit.ly/2tGV62b
Questions? Feel free to ask:
kontakt@lcloud.pl
https://lcloud.pl/
How to Build a $24 Million Ecommerce Company in 2 YearsJanessa Lantz
Learn the drivers fueling success for best-in-class ecommerce companies, how to identify breakout success within the first six months of your business, and the three things you can't afford to get wrong.
2012 Online User Behavior and Engagement Study - Harris InteractiveHemant Charya
More than seven in 10 US tablet owners ages 18 to 34, and more than eight in 10 tablet owners ages 35 to 44, looked up product information on their devices after seeing something interesting about it on TV. The behavior was significantly less common among older tablet owners (and, in addition, older consumers are already less likely to own tablets), but still, more than half of those 45 and older did the same.
Hiring Hacks: Under Armour’s Formula for Data-Centric & Personalized RecruitingGreenhouseSoftware
You think you know how much work it takes to make a strong hire, but have you actually put a number to it? Under Armour has.
Over the course of 2 years, they’ve gathered enough data for MyFitnessPal, a wholly owned subsidiary of Under Armour based in San Francisco, to understand that it takes 200 outbound emails to make a single engineering hire. How did they figure this out? And how does this information affect their recruiting strategy?
Leslie Dutton, Sr. Manager, Talent Acquisition at Under Armour, has used data to help build the foundation of the recruiting program for MyFitnessPal, which tripled the size of the company in under two years and helped lead to their acquisition by Under Armour.
It’s not just about numbers, though. Leslie has found that personalization is the key component of driving a successful recruiting campaign. Join our webinar to learn how you can combine data and personalization to scale your hiring from just barely getting by to off the charts.
In this webinar, you’ll learn:
- Which metrics recruiters should be tracking
- How to make sense of the data you’ve gathered
- Tips to align your recruiting strategies with your product roadmap
- How to differentiate your employee value proposition with personalized messaging
[This is work presented at SIGMOD'13.]
The use of large-scale data mining and machine learning has proliferated through the adoption of technologies such as Hadoop, with its simple programming semantics and rich and active ecosystem. This paper presents LinkedIn's Hadoop-based analytics stack, which allows data scientists and machine learning researchers to extract insights and build product features from massive amounts of data. In particular, we present our solutions to the "last mile" issues in providing a rich developer ecosystem. This includes easy ingress from and egress to online systems, and managing workflows as production processes. A key characteristic of our solution is that these distributed system concerns are completely abstracted away from researchers. For example, deploying data back into the online system is simply a 1-line Pig command that a data scientist can add to the end of their script. We also present case studies on how this ecosystem is used to solve problems ranging from recommendations to news feed updates to email digesting to descriptive analytical dashboards for our members.
Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...DevOpsDays Tel Aviv
"Elasticsearch has come a long way: Started as a distributed search engine in 2009, it's now the tool of choice for even the largest websites (e.g. Facebook, Github, Ebay). Half-way to 2016 the ELK stack helped it become firmly embedded in many centralised log management systems (e.g. Netflix, Uber).
We're now midair in the next step, with the first folks using it for metrics. NASA is using it to monitor the Curiosity rover, Blizzard and Riot to monitor vast online gaming worlds.
This talk will focus on what makes this transition from more unstructured to structured data possible.
"
The presentation provides an overview on how TERN data infrastructure works. The presentation was part of the Workshop on Approaches to Terrestrial Ecosystem Data Management : from collection to synthesis and beyond which was held on 9th of March 2016 in University of Queensland.
FSI201 FINRA’s Managed Data Lake – Next Gen Analytics in the CloudAmazon Web Services
FINRA’s Data Lake unlocks the value in its data to accelerate analytics and machine learning at scale. FINRA's Technology group has changed its customer's relationship with data by creating a Managed Data Lake that enables discovery on Petabytes of capital markets data, while saving time and money over traditional analytics solutions. FINRA’s Managed Data Lake includes a centralized data catalog and separates storage from compute, allowing users to query from petabytes of data in seconds. Learn how FINRA uses Spot instances and services such as Amazon S3, Amazon EMR, Amazon Redshift, and AWS Lambda to provide the 'right tool for the right job' at each step in the data processing pipeline. All of this is done while meeting FINRA’s security and compliance responsibilities as a financial regulator.
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?Denodo
Ver: https://bit.ly/347ImDf
En la era digital, la gestión eficiente de los datos es un factor fundamental para optimizar la competitividad de las empresas. Sin embargo, la mayoría de ellas se enfrentan a silos de datos, lo que hace que su tratamiento sea lento y costoso. Además, la velocidad, la diversidad y el volumen de los datos pueden superar las arquitecturas de TI tradicionales.
¿Cómo mejorar la entrega de datos para extraer todo su valor?
¿Cómo conseguir que los datos estén disponibles y poder utilizarlos en tiempo real?
Los expertos de Vault IT y Denodo te proponen este webinar para descubrir cómo la virtualización de datos permite modernizar una arquitectura de TI en un contexto de transformación digital.
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
Take Action: The New Reality of Data-Driven BusinessInside Analysis
The Briefing Room with Dr. Robin Bloor and WebAction
Live Webcast on July 23, 2014
Watch the archive:
https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=360d371d3a49ad256942f55350aa0a8b
The waiting used to be the hardest part, but not anymore. Today’s cutting-edge enterprises can seize opportunities faster than ever, thanks to an array of technologies that enable real-time responsiveness across the spectrum of business processes. Early adopters are solving critical business challenges by enabling the rapid-fire design, development and production of very specific applications. Functionality can range from improved customer engagement to dynamic machine-to-machine interactions.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor, who will tout a new era in data-driven organizations, and why a data flow architecture will soon be critical for industry leaders. He’ll be briefed by Sami Akbay of WebAction, who will showcase his company’s real-time data management platform, which combines all the component parts needed to access, process and leverage data big and small. He’ll explain how this new approach can provide game-changing power to organizations of all types and sizes.
Visit InsideAnlaysis.com for more information.
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
Thirty years is a long time for a technology foundation to be as active as relational databases. Are their replacements here?
In this webinar, we look at this foundational technology for modern Data Management and show how it evolved to meet the workloads of today, as well as when other platforms make sense for enterprise data.
Watch Paul's session from Fast Data Strategy on-demand here: https://goo.gl/3veKqw
"Through 2020, 50% of enterprises will implement some form of data virtualization as one enterprise production option for data integration" according to Gartner. It is clear that data virtualization has become a driving force for companies to implement an agile, real-time and flexible enterprise data architecture.
Attend this session to learn:
• What data virtualization actually means and how it differs from traditional data integration approaches
• The most important use cases and key patterns of data virtualization
• The benefits of data virtualization
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...Databricks
How did Devon move from a traditional reporting and data warehouse approach to a modern data lake? What did it take to go from a slow and brittle technical landscape to an a flexible, scalable, and agile platform? In the past, Devon addressed data solutions in dozens of ways depending on the user and the requirements. Through a visionary program, driven by Databricks, Devon has begun a transformation of how it consumes data and enables engineers, analysts, and IT developers to deliver data driven solutions along all levels of the data analytics spectrum. We will share the vision, technical architecture, influential decisions, and lessons learned from our journey. Join us to hear the unique Databricks success story at Devon.
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
In Data Engineer's Lunch #60, Rahul Singh, CEO here at Anant, will discuss modern data processing/pipeline approaches.
Want to learn about modern data engineering patterns & practices for global data platforms? A high-level overview of different types, frameworks, and workflows in data processing and pipeline design.
Managing Large Amounts of Data with SalesforceSense Corp
Critical "design skew" problems and solutions - Engaging Big Objects, MuleSoft, Snowflake and Tableau at the right time
Salesforce’s ability to handle large workloads and participate in high-consumption, mobile-application-powering technologies continues to evolve. Pub/sub-models and the investment in adjacent properties like Snowflake, Kafka, and MuleSoft, has broadened the development scope of Salesforce. Solutions now range from internal and in-platform applications to fueling world-scale mobile applications and integrations. Unfortunately, guidance on the extended capabilities is not well understood or documented. Knowing when to move your solution to a higher-order is an important Architect skill.
In this webinar, Paul McCollum, UXMC and Technical Architect at Sense Corp, will present an overview of data and architecture considerations. You’ll learn to identify reasons and guidelines for updating your solutions to larger-scale, modern reference infrastructures, and when to introduce products like Big Objects, Kafka, MuleSoft, and Snowflake.
Against the backdrop of Big Data, the Chief Data Officer, by any name, is emerging as the central player in the business of data, including cybersecurity. The MITCDOIQ Symposium explored the developing landscape, from local organizational issues to global challenges, through case studies from industry, academic, government and healthcare leaders.
Joe Caserta, president at Caserta Concepts, presented "Big Data's Impact on the Enterprise" at the MITCDOIQ Symposium.
Presentation Abstract: Organizations are challenged with managing an unprecedented volume of structured and unstructured data coming into the enterprise from a variety of verified and unverified sources. With that is the urgency to rapidly maximize value while also maintaining high data quality.
Today we start with some history and the components of data governance and information quality necessary for successful solutions. I then bring it all to life with 2 client success stories, one in healthcare and the other in banking and financial services. These case histories illustrate how accurate, complete, consistent and reliable data results in a competitive advantage and enhanced end-user and customer satisfaction.
To learn more, visit www.casertaconcepts.com
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
Many enterprises are turning to Apache Hadoop to enable Big Data Analytics and reduce the costs of traditional data warehousing. Yet, it is hard to succeed when 80% of the time is spent on moving data and only 20% on using it. It’s time to swap the 80/20! The Big Data experts at Attunity and Hortonworks have a solution for accelerating data movement into and out of Hadoop that enables faster time-to-value for Big Data projects and a more complete and trusted view of your business. Join us to learn how this solution can work for you.
50-55 hours Training + Assignments + Actual Project Based Case Studies
All attendees will receive,
Assignment after each module, Video recording of every session
Notes and study material for examples covered.
Access to the Training Blog & Repository of Materials
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
Similar to How to Build a Data-Driven Company: From Infrastructure to Insights (20)
You know that outstanding customer support increases customer loyalty, but how do you know exactly what's working? That's the subject of our latest webinar around support center metrics and how they're used to uncover actionable insights.
Kerri Nunnamaker and Taryn Cooper will explain why analyzing simple response efficiency can only get you so far, and why more in-depth analysis and complex metrics actually leads to increased repeat purchase rates and customer retention.
Analyzing ROI Using Your Facebook and Adwords DataJanessa Lantz
Customer lifetime value (CLV), cost-per-click (CPC), customer acquisition cost (CAC); these are the metrics that data-driven companies use to measure their marketing KPIs. But how do you calculate them? This is the subject of our latest webinar, which will get you familiar with these calculations, then teach you how to turn them into reports for money-making campaigns.
How to Find the Customer Retention Secrets Hiding in Your DataJanessa Lantz
The fastest growing ecommerce companies excel at both customer acquisition and retention. In this event, we're going to dig deep into the latter, looking at how other ecommerce companies perform on common retention benchmarks and how to analyze your own data to build a churn-stopping retention strategy.
How to Use Feedback Surveys to Improve Customer RetentionJanessa Lantz
The most common reasons customers churn are perceived indifference and poor service. That’s why 82% of customers leave!
During this event, you’ll learn how to reduce churn using customer feedback. From building the survey to analyzing the data and acting on the results, attendees will leave with all the information they need to put a churn-stopping feedback program in place.
RJMetrics and Shopify Plus are teaming up to help ecommerce companies find their path to $10 million in annual revenue. Whether you’ve just launched your store or are struggling to find the answers that unlock growth, this event is for you.
This webinar will draw on RJMetrics’ industry research and Shopify Plus’ extensive experience working with hundreds of successful ecommerce companies, from early days through joining the ranks of best-in-class growth.
The Ultimate 30-Minute Guide to SaaS AnalyticsJanessa Lantz
There are tons of metrics you could be tracking for your SaaS company. How do you decide where to start? Join RJMetrics data nerds, Shaun McAvinney and Tristan Handy, for a 30-minute crash course in turning data into growth opportunities.
You'll Learn:
- The metrics every SaaS company should be tracking
- 3 approaches to solving the challenge of data consolidation
- How SaaS companies are using RJMetrics to measure and manage their businesses
Using Benchmark Data to Improve PerformanceJanessa Lantz
We’ve analyzed the purchasing habits of more than 18 million customers, and now we’re teaming up with Hubspot to discuss how retailers can use these insights to acquire and retain more customers.
You’ll learn:
-- How to get more of your most valuable customers
-- The best times of the year to focus on customer acquisition
-- Strategies to increase the lifetime value of customers
RJMetrics teams up with Bounce Exchange to give you research, best practices, and actionable advice that you can implement today.
1. Take a high level dive through the data of a landmark e-commerce study
2. Discuss what your primary growth-goals should be for 2016
3. Give you an honest look at the state of the e-commerce industry
Logos, Brand and Underpants: One Startup's Journey to Finding Their Visual Id...Janessa Lantz
Reasons every startup should invest in a strong visual identity, lessons we learned on the path to building a brand, and what to do if you end up with a logo that looks like underpants.
How to Analyze Your Marketing Funnel Using Pardot + RJMetricsJanessa Lantz
Pardot is deeply ingrained in the fabric of hundreds of marketing teams (including ours!)—that’s why we built the RJMetrics Pardot Integration. With this integration, customers can effortlessly extract their data from Pardot and analyze it in any way they want: explore activity data to find which actions are correlated with conversion, find the channels bringing in the most qualified leads, and integrate the wealth of Pardot data into the entire customer funnel across many disparate systems.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Welocme to ViralQR, your best QR code generator.ViralQR
Welcome to ViralQR, your best QR code generator available on the market!
At ViralQR, we design static and dynamic QR codes. Our mission is to make business operations easier and customer engagement more powerful through the use of QR technology. Be it a small-scale business or a huge enterprise, our easy-to-use platform provides multiple choices that can be tailored according to your company's branding and marketing strategies.
Our Vision
We are here to make the process of creating QR codes easy and smooth, thus enhancing customer interaction and making business more fluid. We very strongly believe in the ability of QR codes to change the world for businesses in their interaction with customers and are set on making that technology accessible and usable far and wide.
Our Achievements
Ever since its inception, we have successfully served many clients by offering QR codes in their marketing, service delivery, and collection of feedback across various industries. Our platform has been recognized for its ease of use and amazing features, which helped a business to make QR codes.
Our Services
At ViralQR, here is a comprehensive suite of services that caters to your very needs:
Static QR Codes: Create free static QR codes. These QR codes are able to store significant information such as URLs, vCards, plain text, emails and SMS, Wi-Fi credentials, and Bitcoin addresses.
Dynamic QR codes: These also have all the advanced features but are subscription-based. They can directly link to PDF files, images, micro-landing pages, social accounts, review forms, business pages, and applications. In addition, they can be branded with CTAs, frames, patterns, colors, and logos to enhance your branding.
Pricing and Packages
Additionally, there is a 14-day free offer to ViralQR, which is an exceptional opportunity for new users to take a feel of this platform. One can easily subscribe from there and experience the full dynamic of using QR codes. The subscription plans are not only meant for business; they are priced very flexibly so that literally every business could afford to benefit from our service.
Why choose us?
ViralQR will provide services for marketing, advertising, catering, retail, and the like. The QR codes can be posted on fliers, packaging, merchandise, and banners, as well as to substitute for cash and cards in a restaurant or coffee shop. With QR codes integrated into your business, improve customer engagement and streamline operations.
Comprehensive Analytics
Subscribers of ViralQR receive detailed analytics and tracking tools in light of having a view of the core values of QR code performance. Our analytics dashboard shows aggregate views and unique views, as well as detailed information about each impression, including time, device, browser, and estimated location by city and country.
So, thank you for choosing ViralQR; we have an offer of nothing but the best in terms of QR code services to meet business diversity!
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
2. #datastack#datastack
What you’re going to learn
1 How top engineering organizations are building their
data infrastructure
The 7 core challenges of data integration
Why companies like Asana, Buffer, and SeatGeek
choose Redshift for their analytics warehouse
...and much more!
2
3
Shaun
4. #datastack
The traditional approach: ETL Dillon
END USERBI TEAMETL TEAM EDW TEAM
A
B
D
CZ
P
SUMMAR
Y
ELT - Heavy Transformation Restricted Q&AOLAP / Silos
SUMMAR
Y
F
E
5. #datastack
How companies are doing it today: ELT
Dillon
Modeling Layer
Transform at Query
FFF
Database
Extract Load
- name:
first_purchasers
type: single_value
base_view: orders
measures:
[orders.customer.all]
Analytics
Viz & Exploration
C
C
C
Transform (and
Explore!)
6. #datastack
Benefits of this approach
1.Redshift is performant enough to handle most
transformations
2.Users prefer performing transformations in a language
they already use (SQL) or with UI
3.Transformations are much simpler, more transparent
4.Performing transformations alongside raw data is great
for auditability
Dillon
15. #datastack
Quick poll Shaun
What top five data sources are a top priority for you to
integrate/keep integrated?
● production databases
● events
● error logs
● billing
● email marketing
● crm
● advertising
● erp
● a/b testing
● support
16. #datastack
“A year ago, we were facing a lot of stability problems with our data processing.
When there was a major shift in a graph, people immediately questioned the
data integrity. It was hard to distinguish interesting insights from
bugs. Data science is already an art so you need the infrastructure to give you
trustworthy answers to the questions you ask. 99% correctness is not
good enough. And on the data infrastructure team, we were spending a lot of
time churning on fighting urgent fires, and that prevented us from
making much long-term progress. It was painful.”
- Marco Gallotta, Asana, How to Build Stable, Accessible Data Infrastructure at a Startup
17. #datastack
“Our story would end here if real-time processing were perfect. But it’s not: some
events can come in days late, some time ranges need to be re-
processed after initial ingestion due to code changes or data revisions, various
components of the real-time pipeline can fail, and so on.”
- Gian Merlino, MetaMarkets, Building a Data Pipeline That Handles Billions of Events in Real-Time
18. #datastack
7 core challenges of data integration
Connections: Every API is a
unique and special snowflake
Accuracy: Ordering data on a
distributed system
Latency: Large object data stores
(Amazon S3, Redshift) are
optimized for batches not streams
Scale: Data will grow
exponentially as your company
grows
Flexibility: you’re interacting with
systems you don’t control
Monitoring: Notifications for
expired credentials, errors,
notifications of disruptions
Maintenance: Justifying
investment in ongoing
maintenance/improvement
Shaun
27. #datastack#datastack
A broken model Dillon
● Feedback loop is broken
● Disparate reporting
● Non-unified decision
making
● Versioning
● Reusability is lost
Marketing
Finance
AM
28. #datastack
Constraints of SQL Dillon
SQL is versatile, but shares the same flavor as
assembly-only languages such as Perl
Can write but not read
Promotes one-off, piecemeal analysis
Disparate interpretation
29. #datastack
The critical multiplier: modeling Dillon
Any SQL Data Warehouse
Modeling Layer
What’s our most
successful
marketing campaign
How does our Q4
Pipeline looks?
Who are our
healthiest / happiest
customers?
Good afternoon, everyone! Thanks so much for joining us today. I’m going to introduce you to my co-host in just a second, but first, let me run through just a few housekeeping details.
We have a lot on the agenda for today. The core of our presentation is going to focus on how companies like yours are solving their data infrastructure challenges. We’re going to cover the challenges engineers should expect around data integration, why Amazon Redshift is quickly becoming the data warehouse of choice, cultural barriers to building a data-driven company, and a lot more.
First thing we’re going to cover is data infrastructure, or the actual architecture of legacy and modern data pipelines
For the last 30 years or so, really since the inception of modern databases, data warehousing has been the standard model to aggregate data and provide business-directed analytics
Data is extracted from various sources…. databases, third-party applications, flat files, etc…. and transformed into a predefined model, then loaded into the data warehouse
This ETL process results in data cubes and data silos, where analytics are separated by key groupings for various departments, such as marketing, product, sales, etc.
This results in a few issues that are fundamentally prohibitive to creating a data-driven organization
First, it’s very resource intensive (and expensive) to manage all of the transformations and data loading
Second, it results in latency in the analytics process. End users only have access to pre-defined metrics, which are typically too broad or inflexible to guide nimble decision making. This means that end-users aren’t really getting any actionable insights from these metrics - they’re just looking at high level analysis
Third, it restricts drilling. If an end-user finds an interesting piece of information…. say sales accelerated drastically for a certain user age group, and you want to know why… that end-user needs to rmake another data request from the ETL or IT team, who will then take some time to return the request. This latency constrains end users from making data-driven decisions.
These were commonly recognized problems. So nowadays, as Shaun was mentioning, modern tech companies have reworked this process
Nowadays, companies are collecting more data than ever before
Additionally, database technology has witnessed significant advances in the last several years... Databases themselves are now capable of performing sophisticated analysis very quickly
This removes the need for data silos and data cubes - all analytics can be performed directly on the central database
What this means, is that it now makes sense to shift the burden of complex transformations to the front of the pipeline - to the BI tool - where transformations can be performed on-the-fly, at query time
Several benefits to this approach, some of which I mentioned a minute ago but are worth repeating:
First, you no longer require huge, resource-intensive engineering or ETL team to move all of your data - so it’s much cheaper on the resource side
Secondly, Technical users can pull data in a language they’re used to, SQL…. and if you have a modeling layer, like Looker provides, then users can actually query the data directly from the UI, without any technical knowledge.
Transformations aren’t being done by engineers on the backend, they’re being performed as the user pulls the data, so they’re much easier to repeat and easier to understand
Lastly, this allows you to audit transformations, so you users understand the components behind analysis - they’ll understand how a metric is defined
And Shaun has a few examples of this in practice
In the process of data engineering going from being a clumsy, multi-year project -- it’s gained some geek cred. Over the past year we’ve watched as one company after the next shared their “how we built our data infrastructure” blog posts. Yes, even looker. At some point data infrastructure gained geek cred. We were really interested in the details behind all these projects so we did a “meta-analysis” where we looked at how these companies solved core data engineering challenges.
We looked at Zulilly
Spotify
Seatgeek, Buffer, Asana, and many more.
Some of these companies (like Netflix and Spotify) are building data products -- recommendation engines. That stack can look slightly different. For this event, we’re going to focus on companies who are building data infrastructure for analytics. And for these companies what we saw is that the process looks very much like what Dillon was just describing. First, they extract data from the variety of sources. Then they load it into the data warehouse. Then they do transformations on top of that.
Let’s start at the first part of the conversation. Extract & Load, or more simply, data integration.
And just to clarify, the reason this step is so important is because all future insights depend on it. Here are some of the use cases that the Asana team laid out.
“It’s difficult work – but an absolute requirement of great intelligence.”
Here are the most common data sources that we saw companies connecting to. Our analysis of how companies built their data infrastructure was based largely on blog posts (and some conversations) on the topic. One limitation there is that engineers tend to write these pieces fairly soon after completion of the project and there’s often the understanding that more data sources will be added on later. Asana built data connections to the most sources, but there’s an enormous amount of data that can be derived just from connecting ad spend to purchase history living in your production databases.
Now, for some audience participation, could you grab your mouse and fill in this poll? What top five data sources are a top priority for you to integrate nad keep integrated?
While you’re filling in your answers, let me just say that data consolidation comes with it’s own special challenges. When Asana first started building their data infrastructure they did it using Python scripts and MySQL. And if you’re just starting out this can work for you too, but you will outgrow it eventually. And I’m going to say more on that in a second, but first let’s take a look at the results.
So in the Asana teams own words, here are some of the challenges they faced during consolidation -- doubts about data integrity due to a lack of monitoring and logging, insights vs. bugs. Urgent fires when systems went down.
And this is from MetaMarkets. Braintree’s team said: deletes are nearly impossible to keep track of, you have to keep track of data that changed, batch updates are slow and it’s difficult to know how long they’ll take.
A big part of my job involves talking to people every day about their data infrastructure. These posts touch on some of the problems you can expect, but keep in mind -- these people are the successful ones. I’ve been on calls with many a frustrated engineer throwing in the towel on their data infrastructure projects after 1 year at the task. Data consolidation is hard. Here are 7 of the core challenges.
Early last month we released a SaaS product designed to solve this problem -- called Pipeline. It takes data from any number of integrations and that data flows into a datawarehouse with super low latency. We’re aggressively releasing new integrations each month, so if you need an integration you don’t see here today, let us know!
If you want to learn more about this, stick around at the end for a demo.
The next step in the process is data warehousing. Hands down the top pick for warehousing was Redshift.
Among the companies that we looked at, Redshift was the most popular choice for an analytics warehouse.
The most common reason? speed. People are seeing dramatic improvements in query time using Redshift.
Asana said that queries that were taking hours now take a few seconds.
Similarly, seatgeek had a critical query that took 20 minutes, now takes half a minute in redshift.
Here are the results of AirBnB tests that show performance in both query time and cost.
Source: http://nerds.airbnb.com/redshift-performance-cost/
Here’s some research from Periscope showing Redshift vs. Postgres shows similar performance gains.
And here is research from DiamondStream showing how much better their internal dashboards performed when built on Redshift vs. MS SQL. I think it’s this final reason why Looker is such a big fan of Redshift and recommends it to their clients.
source: http://www.datasciencecentral.com/profiles/blogs/why-5-companies-chose-amazon-redshift
Right, thanks Shaun... So earlier I talked a bit about the structural differences between old data architecture vs modern data architecture - now I’m going to elaborate a bit on how that architecture impacts business intelligence and analytics work flows
This slide shows workflows with the legacy architecture I described earlier
As a reminder, with legacy architecture, each department is working in silos, all serviced by a central IT or Analyst team
This is fundamentally prohibitive to a data-drive culture for a few reasons:
First, it’s extremely resource-intensive for the central data team to service the needs of their business users.
Second, it creates a bottleneck in the analytics process. You’ll see that the arrows are flowing away from the central data team, and that’s for a specific reason. The data team will provide pre-determined metrics for various departments, then rerun and distribute those metrics periodically. These metrics are typically overly broad and not actionable. If a user has further questions about the analysis…. and that is often the case. How do you know what questions to ask about the data, unless you’ve seen the data already?... Iif a user has a further question, they need to submit a request from the data team, who will may take a few days to turn it around. This latency restricts end-users from making quick, informed business decisions based on their data.
Plus, in most companies, there is typically a hierarchy to who receives data. The Executive team can get all the data they want, while requests from sales reps, marketing managers, etc. are pushed to the back of the line. These groups rarely have the ability to make strategic decisions based on the analysis they request
Lastly, this model results in disparate reporting. If 5 different departments request the same metric from 5 different database analysts, it’s highly likely that those analysts will have differing ideas about the appropriate way to calculate a metric. Especially when you get into the more sophisticated stuff - things like Affinity Analysis... if I buy X what is the likelihood I buy Y?.... There are a few statistically defensible ways to calculate that metrics. In practice, it’s very common for large organizations to have non-unified definitions, which leads to headaches, data chaos, and an inability to make decisions based on data
One of the factors that contributes to these workflow issues, which is sort of the last point I touched on, is the difficulty in consistently defining metrics across a company
Part of this is because of the nature of SQL, the de facto language for querying databases
SQL can be easy to write, but difficult to read / audit
If you give 10 analysts the same metrics, you’ll very likely get 10 different queries, some of which may yield the same results, some of which may not
In practice, this often results in data analysts recycling and slightly modifying old queries, without ever really understanding the inner workings of the query
This then jeopardizes the integrity of the data, which makes it difficult to consistently interpret results
How do we solve this issue of one-off queries and silo’d reporting?
We create a data model as an intermediary
All definitions of metrics, and data transformations, are defined in one place, where all users can access and understand them
Now, you don’t need those 10 analysts, you only need 1-2 who monitor the modeling layer, and you can be confident all users are working off of the same definitions and interpretations of the results
You can also link together data from different sources, so you can link Salesforce marketo zendesk data together to get a comprehensive view of your customer
This allows us to maintain “data governance”, which is a term that you probably hear a lot lately
So, how does this modeling layer impact workflows?
This slide depicts BI and analysis workflows with modern architecture
creates a, but creates a truly data-driven environment
All users have equal access to the data through a UI, they don’t need to know SQL. So now Sales, marketing, finance, customer success, teams that previously could not directly access data, have the ability to explore their database in full detail
Since everyone is looking at the same numbers and reports, business users can collaborate and facilitate meaningful conversations, based on shared insights
Business users can make informed strategic decisions on the fly, which results in tangible, significant competitive advantages
So, how do you set up this kind of architecture?
I think a good example of this is one of our customers, Infectious Media, who offer digital advertising for a myriad of Fortune companies. With Looker, their Sales optimization team has the ability to see, in real time, how various advertising campaigns are performing across every website and publisher. If a certain type of website is driving the most clicks or conversions, the optimization team can immediately determine why, then redirect future campaign efforts towards those specific websites or publishers, and perhaps new, similar ones. In a world where advertisements sometimes only last a week or two, the ability to constantly iterate on, and refine campaign strategy, results in tangible differences in top line sales. This represents the most significant competitive advantage a company in this space can possess… This model is required for a company to survive.
Now that we understand the benefits, I’ll explain how the set-up of these modern infrastructures is easier than ever. And I’ll illustrate this with an example using RJ Pipeline
Say you’re a company that collects data from a number of various sources, such as 3rd party applications
Rather than needing to perform complex transformations (like with legacy architecture), you can dump all of your data directly into a centralized location using a middleware tool such as RJ Pipeline. This completely centralizes all of your data, and prepares it for analytics, with a few clicks. No need for heavy engineering resources and workloads
Once the data is centralized, you can quickly add a tool with modeling layers to help distribute data to all of your end users (again, the modeling layer is key here)
Working with a tool like Looker, for example, we have an offering called Looker Blocks, which is essentially pre-templated code for your modeling layer for all sorts of third-party applications and types of analysis…. These Blocks can be copied into your data model, so now even most of the actual data model development is initially taken care of for you
The result, is going from having silo’d data in several disparate applications with unequal access for users…. to having data centralized in a modern database, with a full analytics suite on top, that can be accessed by any user
What would have taken… quite literally…. months of intensive engineering efforts, is now accomplished in 1,2, or 3 weeks… Which is pretty astounding. That time-to-value from your data is something we’ve never really seen before in the data space.