Treasure Data is a cloud data service that provides data collection, storage, and analysis capabilities. It aims to simplify big data analytics by handling infrastructure setup and maintenance. The presentation discusses Treasure Data's platform and architecture, including its use of open source technologies like Fluentd for data collection and Hadoop for storage and processing. Case studies are presented showing how customers were able to get analytics systems into production within two weeks using Treasure Data.
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...Trivadis
Die Azure Cloud ist in der Schweiz angekommen. In dieser Session beleuchtet Primo Amrein, Cloud Lead bei Microsoft Schweiz, die Einführung der Azure Cloud in der Schweiz, berichtet über die Erfolgsgeschichten und die Lessons Learned. Die Session wird mit einem Ausblick auf die Roadmap abgerundet.
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...Amazon Web Services
Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. In this session we'll give an introduction to the service and its pricing before diving into how it delivers fast query performance on data sets ranging from hundreds of gigabytes to a petabyte or more.
Data saturday malta - ADX Azure Data Explorer overviewRiccardo Zamana
This is a step-by-step approach the entire ecosystem of features driven by Azure Data eXplorer. You can find many examples using Kusto dialect, in order to acquire data, process and build up complete web interfaces using only one service: ADX.
Customer Experience at Disney+ Through Data PerspectiveDatabricks
Disney+ has rapidly scaled to provide a personalized and seamless experience to tens of millions of customers. This experience is powered by a robust data platform that ingests, processes and surfaces billions of events per hour using Delta lake, Databricks, and AWS technologies. The data produced by the platform is used by multitude of services including a recommendation engine for personalized experience, optimizing watch experience including group watch, and fraud and abuse prevention.
In this session, you will learn how Disney+ built these capabilities, the architecture, technologies, design principles, and technical details that make it possible.
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...Trivadis
Die Azure Cloud ist in der Schweiz angekommen. In dieser Session beleuchtet Primo Amrein, Cloud Lead bei Microsoft Schweiz, die Einführung der Azure Cloud in der Schweiz, berichtet über die Erfolgsgeschichten und die Lessons Learned. Die Session wird mit einem Ausblick auf die Roadmap abgerundet.
AWS Summit 2013 | India - Petabyte Scale Data Warehousing at Low Cost, Abhish...Amazon Web Services
Amazon Redshift is a fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. In this session we'll give an introduction to the service and its pricing before diving into how it delivers fast query performance on data sets ranging from hundreds of gigabytes to a petabyte or more.
Data saturday malta - ADX Azure Data Explorer overviewRiccardo Zamana
This is a step-by-step approach the entire ecosystem of features driven by Azure Data eXplorer. You can find many examples using Kusto dialect, in order to acquire data, process and build up complete web interfaces using only one service: ADX.
Customer Experience at Disney+ Through Data PerspectiveDatabricks
Disney+ has rapidly scaled to provide a personalized and seamless experience to tens of millions of customers. This experience is powered by a robust data platform that ingests, processes and surfaces billions of events per hour using Delta lake, Databricks, and AWS technologies. The data produced by the platform is used by multitude of services including a recommendation engine for personalized experience, optimizing watch experience including group watch, and fraud and abuse prevention.
In this session, you will learn how Disney+ built these capabilities, the architecture, technologies, design principles, and technical details that make it possible.
Analyzing big data is a challenge, requiring lots of processing power and storage.
Cloud Computing is an ideal platform to tackle this problem. HD Insight on Microsoft Azure deploys Hadoop and other open source big data tools to the cloud, making it easier to take advantage of the high scalability of this platform.
In this session, you will learn what tools are available in HD Insight and how to use them to store, process, and analyze large amounts of data.
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs. This is a huge deck with lots of screenshots so you can see exactly how it works.
Getting to 1.5M Ads/sec: How DataXu manages Big DataQubole
DataXu sits at the heart of the all-digital world, providing a data platform that manages tens of millions of dollars of digital advertising investments from Global 500 brands. The DataXu data platform evaluates 1.5 million online ad opportunities every second for our customers, allowing them to manage and optimize their marketing investments across all digital channels. DataXu employs a wide range of AWS services: Cloud Front, Cloud Trail, CloudWatch, Data Pipeline, Direct Connect, Dynamo DB, EC2, EMR, Glacier, IAM, Kinesis, RDS, Redshift, Route53, S3, SNS, SQS, and VPC to run various workloads at scale for DataXu data platform.
In addition, DataXu also uses Qubole Data Service, QDS, to offer a Unified Analytics Interface tool to DataXu customers. Qubole, a member of APN provides self-managing Big data infrastructure in the Cloud which leverages spot pricing for cost-efficiencies, provides fast performance, and most importantly a streamlined user-interface for ease of use.
Attendees will learn how Qubole provided self-managing Hadoop clusters in the AWS Cloud accelerated DataXu’s batch-oriented analysis jobs; and how Qubole integration with Amazon Redshift enabled DataXu to preform low latency and interactive analysis. Further, in the session we'll take a look at how DataXu opened up QDS access to their customers using QDS user interface thereby providing them with a single tool for both batch-oriented and interactive analysis. By using the QDS user interface buyers of the DataXu data service could perform all manner of analysis against the data stored in their AWS S3 bucket.
Speakers:
Scott Ward
Solutions Architect at Amazon Web Services
Ashish Dubey
Solutions Architect at Qubole
Yekesa Kosuru
VP Engineering at DataXu
Organizations are grappling to manually classify and create an inventory for distributed and heterogeneous data assets to deliver value. However, the new Azure service for enterprises – Azure Synapse Analytics is poised to help organizations and fill the gap between data warehouses and data lakes.
Using real time big data analytics for competitive advantageAmazon Web Services
Many organisations find it challenging to successfully perform real-time data analytics using their own on premise IT infrastructure. Building a system that can adapt and scale rapidly to handle dramatic increases in transaction loads can potentially be quite a costly and time consuming exercise.
Most of the time, infrastructure is under-utilised and it’s near impossible for organisations to forecast the amount of computing power they will need in the future to serve their customers and suppliers.
To overcome these challenges, organisations can instead utilise the cloud to support their real-time data analytics activities. Scalable, agile and secure, cloud-based infrastructure enables organisations to quickly spin up infrastructure to support their data analytics projects exactly when it is needed. Importantly, they can ‘switch off’ infrastructure when it is not.
BluePi Consulting and Amazon Web Services (AWS) are giving you the opportunity to discover how organisations are using real time data analytics to gain new insights from their information to improve the customer experience and drive competitive advantage.
Privacy has become one of the most important critical topics in data today. It is more than how do we ingest and consume data but the important factors about how you protect your customer’s rights while balancing the business need. In our session, we will bring CTO, Privacera, Don Bosco Durai together with Northwestern Mutual to detail an important use case in privacy and then show how to scale Privacy with a focus on the business needs. We will make the ability to scale effortless.
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Cloudera, Inc.
We have seen the evolution with the Bi and Data Science fields from the structured data warehouse to data lake and finally, to the data hub. This session will cover the key steps required to building a data hub, examining how best to align and engage stakeholders and develop architectural sanction to enable your organisations to realise new customer insights and better enable you to achieve business objectives.
Azure Synapse is Microsoft's new cloud analytics service offering that combines enterprise data warehouse and Big Data analytics capabilities. It offers a powerful and streamlined platform to facilitate the process of consolidating, storing, curating and analysing your data to generate reliable and actionable business insights.
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Data Con LA
Learn how to benefit from IoT (internet of things) to reduce costs and spur transformation for your company and clients. Attendees will learn about building blocks to create an IoT solution, and walk through real life architectural decisions in building a solution.
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...HostedbyConfluent
Due to explosion of IoT, we have streaming data that needs to be processed in real-time. This needs to be made available for applications as well as analytics scenarios such as anomaly detection. This workshop presents a solution using Confluent Cloud on Azure, Azure Cosmos DB and Azure Synapse Analytics which can be connected in a secure way within Azure VNET using Azure Private link configured on Kafka clusters.
Part 3 - Modern Data Warehouse with Azure SynapseNilesh Gule
Slide deck of the third part of building Modern Data Warehouse using Azure. This session covered Azure Synapse, formerly SQL Data Warehouse. We look at the Azure Synapse Architecture, external files, integration with Azuer Data Factory.
The recording of the session is available on YouTube
https://www.youtube.com/watch?v=LZlu6_rFzm8&WT.mc_id=DP-MVP-5003170
AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...Amazon Web Services
Redis Labs' CMO is hosting a fireside chat with leaders from multiple industries including Groupon (e-commerce ), Intuit (Finance ), and LifeLock (Identity Protection ). This conversation-style session will cover the Big Data related challenges faced by these leading companies as they scale their applications, ensure high availability, serve the best user experience at lowest latencies, and optimize between cloud and on-premises operations. The introductory level session will appeal to both developer and DevOps functions. They will hear about diverse use cases such as recommendations engine, hybrid transactions and analytics operations, and time-series data analysis. The audience will learn how the Redis in-memory database platform addresses the above use cases with its multi-model capability and in a cost effective manner to meet the needs of the next generation applications. Session sponsored by Redis Labs.
In this presentation, Kaz Ohta, Kiyoto Tamura, and Ankush Rustagi from Treasure Data describe the company's Cloud Data Warehouse service.
"The Treasure Data Cloud Data Warehouse service enables companies to get big data analytics running in days not months without specialist IT resources and for a tenth the cost of other alternatives. Traditional data warehousing solutions - even modern alternatives such as Hadoop - are too expensive, complex and take too long for many companies to implement, so the idea of quickly launching a data warehouse service that uses the power and economics of the Cloud for companies of any size, opens up a huge potential market."
Learn more at: http://treasure-data.com * Watch the presentation video: http://inside-bigdata.com/?p=3531
Analyzing big data is a challenge, requiring lots of processing power and storage.
Cloud Computing is an ideal platform to tackle this problem. HD Insight on Microsoft Azure deploys Hadoop and other open source big data tools to the cloud, making it easier to take advantage of the high scalability of this platform.
In this session, you will learn what tools are available in HD Insight and how to use them to store, process, and analyze large amounts of data.
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs. This is a huge deck with lots of screenshots so you can see exactly how it works.
Getting to 1.5M Ads/sec: How DataXu manages Big DataQubole
DataXu sits at the heart of the all-digital world, providing a data platform that manages tens of millions of dollars of digital advertising investments from Global 500 brands. The DataXu data platform evaluates 1.5 million online ad opportunities every second for our customers, allowing them to manage and optimize their marketing investments across all digital channels. DataXu employs a wide range of AWS services: Cloud Front, Cloud Trail, CloudWatch, Data Pipeline, Direct Connect, Dynamo DB, EC2, EMR, Glacier, IAM, Kinesis, RDS, Redshift, Route53, S3, SNS, SQS, and VPC to run various workloads at scale for DataXu data platform.
In addition, DataXu also uses Qubole Data Service, QDS, to offer a Unified Analytics Interface tool to DataXu customers. Qubole, a member of APN provides self-managing Big data infrastructure in the Cloud which leverages spot pricing for cost-efficiencies, provides fast performance, and most importantly a streamlined user-interface for ease of use.
Attendees will learn how Qubole provided self-managing Hadoop clusters in the AWS Cloud accelerated DataXu’s batch-oriented analysis jobs; and how Qubole integration with Amazon Redshift enabled DataXu to preform low latency and interactive analysis. Further, in the session we'll take a look at how DataXu opened up QDS access to their customers using QDS user interface thereby providing them with a single tool for both batch-oriented and interactive analysis. By using the QDS user interface buyers of the DataXu data service could perform all manner of analysis against the data stored in their AWS S3 bucket.
Speakers:
Scott Ward
Solutions Architect at Amazon Web Services
Ashish Dubey
Solutions Architect at Qubole
Yekesa Kosuru
VP Engineering at DataXu
Organizations are grappling to manually classify and create an inventory for distributed and heterogeneous data assets to deliver value. However, the new Azure service for enterprises – Azure Synapse Analytics is poised to help organizations and fill the gap between data warehouses and data lakes.
Using real time big data analytics for competitive advantageAmazon Web Services
Many organisations find it challenging to successfully perform real-time data analytics using their own on premise IT infrastructure. Building a system that can adapt and scale rapidly to handle dramatic increases in transaction loads can potentially be quite a costly and time consuming exercise.
Most of the time, infrastructure is under-utilised and it’s near impossible for organisations to forecast the amount of computing power they will need in the future to serve their customers and suppliers.
To overcome these challenges, organisations can instead utilise the cloud to support their real-time data analytics activities. Scalable, agile and secure, cloud-based infrastructure enables organisations to quickly spin up infrastructure to support their data analytics projects exactly when it is needed. Importantly, they can ‘switch off’ infrastructure when it is not.
BluePi Consulting and Amazon Web Services (AWS) are giving you the opportunity to discover how organisations are using real time data analytics to gain new insights from their information to improve the customer experience and drive competitive advantage.
Privacy has become one of the most important critical topics in data today. It is more than how do we ingest and consume data but the important factors about how you protect your customer’s rights while balancing the business need. In our session, we will bring CTO, Privacera, Don Bosco Durai together with Northwestern Mutual to detail an important use case in privacy and then show how to scale Privacy with a focus on the business needs. We will make the ability to scale effortless.
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Cloudera, Inc.
We have seen the evolution with the Bi and Data Science fields from the structured data warehouse to data lake and finally, to the data hub. This session will cover the key steps required to building a data hub, examining how best to align and engage stakeholders and develop architectural sanction to enable your organisations to realise new customer insights and better enable you to achieve business objectives.
Azure Synapse is Microsoft's new cloud analytics service offering that combines enterprise data warehouse and Big Data analytics capabilities. It offers a powerful and streamlined platform to facilitate the process of consolidating, storing, curating and analysing your data to generate reliable and actionable business insights.
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Data Con LA
Learn how to benefit from IoT (internet of things) to reduce costs and spur transformation for your company and clients. Attendees will learn about building blocks to create an IoT solution, and walk through real life architectural decisions in building a solution.
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...HostedbyConfluent
Due to explosion of IoT, we have streaming data that needs to be processed in real-time. This needs to be made available for applications as well as analytics scenarios such as anomaly detection. This workshop presents a solution using Confluent Cloud on Azure, Azure Cosmos DB and Azure Synapse Analytics which can be connected in a secure way within Azure VNET using Azure Private link configured on Kafka clusters.
Part 3 - Modern Data Warehouse with Azure SynapseNilesh Gule
Slide deck of the third part of building Modern Data Warehouse using Azure. This session covered Azure Synapse, formerly SQL Data Warehouse. We look at the Azure Synapse Architecture, external files, integration with Azuer Data Factory.
The recording of the session is available on YouTube
https://www.youtube.com/watch?v=LZlu6_rFzm8&WT.mc_id=DP-MVP-5003170
AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...Amazon Web Services
Redis Labs' CMO is hosting a fireside chat with leaders from multiple industries including Groupon (e-commerce ), Intuit (Finance ), and LifeLock (Identity Protection ). This conversation-style session will cover the Big Data related challenges faced by these leading companies as they scale their applications, ensure high availability, serve the best user experience at lowest latencies, and optimize between cloud and on-premises operations. The introductory level session will appeal to both developer and DevOps functions. They will hear about diverse use cases such as recommendations engine, hybrid transactions and analytics operations, and time-series data analysis. The audience will learn how the Redis in-memory database platform addresses the above use cases with its multi-model capability and in a cost effective manner to meet the needs of the next generation applications. Session sponsored by Redis Labs.
In this presentation, Kaz Ohta, Kiyoto Tamura, and Ankush Rustagi from Treasure Data describe the company's Cloud Data Warehouse service.
"The Treasure Data Cloud Data Warehouse service enables companies to get big data analytics running in days not months without specialist IT resources and for a tenth the cost of other alternatives. Traditional data warehousing solutions - even modern alternatives such as Hadoop - are too expensive, complex and take too long for many companies to implement, so the idea of quickly launching a data warehouse service that uses the power and economics of the Cloud for companies of any size, opens up a huge potential market."
Learn more at: http://treasure-data.com * Watch the presentation video: http://inside-bigdata.com/?p=3531
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Looker
Enterprise companies are struggling to manage increasing demands for data with legacy BI tools. By centralizing their data in Vertica, SnagAJob, an online marketplace for hourly jobs with over 60 million users, can now use Looker to create a single source of truth and put data in the hands of decision-makers across the company.
Simply Business is a leading insurance provider for small business in the UK and we are now growing to the USA. In this presentation, I explain how our data platform is evolving to keep delivering value and adapting to a company that changes really fast.
Big Data brings big promise and also big challenges, the primary and most important one being the ability to deliver Value to business stakeholders who are not data scientists!
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Ian Gomez
At H2O.ai we see a world where all software will incorporate AI, and we’re focused on bringing AI to business through software. H2O.ai is the maker behind H2O, the leading open source machine and deep learning platform for smarter applications and data products. H2O operationalizes data science by developing and deploying algorithms and models for R, Python and the Sparkling Water API for Spark.
In this webinar, you will learn about the scalable H2O core platform and the distributed algorithms it supports. H2O integrates seamlessly with the R and the Python environments. We will show you how to leverage the power of H2O algorithms in R, Python and H2O Flow interface. Come with an open mind and some high level knowledge of machine learning, and you will take away a stream of knowledge for your next ML/DL project.
Amy Wang is a math hacker at H2O, as well as the Sales Engineering Lead. She graduated from Hunter College in NYC with a Masters in Applied Mathematics and Statistics with a heavy concentration on numerical analysis and financial mathematics.
Her interest in applicable math eventually lead her to big data and finding the appropriate mediums for data analysis.
Desmond is a Senior Director of Marketing at H2O.ai. In his 15+ years of career in Enterprise Software, Desmond worked in Distributed Systems, Storage, Virtualization, MPP databases, Streaming Analytics Platform, and most recently Machine Learning. He obtained his Master’s degree in Computer Science from Stanford University and MBA degree from UC Berkeley, Haas School of Business.
Data has been around for a long time. But only in two formats ANALOG and DIGITAL. Recently at an ever increasing rate DIGITAL DATA is growing exponentially year over year. Understand the best practice in Data Integration.
Real Time Data Warehousing Mastering Business Objects June 11jeffmonico
This is a copy of a presentation I gave at the Mastering Business Objects conference in Sydney, June 2011. It explains the move Star Track Express is making towards Active Data Warehousing to support both Analytical and Operational needs from a single platform.
Applications need data, but the legacy approach of n-tiered application architecture doesn’t solve for today’s challenges. Developers aren’t empowered to build and iterate their code quickly without lengthy review processes from other teams. New data sources cannot be quickly adopted into application development cycles, and developers are not able to control their own requirements when it comes to data platforms.
Part of the challenge here is the existing relationship between two groups: developers and DBAs. Developers are trying to go faster, automating build/test/release cycles with CI/CD, and thrive on the autonomy provided by microservices architectures. DBAs are stewards of data protection, governance, and security. Both of these groups are critically important to running data platforms, but many organizations deal with high friction between these teams. As a result, applications get to market more slowly, and it takes longer for customers to see value.
What if we changed the orientation between developers and DBAs? What if developers consumed data products from data teams? In this session, Pivotal’s Dormain Drewitz and Solstice’s Mike Koleno will speak about:
- Product mindset and how balanced teams can reduce internal friction
- Creating data as a product to align with cloud-native application architectures, like microservices and serverless
- Getting started bringing lean principles into your data organization
- Balancing data usability with data protection, governance, and security
Presenter : Dormain Drewitz, Pivotal & Mike Koleno, Solstice
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Sandipan Chakraborty, Director of Engineering (Rakuten)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Actively looking for new opportunity in Business Intelligence, Data Integration and Data Warehouse. Hands on experience in data analysis, providing ETL Solutions and building reports, dashboards and framework for business.
Tools/Technologies:
-Databases: SQL Server 2012, Teradata, MySQL.
-Reporting Tools: Pentaho Report Designer, Tableau.
-Dashboard Tool: Pentaho CDF, Pentaho CDE, Saiku.
-ETL Tools: Pentaho PDI, SSIS
-Scripting Languages: Python,UNIX Shell Scripting, Java script.
-Cloud: AWS – S3, RDS, EC2, DMS, Glue , snowflake, Metallion
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
In today’s world of exponentially growing big data, enterprises are becoming increasingly more aware of the business utility and necessity of harnessing, storing and analyzing this information. Apache Hadoop has rapidly evolved to become a leading platform for managing and processing big data, with the vital management, monitoring, metadata and integration services required by organizations to glean maximum business value and intelligence from their burgeoning amounts of information on customers, web trends, products and competitive markets. In this session, Hortonworks' Himanshu Bari will discuss the opportunities for deriving business value from big data by looking at how organizations utilize Hadoop to store, transform and refine large volumes of this multi-structured information. Connolly will also discuss the evolution of Apache Hadoop and where it is headed, the component requirements of a Hadoop-powered platform, as well as solution architectures that allow for Hadoop integration with existing data discovery and data warehouse platforms. In addition, he will look at real-world use cases where Hadoop has helped to produce more business value, augment productivity or identify new and potentially lucrative opportunities.
The new GDPR regulation went into effect on May 25th. While a majority of conversations have revolved around the security and IT aspects of the law, marketing teams will play a crucial role in helping organizations meet GDPR standards and playing a strategic role across the organization . Join us to learn more, engage with your peers and get prepared.
This webinar will cover:
- How complying with the GDPR will drive better marketing and raise the standard of the quality of your customer engagement
- The GDPR elements marketers must know about
- The elements of PII that will be affected and what marketers need to do about it
- A deep dive on how GDPR regulations will affect your marketing channels - email, programmatic advertising, cold calls, etc.
- Tactical marketing updates needed to meet GDPR guidelines
AR and VR by the Numbers: A Data First Approach to the Technology and MarketTreasure Data, Inc.
With AR and VR technologies, it’s the first time that data collection has been part of the front-end strategy vs back-end process. As companies compete to create new, interactive experiences, data is the tool of choice to measure all aspects of player engagement and marketing effectiveness. In this webinar, two industry experts, Nicolas Nadeau and Andrew Mayer, will talk about the trends driving AR and VR markets today, and what data-driven approaches companies need to think about to compete in these markets tomorrow.
An overview of Customer Data Platforms (CDP) with the industry leader who coined the term, David Raab. Find out how to use Live Customer Data to create a better customer experience and how Live Data Management can give you a competitive edge with a 360 degree view of your clients.
Learn:
- The definition and requirements for Customer Data Platforms
- The differences between Customer Data Platforms and comparative technologies such as Data Warehousing and Marketing Automation
- Reference architectures/approaches to building CDP
- How Treasure Data is used to build Customer Data Platforms
And here's the song: https://youtu.be/RalMozVq55A
In this hands-on webinar we will cover how to leverage the Treasure Data Javascript SDK library to ensure user stitching of web data into the Treasure Data Customer Data Platform to provide a holistic view of prospects and customers.
We will demo the native SDK, as well as deploying the SDK inside of Adobe DTM and Google Tag Manager.
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowTreasure Data, Inc.
In this hands-on webinar we'll explore the data warehousing concept of Slowly Changing Dimensions (SCDs) and common use cases for managing SCDs when dealing with customer data. This webinar will demonstrate different methods for tracking SCDs in a data warehouse, and how Treasure Data Workflow can be used to create robust data pipelines to handle these processes.
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsTreasure Data, Inc.
Gaming companies with multiple products often struggle to calculate accurate Customer Lifetime Value (CLTV) across their portfolio. This is because user data is often analyzed in silos so companies are unable to get a clear picture of ROI and CLTV across platforms, devices and apps.
In this webinar we’ll look at how you can apply a holistic and complete approach to your CLTV and ROI through the lens of gaming companies, though this technique is applicable for any company who has products spanning platforms.
We’ll also explore:
How the integral power of data in business has shifted over the past 10 years.
Discover the current technologies and processes used to analyze data across different platforms by combining multiple data streams, looking at examples in brand and portfolio-based LTV.
How to process and centralize dozens of varying data streams.
Nicolas Nadeau will speak from his extensive experience and show how leveraging data from multiple product strategies spanning many platforms can be highly beneficial for your company.
Do you know what your top ten 'happy' customers look like? Would you like to find ten more just like them? Come learn how to leverage 1st & 3rd party data to map your customer journey and drive users down a path where every interaction is personalized, fun, & data-driven. No more detractors, power your Customer Experience with data!
In this webinar you will learn:
-When, why, and how to leverage 1st, 2nd, and 3rd party data
-Tips & Tricks for marketers to become more data driven when launching their campaigns
-Why all marketers needs a 360 degree customer view
The reality is virtual, but successful VR games still require cold, hard data. For wildly popular games like Survios’ Raw Data, the first VR-exclusive game to reach #1 on Steam’s Global Top Sellers list, data and analytics are the key to success.
And now online gaming companies have the full-stack analytics infrastructure and tools to measure every aspect of a virtual reality game and its ecosystem in real time. You can keep tabs on lag, which ruins a VR experience, improve gameplay and identify issues before they become showstoppers, and create fully personalized, completely immersive experiences that blow minds and boost adoption, and more. All with the right tools.
Make success a reality: Register now for our latest interactive VB Live event, where we’ll tap top experts in the industry to share insights into turning data into winning VR games.
Attendees will:
* Understand the role of VR in online gaming
* Find out how VR company Survios successfully leverages the Exostatic analytics infrastructure for commercial and gaming success
* Discover how to deploy full-stack analytics infrastructure and tools
Speakers:
Nicolas Nadeau, President, Exostatic
Kiyoto Tamura, VP Marketing, Treasure Data
Ben Solganik, Producer, Survios
Stewart Rogers, Director of Marketing Technology, VentureBeat
Wendy Schuchart, Moderator, VentureBeat
Harnessing Data for Better Customer Experience and Company SuccessTreasure Data, Inc.
As big data has exploded, the ability for companies to easily leverage it has imploded. Organizations are drowning in their own information, unable to see the forest through the trees, while the big players consistently outperform in their ability to deliver a great customer experience, faster, cheaper…As a result, the vast majority of companies are scrambling to catch up and become more agile, data-driven, to use their data more effectively so they can attract and retain their elusive customers...
In this joint deck by 451 Research and Treasure Data, you will learn how to enable your line of business team to own their own data (instead of relying on IT) to be able to:
- deliver a single, persistent view of your customer based on behavior data
- make that data accessible to the right people at the right time
- Increase organizational effectiveness by (finally) breaking down silos with data
- enable powerful marketing tools to enhance the customer experience
How to make your open source project MATTER
Let’s face it: most open source projects die. “For every Rails, Docker and React, there are thousands of projects that never take off. They die in the lonely corners of GitHub, only to be discovered by bots scanning for SSH private keys.
Over the last 5 years, I worked on and off on marketing a piece of infrastructure middleware called Fluentd. We tried many things to ensure that it did not die: From speaking at events, speaking to strangers, giving away stickers, making people install Fluentd on their laptop. Most everything I tried had a small, incremental effect, but there were several initiatives/hacks that raised Fluentd’s awareness to the next level. As I listed up these “ideas that worked”, I noticed the common thread: they all brought Fluentd into a new ecosystem via packaging.”
* 행사 정보 :2016년 10월 14일 MARU180 에서 진행된 '데이터야 놀자' 1day 컨퍼런스 발표 자료
* 발표자 : Dylan Ko (고영혁) Data Scientist / Data Architect at Treasure Data
* 발표 내용
- 데이터사이언티스트 고영혁 소개
- Treasure Data (트레저데이터) 소개
- 데이터로 돈 버는 글로벌 사례 #1
>> MUJI : 전통적 리테일에서 데이터 기반 O2O
- 데이터로 돈 버는 글로벌 사례 #2
>> WISH : 개인화&자동화를 통한 쇼핑 최적화
- 데이터로 돈 버는 글로벌 사례 #3
>> Oisix : 머신러닝으로 이탈고객 예측&방지
- 데이터로 돈 버는 글로벌 사례 #4
>> 워너브로스 : 프로세스 자동화로 시간과 돈 절약
- 데이터로 돈 버는 글로벌 사례 #5
>> Dentsu 등의 애드테크(Adtech) 회사들
- 데이터로 돈을 벌고자 할 때 반드시 체크해야 하는 것
Keynote on Fluentd Meetup Summer
Related Slide
- Fluentd ServerEngine Integration & Windows Support http://www.slideshare.net/RittaNarita/fluentd-meetup-2016-serverengine-integration-windows-support
- Fluentd v0.14 Plugin API Details http://www.slideshare.net/tagomoris/fluentd-v014-plugin-api-details
John Hammink's Talk at Great Wide Open 2016. We discuss: 1.) the need for data analytics infrastructure that can scale exponentially and 2.) what such an infrastructure must contain and finally 3.) the need for an infrastructure to be able to handle un - and semi-structured data.
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...Treasure Data, Inc.
Migrate your semi-structured data from MySQL to Amazon Redshift in as few steps as possible. From Amazon Web Services Bay Area meetup @ Sumo Logic, December 3, 2015.
This presentation describes the common issues when doing application logging and introduce how to solve most of the problems through the implementation of an unified logging layer with Fluentd.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
情報処理学会 Exciting Coding! Treasure Data
1. Treasure Data
Exciting Coding!
Nov 2013
Presented by
Masahiro Nakagawa
Senior Software Engineer
www.treasuredata.com
1
2. Who are you
• Masahiro Nakagawa
– @repeatedly
– masa@treasure-data.com or d@
• Treasure Data, Inc
– Senior Software Engineer
• Fluentd / Client libraries / etc...
– Since 2012/11
• Open Source projects
– D Programming Language
– MessagePack: D, Python, etc…
– Fluentd: Core, Mongo, Logger, etc…
– Etc…
2
3. Company &
Board Meeting
Presentation
Service
Introduction
August 15th, 2013 - 3:30PM PDT
Presented by
Hironobu Yoshikawa – CEO
Kazuki Ohta – CTO
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing
www.treasuredata.com
3
4. Company Background
• Founded 2011 in Mountain View, CA
– The first cloud service for the entire data pipeline
– Including: Acquisition, Storage, & Analysis
• Provide a “Cloud Data Service”
– Fast Time to Value
– Cloud Flexibility and Economics
– Simple and Well Supported
The Treasure Data Team
Hiro Yoshikawa – CEO
Open source business veteran
Kaz Ohta – CTO
Founder of world’s largest Hadoop Group
Jeff Yuan – Director, Engineering
LinkedIn, MIT / Michale Stonrebrraker Lab
Keith Goldstein – VP Sales & Bus Dev
VP of Bus Dev from Tibco and Talend
Rich Ghiossi – VP Marketing
VP of Marketing from ParAccel
Notable Investors
• Treasure Data has over 100+ customers in
production
– Incl. Fortune 500 companies
– 500+ Billion new records / month
– Around 2 Trillion records under management
– Variety of use cases and verticals
Othman Laraki
Ex-VP of Growth at Twitter
Jerry Yang
Founder of Yahoo!
Yukihiro “Matz” Matusmoto
Creator of “Ruby” programming language
James Lindenbaum
Founder of Heroku
4
5. Problem Statement
• Lots of companies today produce Big Data by having
“New Data Sources” (Sensor, Weblog, etc)
– But few have the resources to build a
Big Data Analytics system
• 60-70% of a company’s Big Data time & budget
consumed by:
– Infrastructure setup & Maintenance
– Building Collection & Storage Flows
– Hiring/Training Hadoop Expertise
• On average, it takes 6 months to get
a Hadoop environment into production
5
9. Treasure Data Service: Overview
Acquire
Store
Analyze
Web logs
Treasure Agent
App logs
BI Connectivity
Streaming Log !
Collector (JSON)!
REST API, SQL, Pig,
JDBC / ODBC!
Sensor
Tableau, Metric Insights,
QlikView, Excel, etc.
Treasure Data Cloud
RDBMS
Bulk Import
CRM
BI Tools
Parallel Upload from
CSV, MySQL, etc.!
Flexible, Scalable,
Columnar Storage!
ERP
Time to Value
Economy & Flexibility
Result Push
REST API, SQL,
Pig!
Dashboards
Custom App, Local DB,
FTP Server, etc.
Simple & Supported
9
10. Our Value Propositions
• Faster time to value
On-demand cloud infrastructure & versatile streaming data collection agent
– Instantly provision a fully tuned & managed infrastructure
– Go live into production on average in 14 days (collection, analytics, & BI)
• Cloud flexibility and economics
Fraction of the cost of traditional solutions by leveraging cloud storage and processing,
which scales to meet your needs
– Leverage the cost-advantage of the cloud
– Leverage the elasticity of the cloud – scale on demand
– Predictable monthly subscription fee
– No upfront costs & no long-term commitment
• Simple and well supported
We are passionate about simplicity, and customer support excellence
– Focus your time on analyzing your data
– Rely on us to keep your data secure & online
– We love making customers successful & building long-term relationships
10
11. Initial Setup & Onboarding – Two Weeks
1. Data Collection
2. Data Storage
• Setup, tuning, and monitoring
of Treasure Agent
• Embed Treasure Agent code
into applications
• Basic log templates (register,
pay, login, etc.)
• Basic KPI queries (DAU, MAU,
ARPU, etc.)
3. Data Analysis
4. Service & Support
• Setup dashboards with basic
KPIs
• Training on creating
customized reports and adhoc querying
• Assigned a dedicated
technical account manager
• Real-time support via email,
online chat, and call
11
12. Solutions Accelerators
…
Out-of-the Box Reporting
Treasure Data Platform
Configured Treasure Agent
Solution
Components:
- Treasure Data Platform
- Event Collection
Template
- Pre-configured
Treasure Agent
Configuration
- BI Dashboard with KPIs
12
17. Data Storage
Treasure Data Cloud
Default (schema-less)
time
v
13841604
00
{“ip”:”135.52.211.23”, “code”:”0”}
13841622
00
{“ip”:”45.25.38.156”, “code”:”-1”}
13841640
00
{“ip”:”97.12.76.55”, “code”:”99”}
• Stored “schema-less” as JSON
–
Schema can be applied/updated
AFTER storage
• Compressed & columnar format
SELECT v[‘ip’] as ip, v[‘code’] as code …
Schema applied
~30% Faster
time
ip : string
135.52.211.23
45.25.38.156
97.12.76.55
• Quickly scale-up processing power
–
WITHOUT reloading/redistributing the data
-1
138416400
0
• Optimized for time-based filtering
0
138416220
0
For higher query performance
code : int
138416040
0
–
99
SELECT ip, code …
17
18. Data Analysis
REST API
Treasure Data Cloud
Heavy Lifting SQL (Hive):
- Hive’s Built-in UDFs
- TD Added Functions:
- Time Functions
- First, Last, Rank
- Sessionize
Scheduled Jobs
- SQL, Pig Scripts
- Data Pushes
JDBC Connectivity:
- Custom Java Apps
- Standards-based
- BI Tool Integration
Tableau ODBC connector
- Leverages Impala
Interactive SQL
Push Query Results:
Treasure Query Accelerator
- MySQL, PostgreSQL
(Impala)
- Google Spreadsheet
- Web, FTP, S3
Scripted Processing (Pig):
- Leftronic, Indicee
- DataFu (LinkedIn)
- Treasure Data Table
- Piggybank (Apache)
18
19. Treasure
Board Meeting
Presentation
Data
August 15th, 2013 - 3:30PM PDT
General Use Cases
Presented by
Hironobu Yoshikawa – CEO
Kazuki Ohta – CTO
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing
www.treasuredata.com
19
20. A case: “14 Days” from Signup to Success
1. Europe’s largest mobile ad
exchange.
2. Serving >60 billion imps/
month for >30,000 mobile
apps (Q4 2013)
3. Immediate need of analytics
infrastructure: ASAP!
4. With TD, MobFox got into
production only in 14 days,
by one engineer.
"Time is the most precious asset in our fast-moving
business,
and Treasure Data saved us a lot of it."
Julian Zehetmayr, CEO & Founder
20
21. A case: “Replace” in-house Hadoop to TD
Before
1. Global “Hulu” - Online Video
Service with millions of users
2. Video contents are
distributed to over 150
languages.
After
3. Had hard time maintaining
Hadoop cluster
4. With TD, Viki deprecated
their in-house Hadoop
cluster and use engineer for
core businesses.
“Treasure Data has always given us thorough and timely
support peppered with insightful tips to make the best use of
their service."
Huy Nguyen, Software Engineer
21
22. A case: Treasure Data with BI Tool (Tableau)
1. World’s largest android
application market
2. Serving >3 billion app
downloads for >100 million
users
3. Only one engineer managing
the data infrastructure
4. With TD, the data engineer
can focus on analyzing data
with existing BI tool
"I will recommend Treasure Data to my friends in a heartbeat because it
benefits all three stakeholders: Operations, Engineering and Business."
Simon Dong, Principal Architect - Data Engineering
22
23. Treasure
Board Meeting
DataPresentation
Platform
August 15th, 2013 - 3:30PM PDT
Fluentd Overview
Presented by
Hironobu Yoshikawa – CEO
Kazuki Ohta – CTO
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing
www.treasuredata.com
23
24. What is Fluentd?
• Open sourced log collector written in Ruby
– Easy to use, reliable and well performance
– Streaming event processing
• Using rubygems ecosystem to distribute plugins
Fluentd
the missing log collector
fluentd.org
24
29. Resolve your requirement by writing plugin
Access logs
Apache
Alerting
Nagios
App logs
Frontend
Backend
Analysis
MongoDB
MySQL
Hadoop
System logs
syslogd
Databases
filter / buffer / routing
Archiving
Amazon S3
29
30. Treasure Agent (td-agent)
• Open sourced distribution package of Fluentd
– ETL part of Treasure Data
– deb / rpm / homebrew
• Including useful components
– Ruby, jemalloc, fluentd
– 3rd party gems: td, mongo, webhdfs, etc…
– Init script
• http://packages.treasuredata.com/
30
34. Plazma(Hadoop, Storage, Queue and
Workers)
Frontend
Worker
Hadoop
Queue
Hadoop
Applications push
metrics to Fluentd
(via local Fluentd)
Treasure
Data
for historical analysis
Fluentd
Fluentd
sums up data minutes
(partial aggregation)
Librato Metrics
for realtime analysis
34
35. Treasure
Board Meeting
Presentation
Data
August 15th, 2013 - 3:30PM PDT
Development Philosophy
Presented by
Hironobu Yoshikawa – CEO
Kazuki Ohta – CTO
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing
www.treasuredata.com
35
36. Open-Source Culture
• TD prefers engineers, who are contributing
to the OSS products
– MessagePack, Fluentd, ZeroMQ, Hadoop,
MongoDB, Angular.js, Huahin, D-Lang, etc.
– https://github.com/treasure-data?tab=members
• Reasons
– Fixing & Improving the other people’s code is
crucial for the distributed team.
– TD’s engineering workflow is really similar with
OSS product workflow.
– A+ OSS engineers will bring another A+ OSS
engineer!
36
37. OSS v.s. Proprietary
• OSS Everything on the Client Side
– http://github.com/treasure-data/
– http://fluentd.org/
• TD is helping the world to collect more data in an analytics-ready
format
• 2000+ companies (e.g. Nintendo, SlideShare/LinkedIn) are using as
OSS product. 3-4% of the users are TD’s customer.
• We also leverage other OSS products as much as possible.
• Closed Source on the Cloud Side
– The core value must be a proprietary to sustain as a
business.
– The components can be OSS, but the most of the system will
remain proprietary to create the value chain.
37
38. How to decide Product Roadmap?
• Solving the Customer Pain is the #1 Priority
– Developers directly provide the support for customers, and spending
30%-40% of the development time to talk with customers
– Developers are the BEST person to come up with the solution.
– # of code lines != value
• Suffering Oriented Development
– First, make it possible
– Then, make it beautiful
– Then, make it fast
• The Largest Customer Pain is NOT always applicable to other
customers.
– Need to be brave to say NO. NO. NO. NO. NO….
• TD doesn’t have 1-year Product Roadmap. Having 3-months
roadmap accelerates the development, and other teams
(marketing / sales), too.
38
39. Distributed Team (International)
• 13 Engineers as of Nov. 2013
– 5 Engineers in Tokyo, Japan
– 8 Engineers in Mountain View, USA
– 40% of the whole company
• Asynchronous Communication
– Use async communication tools as much as possible:
Chat, JIRA, Email, Github, etc.
– Use video conferencing for weekly sync-up
• English is the primary communication language
– If you cannot speak English, your value is nearly zero at
Treasure Data engineering team.
39
40. Distributed Team (Deployment)
• Predictable Deployment Cycle
– Weekly Deployment
• Continuous Deployment didn’t fit into B2B SaaS application, our
customers want predictability of the changes.
• As a distributed team, it’s hard to track the every changes +
deployment status.
– Track every changes on JIRA, and QA engineer is responsible
for the deployment too.
• Continuous Deployment for Staging
– Single branch, always automatically deployed to the staging
environment
– Monitoring is a continuous testing
• On-Call Alert Schedule, based on the Timezone
– No need to get up around 3am
40
41. Leverage Cloud Services
• Use Cloud Services as Much as Possible
– Don’t hire people, use cloud services.
– Out source everything, except your core value.
– Developers tend to forget his own cost. If you spend 1-hour, it
already costs around $50 as a company.
• Examples
–
–
–
–
–
–
–
–
–
–
EC2 (IaaS)
CopperEgg (Infrastructure Monitoring)
NewRelic (Application Performance Management)
Hosted Chef (Configuration Management)
Librato Metrics (Application Metrics)
Pager Duty (Alerting)
Logentries (Log Search)
CircleCI, TravisCI (Continuous Integration)
HipChat, JIRA, Confluence (Development Tool)
Etc….
41
42. Treasure
Board Meeting
Presentation
Data
Conclusion
August 15th, 2013 - 3:30PM PDT
Presented by
Hironobu Yoshikawa – CEO
Kazuki Ohta – CTO
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing
www.treasuredata.com
42
43. Key points
• Treasure Data, Inc
– Cloud based Data Service for the world
– Customer oriented development
• Our Unique Products and Culture
– Fluend / Plazma (backend)
– OSS enthusiast
• Use Cloud or not?
– Cloud leverages an idea but not differentiator
– Focus own vision!
43
Editor's Notes
Time to Value Setup time and load time for data collection (td-agent) – 1 weekAnalysis capabilities out of the boxSimple integration with existing ecosystem (DI & BI)Cloud flexibility and economiesScalable (cloud), extensible (elastic), flexible (schemaless)Lower TCO compared to on-premise, hosted, or homegrownOn-demand ability to scale, adjust, meet future business requirementsSimple and supported“Full” solutions from collection to visualizationGreat customer service, support, setup, and SLAsEasy to extend on your own / self-service – DIY big data
Time to Value Setup time and load time for data collection (td-agent) – 1 weekAnalysis capabilities out of the boxSimple integration with existing ecosystem (DI & BI)Cloud flexibility and economiesScalable (cloud), extensible (elastic), flexible (schemaless)Lower TCO compared to on-premise, hosted, or homegrownOn-demand ability to scale, adjust, meet future business requirementsSimple and supported“Full” solutions from collection to visualizationGreat customer service, support, setup, and SLAsEasy to extend on your own / self-service – DIY big data