SlideShare a Scribd company logo
Treasure Data 

Exciting Coding!
Nov 2013
Presented by



Masahiro Nakagawa
Senior Software Engineer


www.treasuredata.com

1
Who are you
•  Masahiro Nakagawa
–  @repeatedly
–  masa@treasure-data.com or d@

•  Treasure Data, Inc
–  Senior Software Engineer
•  Fluentd / Client libraries / etc...

–  Since 2012/11

•  Open Source projects
–  D Programming Language
–  MessagePack: D, Python, etc…
–  Fluentd: Core, Mongo, Logger, etc…
–  Etc…

2
Company &

Board Meeting
Presentation
Service

Introduction

August 15th, 2013 - 3:30PM PDT

Presented by


Hironobu Yoshikawa – CEO 
Kazuki Ohta – CTO 
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing


www.treasuredata.com

3
Company Background
•  Founded 2011 in Mountain View, CA
–  The first cloud service for the entire data pipeline
–  Including: Acquisition, Storage, & Analysis

•  Provide a “Cloud Data Service”
–  Fast Time to Value
–  Cloud Flexibility and Economics
–  Simple and Well Supported

The Treasure Data Team
Hiro Yoshikawa – CEO
Open source business veteran
Kaz Ohta – CTO
Founder of world’s largest Hadoop Group
Jeff Yuan – Director, Engineering
LinkedIn, MIT / Michale Stonrebrraker Lab
Keith Goldstein – VP Sales & Bus Dev
VP of Bus Dev from Tibco and Talend
Rich Ghiossi – VP Marketing
VP of Marketing from ParAccel

Notable Investors

•  Treasure Data has over 100+ customers in
production
–  Incl. Fortune 500 companies
–  500+ Billion new records / month
–  Around 2 Trillion records under management
–  Variety of use cases and verticals

Othman Laraki
Ex-VP of Growth at Twitter
Jerry Yang
Founder of Yahoo!
Yukihiro “Matz” Matusmoto
Creator of “Ruby” programming language
James Lindenbaum
Founder of Heroku

4
Problem Statement
•  Lots of companies today produce Big Data by having
“New Data Sources” (Sensor, Weblog, etc)
–  But few have the resources to build a
Big Data Analytics system

•  60-70% of a company’s Big Data time & budget
consumed by:
–  Infrastructure setup & Maintenance
–  Building Collection & Storage Flows
–  Hiring/Training Hadoop Expertise

•  On average, it takes 6 months to get
a Hadoop environment into production
5
6
Treasure Data’s
Focus
(80% of the
needs)

7
8
Treasure Data Service: Overview
Acquire

Store

Analyze

Web logs
Treasure Agent
App logs

BI Connectivity

Streaming Log !
Collector (JSON)!

REST API, SQL, Pig,
JDBC / ODBC!

Sensor

Tableau, Metric Insights,
QlikView, Excel, etc.

Treasure Data Cloud

RDBMS
Bulk Import
CRM

BI Tools

Parallel Upload from
CSV, MySQL, etc.!

Flexible, Scalable,
Columnar Storage!

ERP

Time to Value

Economy & Flexibility

Result Push
REST API, SQL,
Pig!

Dashboards
Custom App, Local DB,
FTP Server, etc.

Simple & Supported

9
Our Value Propositions 
•  Faster time to value

On-demand cloud infrastructure & versatile streaming data collection agent
–  Instantly provision a fully tuned & managed infrastructure
–  Go live into production on average in 14 days (collection, analytics, & BI)

•  Cloud flexibility and economics

Fraction of the cost of traditional solutions by leveraging cloud storage and processing,
which scales to meet your needs
–  Leverage the cost-advantage of the cloud
–  Leverage the elasticity of the cloud – scale on demand
–  Predictable monthly subscription fee
–  No upfront costs & no long-term commitment

•  Simple and well supported
We are passionate about simplicity, and customer support excellence
–  Focus your time on analyzing your data
–  Rely on us to keep your data secure & online
–  We love making customers successful & building long-term relationships

10
Initial Setup & Onboarding – Two Weeks
1. Data Collection

2. Data Storage

•  Setup, tuning, and monitoring
of Treasure Agent
•  Embed Treasure Agent code
into applications

•  Basic log templates (register,
pay, login, etc.)
•  Basic KPI queries (DAU, MAU,
ARPU, etc.)

3. Data Analysis

4. Service & Support

•  Setup dashboards with basic
KPIs
•  Training on creating
customized reports and adhoc querying

•  Assigned a dedicated
technical account manager
•  Real-time support via email,
online chat, and call

11
Solutions Accelerators

…
Out-of-the Box Reporting 



Treasure Data Platform



Configured Treasure Agent

Solution
Components:



-  Treasure Data Platform
-  Event Collection
Template
-  Pre-configured
Treasure Agent
Configuration
-  BI Dashboard with KPIs

12
- Vision -
gle Analytics Platform for the Wo

13
Treasure

Board Meeting
DataPresentation
Platform
August 15th, 2013 - 3:30PM PDT

Architecture Overview
Presented by


Hironobu Yoshikawa – CEO 
Kazuki Ohta – CTO 
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing


www.treasuredata.com

14
Data Acquisition – Streaming Capture
Application Server
# Application Code
...
...
# Post event to Treasure Data
TD.event.post('access', {:uid=>123})

•  Automatic Microbatching
•  Local buffering Fallback
•  Network Tolerance

...
...

Treasure Data Library

Java, Ruby, PHP, Perl, Python, Scala,
Node.js 

Treasure Data Cloud

Treasure Agent (local)

Open-Sourced as Fluentd Project ( http://fluentd.org/ )

15
Data Acquisition – Bulk Loader
RDBMS

App

SaaS

CSV, TSV, JSON,
MessagePack, Apache,
regex, MySQL, FTP

FTP

Treasure Data Cloud



Bulk Loader


Prepare ! Upload ! Perform ! Commit

16
Data Storage

Treasure Data Cloud

Default (schema-less)
time

v

13841604
00

{“ip”:”135.52.211.23”, “code”:”0”}

13841622
00

{“ip”:”45.25.38.156”, “code”:”-1”}

13841640
00

{“ip”:”97.12.76.55”, “code”:”99”}

•  Stored “schema-less” as JSON
– 

Schema can be applied/updated
AFTER storage

•  Compressed & columnar format

SELECT v[‘ip’] as ip, v[‘code’] as code …

Schema applied

~30% Faster

time

ip : string
135.52.211.23
45.25.38.156
97.12.76.55

•  Quickly scale-up processing power
– 

WITHOUT reloading/redistributing the data

-1

138416400
0

•  Optimized for time-based filtering

0

138416220
0

For higher query performance

code : int

138416040
0

– 

99

SELECT ip, code …

17
Data Analysis
REST API

Treasure Data Cloud

Heavy Lifting SQL (Hive):
-  Hive’s Built-in UDFs
-  TD Added Functions:
-  Time Functions
-  First, Last, Rank
-  Sessionize

Scheduled Jobs
-  SQL, Pig Scripts
-  Data Pushes

JDBC Connectivity:
-  Custom Java Apps
-  Standards-based
-  BI Tool Integration

Tableau ODBC connector
-  Leverages Impala
Interactive SQL
Push Query Results:
Treasure Query Accelerator 
 -  MySQL, PostgreSQL
(Impala)
-  Google Spreadsheet
-  Web, FTP, S3
Scripted Processing (Pig):
-  Leftronic, Indicee
-  DataFu (LinkedIn)
-  Treasure Data Table
-  Piggybank (Apache)

18
Treasure

Board Meeting
Presentation
Data
August 15th, 2013 - 3:30PM PDT

General Use Cases
Presented by


Hironobu Yoshikawa – CEO 
Kazuki Ohta – CTO 
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing


www.treasuredata.com

19
A case: “14 Days” from Signup to Success

1.  Europe’s largest mobile ad
exchange.
2.  Serving >60 billion imps/
month for >30,000 mobile
apps (Q4 2013)
3.  Immediate need of analytics
infrastructure: ASAP!
4.  With TD, MobFox got into
production only in 14 days,
by one engineer.

"Time is the most precious asset in our fast-moving
business,
and Treasure Data saved us a lot of it."


Julian Zehetmayr, CEO & Founder
20
A case: “Replace” in-house Hadoop to TD
Before

1.  Global “Hulu” - Online Video
Service with millions of users

2.  Video contents are
distributed to over 150
languages.

After

3.  Had hard time maintaining
Hadoop cluster
4.  With TD, Viki deprecated
their in-house Hadoop
cluster and use engineer for
core businesses.

“Treasure Data has always given us thorough and timely
support peppered with insightful tips to make the best use of
their service."

Huy Nguyen, Software Engineer
21
A case: Treasure Data with BI Tool (Tableau)

1.  World’s largest android
application market
2.  Serving >3 billion app
downloads for >100 million
users
3.  Only one engineer managing
the data infrastructure
4.  With TD, the data engineer
can focus on analyzing data
with existing BI tool

"I will recommend Treasure Data to my friends in a heartbeat because it
benefits all three stakeholders: Operations, Engineering and Business."	

	

Simon Dong, Principal Architect - Data Engineering	


22
Treasure

Board Meeting
DataPresentation
Platform
August 15th, 2013 - 3:30PM PDT

Fluentd Overview
Presented by


Hironobu Yoshikawa – CEO 
Kazuki Ohta – CTO 
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing


www.treasuredata.com

23
What is Fluentd?
•  Open sourced log collector written in Ruby
–  Easy to use, reliable and well performance
–  Streaming event processing

•  Using rubygems ecosystem to distribute plugins

Fluentd
the missing log collector
fluentd.org
24
Data processing pipeline
Data source
Collect

Store

Process

Visualize

Reporting
Monitoring
25
Data processing pipeline
Important but no
defacto
middleware!

Collect

Store

Data source
Process

Visualize

Reporting
Monitoring
26
Fluentd general example
2012-02-04 01:33:51
apache.log

Web Server

{
"host": "127.0.0.1",
"method": "GET",
...

tail

127.0.0.1
127.0.0.1
127.0.0.1
127.0.0.1
127.0.0.1

-

-

[11/Dec/2012:07:26:27]
[11/Dec/2012:07:26:30]
[11/Dec/2012:07:26:32]
[11/Dec/2012:07:26:40]
[11/Dec/2012:07:27:01]
...

"GET
"GET
"GET
"GET
"GET

/
/
/
/
/

...
...
...
...
...

}

Fluentd

insert

event
buffering
27
Pluggable Architecture
Pluggable

Pluggable

Output
Input

> rewrite
> ...

Engine
Buffer
> Forward
> HTTP
> File tail
> dstat
> ...

> File
> Memory

Output
> Forward
> File
> MongoDB
> ...

28
Resolve your requirement by writing plugin

Access logs
Apache

Alerting
Nagios

App logs
Frontend
Backend

Analysis
MongoDB
MySQL
Hadoop

System logs
syslogd
Databases

filter / buffer / routing

Archiving
Amazon S3
29
Treasure Agent (td-agent)
•  Open sourced distribution package of Fluentd
–  ETL part of Treasure Data
–  deb / rpm / homebrew

•  Including useful components
–  Ruby, jemalloc, fluentd
–  3rd party gems: td, mongo, webhdfs, etc…
–  Init script

•  http://packages.treasuredata.com/
30
Fluentd users

31
Treasure

Board Meeting
DataPresentation
Platform
August 15th, 2013 - 3:30PM PDT

Backend Overview
Presented by


Hironobu Yoshikawa – CEO 
Kazuki Ohta – CTO 
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing


www.treasuredata.com

32
AWS components
•  RDS
–  Store user information, job, status, etc…
–  Queue Worker / Scheduler

•  EC2
–  API Server, Hadoop Cluster, Job Worker / Scheduler

•  S3
–  Columnar storage
•  Realtime / Archive storage
•  MessagePack columnar

•  ELB
33
Plazma(Hadoop, Storage, Queue and
Workers)
Frontend

Worker
Hadoop

Queue

Hadoop
Applications push
metrics to Fluentd
(via local Fluentd)

Treasure
Data

for historical analysis

Fluentd

Fluentd

sums up data minutes
(partial aggregation)

Librato Metrics
for realtime analysis

34
Treasure

Board Meeting
Presentation
Data
August 15th, 2013 - 3:30PM PDT

Development Philosophy
Presented by


Hironobu Yoshikawa – CEO 
Kazuki Ohta – CTO 
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing


www.treasuredata.com

35
Open-Source Culture
•  TD prefers engineers, who are contributing
to the OSS products
–  MessagePack, Fluentd, ZeroMQ, Hadoop,
MongoDB, Angular.js, Huahin, D-Lang, etc.
–  https://github.com/treasure-data?tab=members

•  Reasons
–  Fixing & Improving the other people’s code is
crucial for the distributed team.
–  TD’s engineering workflow is really similar with
OSS product workflow.
–  A+ OSS engineers will bring another A+ OSS
engineer!
36
OSS v.s. Proprietary
•  OSS Everything on the Client Side
–  http://github.com/treasure-data/
–  http://fluentd.org/
•  TD is helping the world to collect more data in an analytics-ready
format
•  2000+ companies (e.g. Nintendo, SlideShare/LinkedIn) are using as
OSS product. 3-4% of the users are TD’s customer.
•  We also leverage other OSS products as much as possible.

•  Closed Source on the Cloud Side
–  The core value must be a proprietary to sustain as a
business.
–  The components can be OSS, but the most of the system will
remain proprietary to create the value chain.
37
How to decide Product Roadmap?
•  Solving the Customer Pain is the #1 Priority

–  Developers directly provide the support for customers, and spending
30%-40% of the development time to talk with customers
–  Developers are the BEST person to come up with the solution.
–  # of code lines != value

•  Suffering Oriented Development
–  First, make it possible
–  Then, make it beautiful
–  Then, make it fast

•  The Largest Customer Pain is NOT always applicable to other
customers.
–  Need to be brave to say NO. NO. NO. NO. NO….

•  TD doesn’t have 1-year Product Roadmap. Having 3-months
roadmap accelerates the development, and other teams
(marketing / sales), too.
38
Distributed Team (International)
•  13 Engineers as of Nov. 2013
–  5 Engineers in Tokyo, Japan
–  8 Engineers in Mountain View, USA
–  40% of the whole company

•  Asynchronous Communication
–  Use async communication tools as much as possible:
Chat, JIRA, Email, Github, etc.
–  Use video conferencing for weekly sync-up

•  English is the primary communication language
–  If you cannot speak English, your value is nearly zero at
Treasure Data engineering team.
39
Distributed Team (Deployment)
•  Predictable Deployment Cycle
–  Weekly Deployment

•  Continuous Deployment didn’t fit into B2B SaaS application, our
customers want predictability of the changes.
•  As a distributed team, it’s hard to track the every changes +
deployment status.

–  Track every changes on JIRA, and QA engineer is responsible
for the deployment too.

•  Continuous Deployment for Staging

–  Single branch, always automatically deployed to the staging
environment
–  Monitoring is a continuous testing

•  On-Call Alert Schedule, based on the Timezone
–  No need to get up around 3am

40
Leverage Cloud Services
•  Use Cloud Services as Much as Possible

–  Don’t hire people, use cloud services.
–  Out source everything, except your core value.
–  Developers tend to forget his own cost. If you spend 1-hour, it
already costs around $50 as a company.

•  Examples
– 
– 
– 
– 
– 
– 
– 
– 
– 
– 

EC2 (IaaS)
CopperEgg (Infrastructure Monitoring)
NewRelic (Application Performance Management)
Hosted Chef (Configuration Management)
Librato Metrics (Application Metrics)
Pager Duty (Alerting)
Logentries (Log Search)
CircleCI, TravisCI (Continuous Integration)
HipChat, JIRA, Confluence (Development Tool)
Etc….
41
Treasure

Board Meeting
Presentation
Data

Conclusion

August 15th, 2013 - 3:30PM PDT

Presented by


Hironobu Yoshikawa – CEO 
Kazuki Ohta – CTO 
Rich Ghiossi – VP, Marketing
Keith Goldstein – VP, Sales
Kengo Hirouchi – Director, Japan
Ankush Rustagi – Director, Marketing


www.treasuredata.com

42
Key points
•  Treasure Data, Inc
–  Cloud based Data Service for the world
–  Customer oriented development

•  Our Unique Products and Culture
–  Fluend / Plazma (backend)
–  OSS enthusiast

•  Use Cloud or not?
–  Cloud leverages an idea but not differentiator
–  Focus own vision!
43

More Related Content

What's hot

Big data on AWS
Big data on AWSBig data on AWS
Big data on AWS
Stylight
 
Lecture1
Lecture1Lecture1
Lecture1
Manish Singh
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azure
David Giard
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
James Serra
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Qubole
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
Mark Kromer
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
Jeraldine Phneah
 
Azure Synapse Analytics
Azure Synapse AnalyticsAzure Synapse Analytics
Azure Synapse Analytics
WinWire Technologies Inc
 
Netflix: Using Big Data in the Cloud to Drive Engagement
Netflix: Using Big Data in the Cloud to Drive EngagementNetflix: Using Big Data in the Cloud to Drive Engagement
Netflix: Using Big Data in the Cloud to Drive Engagement
Coy Dean
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
Amazon Web Services
 
Scaling Privacy in a Spark Ecosystem
Scaling Privacy in a Spark EcosystemScaling Privacy in a Spark Ecosystem
Scaling Privacy in a Spark Ecosystem
Databricks
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Cloudera, Inc.
 
Introduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarIntroduction to Azure Synapse Webinar
Introduction to Azure Synapse Webinar
Peter Ward
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Data Con LA
 
Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Google Cloud Platform (GCP)
Google Cloud Platform (GCP)
Chetan Sharma
 
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
HostedbyConfluent
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
 
Delivering business insights and automation utilizing aws data services
Delivering business insights and automation utilizing aws data servicesDelivering business insights and automation utilizing aws data services
Delivering business insights and automation utilizing aws data services
Bhuvaneshwaran R
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure Synapse
Nilesh Gule
 
AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...
AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...
AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...
Amazon Web Services
 

What's hot (20)

Big data on AWS
Big data on AWSBig data on AWS
Big data on AWS
 
Lecture1
Lecture1Lecture1
Lecture1
 
Big Data on azure
Big Data on azureBig Data on azure
Big Data on azure
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Getting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big DataGetting to 1.5M Ads/sec: How DataXu manages Big Data
Getting to 1.5M Ads/sec: How DataXu manages Big Data
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
 
How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?How Workato creates robust data pipelines and automations for you?
How Workato creates robust data pipelines and automations for you?
 
Azure Synapse Analytics
Azure Synapse AnalyticsAzure Synapse Analytics
Azure Synapse Analytics
 
Netflix: Using Big Data in the Cloud to Drive Engagement
Netflix: Using Big Data in the Cloud to Drive EngagementNetflix: Using Big Data in the Cloud to Drive Engagement
Netflix: Using Big Data in the Cloud to Drive Engagement
 
Using real time big data analytics for competitive advantage
 Using real time big data analytics for competitive advantage Using real time big data analytics for competitive advantage
Using real time big data analytics for competitive advantage
 
Scaling Privacy in a Spark Ecosystem
Scaling Privacy in a Spark EcosystemScaling Privacy in a Spark Ecosystem
Scaling Privacy in a Spark Ecosystem
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
 
Introduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarIntroduction to Azure Synapse Webinar
Introduction to Azure Synapse Webinar
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
 
Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Google Cloud Platform (GCP)
Google Cloud Platform (GCP)
 
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
Extracting Value from IOT using Azure Cosmos DB, Azure Synapse Analytics and ...
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
 
Delivering business insights and automation utilizing aws data services
Delivering business insights and automation utilizing aws data servicesDelivering business insights and automation utilizing aws data services
Delivering business insights and automation utilizing aws data services
 
Part 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure SynapsePart 3 - Modern Data Warehouse with Azure Synapse
Part 3 - Modern Data Warehouse with Azure Synapse
 
AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...
AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...
AWS re:Invent 2016: Fireside chat with Groupon, Intuit, and LifeLock on solvi...
 

Similar to 情報処理学会 Exciting Coding! Treasure Data

Treasure Data Cloud Strategy
Treasure Data Cloud StrategyTreasure Data Cloud Strategy
Treasure Data Cloud Strategy
Treasure Data, Inc.
 
Treasure Data Cloud Data Platform
Treasure Data Cloud Data PlatformTreasure Data Cloud Data Platform
Treasure Data Cloud Data Platform
inside-BigData.com
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Looker
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data Platform
Dani Solà Lagares
 
Nyc web perf-final-july-23
Nyc web perf-final-july-23Nyc web perf-final-july-23
Nyc web perf-final-july-23
Dan Boutin
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
Bob Hardaway
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Ian Gomez
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of Data
Daniel Saito
 
Web Briefing: Unlock the power of Hadoop to enable interactive analytics
Web Briefing: Unlock the power of Hadoop to enable interactive analyticsWeb Briefing: Unlock the power of Hadoop to enable interactive analytics
Web Briefing: Unlock the power of Hadoop to enable interactive analyticsKognitio
 
Real Time Data Warehousing Mastering Business Objects June 11
Real Time Data Warehousing   Mastering Business Objects June 11Real Time Data Warehousing   Mastering Business Objects June 11
Real Time Data Warehousing Mastering Business Objects June 11
jeffmonico
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
Qubole
 
Running Data Platforms Like Products
Running Data Platforms Like ProductsRunning Data Platforms Like Products
Running Data Platforms Like Products
VMware Tanzu
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
Capgemini
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Alluxio, Inc.
 
Abhishek jaiswal
Abhishek jaiswalAbhishek jaiswal
Abhishek jaiswal
Abhishek jaiswal
 
Big Data LDN 2017: Unleash Data Science Upon Your Organisation
Big Data LDN 2017: Unleash Data Science Upon Your OrganisationBig Data LDN 2017: Unleash Data Science Upon Your Organisation
Big Data LDN 2017: Unleash Data Science Upon Your Organisation
Matt Stubbs
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
jaxconf
 
MS Azure with IoT - Final Version
MS Azure with IoT - Final VersionMS Azure with IoT - Final Version
MS Azure with IoT - Final VersionJanani Eshwaran
 
MS Azure with IoT - Final Version
MS Azure with IoT - Final VersionMS Azure with IoT - Final Version
MS Azure with IoT - Final VersionJanani Eshwaran
 

Similar to 情報処理学会 Exciting Coding! Treasure Data (20)

Treasure Data Cloud Strategy
Treasure Data Cloud StrategyTreasure Data Cloud Strategy
Treasure Data Cloud Strategy
 
Treasure Data Cloud Data Platform
Treasure Data Cloud Data PlatformTreasure Data Cloud Data Platform
Treasure Data Cloud Data Platform
 
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
Webinar with SnagAJob, HP Vertica and Looker - Data at the speed of busines s...
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data Platform
 
Nyc web perf-final-july-23
Nyc web perf-final-july-23Nyc web perf-final-july-23
Nyc web perf-final-july-23
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
 
Growth hacking in the age of Data
Growth hacking in the age of DataGrowth hacking in the age of Data
Growth hacking in the age of Data
 
Web Briefing: Unlock the power of Hadoop to enable interactive analytics
Web Briefing: Unlock the power of Hadoop to enable interactive analyticsWeb Briefing: Unlock the power of Hadoop to enable interactive analytics
Web Briefing: Unlock the power of Hadoop to enable interactive analytics
 
Real Time Data Warehousing Mastering Business Objects June 11
Real Time Data Warehousing   Mastering Business Objects June 11Real Time Data Warehousing   Mastering Business Objects June 11
Real Time Data Warehousing Mastering Business Objects June 11
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
 
Running Data Platforms Like Products
Running Data Platforms Like ProductsRunning Data Platforms Like Products
Running Data Platforms Like Products
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 
Abhishek jaiswal
Abhishek jaiswalAbhishek jaiswal
Abhishek jaiswal
 
Big Data LDN 2017: Unleash Data Science Upon Your Organisation
Big Data LDN 2017: Unleash Data Science Upon Your OrganisationBig Data LDN 2017: Unleash Data Science Upon Your Organisation
Big Data LDN 2017: Unleash Data Science Upon Your Organisation
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
MS Azure with IoT - Final Version
MS Azure with IoT - Final VersionMS Azure with IoT - Final Version
MS Azure with IoT - Final Version
 
MS Azure with IoT - Final Version
MS Azure with IoT - Final VersionMS Azure with IoT - Final Version
MS Azure with IoT - Final Version
 

More from Treasure Data, Inc.

GDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersGDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for Marketers
Treasure Data, Inc.
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketAR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and Market
Treasure Data, Inc.
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data Platforms
Treasure Data, Inc.
 
Hands On: Javascript SDK
Hands On: Javascript SDKHands On: Javascript SDK
Hands On: Javascript SDK
Treasure Data, Inc.
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Treasure Data, Inc.
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Treasure Data, Inc.
 
How to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataHow to Power Your Customer Experience with Data
How to Power Your Customer Experience with Data
Treasure Data, Inc.
 
Why Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataWhy Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without Data
Treasure Data, Inc.
 
Connecting the Customer Data Dots
Connecting the Customer Data DotsConnecting the Customer Data Dots
Connecting the Customer Data Dots
Treasure Data, Inc.
 
Harnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessHarnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company Success
Treasure Data, Inc.
 
Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017
Treasure Data, Inc.
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
Treasure Data, Inc.
 
Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14
Treasure Data, Inc.
 
Introduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallIntroduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of Hivemall
Treasure Data, Inc.
 
Scalable Hadoop in the cloud
Scalable Hadoop in the cloudScalable Hadoop in the cloud
Scalable Hadoop in the cloud
Treasure Data, Inc.
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
Treasure Data, Inc.
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
Treasure Data, Inc.
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data, Inc.
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to Redshift
Treasure Data, Inc.
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
Treasure Data, Inc.
 

More from Treasure Data, Inc. (20)

GDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for MarketersGDPR: A Practical Guide for Marketers
GDPR: A Practical Guide for Marketers
 
AR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and MarketAR and VR by the Numbers: A Data First Approach to the Technology and Market
AR and VR by the Numbers: A Data First Approach to the Technology and Market
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data Platforms
 
Hands On: Javascript SDK
Hands On: Javascript SDKHands On: Javascript SDK
Hands On: Javascript SDK
 
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD WorkflowHands-On: Managing Slowly Changing Dimensions Using TD Workflow
Hands-On: Managing Slowly Changing Dimensions Using TD Workflow
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
 
How to Power Your Customer Experience with Data
How to Power Your Customer Experience with DataHow to Power Your Customer Experience with Data
How to Power Your Customer Experience with Data
 
Why Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without DataWhy Your VR Game is Virtually Useless Without Data
Why Your VR Game is Virtually Useless Without Data
 
Connecting the Customer Data Dots
Connecting the Customer Data DotsConnecting the Customer Data Dots
Connecting the Customer Data Dots
 
Harnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company SuccessHarnessing Data for Better Customer Experience and Company Success
Harnessing Data for Better Customer Experience and Company Success
 
Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017Packaging Ecosystems -Monki Gras 2017
Packaging Ecosystems -Monki Gras 2017
 
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
글로벌 사례로 보는 데이터로 돈 버는 법 - 트레저데이터 (Treasure Data)
 
Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14Keynote - Fluentd meetup v14
Keynote - Fluentd meetup v14
 
Introduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of HivemallIntroduction to New features and Use cases of Hivemall
Introduction to New features and Use cases of Hivemall
 
Scalable Hadoop in the cloud
Scalable Hadoop in the cloudScalable Hadoop in the cloud
Scalable Hadoop in the cloud
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
 
Scaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big DataScaling to Infinity - Open Source meets Big Data
Scaling to Infinity - Open Source meets Big Data
 
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...Treasure Data:  Move your data from MySQL to Redshift with (not much more tha...
Treasure Data: Move your data from MySQL to Redshift with (not much more tha...
 
Treasure Data From MySQL to Redshift
Treasure Data  From MySQL to RedshiftTreasure Data  From MySQL to Redshift
Treasure Data From MySQL to Redshift
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 

Recently uploaded

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 

Recently uploaded (20)

Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 

情報処理学会 Exciting Coding! Treasure Data

  • 1. Treasure Data 
 Exciting Coding! Nov 2013 Presented by Masahiro Nakagawa Senior Software Engineer www.treasuredata.com 1
  • 2. Who are you •  Masahiro Nakagawa –  @repeatedly –  masa@treasure-data.com or d@ •  Treasure Data, Inc –  Senior Software Engineer •  Fluentd / Client libraries / etc... –  Since 2012/11 •  Open Source projects –  D Programming Language –  MessagePack: D, Python, etc… –  Fluentd: Core, Mongo, Logger, etc… –  Etc… 2
  • 3. Company & Board Meeting Presentation Service Introduction August 15th, 2013 - 3:30PM PDT Presented by Hironobu Yoshikawa – CEO Kazuki Ohta – CTO Rich Ghiossi – VP, Marketing Keith Goldstein – VP, Sales Kengo Hirouchi – Director, Japan Ankush Rustagi – Director, Marketing www.treasuredata.com 3
  • 4. Company Background •  Founded 2011 in Mountain View, CA –  The first cloud service for the entire data pipeline –  Including: Acquisition, Storage, & Analysis •  Provide a “Cloud Data Service” –  Fast Time to Value –  Cloud Flexibility and Economics –  Simple and Well Supported The Treasure Data Team Hiro Yoshikawa – CEO Open source business veteran Kaz Ohta – CTO Founder of world’s largest Hadoop Group Jeff Yuan – Director, Engineering LinkedIn, MIT / Michale Stonrebrraker Lab Keith Goldstein – VP Sales & Bus Dev VP of Bus Dev from Tibco and Talend Rich Ghiossi – VP Marketing VP of Marketing from ParAccel Notable Investors •  Treasure Data has over 100+ customers in production –  Incl. Fortune 500 companies –  500+ Billion new records / month –  Around 2 Trillion records under management –  Variety of use cases and verticals Othman Laraki Ex-VP of Growth at Twitter Jerry Yang Founder of Yahoo! Yukihiro “Matz” Matusmoto Creator of “Ruby” programming language James Lindenbaum Founder of Heroku 4
  • 5. Problem Statement •  Lots of companies today produce Big Data by having “New Data Sources” (Sensor, Weblog, etc) –  But few have the resources to build a Big Data Analytics system •  60-70% of a company’s Big Data time & budget consumed by: –  Infrastructure setup & Maintenance –  Building Collection & Storage Flows –  Hiring/Training Hadoop Expertise •  On average, it takes 6 months to get a Hadoop environment into production 5
  • 6. 6
  • 8. 8
  • 9. Treasure Data Service: Overview Acquire Store Analyze Web logs Treasure Agent App logs BI Connectivity Streaming Log ! Collector (JSON)! REST API, SQL, Pig, JDBC / ODBC! Sensor Tableau, Metric Insights, QlikView, Excel, etc. Treasure Data Cloud RDBMS Bulk Import CRM BI Tools Parallel Upload from CSV, MySQL, etc.! Flexible, Scalable, Columnar Storage! ERP Time to Value Economy & Flexibility Result Push REST API, SQL, Pig! Dashboards Custom App, Local DB, FTP Server, etc. Simple & Supported 9
  • 10. Our Value Propositions •  Faster time to value On-demand cloud infrastructure & versatile streaming data collection agent –  Instantly provision a fully tuned & managed infrastructure –  Go live into production on average in 14 days (collection, analytics, & BI) •  Cloud flexibility and economics Fraction of the cost of traditional solutions by leveraging cloud storage and processing, which scales to meet your needs –  Leverage the cost-advantage of the cloud –  Leverage the elasticity of the cloud – scale on demand –  Predictable monthly subscription fee –  No upfront costs & no long-term commitment •  Simple and well supported We are passionate about simplicity, and customer support excellence –  Focus your time on analyzing your data –  Rely on us to keep your data secure & online –  We love making customers successful & building long-term relationships 10
  • 11. Initial Setup & Onboarding – Two Weeks 1. Data Collection 2. Data Storage •  Setup, tuning, and monitoring of Treasure Agent •  Embed Treasure Agent code into applications •  Basic log templates (register, pay, login, etc.) •  Basic KPI queries (DAU, MAU, ARPU, etc.) 3. Data Analysis 4. Service & Support •  Setup dashboards with basic KPIs •  Training on creating customized reports and adhoc querying •  Assigned a dedicated technical account manager •  Real-time support via email, online chat, and call 11
  • 12. Solutions Accelerators … Out-of-the Box Reporting Treasure Data Platform Configured Treasure Agent Solution Components: -  Treasure Data Platform -  Event Collection Template -  Pre-configured Treasure Agent Configuration -  BI Dashboard with KPIs 12
  • 13. - Vision - gle Analytics Platform for the Wo 13
  • 14. Treasure Board Meeting DataPresentation Platform August 15th, 2013 - 3:30PM PDT Architecture Overview Presented by Hironobu Yoshikawa – CEO Kazuki Ohta – CTO Rich Ghiossi – VP, Marketing Keith Goldstein – VP, Sales Kengo Hirouchi – Director, Japan Ankush Rustagi – Director, Marketing www.treasuredata.com 14
  • 15. Data Acquisition – Streaming Capture Application Server # Application Code ... ... # Post event to Treasure Data TD.event.post('access', {:uid=>123}) •  Automatic Microbatching •  Local buffering Fallback •  Network Tolerance ... ... Treasure Data Library Java, Ruby, PHP, Perl, Python, Scala, Node.js Treasure Data Cloud Treasure Agent (local) Open-Sourced as Fluentd Project ( http://fluentd.org/ ) 15
  • 16. Data Acquisition – Bulk Loader RDBMS App SaaS CSV, TSV, JSON, MessagePack, Apache, regex, MySQL, FTP FTP Treasure Data Cloud Bulk Loader Prepare ! Upload ! Perform ! Commit 16
  • 17. Data Storage Treasure Data Cloud Default (schema-less) time v 13841604 00 {“ip”:”135.52.211.23”, “code”:”0”} 13841622 00 {“ip”:”45.25.38.156”, “code”:”-1”} 13841640 00 {“ip”:”97.12.76.55”, “code”:”99”} •  Stored “schema-less” as JSON –  Schema can be applied/updated AFTER storage •  Compressed & columnar format SELECT v[‘ip’] as ip, v[‘code’] as code … Schema applied ~30% Faster time ip : string 135.52.211.23 45.25.38.156 97.12.76.55 •  Quickly scale-up processing power –  WITHOUT reloading/redistributing the data -1 138416400 0 •  Optimized for time-based filtering 0 138416220 0 For higher query performance code : int 138416040 0 –  99 SELECT ip, code … 17
  • 18. Data Analysis REST API Treasure Data Cloud Heavy Lifting SQL (Hive): -  Hive’s Built-in UDFs -  TD Added Functions: -  Time Functions -  First, Last, Rank -  Sessionize Scheduled Jobs -  SQL, Pig Scripts -  Data Pushes JDBC Connectivity: -  Custom Java Apps -  Standards-based -  BI Tool Integration Tableau ODBC connector -  Leverages Impala Interactive SQL Push Query Results: Treasure Query Accelerator -  MySQL, PostgreSQL (Impala) -  Google Spreadsheet -  Web, FTP, S3 Scripted Processing (Pig): -  Leftronic, Indicee -  DataFu (LinkedIn) -  Treasure Data Table -  Piggybank (Apache) 18
  • 19. Treasure Board Meeting Presentation Data August 15th, 2013 - 3:30PM PDT General Use Cases Presented by Hironobu Yoshikawa – CEO Kazuki Ohta – CTO Rich Ghiossi – VP, Marketing Keith Goldstein – VP, Sales Kengo Hirouchi – Director, Japan Ankush Rustagi – Director, Marketing www.treasuredata.com 19
  • 20. A case: “14 Days” from Signup to Success 1.  Europe’s largest mobile ad exchange. 2.  Serving >60 billion imps/ month for >30,000 mobile apps (Q4 2013) 3.  Immediate need of analytics infrastructure: ASAP! 4.  With TD, MobFox got into production only in 14 days, by one engineer. "Time is the most precious asset in our fast-moving business, and Treasure Data saved us a lot of it." 
 Julian Zehetmayr, CEO & Founder 20
  • 21. A case: “Replace” in-house Hadoop to TD Before 1.  Global “Hulu” - Online Video Service with millions of users 2.  Video contents are distributed to over 150 languages. After 3.  Had hard time maintaining Hadoop cluster 4.  With TD, Viki deprecated their in-house Hadoop cluster and use engineer for core businesses. “Treasure Data has always given us thorough and timely support peppered with insightful tips to make the best use of their service." Huy Nguyen, Software Engineer 21
  • 22. A case: Treasure Data with BI Tool (Tableau) 1.  World’s largest android application market 2.  Serving >3 billion app downloads for >100 million users 3.  Only one engineer managing the data infrastructure 4.  With TD, the data engineer can focus on analyzing data with existing BI tool "I will recommend Treasure Data to my friends in a heartbeat because it benefits all three stakeholders: Operations, Engineering and Business." Simon Dong, Principal Architect - Data Engineering 22
  • 23. Treasure Board Meeting DataPresentation Platform August 15th, 2013 - 3:30PM PDT Fluentd Overview Presented by Hironobu Yoshikawa – CEO Kazuki Ohta – CTO Rich Ghiossi – VP, Marketing Keith Goldstein – VP, Sales Kengo Hirouchi – Director, Japan Ankush Rustagi – Director, Marketing www.treasuredata.com 23
  • 24. What is Fluentd? •  Open sourced log collector written in Ruby –  Easy to use, reliable and well performance –  Streaming event processing •  Using rubygems ecosystem to distribute plugins Fluentd the missing log collector fluentd.org 24
  • 25. Data processing pipeline Data source Collect Store Process Visualize Reporting Monitoring 25
  • 26. Data processing pipeline Important but no defacto middleware! Collect Store Data source Process Visualize Reporting Monitoring 26
  • 27. Fluentd general example 2012-02-04 01:33:51 apache.log Web Server { "host": "127.0.0.1", "method": "GET", ... tail 127.0.0.1 127.0.0.1 127.0.0.1 127.0.0.1 127.0.0.1 - - [11/Dec/2012:07:26:27] [11/Dec/2012:07:26:30] [11/Dec/2012:07:26:32] [11/Dec/2012:07:26:40] [11/Dec/2012:07:27:01] ... "GET "GET "GET "GET "GET / / / / / ... ... ... ... ... } Fluentd insert event buffering 27
  • 28. Pluggable Architecture Pluggable Pluggable Output Input > rewrite > ... Engine Buffer > Forward > HTTP > File tail > dstat > ... > File > Memory Output > Forward > File > MongoDB > ... 28
  • 29. Resolve your requirement by writing plugin Access logs Apache Alerting Nagios App logs Frontend Backend Analysis MongoDB MySQL Hadoop System logs syslogd Databases filter / buffer / routing Archiving Amazon S3 29
  • 30. Treasure Agent (td-agent) •  Open sourced distribution package of Fluentd –  ETL part of Treasure Data –  deb / rpm / homebrew •  Including useful components –  Ruby, jemalloc, fluentd –  3rd party gems: td, mongo, webhdfs, etc… –  Init script •  http://packages.treasuredata.com/ 30
  • 32. Treasure Board Meeting DataPresentation Platform August 15th, 2013 - 3:30PM PDT Backend Overview Presented by Hironobu Yoshikawa – CEO Kazuki Ohta – CTO Rich Ghiossi – VP, Marketing Keith Goldstein – VP, Sales Kengo Hirouchi – Director, Japan Ankush Rustagi – Director, Marketing www.treasuredata.com 32
  • 33. AWS components •  RDS –  Store user information, job, status, etc… –  Queue Worker / Scheduler •  EC2 –  API Server, Hadoop Cluster, Job Worker / Scheduler •  S3 –  Columnar storage •  Realtime / Archive storage •  MessagePack columnar •  ELB 33
  • 34. Plazma(Hadoop, Storage, Queue and Workers) Frontend Worker Hadoop Queue Hadoop Applications push metrics to Fluentd (via local Fluentd) Treasure Data for historical analysis Fluentd Fluentd sums up data minutes (partial aggregation) Librato Metrics for realtime analysis 34
  • 35. Treasure Board Meeting Presentation Data August 15th, 2013 - 3:30PM PDT Development Philosophy Presented by Hironobu Yoshikawa – CEO Kazuki Ohta – CTO Rich Ghiossi – VP, Marketing Keith Goldstein – VP, Sales Kengo Hirouchi – Director, Japan Ankush Rustagi – Director, Marketing www.treasuredata.com 35
  • 36. Open-Source Culture •  TD prefers engineers, who are contributing to the OSS products –  MessagePack, Fluentd, ZeroMQ, Hadoop, MongoDB, Angular.js, Huahin, D-Lang, etc. –  https://github.com/treasure-data?tab=members •  Reasons –  Fixing & Improving the other people’s code is crucial for the distributed team. –  TD’s engineering workflow is really similar with OSS product workflow. –  A+ OSS engineers will bring another A+ OSS engineer! 36
  • 37. OSS v.s. Proprietary •  OSS Everything on the Client Side –  http://github.com/treasure-data/ –  http://fluentd.org/ •  TD is helping the world to collect more data in an analytics-ready format •  2000+ companies (e.g. Nintendo, SlideShare/LinkedIn) are using as OSS product. 3-4% of the users are TD’s customer. •  We also leverage other OSS products as much as possible. •  Closed Source on the Cloud Side –  The core value must be a proprietary to sustain as a business. –  The components can be OSS, but the most of the system will remain proprietary to create the value chain. 37
  • 38. How to decide Product Roadmap? •  Solving the Customer Pain is the #1 Priority –  Developers directly provide the support for customers, and spending 30%-40% of the development time to talk with customers –  Developers are the BEST person to come up with the solution. –  # of code lines != value •  Suffering Oriented Development –  First, make it possible –  Then, make it beautiful –  Then, make it fast •  The Largest Customer Pain is NOT always applicable to other customers. –  Need to be brave to say NO. NO. NO. NO. NO…. •  TD doesn’t have 1-year Product Roadmap. Having 3-months roadmap accelerates the development, and other teams (marketing / sales), too. 38
  • 39. Distributed Team (International) •  13 Engineers as of Nov. 2013 –  5 Engineers in Tokyo, Japan –  8 Engineers in Mountain View, USA –  40% of the whole company •  Asynchronous Communication –  Use async communication tools as much as possible: Chat, JIRA, Email, Github, etc. –  Use video conferencing for weekly sync-up •  English is the primary communication language –  If you cannot speak English, your value is nearly zero at Treasure Data engineering team. 39
  • 40. Distributed Team (Deployment) •  Predictable Deployment Cycle –  Weekly Deployment •  Continuous Deployment didn’t fit into B2B SaaS application, our customers want predictability of the changes. •  As a distributed team, it’s hard to track the every changes + deployment status. –  Track every changes on JIRA, and QA engineer is responsible for the deployment too. •  Continuous Deployment for Staging –  Single branch, always automatically deployed to the staging environment –  Monitoring is a continuous testing •  On-Call Alert Schedule, based on the Timezone –  No need to get up around 3am 40
  • 41. Leverage Cloud Services •  Use Cloud Services as Much as Possible –  Don’t hire people, use cloud services. –  Out source everything, except your core value. –  Developers tend to forget his own cost. If you spend 1-hour, it already costs around $50 as a company. •  Examples –  –  –  –  –  –  –  –  –  –  EC2 (IaaS) CopperEgg (Infrastructure Monitoring) NewRelic (Application Performance Management) Hosted Chef (Configuration Management) Librato Metrics (Application Metrics) Pager Duty (Alerting) Logentries (Log Search) CircleCI, TravisCI (Continuous Integration) HipChat, JIRA, Confluence (Development Tool) Etc…. 41
  • 42. Treasure Board Meeting Presentation Data Conclusion August 15th, 2013 - 3:30PM PDT Presented by Hironobu Yoshikawa – CEO Kazuki Ohta – CTO Rich Ghiossi – VP, Marketing Keith Goldstein – VP, Sales Kengo Hirouchi – Director, Japan Ankush Rustagi – Director, Marketing www.treasuredata.com 42
  • 43. Key points •  Treasure Data, Inc –  Cloud based Data Service for the world –  Customer oriented development •  Our Unique Products and Culture –  Fluend / Plazma (backend) –  OSS enthusiast •  Use Cloud or not? –  Cloud leverages an idea but not differentiator –  Focus own vision! 43

Editor's Notes

  1. Time to Value Setup time and load time for data collection (td-agent) – 1 weekAnalysis capabilities out of the boxSimple integration with existing ecosystem (DI & BI)Cloud flexibility and economiesScalable (cloud), extensible (elastic), flexible (schemaless)Lower TCO compared to on-premise, hosted, or homegrownOn-demand ability to scale, adjust, meet future business requirementsSimple and supported“Full” solutions from collection to visualizationGreat customer service, support, setup, and SLAsEasy to extend on your own / self-service – DIY big data
  2. Time to Value Setup time and load time for data collection (td-agent) – 1 weekAnalysis capabilities out of the boxSimple integration with existing ecosystem (DI & BI)Cloud flexibility and economiesScalable (cloud), extensible (elastic), flexible (schemaless)Lower TCO compared to on-premise, hosted, or homegrownOn-demand ability to scale, adjust, meet future business requirementsSimple and supported“Full” solutions from collection to visualizationGreat customer service, support, setup, and SLAsEasy to extend on your own / self-service – DIY big data