SlideShare a Scribd company logo
Gabriel PREDA
@eRadical
(Almost) Serverless Analytics System
with BigQuery & AppEngine
Agenda
Going Serverless with
AppEngine & Tasks
Pub/Sub, DataStore
BigQuery
Load
Batch
Streaming Inserts
Query
UDF
Export
...some BigQueries...
AeonsSome years ago...
~ 500,000 - 2,000,000 events / day
(on average)
Some time ago...
~2,000,000 - 22,000,000 events / day
Dec 2014: 57,430,000 events / day
1 day to recompute » 12 hours
NOW()
22,000,000 - 70,000,000 events / day
AVG » 40,000,000 events / day
Processing ~30GB-70GB / day
Recompute 1 day » 10-20 minutes
serverless?
Desired for: https://www.innertrends.com
other... (almost) serverless products
Cloud Functions (alpha - Node.JS)
Cloud DataFlow (Java, Python - beta)
BigQuery
https://cloud.google.com/bigquery/docs/
BigQuery - data types
● STRING - UTF-8 (2 bytes + encoded string size)
● BYTES - base64 encoded (except in Avro)
● INTEGER - 64-bit signed (8 bytes)
● FLOAT (8 bytes)
● BOOLEAN - true/false, 1/0 only in CSV (1 byte)
● TIMESTAMP ex:”2014-08-19 12:41:35.220 UTC” (8 bytes)
● DATE, TIME, DATETIME - limited support in Legacy SQL
● RECORD - a collection of fields (size of fields)
https://cloud.google.com/bigquery/data-types
BigQuery -> loadData()
Formats: CSV, JSON (newline delimited), Avro, Parquet (experimental)
Tools: Web UI, bq, API
Source:
local files,
Cloud Storage, [demo]
Cloud Datastore (backup files),
POST requests,
SQL DML*
Google Sheets
- Federated Data Sources
- Streaming Inserts
BigQuery -> loadData()
bq load ...
BigQuery -> loadData()
Got some rows?
BigQuery -> SELECT … FROM surprise…
query:
SELECT { * | field_path.* | expression } [ [ AS ] alias ] [ , ... ]
[ FROM from_body
[ WHERE bool_expression ]
[ OMIT RECORD IF bool_expression]
[ GROUP [ EACH ] BY [ ROLLUP ] { field_name_or_alias } [ , ... ] ]
[ HAVING bool_expression ]
[ ORDER BY field_name_or_alias [ { DESC | ASC } ] [, ... ] ]
[ LIMIT n ]
];
from_body:
from_item [, ...] | # Warning: Comma means UNION ALL here
from_item [ join_type ] JOIN [ EACH ] from_item [ ON join_predicate ] |
(FLATTEN({ table_name | (query) }, field_name_or_alias)) |
table_wildcard_function
from_item:
{ table_name | (query) } [ [ AS ] alias ]
join_type:
{ INNER | [ FULL ] [ OUTER ] | RIGHT [ OUTER ] | LEFT [ OUTER ] | CROSS }
BigQuery -> SELECT … FROM surprise…
Date-Partitioned Tables [demo]
Table Decorators - See the past w/ @
Table Wildcard Functions - TABLE_DATE_RANGE() & TABLE_QUERY()
Interesting functions
- DateTime » UTC_USEC_TO_DAY/HOUR/MONTH/WEEK/YEAR()
» Shifts a UNIX timestamp in microseconds to the beginning of the period it occurs in.
- JSON_EXTRACT[_SCALAR]()
- URL functions » HOST(), DOMAIN(), TLD()
- REGEXP_MATCH(), REGEXP_EXTRACT()
bigquery.defineFunction(
'expandAssetLibrary', // Name of the function exported to SQL
['user_id', 'video_id', 'stage_settings'], // Names of input columns
[ {'name': 'user_id', 'type': 'integer'}, // Output schema
{'name': 'video_id', 'type': 'string'},
{'name': 'asset', 'type': 'string'} ],
expandAssetLibrary // Reference to JavaScript UDF
);
function expandAssetLibrary(row, emit) { …………………………
emit({ user_id: row.user_id, video_id: row.video_id, asset: ss.url.replace('http://', ''));
}
BigQuery -> User Defined Functions
BigQuery -> DML
Standard SQL only
Maximum UPDATE/DELETE statements per day per table: 48
Maximum UPDATE/DELETE statements per day per project: 500
Maximum INSERT statements per day per table: 1,000
Maximum INSERT statements per day per project: 10,000
BigQuery -> export()
To: Google Cloud Storage
Format: CSV, JSON [.gz], Avro
…1G files
BigQuery -> some (Big)Queries
SELECT year, count(1)
FROM [bigquery-public-data:samples.natality]
WHERE father_age < 18
GROUP BY year
ORDER BY year
SELECT year, count(1)
FROM [bigquery-public-data:samples.natality]
WHERE mother_age < 18
GROUP BY year
ORDER BY year
SELECT table_id, row_count, CEIL(size_bytes/POW(1024, 3)) AS gb
FROM [bigquery-public-data:ghcn_m.__TABLES__] ORDER BY gb DESC
BigQuery -> some (Big)Queries
SELECT REGEXP_EXTRACT(path, r'.*.(.*)$') AS file_extension,
COUNT(1) AS k
FROM [bigquery-public-data:github_repos.files]
GROUP BY file_extension
ORDER BY k DESC
LIMIT 20
SELECT table_id, row_count,
CEIL(size_bytes/POW(1024, 3)) AS gb
FROM [bigquery-public-data:github_repos.__TABLES__]
ORDER BY gb DESC

More Related Content

What's hot

Introduction to cron queue
Introduction to cron queueIntroduction to cron queue
Introduction to cron queue
ADCI Solutions
 
Functional programming
Functional programming Functional programming
Functional programming
Nyarai Tinashe Gomiwa
 
Data analytics with hadoop hive on multiple data centers
Data analytics with hadoop hive on multiple data centersData analytics with hadoop hive on multiple data centers
Data analytics with hadoop hive on multiple data centersHirotaka Niisato
 
2016 gunma.web games-and-asm.js
2016 gunma.web games-and-asm.js2016 gunma.web games-and-asm.js
2016 gunma.web games-and-asm.js
Noritada Shimizu
 
20151224-games
20151224-games20151224-games
20151224-games
Noritada Shimizu
 
Asynchronous programming
Asynchronous programmingAsynchronous programming
Asynchronous programming
Filip Ekberg
 
No More Deadlocks; Asynchronous Programming in .NET
No More Deadlocks; Asynchronous Programming in .NETNo More Deadlocks; Asynchronous Programming in .NET
No More Deadlocks; Asynchronous Programming in .NET
Filip Ekberg
 
RxJS 5 in Depth
RxJS 5 in DepthRxJS 5 in Depth
RxJS 5 in Depth
C4Media
 
Working with NoSQL in a SQL Database (XDevApi)
Working with NoSQL in a SQL Database (XDevApi)Working with NoSQL in a SQL Database (XDevApi)
Working with NoSQL in a SQL Database (XDevApi)
Lior Altarescu
 
NoSQL in SQL - Lior Altarescu
NoSQL in SQL - Lior Altarescu NoSQL in SQL - Lior Altarescu
NoSQL in SQL - Lior Altarescu
Wix Engineering
 
W3C HTML5 KIG-How to write low garbage real-time javascript
W3C HTML5 KIG-How to write low garbage real-time javascriptW3C HTML5 KIG-How to write low garbage real-time javascript
W3C HTML5 KIG-How to write low garbage real-time javascript
Changhwan Yi
 
A Shiny Example-- R
A Shiny Example-- RA Shiny Example-- R
A Shiny Example-- R
Dr. Volkan OBAN
 
University of Bedford Knowledge Network 2.12.13
University of Bedford Knowledge Network 2.12.13University of Bedford Knowledge Network 2.12.13
University of Bedford Knowledge Network 2.12.13
Business BUZZ - Watford
 
Data visualization by Kenneth Odoh
Data visualization by Kenneth OdohData visualization by Kenneth Odoh
Data visualization by Kenneth Odoh
pyconfi
 
Do something in 5 minutes with gas 1-use spreadsheet as database
Do something in 5 minutes with gas 1-use spreadsheet as databaseDo something in 5 minutes with gas 1-use spreadsheet as database
Do something in 5 minutes with gas 1-use spreadsheet as database
Bruce McPherson
 
Functional Programming
Functional ProgrammingFunctional Programming
Functional Programming
SovTech
 
Visdjango presentation django_boston_oct_2014
Visdjango presentation django_boston_oct_2014Visdjango presentation django_boston_oct_2014
Visdjango presentation django_boston_oct_2014
jlbaldwin
 
Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for Ruby
Prasun Anand
 
Business Networking Cambridge April 2014
Business Networking Cambridge April 2014Business Networking Cambridge April 2014
Business Networking Cambridge April 2014
Business BUZZ - Watford
 

What's hot (20)

Introduction to cron queue
Introduction to cron queueIntroduction to cron queue
Introduction to cron queue
 
Functional programming
Functional programming Functional programming
Functional programming
 
Data analytics with hadoop hive on multiple data centers
Data analytics with hadoop hive on multiple data centersData analytics with hadoop hive on multiple data centers
Data analytics with hadoop hive on multiple data centers
 
2016 gunma.web games-and-asm.js
2016 gunma.web games-and-asm.js2016 gunma.web games-and-asm.js
2016 gunma.web games-and-asm.js
 
20151224-games
20151224-games20151224-games
20151224-games
 
Asynchronous programming
Asynchronous programmingAsynchronous programming
Asynchronous programming
 
No More Deadlocks; Asynchronous Programming in .NET
No More Deadlocks; Asynchronous Programming in .NETNo More Deadlocks; Asynchronous Programming in .NET
No More Deadlocks; Asynchronous Programming in .NET
 
RxJS 5 in Depth
RxJS 5 in DepthRxJS 5 in Depth
RxJS 5 in Depth
 
Working with NoSQL in a SQL Database (XDevApi)
Working with NoSQL in a SQL Database (XDevApi)Working with NoSQL in a SQL Database (XDevApi)
Working with NoSQL in a SQL Database (XDevApi)
 
NoSQL in SQL - Lior Altarescu
NoSQL in SQL - Lior Altarescu NoSQL in SQL - Lior Altarescu
NoSQL in SQL - Lior Altarescu
 
W3C HTML5 KIG-How to write low garbage real-time javascript
W3C HTML5 KIG-How to write low garbage real-time javascriptW3C HTML5 KIG-How to write low garbage real-time javascript
W3C HTML5 KIG-How to write low garbage real-time javascript
 
A Shiny Example-- R
A Shiny Example-- RA Shiny Example-- R
A Shiny Example-- R
 
University of Bedford Knowledge Network 2.12.13
University of Bedford Knowledge Network 2.12.13University of Bedford Knowledge Network 2.12.13
University of Bedford Knowledge Network 2.12.13
 
Data visualization by Kenneth Odoh
Data visualization by Kenneth OdohData visualization by Kenneth Odoh
Data visualization by Kenneth Odoh
 
Do something in 5 minutes with gas 1-use spreadsheet as database
Do something in 5 minutes with gas 1-use spreadsheet as databaseDo something in 5 minutes with gas 1-use spreadsheet as database
Do something in 5 minutes with gas 1-use spreadsheet as database
 
Functional Programming
Functional ProgrammingFunctional Programming
Functional Programming
 
Visdjango presentation django_boston_oct_2014
Visdjango presentation django_boston_oct_2014Visdjango presentation django_boston_oct_2014
Visdjango presentation django_boston_oct_2014
 
Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for Ruby
 
Business Networking Cambridge April 2014
Business Networking Cambridge April 2014Business Networking Cambridge April 2014
Business Networking Cambridge April 2014
 
G* on GAE/J 挑戦編
G* on GAE/J 挑戦編G* on GAE/J 挑戦編
G* on GAE/J 挑戦編
 

Viewers also liked

Mashing the data
Mashing the dataMashing the data
Mashing the data
Felix Crisan
 
Social Media For Beginners - Agcas 2012
Social Media For Beginners - Agcas 2012Social Media For Beginners - Agcas 2012
Social Media For Beginners - Agcas 2012Matthew Mobbs
 
9no a 2da version
9no a 2da version9no a 2da version
9no a 2da versionAna María
 
Framtidens ehandel redan idag
Framtidens ehandel redan idagFramtidens ehandel redan idag
Framtidens ehandel redan idag
Ulrika Schreil
 
Introducción a la cerámica popular canaria cuadernillo
Introducción a la cerámica popular canaria cuadernilloIntroducción a la cerámica popular canaria cuadernillo
Introducción a la cerámica popular canaria cuadernilloGustavo Rivero Vega
 
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. МукачевоСвято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Наталія Бабич
 
Aнглийский сленг (U-Z)
Aнглийский сленг (U-Z)Aнглийский сленг (U-Z)
Worcester Food & Active Living Policy Council: An Introduction
Worcester Food & Active Living Policy Council: An IntroductionWorcester Food & Active Living Policy Council: An Introduction
Worcester Food & Active Living Policy Council: An Introductionesheehancastro
 
Innovation in digital schools Gess Dubai 2013
Innovation in digital schools Gess Dubai 2013Innovation in digital schools Gess Dubai 2013
Innovation in digital schools Gess Dubai 2013
Carlos J. Ochoa Fernández
 
Professional scepticism judgment uia 2
Professional scepticism judgment uia 2Professional scepticism judgment uia 2
Professional scepticism judgment uia 2
Nik Hasyudeen
 
8th pre alg -jan22
8th pre alg -jan228th pre alg -jan22
8th pre alg -jan22jdurst65
 
Introducción a la ciencia e ingeniería de los materiales william d. callist...
Introducción a la ciencia e ingeniería de los materiales   william d. callist...Introducción a la ciencia e ingeniería de los materiales   william d. callist...
Introducción a la ciencia e ingeniería de los materiales william d. callist...elkinn
 
IntroduccióN A La ClíNica PsicolóGica Con NiñOs
IntroduccióN A La ClíNica PsicolóGica  Con  NiñOsIntroduccióN A La ClíNica PsicolóGica  Con  NiñOs
IntroduccióN A La ClíNica PsicolóGica Con NiñOsguesta14865ae
 
Evolucion de la informatica y su aplicacion
Evolucion de la informatica y su aplicacionEvolucion de la informatica y su aplicacion
Evolucion de la informatica y su aplicacionJessy Acosta
 
Introducción a la CMNUCC
Introducción a la CMNUCCIntroducción a la CMNUCC
Introducción a la CMNUCC
CO2.cr
 
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCAINTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCAAdriana Amo
 
Introducción a la Biotecnología. Capítulo 2
Introducción a la Biotecnología. Capítulo 2Introducción a la Biotecnología. Capítulo 2
Introducción a la Biotecnología. Capítulo 2
CiberGeneticaUNAM
 

Viewers also liked (20)

Mashing the data
Mashing the dataMashing the data
Mashing the data
 
Social Media For Beginners - Agcas 2012
Social Media For Beginners - Agcas 2012Social Media For Beginners - Agcas 2012
Social Media For Beginners - Agcas 2012
 
9no a 2da version
9no a 2da version9no a 2da version
9no a 2da version
 
Framtidens ehandel redan idag
Framtidens ehandel redan idagFramtidens ehandel redan idag
Framtidens ehandel redan idag
 
Introducción a la cerámica popular canaria cuadernillo
Introducción a la cerámica popular canaria cuadernilloIntroducción a la cerámica popular canaria cuadernillo
Introducción a la cerámica popular canaria cuadernillo
 
Weekly plannig52012
Weekly plannig52012Weekly plannig52012
Weekly plannig52012
 
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. МукачевоСвято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
 
Aнглийский сленг (U-Z)
Aнглийский сленг (U-Z)Aнглийский сленг (U-Z)
Aнглийский сленг (U-Z)
 
Worcester Food & Active Living Policy Council: An Introduction
Worcester Food & Active Living Policy Council: An IntroductionWorcester Food & Active Living Policy Council: An Introduction
Worcester Food & Active Living Policy Council: An Introduction
 
Innovation in digital schools Gess Dubai 2013
Innovation in digital schools Gess Dubai 2013Innovation in digital schools Gess Dubai 2013
Innovation in digital schools Gess Dubai 2013
 
Professional scepticism judgment uia 2
Professional scepticism judgment uia 2Professional scepticism judgment uia 2
Professional scepticism judgment uia 2
 
8th pre alg -jan22
8th pre alg -jan228th pre alg -jan22
8th pre alg -jan22
 
Introducción a la ciencia e ingeniería de los materiales william d. callist...
Introducción a la ciencia e ingeniería de los materiales   william d. callist...Introducción a la ciencia e ingeniería de los materiales   william d. callist...
Introducción a la ciencia e ingeniería de los materiales william d. callist...
 
Guitar 5th grade
Guitar 5th gradeGuitar 5th grade
Guitar 5th grade
 
IntroduccióN A La ClíNica PsicolóGica Con NiñOs
IntroduccióN A La ClíNica PsicolóGica  Con  NiñOsIntroduccióN A La ClíNica PsicolóGica  Con  NiñOs
IntroduccióN A La ClíNica PsicolóGica Con NiñOs
 
Evolucion de la informatica y su aplicacion
Evolucion de la informatica y su aplicacionEvolucion de la informatica y su aplicacion
Evolucion de la informatica y su aplicacion
 
Introducción a la CMNUCC
Introducción a la CMNUCCIntroducción a la CMNUCC
Introducción a la CMNUCC
 
Retailing
RetailingRetailing
Retailing
 
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCAINTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
 
Introducción a la Biotecnología. Capítulo 2
Introducción a la Biotecnología. Capítulo 2Introducción a la Biotecnología. Capítulo 2
Introducción a la Biotecnología. Capítulo 2
 

Similar to (Almost) Serverless Analytics System with BigQuery & AppEngine

Using redux and angular 2 with meteor
Using redux and angular 2 with meteorUsing redux and angular 2 with meteor
Using redux and angular 2 with meteor
Ken Ono
 
Using redux and angular 2 with meteor
Using redux and angular 2 with meteorUsing redux and angular 2 with meteor
Using redux and angular 2 with meteor
Ken Ono
 
U-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningU-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance Tuning
Michael Rys
 
Writing MySQL User-defined Functions in JavaScript
Writing MySQL User-defined Functions in JavaScriptWriting MySQL User-defined Functions in JavaScript
Writing MySQL User-defined Functions in JavaScriptRoland Bouman
 
Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014
Dieter Plaetinck
 
BigQueryで作る分析環境
BigQueryで作る分析環境BigQueryで作る分析環境
BigQueryで作る分析環境
将央 山口
 
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_103 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
mlraviol
 
A Tour of Building Web Applications with R Shiny
A Tour of Building Web Applications with R Shiny A Tour of Building Web Applications with R Shiny
A Tour of Building Web Applications with R Shiny
Wendy Chen Dubois
 
What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2
MariaDB plc
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive Platform
Martin Zapletal
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
MariaDB plc
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
MariaDB plc
 
Programming IoT Gateways in JavaScript with macchina.io
Programming IoT Gateways in JavaScript with macchina.ioProgramming IoT Gateways in JavaScript with macchina.io
Programming IoT Gateways in JavaScript with macchina.io
Günter Obiltschnig
 
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
Big Data Spain
 
MySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and GraphiteMySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and Graphite
DB-Art
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
InfluxData
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
Databricks
 
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
MariaDB plc
 

Similar to (Almost) Serverless Analytics System with BigQuery & AppEngine (20)

App bot
App botApp bot
App bot
 
Using redux and angular 2 with meteor
Using redux and angular 2 with meteorUsing redux and angular 2 with meteor
Using redux and angular 2 with meteor
 
Using redux and angular 2 with meteor
Using redux and angular 2 with meteorUsing redux and angular 2 with meteor
Using redux and angular 2 with meteor
 
U-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningU-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance Tuning
 
Writing MySQL User-defined Functions in JavaScript
Writing MySQL User-defined Functions in JavaScriptWriting MySQL User-defined Functions in JavaScript
Writing MySQL User-defined Functions in JavaScript
 
Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014
 
BigQueryで作る分析環境
BigQueryで作る分析環境BigQueryで作る分析環境
BigQueryで作る分析環境
 
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_103 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
 
A Tour of Building Web Applications with R Shiny
A Tour of Building Web Applications with R Shiny A Tour of Building Web Applications with R Shiny
A Tour of Building Web Applications with R Shiny
 
What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive Platform
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
Programming IoT Gateways in JavaScript with macchina.io
Programming IoT Gateways in JavaScript with macchina.ioProgramming IoT Gateways in JavaScript with macchina.io
Programming IoT Gateways in JavaScript with macchina.io
 
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
 
MySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and GraphiteMySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and Graphite
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
 

Recently uploaded

Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Yara Milbes
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
abdulrafaychaudhry
 

Recently uploaded (20)

Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Lecture 1 Introduction to games development
Lecture 1 Introduction to games developmentLecture 1 Introduction to games development
Lecture 1 Introduction to games development
 

(Almost) Serverless Analytics System with BigQuery & AppEngine

  • 1. Gabriel PREDA @eRadical (Almost) Serverless Analytics System with BigQuery & AppEngine
  • 2. Agenda Going Serverless with AppEngine & Tasks Pub/Sub, DataStore BigQuery Load Batch Streaming Inserts Query UDF Export ...some BigQueries...
  • 3. AeonsSome years ago... ~ 500,000 - 2,000,000 events / day (on average)
  • 4. Some time ago... ~2,000,000 - 22,000,000 events / day Dec 2014: 57,430,000 events / day 1 day to recompute » 12 hours
  • 5. NOW() 22,000,000 - 70,000,000 events / day AVG » 40,000,000 events / day Processing ~30GB-70GB / day Recompute 1 day » 10-20 minutes
  • 7. other... (almost) serverless products Cloud Functions (alpha - Node.JS) Cloud DataFlow (Java, Python - beta)
  • 9. BigQuery - data types ● STRING - UTF-8 (2 bytes + encoded string size) ● BYTES - base64 encoded (except in Avro) ● INTEGER - 64-bit signed (8 bytes) ● FLOAT (8 bytes) ● BOOLEAN - true/false, 1/0 only in CSV (1 byte) ● TIMESTAMP ex:”2014-08-19 12:41:35.220 UTC” (8 bytes) ● DATE, TIME, DATETIME - limited support in Legacy SQL ● RECORD - a collection of fields (size of fields) https://cloud.google.com/bigquery/data-types
  • 10. BigQuery -> loadData() Formats: CSV, JSON (newline delimited), Avro, Parquet (experimental) Tools: Web UI, bq, API Source: local files, Cloud Storage, [demo] Cloud Datastore (backup files), POST requests, SQL DML* Google Sheets - Federated Data Sources - Streaming Inserts
  • 13. BigQuery -> SELECT … FROM surprise… query: SELECT { * | field_path.* | expression } [ [ AS ] alias ] [ , ... ] [ FROM from_body [ WHERE bool_expression ] [ OMIT RECORD IF bool_expression] [ GROUP [ EACH ] BY [ ROLLUP ] { field_name_or_alias } [ , ... ] ] [ HAVING bool_expression ] [ ORDER BY field_name_or_alias [ { DESC | ASC } ] [, ... ] ] [ LIMIT n ] ]; from_body: from_item [, ...] | # Warning: Comma means UNION ALL here from_item [ join_type ] JOIN [ EACH ] from_item [ ON join_predicate ] | (FLATTEN({ table_name | (query) }, field_name_or_alias)) | table_wildcard_function from_item: { table_name | (query) } [ [ AS ] alias ] join_type: { INNER | [ FULL ] [ OUTER ] | RIGHT [ OUTER ] | LEFT [ OUTER ] | CROSS }
  • 14. BigQuery -> SELECT … FROM surprise… Date-Partitioned Tables [demo] Table Decorators - See the past w/ @ Table Wildcard Functions - TABLE_DATE_RANGE() & TABLE_QUERY() Interesting functions - DateTime » UTC_USEC_TO_DAY/HOUR/MONTH/WEEK/YEAR() » Shifts a UNIX timestamp in microseconds to the beginning of the period it occurs in. - JSON_EXTRACT[_SCALAR]() - URL functions » HOST(), DOMAIN(), TLD() - REGEXP_MATCH(), REGEXP_EXTRACT()
  • 15. bigquery.defineFunction( 'expandAssetLibrary', // Name of the function exported to SQL ['user_id', 'video_id', 'stage_settings'], // Names of input columns [ {'name': 'user_id', 'type': 'integer'}, // Output schema {'name': 'video_id', 'type': 'string'}, {'name': 'asset', 'type': 'string'} ], expandAssetLibrary // Reference to JavaScript UDF ); function expandAssetLibrary(row, emit) { ………………………… emit({ user_id: row.user_id, video_id: row.video_id, asset: ss.url.replace('http://', '')); } BigQuery -> User Defined Functions
  • 16. BigQuery -> DML Standard SQL only Maximum UPDATE/DELETE statements per day per table: 48 Maximum UPDATE/DELETE statements per day per project: 500 Maximum INSERT statements per day per table: 1,000 Maximum INSERT statements per day per project: 10,000
  • 17. BigQuery -> export() To: Google Cloud Storage Format: CSV, JSON [.gz], Avro …1G files
  • 18. BigQuery -> some (Big)Queries SELECT year, count(1) FROM [bigquery-public-data:samples.natality] WHERE father_age < 18 GROUP BY year ORDER BY year SELECT year, count(1) FROM [bigquery-public-data:samples.natality] WHERE mother_age < 18 GROUP BY year ORDER BY year SELECT table_id, row_count, CEIL(size_bytes/POW(1024, 3)) AS gb FROM [bigquery-public-data:ghcn_m.__TABLES__] ORDER BY gb DESC
  • 19. BigQuery -> some (Big)Queries SELECT REGEXP_EXTRACT(path, r'.*.(.*)$') AS file_extension, COUNT(1) AS k FROM [bigquery-public-data:github_repos.files] GROUP BY file_extension ORDER BY k DESC LIMIT 20 SELECT table_id, row_count, CEIL(size_bytes/POW(1024, 3)) AS gb FROM [bigquery-public-data:github_repos.__TABLES__] ORDER BY gb DESC