SlideShare a Scribd company logo
1 of 42
Download to read offline
HANDS ON WITH BIGQUERY JAVASCRIPT UDFS
THOMAS PARK
SOFTWARE ENGINEER - GOOGLE
Hands-on with BigQuery JavaScript
User-Defined Functions
Thomas Park
Software Engineer - Google
Felipe Hoffa
@felipehoffa
Developer Advocate - Google
Agenda
Background
Example: Cross-row intervals
Under the hood
Example: Codebreaking
I.
II.
III.
IV.
Agenda
Background
Example: Cross-row intervals
Under the hood
Example: Codebreaking
I.
II.
III.
IV.
What is BigQuery?
BigQuery: Big Data Analytics in the Cloud
Unrivaled
Performance and Scale
● Scan multiple TB’s in seconds
● Interactive query
performance
● No limits on amount of data
Ease of Use
and Adoption
● No administration /
provisioning
● Convenience of SQL
● Open interfaces
(REST, WebUI, ODBC)
● First 1 TB of data processed
per month is free
Advanced “Big Data”
Storage
● Familiar database structure
● Easy data management and
ACL’s
● Fast, atomic imports
Google confidential │ Do not distribute
How many pageviews does Wikipedia
have in a month?
SELECT COUNT(*)FROM
[fh-bigquery:wikipedia.wikipedia_views_201308]
https://bigquery.cloud.google.com/table/fh-bigquery:wikipedia.pagecounts_20140602_18
Google confidential │ Do not distribute
$500 in Cloud Platform credit
to launch your idea!
Build. Store. Analyze.
On the same infrastructure
that powers Google
Start building
Click ‘Apply Now’ and complete the
application with promo code: bigdata-spain
Starter Pack
Offer Description
1
2
3
Go to cloud.google.com/developers/starterpack
Agenda
Background
Example: Cross-row intervals
Under the hood
Example: Codebreaking
I.
II.
III.
IV.
Images by Connie Zhou
Scenario:
Door access records from a
Very Well-Secured Lab
where users must badge in
to enter or leave
Image by Tod Kurt
Images by Connie Zhou
Example:
Time-series analysis
from discrete user action data
Image by Tod Kurt
user_id timestamp
Beep!!
9h: arrive @ lab thomas 2014.07.15 09:00
Beep!!
10h: leave to pick up
prototype
thomas 2014.07.15 10:00
Beep!!
10h15: return with
prototype
thomas 2014.07.15 10:15
Beep!!
12h: out for lunch thomas 2014.07.15 12:00
How can we find out how much time
each user spent in the lab?
...where each scan of the user’s access card is
represented as a discrete row?
rownum user_id timestamp
1 thomas 2014.07.15 09:00
2 thomas 2014.07.15 10:00
3 thomas 2014.07.15 10:15
4 thomas 2014.07.15 12:00
60 minutes
105 minutes
Our analysis with data
in this format via SQL
is horrid and painful
A BigQuery + JS
friendly format:
data for each user in
separate rows
user_id timestamps
thomas [ 09:00, 10:00, 10:15, 12:00, ... ]
hoffa [ 08:10, 11:30, 12:00, 12:15, ... ]
SELECT user_id, NEST(timestamp) AS timestamps
FROM T
GROUP BY user_id;
Producing this format
is trivial in BigQuery...
// This function will be called once for each user,
// and receive an array of timestamps.
function(record, emit) {
var total_time = 0;
// Order of records built by NEST are not guaranteed!
// Sort to guarantee ascending timestamps.
var ts = record.ts.sort(
function (a, b) {return a > b;});
// Loop over timestamp pairs, calculate interval.
for (var i = 0; i < ts.length - 1; i += 2) {
total_time += (ts[i+1] - ts[i]);
}
// Emit total time for this user.
emit({user: record.user_id,
total_time: total_time});
}
JS: Total time for each
user
SELECT * FROM
js(
// Input table or query.
[secret-lab:door_scans.201411]
// Input columns.
user_id, timestamps,
// Output schema.
"[{name: 'user_id', type:'string'},
{name: 'tot_time', type:'integer'}]",
// The function.
"function(r, emit) {
...
emit(...);
}")
SELECT * FROM
js(
// Input table or query.
[secret-lab:door_scans.201411]
// Input columns.
user_id, timestamps,
// Output schema.
"[{name: 'user_id', type:'string'},
{name: 'tot_time', type:'integer'}]",
// The function.
"function(r, emit) {
...
emit(...);
}")
The JS
function
SELECT * FROM
js(
// Input table or query.
[secret-lab:door_scans.201411]
// Input columns.
user_id, timestamps,
// Output schema.
"[{name: 'user_id', type:'string'},
{name: 'tot_time', type:'integer'}]",
// The function.
"function(r, emit) {
...
emit(...);
}")
Input schema
(column
names only!)
SELECT * FROM
js(
// Input table or query.
[secret-lab:door_scans.201411]
// Input columns.
user_id, timestamps,
// Output schema.
"[{name: 'user_id', type:'string'},
{name: 'tot_time', type:'integer'}]",
// The function.
"function(r, emit) {
...
emit(...);
}")
Output schema
(full
declaration)
SELECT * FROM
js(
// Input table or query.
[secret-lab:door_scans.201411]
// Input columns.
user_id, timestamps,
// Output schema.
"[{name: 'user_id', type:'string'},
{name: 'tot_time', type:'integer'}]",
// The function.
"function(r, emit) {
...
emit(...);
}")
Input table or
subquery
SELECT * FROM
js(
// Input table or query.
[secret-lab:door_scans.201411]
// Input columns.
user_id, timestamps,
// Output schema.
"[{name: 'user_id', type:'string'},
{name: 'tot_time', type:'integer'}]",
// The function.
"function(r, emit) {
...
emit(...);
}")
Agenda
Background
Example: Cross-row intervals
Under the hood
Example: Codebreaking
I.
II.
III.
IV.
How BigQuery works
Get data from lower levels,
filter / join / transform,
send rows up
Tree Structured Query Dispatch and Aggregation
Distributed Storage
SELECT title, requests
Leaf Leaf Leaf Leaf
SUM(requests)
GROUP BY title
WHERE REGEX_MATCH(title, 'pat.*rn')
Mixer 1 Mixer 1 SUM(requests)
GROUP BY title
Mixer 0
LIMIT 10
ORDER BY c DESC
SUM(requests)
GROUP BY title
Data for each row is calculated and
streamed through a “Row Iterator”
Subquery0 Subquery1
JOIN
Row Iterator 0 Row Iterator 1
Row Iterator 2
Can insert JavaScript Functions
wherever we have a Row Iterator
Subquery0 Subquery1
JOIN
Row Iterator 0 Row Iterator 1
Row Iterator 2
UDF1
UDF0
Join order item info with web hits info
SELECT
item FROM
orders
SELECT query
string FROM hits
JOIN
Row Iterator 0 Row Iterator 1
Row Iterator 2
UDF1
UDF0
http://www.store.com/?q=7%2e1+Speakers
SELECT
item FROM
orders
SELECT query
string FROM hits
JOIN
Row Iterator 0 Row Iterator 1
Row Iterator 2
UDF1
UDF0
http://www.store.com/?q=7%2e1+Speakers
Extract and decode query term => “7.1 Speakers”
SELECT
item FROM
orders
SELECT query
string FROM hits
JOIN
Row Iterator 0 Row Iterator 1
Row Iterator 2
UDF1
UDF0
UDF execution
Subquery0 Subquery1
JOIN
UDF1
Process boundary
UDF0UDF0
User Code
Agenda
Background
Example: Cross-row intervals
Under the hood
Example: Codebreaking
I.
II.
III.
IV.
Demos:
Ñ
Image: El Hormiguero (Flickr CC)
http://jsfiddle.net/fhoffa/y4pt9s23/
Image: TheVanCats (Flickr CC)
Questions?
News: reddit.com/r/bigquery
Ask: stackoverflow.com
Share: bigqueri.es
Thomas Park
Felipe Hoffa @felipehoffa +FelipeHoffa
Rate us?
http://goo.gl/k3bzdw
Backup slides / screenshots
17TH ~ 18th NOV 2014
MADRID (SPAIN)

More Related Content

What's hot

BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementationSimon Su
 
Workshop 20140522 BigQuery Implementation
Workshop 20140522   BigQuery ImplementationWorkshop 20140522   BigQuery Implementation
Workshop 20140522 BigQuery ImplementationSimon Su
 
TDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDataTDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDatatdc-globalcode
 
Getting started with BigQuery
Getting started with BigQueryGetting started with BigQuery
Getting started with BigQueryPradeep Bhadani
 
Google BigQuery Best Practices
Google BigQuery Best PracticesGoogle BigQuery Best Practices
Google BigQuery Best PracticesMatillion
 
Google BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewGoogle BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewDoiT International
 
BigQuery for the Big Data win
BigQuery for the Big Data winBigQuery for the Big Data win
BigQuery for the Big Data winKen Taylor
 
Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)Ido Green
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud PlatformPradeep Bhadani
 
You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQueryRyuji Tamagawa
 
Faites évoluer votre accès aux données avec MongoDB Stitch
Faites évoluer votre accès aux données avec MongoDB StitchFaites évoluer votre accès aux données avec MongoDB Stitch
Faites évoluer votre accès aux données avec MongoDB StitchMongoDB
 
Google App Engine 7 9-14
Google App Engine 7 9-14Google App Engine 7 9-14
Google App Engine 7 9-14Tony Frame
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessMongoDB
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaObjectRocket
 
Google Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery WebinarGoogle Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery WebinarRasel Rana
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudEduardo Silva Pereira
 
Augmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataAugmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataTreasure Data, Inc.
 

What's hot (20)

BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementation
 
Workshop 20140522 BigQuery Implementation
Workshop 20140522   BigQuery ImplementationWorkshop 20140522   BigQuery Implementation
Workshop 20140522 BigQuery Implementation
 
TDC2016SP - Trilha BigData
TDC2016SP - Trilha BigDataTDC2016SP - Trilha BigData
TDC2016SP - Trilha BigData
 
Getting started with BigQuery
Getting started with BigQueryGetting started with BigQuery
Getting started with BigQuery
 
Google BigQuery Best Practices
Google BigQuery Best PracticesGoogle BigQuery Best Practices
Google BigQuery Best Practices
 
Google BigQuery 101 & What’s New
Google BigQuery 101 & What’s NewGoogle BigQuery 101 & What’s New
Google BigQuery 101 & What’s New
 
BigQuery for the Big Data win
BigQuery for the Big Data winBigQuery for the Big Data win
BigQuery for the Big Data win
 
Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)Big Query - Women Techmarkers (Ukraine - March 2014)
Big Query - Women Techmarkers (Ukraine - March 2014)
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQuery
 
Faites évoluer votre accès aux données avec MongoDB Stitch
Faites évoluer votre accès aux données avec MongoDB StitchFaites évoluer votre accès aux données avec MongoDB Stitch
Faites évoluer votre accès aux données avec MongoDB Stitch
 
Google App Engine 7 9-14
Google App Engine 7 9-14Google App Engine 7 9-14
Google App Engine 7 9-14
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational Awareness
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
Google Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery WebinarGoogle Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery Webinar
 
Unifying Events and Logs into the Cloud
Unifying Events and Logs into the CloudUnifying Events and Logs into the Cloud
Unifying Events and Logs into the Cloud
 
Google Cloud Spanner Preview
Google Cloud Spanner PreviewGoogle Cloud Spanner Preview
Google Cloud Spanner Preview
 
Augmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure dataAugmenting Mongo DB with treasure data
Augmenting Mongo DB with treasure data
 
MongoDB + Spring
MongoDB + SpringMongoDB + Spring
MongoDB + Spring
 
Siddhi - cloud-native stream processor
Siddhi - cloud-native stream processorSiddhi - cloud-native stream processor
Siddhi - cloud-native stream processor
 

Viewers also liked

Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...Big Data Spain
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014Big Data Spain
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data Spain
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014Big Data Spain
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...Big Data Spain
 
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Big Data Spain
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceBig Data Spain
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012Big Data Spain
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...Big Data Spain
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...Big Data Spain
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Big Data Spain
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Big Data Spain
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Big Data Spain
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Big Data Spain
 
A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...Big Data Spain
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Big Data Spain
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data Spain
 
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...Big Data Spain
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...Big Data Spain
 

Viewers also liked (20)

Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
 
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conference
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
 
A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
 
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
 

Similar to BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at Big Data Spain 2014

Hadoop cluster performance profiler
Hadoop cluster performance profilerHadoop cluster performance profiler
Hadoop cluster performance profilerIhor Bobak
 
Intro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiIntro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiRan Mizrahi
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1MariaDB plc
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1MariaDB plc
 
Why you should be using structured logs
Why you should be using structured logsWhy you should be using structured logs
Why you should be using structured logsStefan Krawczyk
 
Re-Design with Elixir/OTP
Re-Design with Elixir/OTPRe-Design with Elixir/OTP
Re-Design with Elixir/OTPMustafa TURAN
 
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1MariaDB plc
 
#NewMeetup Performance
#NewMeetup Performance#NewMeetup Performance
#NewMeetup PerformanceJustin Cataldo
 
Migrating from Struts 1 to Struts 2
Migrating from Struts 1 to Struts 2Migrating from Struts 1 to Struts 2
Migrating from Struts 1 to Struts 2Matt Raible
 
Protractor framework – how to make stable e2e tests for Angular applications
Protractor framework – how to make stable e2e tests for Angular applicationsProtractor framework – how to make stable e2e tests for Angular applications
Protractor framework – how to make stable e2e tests for Angular applicationsLudmila Nesvitiy
 
Kapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EngineKapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EnginePrashant Vats
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomyDongmin Yu
 
TypeScript for Java Developers
TypeScript for Java DevelopersTypeScript for Java Developers
TypeScript for Java DevelopersYakov Fain
 
SwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupSwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupErnest Jumbe
 
Charla EHU Noviembre 2014 - Desarrollo Web
Charla EHU Noviembre 2014 - Desarrollo WebCharla EHU Noviembre 2014 - Desarrollo Web
Charla EHU Noviembre 2014 - Desarrollo WebMikel Torres Ugarte
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak PROIDEA
 
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wyciekówKonrad Kokosa
 
Artem Storozhuk "Building SQL firewall: insights from developers"
Artem Storozhuk "Building SQL firewall: insights from developers"Artem Storozhuk "Building SQL firewall: insights from developers"
Artem Storozhuk "Building SQL firewall: insights from developers"Fwdays
 

Similar to BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at Big Data Spain 2014 (20)

Hadoop cluster performance profiler
Hadoop cluster performance profilerHadoop cluster performance profiler
Hadoop cluster performance profiler
 
Intro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran MizrahiIntro To JavaScript Unit Testing - Ran Mizrahi
Intro To JavaScript Unit Testing - Ran Mizrahi
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
Why you should be using structured logs
Why you should be using structured logsWhy you should be using structured logs
Why you should be using structured logs
 
Re-Design with Elixir/OTP
Re-Design with Elixir/OTPRe-Design with Elixir/OTP
Re-Design with Elixir/OTP
 
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
 
#NewMeetup Performance
#NewMeetup Performance#NewMeetup Performance
#NewMeetup Performance
 
Migrating from Struts 1 to Struts 2
Migrating from Struts 1 to Struts 2Migrating from Struts 1 to Struts 2
Migrating from Struts 1 to Struts 2
 
Protractor framework – how to make stable e2e tests for Angular applications
Protractor framework – how to make stable e2e tests for Angular applicationsProtractor framework – how to make stable e2e tests for Angular applications
Protractor framework – how to make stable e2e tests for Angular applications
 
Kapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing EngineKapacitor - Real Time Data Processing Engine
Kapacitor - Real Time Data Processing Engine
 
Google cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache FlinkGoogle cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache Flink
 
Presto anatomy
Presto anatomyPresto anatomy
Presto anatomy
 
TypeScript for Java Developers
TypeScript for Java DevelopersTypeScript for Java Developers
TypeScript for Java Developers
 
Slickdemo
SlickdemoSlickdemo
Slickdemo
 
SwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupSwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup Group
 
Charla EHU Noviembre 2014 - Desarrollo Web
Charla EHU Noviembre 2014 - Desarrollo WebCharla EHU Noviembre 2014 - Desarrollo Web
Charla EHU Noviembre 2014 - Desarrollo Web
 
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak   CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
 
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wycieków
 
Artem Storozhuk "Building SQL firewall: insights from developers"
Artem Storozhuk "Building SQL firewall: insights from developers"Artem Storozhuk "Building SQL firewall: insights from developers"
Artem Storozhuk "Building SQL firewall: insights from developers"
 

More from Big Data Spain

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
 

More from Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
 

Recently uploaded

costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Recently uploaded (20)

costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at Big Data Spain 2014

  • 1. HANDS ON WITH BIGQUERY JAVASCRIPT UDFS THOMAS PARK SOFTWARE ENGINEER - GOOGLE
  • 2. Hands-on with BigQuery JavaScript User-Defined Functions Thomas Park Software Engineer - Google Felipe Hoffa @felipehoffa Developer Advocate - Google
  • 3. Agenda Background Example: Cross-row intervals Under the hood Example: Codebreaking I. II. III. IV.
  • 4. Agenda Background Example: Cross-row intervals Under the hood Example: Codebreaking I. II. III. IV.
  • 6. BigQuery: Big Data Analytics in the Cloud Unrivaled Performance and Scale ● Scan multiple TB’s in seconds ● Interactive query performance ● No limits on amount of data Ease of Use and Adoption ● No administration / provisioning ● Convenience of SQL ● Open interfaces (REST, WebUI, ODBC) ● First 1 TB of data processed per month is free Advanced “Big Data” Storage ● Familiar database structure ● Easy data management and ACL’s ● Fast, atomic imports
  • 7. Google confidential │ Do not distribute How many pageviews does Wikipedia have in a month? SELECT COUNT(*)FROM [fh-bigquery:wikipedia.wikipedia_views_201308] https://bigquery.cloud.google.com/table/fh-bigquery:wikipedia.pagecounts_20140602_18
  • 8. Google confidential │ Do not distribute $500 in Cloud Platform credit to launch your idea! Build. Store. Analyze. On the same infrastructure that powers Google Start building Click ‘Apply Now’ and complete the application with promo code: bigdata-spain Starter Pack Offer Description 1 2 3 Go to cloud.google.com/developers/starterpack
  • 9. Agenda Background Example: Cross-row intervals Under the hood Example: Codebreaking I. II. III. IV.
  • 10. Images by Connie Zhou Scenario: Door access records from a Very Well-Secured Lab where users must badge in to enter or leave Image by Tod Kurt
  • 11. Images by Connie Zhou Example: Time-series analysis from discrete user action data Image by Tod Kurt
  • 12. user_id timestamp Beep!! 9h: arrive @ lab thomas 2014.07.15 09:00 Beep!! 10h: leave to pick up prototype thomas 2014.07.15 10:00 Beep!! 10h15: return with prototype thomas 2014.07.15 10:15 Beep!! 12h: out for lunch thomas 2014.07.15 12:00
  • 13. How can we find out how much time each user spent in the lab? ...where each scan of the user’s access card is represented as a discrete row?
  • 14. rownum user_id timestamp 1 thomas 2014.07.15 09:00 2 thomas 2014.07.15 10:00 3 thomas 2014.07.15 10:15 4 thomas 2014.07.15 12:00 60 minutes 105 minutes Our analysis with data in this format via SQL is horrid and painful A BigQuery + JS friendly format: data for each user in separate rows user_id timestamps thomas [ 09:00, 10:00, 10:15, 12:00, ... ] hoffa [ 08:10, 11:30, 12:00, 12:15, ... ] SELECT user_id, NEST(timestamp) AS timestamps FROM T GROUP BY user_id; Producing this format is trivial in BigQuery...
  • 15. // This function will be called once for each user, // and receive an array of timestamps. function(record, emit) { var total_time = 0; // Order of records built by NEST are not guaranteed! // Sort to guarantee ascending timestamps. var ts = record.ts.sort( function (a, b) {return a > b;}); // Loop over timestamp pairs, calculate interval. for (var i = 0; i < ts.length - 1; i += 2) { total_time += (ts[i+1] - ts[i]); } // Emit total time for this user. emit({user: record.user_id, total_time: total_time}); } JS: Total time for each user
  • 16. SELECT * FROM js( // Input table or query. [secret-lab:door_scans.201411] // Input columns. user_id, timestamps, // Output schema. "[{name: 'user_id', type:'string'}, {name: 'tot_time', type:'integer'}]", // The function. "function(r, emit) { ... emit(...); }")
  • 17. SELECT * FROM js( // Input table or query. [secret-lab:door_scans.201411] // Input columns. user_id, timestamps, // Output schema. "[{name: 'user_id', type:'string'}, {name: 'tot_time', type:'integer'}]", // The function. "function(r, emit) { ... emit(...); }") The JS function
  • 18. SELECT * FROM js( // Input table or query. [secret-lab:door_scans.201411] // Input columns. user_id, timestamps, // Output schema. "[{name: 'user_id', type:'string'}, {name: 'tot_time', type:'integer'}]", // The function. "function(r, emit) { ... emit(...); }") Input schema (column names only!)
  • 19. SELECT * FROM js( // Input table or query. [secret-lab:door_scans.201411] // Input columns. user_id, timestamps, // Output schema. "[{name: 'user_id', type:'string'}, {name: 'tot_time', type:'integer'}]", // The function. "function(r, emit) { ... emit(...); }") Output schema (full declaration)
  • 20. SELECT * FROM js( // Input table or query. [secret-lab:door_scans.201411] // Input columns. user_id, timestamps, // Output schema. "[{name: 'user_id', type:'string'}, {name: 'tot_time', type:'integer'}]", // The function. "function(r, emit) { ... emit(...); }") Input table or subquery
  • 21. SELECT * FROM js( // Input table or query. [secret-lab:door_scans.201411] // Input columns. user_id, timestamps, // Output schema. "[{name: 'user_id', type:'string'}, {name: 'tot_time', type:'integer'}]", // The function. "function(r, emit) { ... emit(...); }")
  • 22. Agenda Background Example: Cross-row intervals Under the hood Example: Codebreaking I. II. III. IV.
  • 23. How BigQuery works Get data from lower levels, filter / join / transform, send rows up Tree Structured Query Dispatch and Aggregation Distributed Storage SELECT title, requests Leaf Leaf Leaf Leaf SUM(requests) GROUP BY title WHERE REGEX_MATCH(title, 'pat.*rn') Mixer 1 Mixer 1 SUM(requests) GROUP BY title Mixer 0 LIMIT 10 ORDER BY c DESC SUM(requests) GROUP BY title
  • 24. Data for each row is calculated and streamed through a “Row Iterator” Subquery0 Subquery1 JOIN Row Iterator 0 Row Iterator 1 Row Iterator 2
  • 25. Can insert JavaScript Functions wherever we have a Row Iterator Subquery0 Subquery1 JOIN Row Iterator 0 Row Iterator 1 Row Iterator 2 UDF1 UDF0
  • 26. Join order item info with web hits info SELECT item FROM orders SELECT query string FROM hits JOIN Row Iterator 0 Row Iterator 1 Row Iterator 2 UDF1 UDF0
  • 27. http://www.store.com/?q=7%2e1+Speakers SELECT item FROM orders SELECT query string FROM hits JOIN Row Iterator 0 Row Iterator 1 Row Iterator 2 UDF1 UDF0
  • 28. http://www.store.com/?q=7%2e1+Speakers Extract and decode query term => “7.1 Speakers” SELECT item FROM orders SELECT query string FROM hits JOIN Row Iterator 0 Row Iterator 1 Row Iterator 2 UDF1 UDF0
  • 30. Agenda Background Example: Cross-row intervals Under the hood Example: Codebreaking I. II. III. IV.
  • 32. Image: El Hormiguero (Flickr CC)
  • 34. Questions? News: reddit.com/r/bigquery Ask: stackoverflow.com Share: bigqueri.es Thomas Park Felipe Hoffa @felipehoffa +FelipeHoffa Rate us? http://goo.gl/k3bzdw
  • 35. Backup slides / screenshots
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42. 17TH ~ 18th NOV 2014 MADRID (SPAIN)