SlideShare a Scribd company logo
Imhotep Workshop 
http://go.indeed.com/iws
engineering.indeed.com/talks
@IndeedEng Workshop: 
Interactive Analytics 
with Imhotep
Tom Bergman 
Product Manager
Is anybody there???
Does this thing work???
Are we winning???
Harder questions???
What is Imhotep? 
Imhotep is Indeed’s highly scalable 
open-source analytics platform.
Imhotep open source included: 
● Imhotep Daemons 
● Imhotep Query Language (IQL) 
● IQL Web Client 
● TSV/CSV Uploader
What does Imhotep do? 
● Easy upload & compression 
● Fast, Interactive queries
Imhotep Philosophy
Interactive 
● Quickly refine your questions
Time to the right question 
SOME TIME LATER… 
Oh, bummer. Wrong question. Let’s try again. 
Nope. Nope. YES! 
Next question? 
Hadoop and the Hadoop elephant logo are trademarks of the Apache Software Foundation.
Ask all the questions! 
Cool! Really? 
Wow... Awesome 
Oh… Ah! INSIGHT! …
Ground Truth 
● Data should not be down-sampled
Show me the data 
● Web-based to facilitate sharing
Cache Rules Everything Around Me 
● Instantaneous sharing
Easy Access 
● Easily queryable
Imhotep Data Structures 
● Dataset > 
● Document > 
● Field > 
● Term > 
● DB Table 
● Denormalized Row 
● Column 
● Value
Imhotep Query Language 
(IQL)
IQL - Imhotep Query Language 
Expressive SQL-like language for 
aggregate analytics.
IQL queries - requirements 
Dataset 
Date range
IQL queries - optional 
Dataset 
Date range 
Filters 
Group by 
Metrics
IQL - Dataset 
from searchresults 
‘2013-12-05’ 
‘2013-12-10’ 
where country=ie 
and jobagedays<1 
group by time(1d) 
select clicked, count() 
Dataset
IQL - Date Range 
from searchresults 
‘2013-12-05’ 
‘2013-12-10’ 
where country=ie 
and jobagedays<1 
group by time(1d) 
select clicked, count() 
Date Range
IQL - Filters 
from searchresults 
‘2013-12-05’ 
‘2013-12-10’ 
where country=ie 
and jobagedays<1 
group by time(1d) 
select clicked, count() 
Filters
IQL - Group by 
from searchresults 
‘2013-12-05’ 
‘2013-12-10’ 
where country=ie 
and jobagedays<1 
group by time(1d) 
select clicked, count() 
Groups
IQL - Metrics 
from searchresults 
‘2013-12-05’ 
‘2013-12-10’ 
where country=ie 
and jobagedays<1 
group by time(1d) 
select clicked, count() 
Metrics
IQL - Example Results
Extract Transform Load 
(ETL)
Extract 
● Up to your data schema
Transform 
● De-Normalize Data 
● Formating
Example Datasets 
● Jobsearch 
● Ad Clicks 
● Resume Contacts
Load 
● TSV/CSV Uploader
Load 
● TSV/CSV Uploader 
● Java API
Inverted Index 
● Massive compression 
● Fast Boolean Search
Opensource Package 
● Run on any* computer/s 
● Can deploy to AWS using handy 
AWS CloudFormation script
CloudFormation Setup 
● Create S3 buckets 
● Create EC2 Key Pair 
● Run CloudFormation script
DEMO
Q&A
Helpful Workshop Links 
http://go.indeed.com/iws

More Related Content

What's hot

Using AI for Providing Insights and Recommendations on Activity Data Alexis R...
Using AI for Providing Insights and Recommendations on Activity Data Alexis R...Using AI for Providing Insights and Recommendations on Activity Data Alexis R...
Using AI for Providing Insights and Recommendations on Activity Data Alexis R...
Databricks
 
Building a Production-ready Predictive App for Customer Service - Alex Ingerm...
Building a Production-ready Predictive App for Customer Service - Alex Ingerm...Building a Production-ready Predictive App for Customer Service - Alex Ingerm...
Building a Production-ready Predictive App for Customer Service - Alex Ingerm...
PAPIs.io
 
Reduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective SearchReduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective Search
Lucidworks
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)
Turi, Inc.
 
Question Answering and Virtual Assistants with Deep Learning
Question Answering and Virtual Assistants with Deep LearningQuestion Answering and Virtual Assistants with Deep Learning
Question Answering and Virtual Assistants with Deep Learning
Lucidworks
 
Florian Douetteau @ Dataiku
Florian Douetteau @ DataikuFlorian Douetteau @ Dataiku
Florian Douetteau @ Dataiku
PAPIs.io
 
Sharing about my data science journey and what I do at Lazada
Sharing about my data science journey and what I do at LazadaSharing about my data science journey and what I do at Lazada
Sharing about my data science journey and what I do at Lazada
Eugene Yan Ziyou
 

What's hot (7)

Using AI for Providing Insights and Recommendations on Activity Data Alexis R...
Using AI for Providing Insights and Recommendations on Activity Data Alexis R...Using AI for Providing Insights and Recommendations on Activity Data Alexis R...
Using AI for Providing Insights and Recommendations on Activity Data Alexis R...
 
Building a Production-ready Predictive App for Customer Service - Alex Ingerm...
Building a Production-ready Predictive App for Customer Service - Alex Ingerm...Building a Production-ready Predictive App for Customer Service - Alex Ingerm...
Building a Production-ready Predictive App for Customer Service - Alex Ingerm...
 
Reduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective SearchReduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective Search
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)
 
Question Answering and Virtual Assistants with Deep Learning
Question Answering and Virtual Assistants with Deep LearningQuestion Answering and Virtual Assistants with Deep Learning
Question Answering and Virtual Assistants with Deep Learning
 
Florian Douetteau @ Dataiku
Florian Douetteau @ DataikuFlorian Douetteau @ Dataiku
Florian Douetteau @ Dataiku
 
Sharing about my data science journey and what I do at Lazada
Sharing about my data science journey and what I do at LazadaSharing about my data science journey and what I do at Lazada
Sharing about my data science journey and what I do at Lazada
 

Viewers also liked

Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision MakingData-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
indeedeng
 
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...
MongoDB
 
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
indeedeng
 
[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters
indeedeng
 
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
indeedeng
 
Automation and Developer Infrastructure — Empowering Engineers to Move from I...
Automation and Developer Infrastructure — Empowering Engineers to Move from I...Automation and Developer Infrastructure — Empowering Engineers to Move from I...
Automation and Developer Infrastructure — Empowering Engineers to Move from I...
indeedeng
 
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol [@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
indeedeng
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
University of California, Santa Cruz
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
MIJIN AN
 
LSM
LSMLSM
LSM Trees
LSM TreesLSM Trees
LSM Trees
Chris Lohfink
 
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
indeedeng
 
失敗から学ぶ データ分析グループの チームマネジメント変遷 (デブサミ2016) #devsumi
失敗から学ぶデータ分析グループのチームマネジメント変遷 (デブサミ2016) #devsumi失敗から学ぶデータ分析グループのチームマネジメント変遷 (デブサミ2016) #devsumi
失敗から学ぶ データ分析グループの チームマネジメント変遷 (デブサミ2016) #devsumi
Tokoroten Nakayama
 

Viewers also liked (13)

Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision MakingData-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
 
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...
Retail Reference Architecture Part 3: Scalable Insight Component Providing Us...
 
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
@Indeedeng: RAD - How We Replicate Terabytes of Data Around the World Every Day
 
[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters[@IndeedEng] Redundant Array of Inexpensive Datacenters
[@IndeedEng] Redundant Array of Inexpensive Datacenters
 
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
[@IndeedEng] Engineering Velocity: Building Great Software Through Fast Itera...
 
Automation and Developer Infrastructure — Empowering Engineers to Move from I...
Automation and Developer Infrastructure — Empowering Engineers to Move from I...Automation and Developer Infrastructure — Empowering Engineers to Move from I...
Automation and Developer Infrastructure — Empowering Engineers to Move from I...
 
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol [@IndeedEng] Boxcar: A self-balancing distributed services protocol
[@IndeedEng] Boxcar: A self-balancing distributed services protocol
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
LSM
LSMLSM
LSM
 
LSM Trees
LSM TreesLSM Trees
LSM Trees
 
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
[@IndeedEng] Managing Experiments and Behavior Dynamically with Proctor
 
失敗から学ぶ データ分析グループの チームマネジメント変遷 (デブサミ2016) #devsumi
失敗から学ぶデータ分析グループのチームマネジメント変遷 (デブサミ2016) #devsumi失敗から学ぶデータ分析グループのチームマネジメント変遷 (デブサミ2016) #devsumi
失敗から学ぶ データ分析グループの チームマネジメント変遷 (デブサミ2016) #devsumi
 

Similar to [@IndeedEng] Imhotep Workshop

Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Austin Ogilvie
 
Computational Advertising-The LinkedIn Way
Computational Advertising-The LinkedIn WayComputational Advertising-The LinkedIn Way
Computational Advertising-The LinkedIn Way
yingfeng
 
Building cost-effective mobile product & marketing app analytics based on GCP...
Building cost-effective mobile product & marketing app analytics based on GCP...Building cost-effective mobile product & marketing app analytics based on GCP...
Building cost-effective mobile product & marketing app analytics based on GCP...
GameCamp
 
Mozilla Foundation Metrics - presentation to engineers
Mozilla Foundation Metrics - presentation to engineersMozilla Foundation Metrics - presentation to engineers
Mozilla Foundation Metrics - presentation to engineers
John Schneider
 
Splunk in integration testing
Splunk in integration testingSplunk in integration testing
Splunk in integration testing
Albert Witteveen
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
Looker
 
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Austin Ogilvie
 
Measure everything you can
Measure everything you canMeasure everything you can
Measure everything you can
Ricardo Bánffy
 
danmcclary-pspresentation-katieboyle-171030115522.pdf
danmcclary-pspresentation-katieboyle-171030115522.pdfdanmcclary-pspresentation-katieboyle-171030115522.pdf
danmcclary-pspresentation-katieboyle-171030115522.pdf
ssuser3ee399
 
Why Big and Small Data Is Important by Google's Product Manager
Why Big and Small Data Is Important by Google's Product ManagerWhy Big and Small Data Is Important by Google's Product Manager
Why Big and Small Data Is Important by Google's Product Manager
Product School
 
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowMay 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
Adam Doyle
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
Product School
 
Agile Test Management and Reporting—Even in a Non-Agile Project
Agile Test Management and Reporting—Even in a Non-Agile ProjectAgile Test Management and Reporting—Even in a Non-Agile Project
Agile Test Management and Reporting—Even in a Non-Agile Project
TechWell
 
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Aggregage
 
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
BrittanyShear
 
Playing Nice in the Product Playground #StrataHadoop
Playing Nice in the Product Playground #StrataHadoopPlaying Nice in the Product Playground #StrataHadoop
Playing Nice in the Product Playground #StrataHadoop
Intuit Inc.
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Databricks
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku
 
Making operations visible - Nick Gallbreath
Making operations visible - Nick GallbreathMaking operations visible - Nick Gallbreath
Making operations visible - Nick Gallbreath
Devopsdays
 
Making operations visible - devopsdays tokyo 2013
Making operations visible  - devopsdays tokyo 2013Making operations visible  - devopsdays tokyo 2013
Making operations visible - devopsdays tokyo 2013
Nick Galbreath
 

Similar to [@IndeedEng] Imhotep Workshop (20)

Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014
 
Computational Advertising-The LinkedIn Way
Computational Advertising-The LinkedIn WayComputational Advertising-The LinkedIn Way
Computational Advertising-The LinkedIn Way
 
Building cost-effective mobile product & marketing app analytics based on GCP...
Building cost-effective mobile product & marketing app analytics based on GCP...Building cost-effective mobile product & marketing app analytics based on GCP...
Building cost-effective mobile product & marketing app analytics based on GCP...
 
Mozilla Foundation Metrics - presentation to engineers
Mozilla Foundation Metrics - presentation to engineersMozilla Foundation Metrics - presentation to engineers
Mozilla Foundation Metrics - presentation to engineers
 
Splunk in integration testing
Splunk in integration testingSplunk in integration testing
Splunk in integration testing
 
Operationalizing analytics to scale
Operationalizing analytics to scaleOperationalizing analytics to scale
Operationalizing analytics to scale
 
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
 
Measure everything you can
Measure everything you canMeasure everything you can
Measure everything you can
 
danmcclary-pspresentation-katieboyle-171030115522.pdf
danmcclary-pspresentation-katieboyle-171030115522.pdfdanmcclary-pspresentation-katieboyle-171030115522.pdf
danmcclary-pspresentation-katieboyle-171030115522.pdf
 
Why Big and Small Data Is Important by Google's Product Manager
Why Big and Small Data Is Important by Google's Product ManagerWhy Big and Small Data Is Important by Google's Product Manager
Why Big and Small Data Is Important by Google's Product Manager
 
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflowMay 2021 Spark Testing ... or how to farm reputation on StackOverflow
May 2021 Spark Testing ... or how to farm reputation on StackOverflow
 
What Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
 
Agile Test Management and Reporting—Even in a Non-Agile Project
Agile Test Management and Reporting—Even in a Non-Agile ProjectAgile Test Management and Reporting—Even in a Non-Agile Project
Agile Test Management and Reporting—Even in a Non-Agile Project
 
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
 
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
Start With Why: Ask the "Right" Questions: Your Analytics-Guided Product Stra...
 
Playing Nice in the Product Playground #StrataHadoop
Playing Nice in the Product Playground #StrataHadoopPlaying Nice in the Product Playground #StrataHadoop
Playing Nice in the Product Playground #StrataHadoop
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce Setting
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
 
Making operations visible - Nick Gallbreath
Making operations visible - Nick GallbreathMaking operations visible - Nick Gallbreath
Making operations visible - Nick Gallbreath
 
Making operations visible - devopsdays tokyo 2013
Making operations visible  - devopsdays tokyo 2013Making operations visible  - devopsdays tokyo 2013
Making operations visible - devopsdays tokyo 2013
 

More from indeedeng

Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-DrivenWeapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
indeedeng
 
Alchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That WorkAlchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That Work
indeedeng
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
indeedeng
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
indeedeng
 
Improving the development process with metrics driven insights presentation
Improving the development process with metrics driven insights presentationImproving the development process with metrics driven insights presentation
Improving the development process with metrics driven insights presentation
indeedeng
 
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
indeedeng
 
Data Day Texas - Recommendations
Data Day Texas - RecommendationsData Day Texas - Recommendations
Data Day Texas - Recommendations
indeedeng
 
Vectorized VByte Decoding
Vectorized VByte DecodingVectorized VByte Decoding
Vectorized VByte Decoding
indeedeng
 
[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search
indeedeng
 
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
indeedeng
 

More from indeedeng (10)

Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-DrivenWeapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
Weapons of Math Instruction: Evolving from Data0-Driven to Science-Driven
 
Alchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That WorkAlchemy and Science: Choosing Metrics That Work
Alchemy and Science: Choosing Metrics That Work
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
 
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
Indeed Engineering and The Lead Developer Present: Tech Leadership and Manage...
 
Improving the development process with metrics driven insights presentation
Improving the development process with metrics driven insights presentationImproving the development process with metrics driven insights presentation
Improving the development process with metrics driven insights presentation
 
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
Indeed My Jobs: A case study in ReactJS and Redux (Meetup talk March 2016)
 
Data Day Texas - Recommendations
Data Day Texas - RecommendationsData Day Texas - Recommendations
Data Day Texas - Recommendations
 
Vectorized VByte Decoding
Vectorized VByte DecodingVectorized VByte Decoding
Vectorized VByte Decoding
 
[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search
 
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
[@IndeedEng] From 1 To 1 Billion: Evolution of Indeed's Document Serving System
 

Recently uploaded

Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 

Recently uploaded (20)

Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 

[@IndeedEng] Imhotep Workshop

Editor's Notes

  1. https://bugs.indeed.com/browse/JASX-4581
  2. Can combine results from multiple datasets Allows for automation of data tools
  3. make sure to start with just -"dataset" + time range - view the fields - add a filter field=term - add inequality filter, lucene, regex. - group by time - group by time and user in X - group by buckets (tottime) - add metric (count) - add distinct(user) - add filter as metric (user=zak) - add inequality as a metric (% of queries under 5000ms) - average tottime/count() - percentile(tottime,50)