SlideShare a Scribd company logo
1 of 20
Download to read offline
Servo & SQLGitHub
A JOURNEY OF TRANSFORMING FAILURES INTO A GOOD IDEA
Servo
 A new browser engine written from scratch in Rust by Mozilla
 Original Project: Write an Android frontend for servo
 Issues, Issues and issues
 Zero knowledge about mobile development and Rust
 Outdated documentation
 Very poorly supported on Android
 Tried approaching in various ways, but to no avail
 Made build and packaging guide and reported issues
SQLGitHub – Motivation (I)
 Who is stealing all my easy bugs/issues?
 There are very limited tools for managing GitHub organizations
 Open-source organizations are usually understaffed
 The Servo project on GitHub for example,
contains 129 repositories
managed by practically 1 person.
SQLGitHub – Background
 Organization: Organizations are shared accounts where businesses
and open-source projects can collaborate across many projects at
once.
 Repository: A repository contains all of the project files, and stores
each file's revision history.
 Commit: An individual change to a set of files.
 Issue: Suggested improvements, tasks or questions
 Pull Request: Proposed change
to a repository.
Repository
Organization
Repository Repository
• Commits
• Issues
• Pull Requests
• …
• Commits
• Issues
• Pull Requests
• …
• Commits
• Issues
• Pull Requests
• …https://help.github.com/articles/github-glossary/
SQLGitHub – Motivation (II)
 Common questions/problems an organization admin include:
 Obtain certain metrics of the organization in machine-friendly format for
post-processing (eg. KPI report)
 Get the current list of projects hosted on GitHub
 List of the most popular repositories in the organization
 Get the list of issues closed (resolved) for the past 7 days
 What are the critical issues that are still left open?
 Who are the top contributors of the past month?
 … endless possible questions
Abstract
 SQLGitHub features a SQL-like syntax that allows you to:
Query information about an organization as a whole.
 You may also think of it as a better, enhanced frontend layer built
on top of GitHub’s RESTful API
Introduction – Supported Schema
 SELECT
select_expr [, select_expr ...]
FROM {org_name | org_name.{repos | issues | pulls | commits}}
[WHERE where_condition]
[GROUP BY {col_name | expr}
[ASC | DESC], ...]
[HAVING where_condition]
[ORDER BY {col_name | expr}
[ASC | DESC], ...]
[LIMIT row_count]
Introduction – Use Case (I)
 Get name and description from all the repos in apple.
 select name, description from apple.repos
Introduction – Use Case (II)
 Get last-updated time and title of the issues closed in the past week (7
days) in servo listed in descending order of last-updated time.
 select updated_at, title from servo.issues.closed.7 order by updated_at desc
Introduction – Use Case (III)
 Get top 10 most-starred repositories in servo.
 select concat(concat("(", stargazers_count, ") ", name), ": ", description) from
servo.repos order by stargazers_count desc, name limit 10
Introduction – Use Case (IV)
 Get top 10 contributors in servo for the past month (30 days) based
on number of commits.
 select login, count(login) from servo.commits.30 group by login order by
count(login) desc, login limit 10
Introduction – Technology Stack
 Python
 re & regex, regular expression libraries
 PyGithub (patched), an unofficial client library for GitHub API
 prompt_toolkit, a library for building prompts
 pygments, a library for syntax highlighting
Introduction – (Simplified) Flow
Fetch data (from)
Filter by where conditions
Evaluate partial exprs
Group by group exprs
Order by order exprs
Evaluate select exprs
Fetch data with required fields from GitHub API
Evaluate where conditions and filter fetched data
Evaluate group exprs and other “field” exprs
Generate table groups by values of group exprs
Sort within and between tables
Evaluate select exprs
Filter by having conditions Evaluate having conditions and filter tables
Introduction – Architecture
top_level
parser
tokenizersession
grouping ordering
table_fetcher expression
definitionutilitiestable PyGithub
process user input
process external data
fetches external data
base classes, functions
and SQL definitions
Introduction – Challenges (I)
 Algorithm of parsing is almost identical to that of expression
evaluation  waste of time
 Lazy Parsing: Only parse clauses (eg. select, from, where) and comma-
separated fields
 Comma-separated fields, strings and escape characters
Evaluate this: concat("[)"Stars"(: ", stargazers_count)
 concat("[)"Stars"(: ", stargazers_count)
concat("[)"Stars"(: ", stargazers_count)
 concat("[)"Stars"(: ", stargazers_count)
Introduction – Challenges (II)
 Extracting all relevant fields from expressions to fetch at once
 select concat("[)"-> avg(stargazers_count)"(: ", stargazers_count -
avg(stargazers_count), "] ", name) from apple.repos where description
like "%library%" order by id
 Algorithm: for each expression,
 Remove all literal strings. Use r""(?:[^"]|.)*"" to match.
 Find all possible tokens with r"([a-zA-Z_]+)(?:[^(a-zA-Z_]|$)".
 For each token, check if it’s a predefined token (ie. part of SQL).
Introduction – Challenges (III)
 Expression Evaluation is really complicated
 Regular (eg. concat, floor) and Aggregate functions (eg. max, min)
 Have to evaluate an entire table at once
 Nested functions (eg. sum(avg(field_a) + avg(field_b)))
 Use recursive regex patterns to extract tokens – r”((?:(?>[^()]+|(?R))*))”
 Assign special precedence and insert extra logic in place
 Operator Precedence
 Modified 2-stack evaluation approach +
 Finite State Machine + One-token Lookahead
Introduction – Challenges (IV)
 Python’s built-in sort is not customizable:
sorted(iterable, *, key=None, reverse=False)
 order by requires sorting with multiple keys each with potentially
different reverse:
order by field_a desc, field_b asc, field_c, desc
 Wrote custom sort that integrates better with the workflow
Future Directions
 Improve SQL, MySQL compatibility
 Extend to end users not just organizations
 Migrate to the new GraphQL backend (GitHub API v4)
 Integrate SQLGitHub directly on the server end (better efficiency
and perhaps better security!)
Acknowledgements
 We would like to thank:
 Shing Lyu, former software engineer at Mozilla Taiwan for the mentorship
 Irvin Chen, Liaison of MozTW (Mozilla Taiwan Community) for
coordinating the program
 Prof. Cheng-Chung Lin for organizing the program

More Related Content

What's hot

Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query ParsingErik Hatcher
 
Collections - Sorting, Comparing Basics
Collections - Sorting, Comparing Basics Collections - Sorting, Comparing Basics
Collections - Sorting, Comparing Basics Hitesh-Java
 
Session 15 - Collections - Array List
Session 15 - Collections - Array ListSession 15 - Collections - Array List
Session 15 - Collections - Array ListPawanMM
 
Collections - Lists, Sets
Collections - Lists, Sets Collections - Lists, Sets
Collections - Lists, Sets Hitesh-Java
 
Building your own search engine with Apache Solr
Building your own search engine with Apache SolrBuilding your own search engine with Apache Solr
Building your own search engine with Apache SolrBiogeeks
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
 
New features in jdk8 iti
New features in jdk8 itiNew features in jdk8 iti
New features in jdk8 itiAhmed mar3y
 
A brief tour of modern Java
A brief tour of modern JavaA brief tour of modern Java
A brief tour of modern JavaSina Madani
 
Ingesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptIngesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptLucidworks
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginsearchbox-com
 
Python first day
Python first dayPython first day
Python first dayfarkhand
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for ElasticsearchJodok Batlogg
 
Collections - Maps
Collections - Maps Collections - Maps
Collections - Maps Hitesh-Java
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and TricksErik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Turning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseTurning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseMatthias Wahl
 

What's hot (20)

Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
Collections - Sorting, Comparing Basics
Collections - Sorting, Comparing Basics Collections - Sorting, Comparing Basics
Collections - Sorting, Comparing Basics
 
Session 15 - Collections - Array List
Session 15 - Collections - Array ListSession 15 - Collections - Array List
Session 15 - Collections - Array List
 
Collections - Lists, Sets
Collections - Lists, Sets Collections - Lists, Sets
Collections - Lists, Sets
 
Building your own search engine with Apache Solr
Building your own search engine with Apache SolrBuilding your own search engine with Apache Solr
Building your own search engine with Apache Solr
 
Object Class
Object Class Object Class
Object Class
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
 
New features in jdk8 iti
New features in jdk8 itiNew features in jdk8 iti
New features in jdk8 iti
 
A brief tour of modern Java
A brief tour of modern JavaA brief tour of modern Java
A brief tour of modern Java
 
Ingesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScriptIngesting and Manipulating Data with JavaScript
Ingesting and Manipulating Data with JavaScript
 
Elasticsearch speed is key
Elasticsearch speed is keyElasticsearch speed is key
Elasticsearch speed is key
 
Java 8 Lambda and Streams
Java 8 Lambda and StreamsJava 8 Lambda and Streams
Java 8 Lambda and Streams
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
Python first day
Python first dayPython first day
Python first day
 
Python first day
Python first dayPython first day
Python first day
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
 
Collections - Maps
Collections - Maps Collections - Maps
Collections - Maps
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and Tricks
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Turning a Search Engine into a Relational Database
Turning a Search Engine into a Relational DatabaseTurning a Search Engine into a Relational Database
Turning a Search Engine into a Relational Database
 

Similar to SQLGitHub - Access GitHub API with SQL-like syntaxes

New Features in JDK 8
New Features in JDK 8New Features in JDK 8
New Features in JDK 8Martin Toshev
 
Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Julian Hyde
 
SQL Saturday 28 - .NET Fundamentals
SQL Saturday 28 - .NET FundamentalsSQL Saturday 28 - .NET Fundamentals
SQL Saturday 28 - .NET Fundamentalsmikehuguet
 
.NET Fest 2018. Дмитрий Иванов. Иммутабельные структуры данных в .NET: зачем ...
.NET Fest 2018. Дмитрий Иванов. Иммутабельные структуры данных в .NET: зачем ....NET Fest 2018. Дмитрий Иванов. Иммутабельные структуры данных в .NET: зачем ...
.NET Fest 2018. Дмитрий Иванов. Иммутабельные структуры данных в .NET: зачем ...NETFest
 
Discover Tasty Query: The library for Scala program analysis
Discover Tasty Query: The library for Scala program analysisDiscover Tasty Query: The library for Scala program analysis
Discover Tasty Query: The library for Scala program analysisJames Thompson
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisSebastiano Panichella
 
Beyond Java: 자바 8을 중심으로 본 자바의 혁신
Beyond Java: 자바 8을 중심으로 본 자바의 혁신Beyond Java: 자바 8을 중심으로 본 자바의 혁신
Beyond Java: 자바 8을 중심으로 본 자바의 혁신Sungchul Park
 
20140908 spark sql & catalyst
20140908 spark sql & catalyst20140908 spark sql & catalyst
20140908 spark sql & catalystTakuya UESHIN
 
Java 8 - Project Lambda
Java 8 - Project LambdaJava 8 - Project Lambda
Java 8 - Project LambdaRahman USTA
 
The Road to Lambda - Mike Duigou
The Road to Lambda - Mike DuigouThe Road to Lambda - Mike Duigou
The Road to Lambda - Mike Duigoujaxconf
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationJonathan Katz
 
Spring Day | Spring and Scala | Eberhard Wolff
Spring Day | Spring and Scala | Eberhard WolffSpring Day | Spring and Scala | Eberhard Wolff
Spring Day | Spring and Scala | Eberhard WolffJAX London
 
Synapseindia reviews.odp.
Synapseindia reviews.odp.Synapseindia reviews.odp.
Synapseindia reviews.odp.Tarunsingh198
 
What to expect from Java 9
What to expect from Java 9What to expect from Java 9
What to expect from Java 9Ivan Krylov
 

Similar to SQLGitHub - Access GitHub API with SQL-like syntaxes (20)

New Features in JDK 8
New Features in JDK 8New Features in JDK 8
New Features in JDK 8
 
Java Tutorial
Java TutorialJava Tutorial
Java Tutorial
 
Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)
 
SQL Saturday 28 - .NET Fundamentals
SQL Saturday 28 - .NET FundamentalsSQL Saturday 28 - .NET Fundamentals
SQL Saturday 28 - .NET Fundamentals
 
cheat-sheets.pdf
cheat-sheets.pdfcheat-sheets.pdf
cheat-sheets.pdf
 
.NET Fest 2018. Дмитрий Иванов. Иммутабельные структуры данных в .NET: зачем ...
.NET Fest 2018. Дмитрий Иванов. Иммутабельные структуры данных в .NET: зачем ....NET Fest 2018. Дмитрий Иванов. Иммутабельные структуры данных в .NET: зачем ...
.NET Fest 2018. Дмитрий Иванов. Иммутабельные структуры данных в .NET: зачем ...
 
Discover Tasty Query: The library for Scala program analysis
Discover Tasty Query: The library for Scala program analysisDiscover Tasty Query: The library for Scala program analysis
Discover Tasty Query: The library for Scala program analysis
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
 
Java SE 8 & EE 7 Launch
Java SE 8 & EE 7 LaunchJava SE 8 & EE 7 Launch
Java SE 8 & EE 7 Launch
 
Beyond Java: 자바 8을 중심으로 본 자바의 혁신
Beyond Java: 자바 8을 중심으로 본 자바의 혁신Beyond Java: 자바 8을 중심으로 본 자바의 혁신
Beyond Java: 자바 8을 중심으로 본 자바의 혁신
 
20140908 spark sql & catalyst
20140908 spark sql & catalyst20140908 spark sql & catalyst
20140908 spark sql & catalyst
 
Java 8 - Project Lambda
Java 8 - Project LambdaJava 8 - Project Lambda
Java 8 - Project Lambda
 
The Road to Lambda - Mike Duigou
The Road to Lambda - Mike DuigouThe Road to Lambda - Mike Duigou
The Road to Lambda - Mike Duigou
 
Lambdas and Laughs
Lambdas and LaughsLambdas and Laughs
Lambdas and Laughs
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management Application
 
Spring Day | Spring and Scala | Eberhard Wolff
Spring Day | Spring and Scala | Eberhard WolffSpring Day | Spring and Scala | Eberhard Wolff
Spring Day | Spring and Scala | Eberhard Wolff
 
Spark sql meetup
Spark sql meetupSpark sql meetup
Spark sql meetup
 
Flink internals web
Flink internals web Flink internals web
Flink internals web
 
Synapseindia reviews.odp.
Synapseindia reviews.odp.Synapseindia reviews.odp.
Synapseindia reviews.odp.
 
What to expect from Java 9
What to expect from Java 9What to expect from Java 9
What to expect from Java 9
 

Recently uploaded

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 

Recently uploaded (20)

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 

SQLGitHub - Access GitHub API with SQL-like syntaxes

  • 1. Servo & SQLGitHub A JOURNEY OF TRANSFORMING FAILURES INTO A GOOD IDEA
  • 2. Servo  A new browser engine written from scratch in Rust by Mozilla  Original Project: Write an Android frontend for servo  Issues, Issues and issues  Zero knowledge about mobile development and Rust  Outdated documentation  Very poorly supported on Android  Tried approaching in various ways, but to no avail  Made build and packaging guide and reported issues
  • 3. SQLGitHub – Motivation (I)  Who is stealing all my easy bugs/issues?  There are very limited tools for managing GitHub organizations  Open-source organizations are usually understaffed  The Servo project on GitHub for example, contains 129 repositories managed by practically 1 person.
  • 4. SQLGitHub – Background  Organization: Organizations are shared accounts where businesses and open-source projects can collaborate across many projects at once.  Repository: A repository contains all of the project files, and stores each file's revision history.  Commit: An individual change to a set of files.  Issue: Suggested improvements, tasks or questions  Pull Request: Proposed change to a repository. Repository Organization Repository Repository • Commits • Issues • Pull Requests • … • Commits • Issues • Pull Requests • … • Commits • Issues • Pull Requests • …https://help.github.com/articles/github-glossary/
  • 5. SQLGitHub – Motivation (II)  Common questions/problems an organization admin include:  Obtain certain metrics of the organization in machine-friendly format for post-processing (eg. KPI report)  Get the current list of projects hosted on GitHub  List of the most popular repositories in the organization  Get the list of issues closed (resolved) for the past 7 days  What are the critical issues that are still left open?  Who are the top contributors of the past month?  … endless possible questions
  • 6. Abstract  SQLGitHub features a SQL-like syntax that allows you to: Query information about an organization as a whole.  You may also think of it as a better, enhanced frontend layer built on top of GitHub’s RESTful API
  • 7. Introduction – Supported Schema  SELECT select_expr [, select_expr ...] FROM {org_name | org_name.{repos | issues | pulls | commits}} [WHERE where_condition] [GROUP BY {col_name | expr} [ASC | DESC], ...] [HAVING where_condition] [ORDER BY {col_name | expr} [ASC | DESC], ...] [LIMIT row_count]
  • 8. Introduction – Use Case (I)  Get name and description from all the repos in apple.  select name, description from apple.repos
  • 9. Introduction – Use Case (II)  Get last-updated time and title of the issues closed in the past week (7 days) in servo listed in descending order of last-updated time.  select updated_at, title from servo.issues.closed.7 order by updated_at desc
  • 10. Introduction – Use Case (III)  Get top 10 most-starred repositories in servo.  select concat(concat("(", stargazers_count, ") ", name), ": ", description) from servo.repos order by stargazers_count desc, name limit 10
  • 11. Introduction – Use Case (IV)  Get top 10 contributors in servo for the past month (30 days) based on number of commits.  select login, count(login) from servo.commits.30 group by login order by count(login) desc, login limit 10
  • 12. Introduction – Technology Stack  Python  re & regex, regular expression libraries  PyGithub (patched), an unofficial client library for GitHub API  prompt_toolkit, a library for building prompts  pygments, a library for syntax highlighting
  • 13. Introduction – (Simplified) Flow Fetch data (from) Filter by where conditions Evaluate partial exprs Group by group exprs Order by order exprs Evaluate select exprs Fetch data with required fields from GitHub API Evaluate where conditions and filter fetched data Evaluate group exprs and other “field” exprs Generate table groups by values of group exprs Sort within and between tables Evaluate select exprs Filter by having conditions Evaluate having conditions and filter tables
  • 14. Introduction – Architecture top_level parser tokenizersession grouping ordering table_fetcher expression definitionutilitiestable PyGithub process user input process external data fetches external data base classes, functions and SQL definitions
  • 15. Introduction – Challenges (I)  Algorithm of parsing is almost identical to that of expression evaluation  waste of time  Lazy Parsing: Only parse clauses (eg. select, from, where) and comma- separated fields  Comma-separated fields, strings and escape characters Evaluate this: concat("[)"Stars"(: ", stargazers_count)  concat("[)"Stars"(: ", stargazers_count) concat("[)"Stars"(: ", stargazers_count)  concat("[)"Stars"(: ", stargazers_count)
  • 16. Introduction – Challenges (II)  Extracting all relevant fields from expressions to fetch at once  select concat("[)"-> avg(stargazers_count)"(: ", stargazers_count - avg(stargazers_count), "] ", name) from apple.repos where description like "%library%" order by id  Algorithm: for each expression,  Remove all literal strings. Use r""(?:[^"]|.)*"" to match.  Find all possible tokens with r"([a-zA-Z_]+)(?:[^(a-zA-Z_]|$)".  For each token, check if it’s a predefined token (ie. part of SQL).
  • 17. Introduction – Challenges (III)  Expression Evaluation is really complicated  Regular (eg. concat, floor) and Aggregate functions (eg. max, min)  Have to evaluate an entire table at once  Nested functions (eg. sum(avg(field_a) + avg(field_b)))  Use recursive regex patterns to extract tokens – r”((?:(?>[^()]+|(?R))*))”  Assign special precedence and insert extra logic in place  Operator Precedence  Modified 2-stack evaluation approach +  Finite State Machine + One-token Lookahead
  • 18. Introduction – Challenges (IV)  Python’s built-in sort is not customizable: sorted(iterable, *, key=None, reverse=False)  order by requires sorting with multiple keys each with potentially different reverse: order by field_a desc, field_b asc, field_c, desc  Wrote custom sort that integrates better with the workflow
  • 19. Future Directions  Improve SQL, MySQL compatibility  Extend to end users not just organizations  Migrate to the new GraphQL backend (GitHub API v4)  Integrate SQLGitHub directly on the server end (better efficiency and perhaps better security!)
  • 20. Acknowledgements  We would like to thank:  Shing Lyu, former software engineer at Mozilla Taiwan for the mentorship  Irvin Chen, Liaison of MozTW (Mozilla Taiwan Community) for coordinating the program  Prof. Cheng-Chung Lin for organizing the program