SlideShare a Scribd company logo
1 of 66
Parsing with Perl 6
Grammars
Perl6 is available today
● brew install rakudo-star
● git clone https://github.com/tadzik/rakudobrew ~/.rakudobrew
● docker pull rakudo-star
Everything is an object
my $obj = Thingy.new();
$obj.method();
Everything is an object
my $obj = Thingy.new();
$obj.method();
my @array = <one two three>; # like qw(...)
@array.elems.say; # 3
Everything is an object
my $obj = Thingy.new();
$obj.method();
my @array = <one two three>; # like qw(...)
@array.elems.say; # 3
4.say; # 4
“this is a string”.words.perl.say;
# ("this", "is", "a", "string").Seq
Several namespace-y collections
● package { … }
● module { … }
● enum <zero one two three>
● class { … }
● subset PositiveInt of Int where * > 0
● role { … }
● grammar { … }
Grammars are collections of named
regexes
grammar Answer {
regex TOP { <yes> | <no> | <maybe> }
}
Grammars are collections of named
regexes
grammar Answer {
regex TOP { <yes> | <no> | <maybe> }
regex yes { :ignorecase 'y' | 'ye' | 'yes' }
regex no { :ignorecase 'n' | 'no' }
regex maybe { :ignorecase
'm' | 'ma' | may' | 'mayb' | 'maybe'
}
}
Grammars are collections of named
regexes
grammar Answer {
regex TOP { <yes> | <no> | <maybe> }
regex yes { :i 'y' | 'ye' | 'yes' }
regex no { :i 'n' | 'no' }
regex maybe { :i
'm' | 'ma' | may' | 'mayb' | 'maybe'
}
}
Grammars are collections of named
regexes
grammar Answer {
regex TOP { <yes> | <no> | <maybe> }
regex yes { :i 'y' | 'ye' | 'yes' }
regex no { :i 'n' | 'no' }
regex maybe { :i
<{ 'maybe'.comb.produce(&[~]) }>
}
}
grammar Answer {
regex TOP { <yes> | <no> | <maybe> }
regex yes { … }
regex no { … }
regex maybe { … }
}
my $response = prompt “What is your answer? “;
my $match = Answer.parse($response);
$match.say;
grammar Answer { … }
my $response = prompt “What is your answer? “;
my $match = Answer.parse($response);
$match.say;
if !$match {
say “I didn't understand”;
} elsif $match{'yes'} {
say “I'll proceed...”;
} elsif $match{'no'} {
say “Cancelled”;
} elsif $match{'maybe'} {
say “Come back later”;
}
grammar Answer { … }
my $response = prompt “What is your answer? “;
my $match = Answer.parse($response);
$match.say;
if !$match {
say “I didn't understand”;
} elsif $match<yes> {
say “I'll proceed...”;
} elsif $match<no> {
say “Cancelled”;
} elsif $match<maybe> {
say “Come back later”;
}
grammar Answer { … }
my $response = prompt “What is your answer? “;
my $match = Answer.parse($response);
$match.say;
given $match {
when $<yes> { say “I'll proceed” };
when $<no> { say “Cancelled” };
when $<maybe> { say “Come back later” };
default { say “I didn't understand” };
}
> get-answer
What is your answer? y
「 y 」
yes => 「 y 」
I'll proceed
> get-answer
What is your answer? y
「 y 」
yes => 「 y 」
I'll proceed
> get-answer
What is your answer? MaY
「 MaY 」
maybe => 「 MaY 」
Come back when you have decided
> get-answer
What is your answer? y
「 y 」
yes => 「 y 」
I'll proceed
> get-answer
What is your answer? MaY
「 MaY 」
maybe => 「 MaY 」
Come back when you have decided
> get-answer
What is your answer? nope
> get-answer
What is your answer? y
「 y 」
yes => 「 y 」
I'll proceed
> get-answer
What is your answer? MaY
「 MaY 」
maybe => 「 MaY 」
Come back when you have decided
> get-answer
What is your answer? nope
(Any)
I didn't understand...
Tokens and Rules
grammar Fastq {
token TOP { <fastq-rec>+ }
rule fastq-rec {
'@'<seq-id>
<seq>
'+'<seq-id>?
<qual>
}
@first sequence
ATGCTTGACATGGA
+
Gje%(wk@~lefb.
@Second sequence
GGTAAAAT
+Second sequence
K#38rg@R
Tokens and Rules
grammar Fastq {
token TOP { <fastq-rec>+ }
rule fastq-rec {
'@'<seq-id>
<seq>
'+'$<seq-id>?
<qual>
}
@first sequence
ATGCTTGACATGGA
+
Gje%(wk@~lefb.
@Second sequence
GGTAAAAT
+Second sequence
K#38rg@R
Tokens and Rules
grammar Fastq {
token TOP { <fastq-rec>+ }
rule fastq-rec {
'@'<seq-id>
<seq>
'+'$<seq-id>?
<qual> <?{ $<seq>.chars == $<qual>.chars }>
}
Tokens and Rules
grammar Fastq {
…
token seq-id { .*? $$ }
token seq { <base>+ }
token base { <[A..Z] + [ - ] + [ * ]> }
token qual { <qual-letter>+ }
token qual-letter {
<[0..9 A..Z a..z ! " # $ % & ' ( ) * +
, - . / : ; < = > ?@ [  ] ^ _ ` { | } ~]>
}
Model
class FastqRec {
has Str $.id;
has Str $.seq;
has Str $.qual;
}
class FastqFile {
has FastqRec @.sequences;
}
Actions
class FastqActions {
method fastq-rec($/) {
make FastqRec.new(
seq-id => ~$<seq-id>,
seq => ~$<seq>,
qual => ~$<qual>,
);
}
}
rule fastq-rec {
'@'<seq-id>
<seq>
'+'$<seq-id>?
<qual>
}
Actions
class FastqActions {
method fastq-rec($/) { … }
method TOP($/) {
make FastqFile.new(
sequences =>
@<fastq-rec>>>.made;
);
}
}
rule fastq-rec { … }
rule TOP {
<fastq-rec>+
}
Actions
grammar Fastq { … }
class FastqRec { … }
class FastqFile { … }
class FastqActions {
method fastq-rec { … }
method TOP { … }
}
my FastqFile $fq = Fastq.parsefile(
'file.fq',
actions => FastqActions).made;
say “File had “, $fq.sequences.elems, “ sequences”;
Translating SQL
● Comments
REM vs --
PROMPT vs echo
Translating SQL
● Comments
● Data types
VARCHAR2 vs VARCHAR
NUMBER(2) vs SMALLINT
BLOB vs BYTEA
9999999999999999999999999999
vs 9223372036854775807
Translating SQL
● Comments
● Data types
● CREATE $thing options
CREATE TABLE foo ( … ) ORGANIZATION HEAP;
CREATE INDEX bar ON foo ( … ) COMPRESS 1;
CREATE CONSTRAINT … DISABLE;
CREATE SEQUENCE baz NOCYCLE NOCACHE;
vs NO CYCLE;
Translating SQL
● Comments
● Data types
● CREATE $thing options
● Functions
decode(x, y, z) vs ( CASE x WHEN y ELSE z END )
systimestamp() vs localtimestamp()
SQL grammar
grammar TranslateOracleDDL::Grammar {
rule TOP { <input-line>+ }
proto rule input-line { * }
rule input-line:sym<sqlplus-directive> { <sqlplus-directive> }
rule input-line:sym<sql-statement> { <sql-statement> ';' }
}
sqlplus comments
grammar TranslateOracleDDL::Grammar {
rule input-line:sym<sqlplus-directive> { <sqlplus-directive> }
token string-to-end-of-line { V+ }
rule sqlplus-directive:sym<REM> {
['REM'<?before v>]
| ['REM'<string-to-end-of-line>
}
rule sqlplus-directive:sym<PROMPT> {
['PROMPT'<?before v>]
| ['PROMPT'<string-to-end-of-line>]
}
}
Translation Actions
grammar TranslateOracleDDL::Grammar {
rule sqlplus-directive:sym<REM> {
['REM'<?before v>]
| ['REM'<string-to-end-of-line>
}
}
class TranslateOracleDDL::ToPostgres {
method sql-statement:sym<REM>($/) {
make '--' ~ ($<string-to-end-of-line> || '' )
}
}
SQL statements
grammar TranslateOracleDDL::Grammar {
rule input-line:sym<sql-statement> { <sql-statement> ';' }
proto rule sql-statement { * }
rule sql-statement:sym<CREATE-SEQUENCE> {
'CREATE' 'SEQUENCE' <entity-name> <create-seq-clause>*
}
proto rule create-seq-clause { * }
rule create-seq-clause:sym<START> { 'START' 'WITH' <bigint> }
rule create-seq-clause:sym<INC-BY> { 'INCREMENT' 'BY' <bigint> }
rule create-seq-clause:sym<CACHE> { 'CACHE' <bigint> }
rule create-seq-clause:sym<NOCACHE> { 'NOCACHE' }
}
Actions
grammar TranslateOracleDDL::Grammar {
rule sql-statement:sym<CREATE-SEQUENCE> {
'CREATE' 'SEQUENCE' <entity-name> <create-seq-clause>*
}
}
Class TranslateOracleDDL::ToPostgres {
method sql-statement:sym<CREATE-SEQUENCE>($/) {
if @<create-sequence-clause> {
make 'CREATE SEQUENCE'
~ $<entity-name>.made
~ ' ' ~ @<create-sequence-clause>>>.made.join(' ');
}
Actions
grammar TranslateOracleDDL::Grammar {
rule create-seq-clause:sym<START> { 'START' 'WITH' <bigint> }
}
Class TranslateOracleDDL::ToPostgres {
method bigint($/) {
make($/ > 9223372036854775807
?? 9223372036854775807
|| ~$/);
}
method create-seq-clause:sym<START>($/) {
make 'START WITH' ~ $<bigint>.made;
}
SQL statements
grammar TranslateOracleDDL::Grammar {
rule input-line:sym<sql-statement> { <sql-statement> ';' }
proto rule sql-statement { * }
rule sql-statement:sym<CREATE-SEQUENCE> { … }
rule sql-statement:sym<COMMENT-ON> {
'COMMENT' 'ON' ['TABLE' | 'COLUMN'] <entity-name> IS <value>
}
}
SQL statements
grammar TranslateOracleDDL::Grammar {
rule input-line:sym<sql-statement> { <sql-statement> ';' }
proto rule sql-statement { * }
rule sql-statement:sym<CREATE-SEQUENCE> { … }
rule sql-statement:sym<COMMENT-ON> { … }
rule sql-statement:sym<CREATE-TABLE> {
'CREATE' 'TABLE' <entity-name>
'(' <create-table-column-def>+? % ','
[ ',' <table-constraint-def> ]*
')'
<create-table-extra-oracle-stuff>*
}
SQL statements
grammar TranslateOracleDDL::Grammar {
rule input-line:sym<sql-statement> { <sql-statement> ';' }
proto rule sql-statement { * }
rule sql-statement:sym<CREATE-SEQUENCE> { … }
rule sql-statement:sym<COMMENT-ON> { … }
rule sql-statement:sym<CREATE-TABLE> { … }
rule sql-statement:sym<CREATE-INDEX> {
'CREATE' [ $<unique>=('UNIQUE') ]?
'INDEX' <index-name=entity-name>
'ON' <table-name=entity-name>
'(' ~ ')' <columns=expr>+ % ','
<index-option>*
SQL statements
grammar TranslateOracleDDL::Grammar {
rule input-line:sym<sql-statement> { <sql-statement> ';' }
proto rule sql-statement { * }
rule sql-statement:sym<CREATE-SEQUENCE> { … }
rule sql-statement:sym<COMMENT-ON> { … }
rule sql-statement:sym<CREATE-TABLE> { … }
rule sql-statement:sym<CREATE-INDEX> { … }
rule sql-statement:sym<CREATE-VIEW> {
'CREATE' [ 'OR' 'REPLACE' ]? 'VIEW' <view-name=entity-name>
'(' ~ ')' <columns=expr> + % ','
'AS' <select-statement>
}
Table, Column, Sequence, etc.
names
grammar TranslateOracleDDL::Grammar {
rule entity-name { <identifier>** 1..3 % '.' }
proto token identifier { * }
token identifier:sym<bareword> { <[$w]>+ }
token identifier:sym<qq> {
'”'
[ <-[“]>+:
| '””'
]*
'”'
}
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
from (
select
project_id,
batch_number,
organism_sample_id,
count(illumina_seq_id) as count_illumina,
seq_id,
gender,
setup_wo_id
from (
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
from (
select
project_id,
batch_number,
organism_sample_id,
count(illumina_seq_id) as count_illumina,
seq_id,
gender,
setup_wo_id
from (
select distinct
wo.project_id,
batch.batch_number as batch_number,
os.organism_sample_id,
ig.seq_id as illumina_seq_id,
case when eg.seq_id is not null then
eg.seq_id
else
case when eg2.seq_id is not null then
eg2.seq_id
else
eg3.seq_id
end
end as seq_id,
oi.gender as gender,
wo.setup_wo_id as setup_wo_id
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
from (
select
project_id,
batch_number,
organism_sample_id,
count(illumina_seq_id) as count_illumina,
seq_id,
gender,
setup_wo_id
from (
select distinct
wo.project_id,
batch.batch_number as batch_number,
os.organism_sample_id,
ig.seq_id as illumina_seq_id,
case when eg.seq_id is not null then
eg.seq_id
else
case when eg2.seq_id is not null then
eg2.seq_id
else
eg3.seq_id
end
end as seq_id,
oi.gender as gender,
wo.setup_wo_id as setup_wo_id
from setup_work_order wo
join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id
join organism_sample@dw os on woi.dna_id = os.organism_sample_id
join organism_individual@dw oi on os.source_id = oi.organism_id
left outer join (
select
sa.organism_sample_id,
sa.attribute_value as batch_number
from sample_attribute@dw sa
where sa.attribute_label = 'batch_number'
) batch on os.organism_sample_id = batch.organism_sample_id
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
from (
select
project_id,
batch_number,
organism_sample_id,
count(illumina_seq_id) as count_illumina,
seq_id,
gender,
setup_wo_id
from (
select distinct
wo.project_id,
batch.batch_number as batch_number,
os.organism_sample_id,
ig.seq_id as illumina_seq_id,
case when eg.seq_id is not null then
eg.seq_id
else
case when eg2.seq_id is not null then
eg2.seq_id
else
eg3.seq_id
end
end as seq_id,
oi.gender as gender,
wo.setup_wo_id as setup_wo_id
from setup_work_order wo
join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id
join organism_sample@dw os on woi.dna_id = os.organism_sample_id
join organism_individual@dw oi on os.source_id = oi.organism_id
left outer join (
select
sa.organism_sample_id,
sa.attribute_value as batch_number
from sample_attribute@dw sa
where sa.attribute_label = 'batch_number'
) batch on os.organism_sample_id = batch.organism_sample_id
left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id
left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id
left outer join organism_sample@dw os2 on os2.source_id = os.source_id
and os2.common_name = os.common_name
and os2.sample_type = os.sample_type
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
from (
select
project_id,
batch_number,
organism_sample_id,
count(illumina_seq_id) as count_illumina,
seq_id,
gender,
setup_wo_id
from (
select distinct
wo.project_id,
batch.batch_number as batch_number,
os.organism_sample_id,
ig.seq_id as illumina_seq_id,
case when eg.seq_id is not null then
eg.seq_id
else
case when eg2.seq_id is not null then
eg2.seq_id
else
eg3.seq_id
end
end as seq_id,
oi.gender as gender,
wo.setup_wo_id as setup_wo_id
from setup_work_order wo
join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id
join organism_sample@dw os on woi.dna_id = os.organism_sample_id
join organism_individual@dw oi on os.source_id = oi.organism_id
left outer join (
select
sa.organism_sample_id,
sa.attribute_value as batch_number
from sample_attribute@dw sa
where sa.attribute_label = 'batch_number'
) batch on os.organism_sample_id = batch.organism_sample_id
left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id
left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id
left outer join organism_sample@dw os2 on os2.source_id = os.source_id
and os2.common_name = os.common_name
and os2.sample_type = os.sample_type
left outer join
external_genotyping@dw eg3 on
eg3.organism_sample_id =
os2.organism_sample_id
left outer join
illumina_genotyping@dw ig on
os.organism_sample_id =
ig.organism_sample_id
where wo.pipeline in (
'Genotyping',
'Genotyping and Agilent
Exome Sequencing',
'Genotyping and Illumina
Whole Genome Sequencing',
'Genotyping and Nimblegen Exome
Sequencing',
'Genotyping and Nimblegen
Custom Capture Illumina',
'Genotyping and Nimblegen Liquid
Phase Targeted Sequencing',
'Genotyping and Nimblegen Solid
Phase Targeted Sequencing',
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
from (
select
project_id,
batch_number,
organism_sample_id,
count(illumina_seq_id) as count_illumina,
seq_id,
gender,
setup_wo_id
from (
select distinct
wo.project_id,
batch.batch_number as batch_number,
os.organism_sample_id,
ig.seq_id as illumina_seq_id,
case when eg.seq_id is not null then
eg.seq_id
else
case when eg2.seq_id is not null then
eg2.seq_id
else
eg3.seq_id
end
end as seq_id,
oi.gender as gender,
wo.setup_wo_id as setup_wo_id
from setup_work_order wo
join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id
join organism_sample@dw os on woi.dna_id = os.organism_sample_id
join organism_individual@dw oi on os.source_id = oi.organism_id
left outer join (
select
sa.organism_sample_id,
sa.attribute_value as batch_number
from sample_attribute@dw sa
where sa.attribute_label = 'batch_number'
) batch on os.organism_sample_id = batch.organism_sample_id
left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id
left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id
left outer join organism_sample@dw os2 on os2.source_id = os.source_id
and os2.common_name = os.common_name
and os2.sample_type = os.sample_type
left outer join external_genotyping@dw eg3 on
eg3.organism_sample_id =
os2.organism_sample_id
left outer join illumina_genotyping@dw ig on
os.organism_sample_id =
ig.organism_sample_id
where wo.pipeline in (
'Genotyping',
'Genotyping and Agilent Exome
Sequencing',
'Genotyping and Illumina Whole Genome
Sequencing',
'Genotyping and Nimblegen Exome Sequencing',
'Genotyping and Nimblegen Custom Capture
Illumina',
'Genotyping and Nimblegen Liquid Phase
Targeted Sequencing',
'Genotyping and Nimblegen Solid Phase
Targeted Sequencing',
'Resource Storage'
)
)
group by
project_id,
batch_number,
organism_sample_id,
seq_id,
gender,
setup_wo_id
) samples
left outer join external_genotyping@dw eg on
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
from (
select
project_id,
batch_number,
organism_sample_id,
count(illumina_seq_id) as count_illumina,
seq_id,
gender,
setup_wo_id
from (
select distinct
wo.project_id,
batch.batch_number as batch_number,
os.organism_sample_id,
ig.seq_id as illumina_seq_id,
case when eg.seq_id is not null then
eg.seq_id
else
case when eg2.seq_id is not null then
eg2.seq_id
else
eg3.seq_id
end
end as seq_id,
oi.gender as gender,
wo.setup_wo_id as setup_wo_id
from setup_work_order wo
join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id
join organism_sample@dw os on woi.dna_id = os.organism_sample_id
join organism_individual@dw oi on os.source_id = oi.organism_id
left outer join (
select
sa.organism_sample_id,
sa.attribute_value as batch_number
from sample_attribute@dw sa
where sa.attribute_label = 'batch_number'
) batch on os.organism_sample_id = batch.organism_sample_id
left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id
left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id
left outer join organism_sample@dw os2 on os2.source_id = os.source_id
and os2.common_name = os.common_name
and os2.sample_type = os.sample_type
left outer join external_genotyping@dw eg3 on
eg3.organism_sample_id = os2.organism_sample_id
left outer join illumina_genotyping@dw ig on
os.organism_sample_id = ig.organism_sample_id
where wo.pipeline in (
'Genotyping',
'Genotyping and Agilent Exome Sequencing',
'Genotyping and Illumina Whole Genome Sequencing',
'Genotyping and Nimblegen Exome Sequencing',
'Genotyping and Nimblegen Custom Capture Illumina',
'Genotyping and Nimblegen Liquid Phase Targeted
Sequencing',
'Genotyping and Nimblegen Solid Phase Targeted Sequencing',
'Resource Storage'
)
)
group by
project_id,
batch_number,
organism_sample_id,
seq_id,
gender,
setup_wo_id
) samples
left outer join external_genotyping@dw eg on samples.seq_id =
eg.seq_id
left outer join (
select
param.param_value,
param.pse_id,
to_char(pse.date_scheduled, 'MM/DD/YYYY') as
date_scheduled
from process_step_executions pse
join pse_param param on pse.pse_id = param.pse_id
where
pse.ps_ps_id = 7375
and param.param_name = 'organism_sample_id'
) import_pses on import_pses.param_value =
eg.organism_sample_id
left outer join tpp_pse on tpp_pse.pse_id = import_pses.pse_id;
SQL expressions
CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo
(
project_id
, batch_number
, organism_sample_id
, count_illumina
, seq_id
, date_scheduled
, y_snp_count
, y_snp_total
, gender
, total_snps
, total_calls
, status
, setup_wo_id
)
AS
select distinct
samples.project_id,
samples.batch_number,
samples.organism_sample_id,
count_illumina,
eg.seq_id,
import_pses.date_scheduled,
eg.y_snp_count,
eg.y_snp_total,
samples.gender,
eg.total_snps,
eg.total_calls,
eg.status,
setup_wo_id
from (
select
project_id,
batch_number,
organism_sample_id,
count(illumina_seq_id) as count_illumina,
seq_id,
gender,
setup_wo_id
from (
select distinct
wo.project_id,
batch.batch_number as batch_number,
os.organism_sample_id,
ig.seq_id as illumina_seq_id,
case when eg.seq_id is not null then
eg.seq_id
else
case when eg2.seq_id is not null then
eg2.seq_id
else
eg3.seq_id
end
end as seq_id,
oi.gender as gender,
wo.setup_wo_id as setup_wo_id
from setup_work_order wo
join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id
join organism_sample@dw os on woi.dna_id = os.organism_sample_id
join organism_individual@dw oi on os.source_id = oi.organism_id
left outer join (
select
sa.organism_sample_id,
sa.attribute_value as batch_number
from sample_attribute@dw sa
where sa.attribute_label = 'batch_number'
) batch on os.organism_sample_id = batch.organism_sample_id
left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id
left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id
left outer join organism_sample@dw os2 on os2.source_id = os.source_id
and os2.common_name = os.common_name
and os2.sample_type = os.sample_type
left outer join external_genotyping@dw eg3 on
eg3.organism_sample_id = os2.organism_sample_id
left outer join illumina_genotyping@dw ig on
os.organism_sample_id = ig.organism_sample_id
where wo.pipeline in (
'Genotyping',
'Genotyping and Agilent Exome Sequencing',
'Genotyping and Illumina Whole Genome Sequencing',
'Genotyping and Nimblegen Exome Sequencing',
'Genotyping and Nimblegen Custom Capture Illumina',
'Genotyping and Nimblegen Liquid Phase Targeted
Sequencing',
'Genotyping and Nimblegen Solid Phase Targeted Sequencing',
'Resource Storage'
)
)
group by
project_id,
batch_number,
organism_sample_id,
seq_id,
gender,
setup_wo_id
) samples
left outer join external_genotyping@dw eg on samples.seq_id =
eg.seq_id
left outer join (
select
param.param_value,
param.pse_id,
to_char(pse.date_scheduled, 'MM/DD/YYYY') as
date_scheduled
from process_step_executions pse
join pse_param param on pse.pse_id = param.pse_id
where
pse.ps_ps_id = 7375
and param.param_name = 'organism_sample_id'
) import_pses on import_pses.param_value =
eg.organism_sample_id
left outer join tpp_pse on tpp_pse.pse_id = import_pses.pse_id;
SQL expressions
grammar TranslateOracleDDL::Grammar {
token and-or { :i 'and' | 'or' }
proto rule expr { * }
rule expr:sym<nested> { [ '(' <expr> ')' ]+ % <and-or> }
rule expr:sym<composite> { <expr> <and-or> <expr> }
rule expr:sym<atom> { <entity-name> | <value> }
rule expr:sym<operator> { <left=identifier-or-value>
<expr-op>
<right=expr> }
rule expr:sym<IN> { 'IN' '(' ~ ')' <value> + % ',' }
rule expr:sym<substr-f> { 'substr(' <expr>**2..3 % ',' ')' }
Left-recursive grammar
grammar TranslateOracleDDL::Grammar {
token and-or { :i 'and' | 'or' }
proto rule expr { * }
rule expr:sym<nested> { [ '(' <expr> ')' ]+ % <and-or> }
rule expr:sym<composite> { <expr> <and-or> <expr> }
rule expr:sym<atom> { <entity-name> | <value> }
rule expr:sym<operator> { <left=ident-or-value>
<expr-op>
<right=expr> }
rule expr:sym<IN> { 'IN' '(' ~ ')' <value> + % ',' }
rule expr:sym<substr-f> { 'substr(' <expr>**2..3 % ',' ')' }
Remove left recursion
grammar TranslateOracleDDL::Grammar {
token and-or { :i 'and' | 'or' }
proto rule expr { * }
rule expr:sym<nested> { [ '(' <expr> ')' ]+ % <and-or> }
rule expr:sym<composite> { <expr-part> <and-or> <expr> }
rule expr:sym<part> { <expr-part> }
rule expr:sym<atom> { <entity-name> | <value> }
rule expr-part:sym<operator> { <left=ident-or-value>
<expr-op>
<right=expr> }
rule expr-part:sym<IN> { 'IN' '(' ~ ')' <value> + % ',' }
rule expr-part:sym<substr-f> { 'substr(' <expr>**2..3 % ',' ')' }
Broken SQL
CREATE UNIQUE INDEX SCHEMA_USER.te_spname_lcname ON
SCHEMA_USER.tp_entry
(
species_name
, tp_id
,;
Broken SQL
CREATE UNIQUE INDEX SCHEMA_USER.te_spname_lcname ON
SCHEMA_USER.tp_entry
(
species_name
, tp_id
,;
rule sql-statement<broken-CREATE-INDEX> {
'CREATE' <unique='UNIQUE'> 'INDEX' <entity-name>
'ON'
'(' <expr>* %% ','
}
Broken SQL
ALTER TABLE SCHEMA_USER. ADD CONSTRAINT
bin$hwvvsoqpce/gu4ocaaqrog==$0
CHECK ()
(
)
INITIALLY
DISABLE;
Broken SQL
ALTER TABLE SCHEMA_USER. ADD CONSTRAINT
bin$hwvvsoqpce/gu4ocaaqrog==$0
CHECK ()
(
)
INITIALLY
DISABLE;
rule sql-statement<broken-ALTER-TABLE-ADD-CONSTRAINT> {
'ALTER' 'TABLE'
S+
'ADD' 'CONSTRAINT'
S+
CHECK '(' ')' '(' ')'
w+ 'DISABLE'
}
Hard to parse SQL
CREATE OR REPLACE VIEW SCHEMA_USER.plate_locations
( sec_sec_id, well_name, pt_pt_id, pl_id )
AS SELECT
SCHEMA_USER.sectors.SEC_ID AS "SEC_SEC_ID",
SCHEMA_USER.dna_location.LOCATION_NAME AS "WELL_NAME",
SCHEMA_USER.plate_types.PT_ID AS "PT_PT_ID",
SCHEMA_USER.dna_location.dl_id AS "PL_ID"
FROM
SCHEMA_USER.sectors, SCHEMA_USER.dna_location,
SCHEMA_USER.plate_types
WHERE
SCHEMA_USER.sectors.SEC_ID
= SCHEMA_USER.dna_location.sec_id
and
SCHEMA_USER.dna_location.LOCATION_TYPE
= to_char(SCHEMA_USER.plate_types.WELL_COUNT)||' well plate'
Hard to parse SQL
CREATE OR REPLACE VIEW SCHEMA_USER.plate_locations
( sec_sec_id, well_name, pt_pt_id, pl_id )
AS SELECT
SCHEMA_USER.sectors.SEC_ID AS "SEC_SEC_ID",
SCHEMA_USER.dna_location.LOCATION_NAME AS "WELL_NAME",
SCHEMA_USER.plate_types.PT_ID AS "PT_PT_ID",
SCHEMA_USER.dna_location.dl_id AS "PL_ID"
FROM
SCHEMA_USER.sectors, SCHEMA_USER.dna_location,
SCHEMA_USER.plate_types
WHERE
SCHEMA_USER.sectors.SEC_ID
= SCHEMA_USER.dna_location.sec_id
and
SCHEMA_USER.dna_location.LOCATION_TYPE
= to_char(SCHEMA_USER.plate_types.WELL_COUNT)||' well plate'
Hard to parse SQL
rule sql-statement:sym<special-CREATE-VIEW> {
'CREATE' 'OR' 'REPLACE' 'VIEW'
'SCHEMA_USER.plate_locations'
'(' <-[;]>+
}
method sql-statement:sym<special-CREATE-VIEW>($/) {
Make ~$/;
}
Translate
use TranslateOracleDDL::Grammar;
use TranslateOracleDDL::ToPostgres;
sub MAIN(Str $filename where *.IO.e) {
my $parsed = TranslateOracleDDL::Grammar.parsefile(
$filename,
actions => TranslateOracleDDL::ToPostgres,
);
say $parsed.made;
}
Debugging
use Grammar::Tracer;
grammar TranslateOracleDDL::Grammar { … }
TranslateOracleDDL::Grammar.parsefile($filename);
TOP
| input-line
| | input-line:sym<sqlplus-directive>
| | | sqlplus-directive
| | | | sqlplus-directive:sym<REM>
| | | | | string-to-end-of-line
| | | | | * MATCH “ This DDL was reverse engineered by”
| | | | * MATCH “REM This DDL was reverse engineered byn”
| | | * MATCH “REM This DDL was reverse engineered byn”
| | * MATCH “REM This DDL was reverse engineered byn”
| * MATCH “REM This DDL was reverse engineered byn”
Debugging
rule select-statement { :ignorecase
'SELECT'
[ $<distinct>=('DISTINCT') ]?
<columns=select-column>+ % ','
'FROM'
<select-from-clause> + % ','
<join-clause>*
<where-clause>?
<group-by-clause>?
}
Debugging
rule select-statement { :ignorecase
'SELECT' { say “saw SELECT” }
[ $<distinct>=('DISTINCT') { say “saw DISTINCT” }]?
<columns=select-column>+ % ',' { say @<columns>.elems,” columns” }
'FROM'
<select-from-clause> + % ',' { say @<select-from-
clause>.elems,”from”}
<join-clause>* { say @<join-clause>.elems, “ joins” }
<where-clause>?
<group-by-clause>?
}
Grammars to make other grammars
https://perl6advent.wordpress.com/2015/12/08/day-8-grammars-generating-grammars/
grammar Grammar::BNF {
token TOP { s* <rule>+ s* }
token rule { … }
token expression { … }
token term { … }
…
}
class GrammarGenerator {
has $.name = 'BNFGrammar';
method TOP {
my $grmr := Metamodel::GrammarHOW.new_type(:$.name);
$grmr.^add_method('TOP', EVAL 'token …');
for $<rule>.map(*.ast) -> $rule {
$grmr.^add_method($rule.key, $rule.value);
}
$grmr.^compose;
make $grmr;
}
...

More Related Content

What's hot

Scala and Lift presentation
Scala and Lift presentationScala and Lift presentation
Scala and Lift presentation
Scalac
 
Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介
Kiyotaka Oku
 

What's hot (20)

Scala == Effective Java
Scala == Effective JavaScala == Effective Java
Scala == Effective Java
 
Presentatie - Introductie in Groovy
Presentatie - Introductie in GroovyPresentatie - Introductie in Groovy
Presentatie - Introductie in Groovy
 
Stored Procedures and MUMPS for DivConq
 Stored Procedures and  MUMPS for DivConq  Stored Procedures and  MUMPS for DivConq
Stored Procedures and MUMPS for DivConq
 
Innovative Specifications for Better Performance Logging and Monitoring
Innovative Specifications for Better Performance Logging and MonitoringInnovative Specifications for Better Performance Logging and Monitoring
Innovative Specifications for Better Performance Logging and Monitoring
 
Scala and Lift presentation
Scala and Lift presentationScala and Lift presentation
Scala and Lift presentation
 
Let the type system be your friend
Let the type system be your friendLet the type system be your friend
Let the type system be your friend
 
The Ring programming language version 1.8 book - Part 96 of 202
The Ring programming language version 1.8 book - Part 96 of 202The Ring programming language version 1.8 book - Part 96 of 202
The Ring programming language version 1.8 book - Part 96 of 202
 
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und GebBDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
 
Towards Reusable Components With Aspects [ICSE 2008]
Towards Reusable Components With Aspects [ICSE 2008]Towards Reusable Components With Aspects [ICSE 2008]
Towards Reusable Components With Aspects [ICSE 2008]
 
Kamil Chmielewski, Jacek Juraszek - "Hadoop. W poszukiwaniu złotego młotka."
Kamil Chmielewski, Jacek Juraszek - "Hadoop. W poszukiwaniu złotego młotka."Kamil Chmielewski, Jacek Juraszek - "Hadoop. W poszukiwaniu złotego młotka."
Kamil Chmielewski, Jacek Juraszek - "Hadoop. W poszukiwaniu złotego młotka."
 
Programming Java - Lection 07 - Puzzlers - Lavrentyev Fedor
Programming Java - Lection 07 - Puzzlers - Lavrentyev FedorProgramming Java - Lection 07 - Puzzlers - Lavrentyev Fedor
Programming Java - Lection 07 - Puzzlers - Lavrentyev Fedor
 
Adding Dependency Injection to Legacy Applications
Adding Dependency Injection to Legacy ApplicationsAdding Dependency Injection to Legacy Applications
Adding Dependency Injection to Legacy Applications
 
Scala vs Java 8 in a Java 8 World
Scala vs Java 8 in a Java 8 WorldScala vs Java 8 in a Java 8 World
Scala vs Java 8 in a Java 8 World
 
Spock and Geb in Action
Spock and Geb in ActionSpock and Geb in Action
Spock and Geb in Action
 
A Test of Strength
A Test of StrengthA Test of Strength
A Test of Strength
 
Deferred
DeferredDeferred
Deferred
 
Spring has got me under it’s SpEL
Spring has got me under it’s SpELSpring has got me under it’s SpEL
Spring has got me under it’s SpEL
 
Melhorando sua API com DSLs
Melhorando sua API com DSLsMelhorando sua API com DSLs
Melhorando sua API com DSLs
 
Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介Grails/Groovyによる開発事例紹介
Grails/Groovyによる開発事例紹介
 
Alternate JVM Languages
Alternate JVM LanguagesAlternate JVM Languages
Alternate JVM Languages
 

Similar to Parsing with Perl6 Grammars

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
Dmitry Buzdin
 
Good Evils In Perl (Yapc Asia)
Good Evils In Perl (Yapc Asia)Good Evils In Perl (Yapc Asia)
Good Evils In Perl (Yapc Asia)
Kang-min Liu
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1
Jano Suchal
 
Good Evils In Perl
Good Evils In PerlGood Evils In Perl
Good Evils In Perl
Kang-min Liu
 
Php tips-and-tricks4128
Php tips-and-tricks4128Php tips-and-tricks4128
Php tips-and-tricks4128
PrinceGuru MS
 

Similar to Parsing with Perl6 Grammars (20)

PHPSpec BDD for PHP
PHPSpec BDD for PHPPHPSpec BDD for PHP
PHPSpec BDD for PHP
 
Groovy
GroovyGroovy
Groovy
 
Models and Service Layers, Hemoglobin and Hobgoblins
Models and Service Layers, Hemoglobin and HobgoblinsModels and Service Layers, Hemoglobin and Hobgoblins
Models and Service Layers, Hemoglobin and Hobgoblins
 
Things I Believe Now That I'm Old
Things I Believe Now That I'm OldThings I Believe Now That I'm Old
Things I Believe Now That I'm Old
 
How to write code you won't hate tomorrow
How to write code you won't hate tomorrowHow to write code you won't hate tomorrow
How to write code you won't hate tomorrow
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
Component lifecycle hooks in Angular 2.0
Component lifecycle hooks in Angular 2.0Component lifecycle hooks in Angular 2.0
Component lifecycle hooks in Angular 2.0
 
Good Evils In Perl (Yapc Asia)
Good Evils In Perl (Yapc Asia)Good Evils In Perl (Yapc Asia)
Good Evils In Perl (Yapc Asia)
 
Smelling your code
Smelling your codeSmelling your code
Smelling your code
 
Oscon 2010 Specs talk
Oscon 2010 Specs talkOscon 2010 Specs talk
Oscon 2010 Specs talk
 
PHPSpec BDD Framework
PHPSpec BDD FrameworkPHPSpec BDD Framework
PHPSpec BDD Framework
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1
 
Rust ⇋ JavaScript
Rust ⇋ JavaScriptRust ⇋ JavaScript
Rust ⇋ JavaScript
 
Good Evils In Perl
Good Evils In PerlGood Evils In Perl
Good Evils In Perl
 
Drupal7 dbtng
Drupal7  dbtngDrupal7  dbtng
Drupal7 dbtng
 
Ast transformations
Ast transformationsAst transformations
Ast transformations
 
Perl6 a whistle stop tour
Perl6 a whistle stop tourPerl6 a whistle stop tour
Perl6 a whistle stop tour
 
Perl6 a whistle stop tour
Perl6 a whistle stop tourPerl6 a whistle stop tour
Perl6 a whistle stop tour
 
2013 - Benjamin Eberlei - Doctrine 2
2013 - Benjamin Eberlei - Doctrine 22013 - Benjamin Eberlei - Doctrine 2
2013 - Benjamin Eberlei - Doctrine 2
 
Php tips-and-tricks4128
Php tips-and-tricks4128Php tips-and-tricks4128
Php tips-and-tricks4128
 

Recently uploaded

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 

Recently uploaded (20)

%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 

Parsing with Perl6 Grammars

  • 1. Parsing with Perl 6 Grammars
  • 2. Perl6 is available today ● brew install rakudo-star ● git clone https://github.com/tadzik/rakudobrew ~/.rakudobrew ● docker pull rakudo-star
  • 3. Everything is an object my $obj = Thingy.new(); $obj.method();
  • 4. Everything is an object my $obj = Thingy.new(); $obj.method(); my @array = <one two three>; # like qw(...) @array.elems.say; # 3
  • 5. Everything is an object my $obj = Thingy.new(); $obj.method(); my @array = <one two three>; # like qw(...) @array.elems.say; # 3 4.say; # 4 “this is a string”.words.perl.say; # ("this", "is", "a", "string").Seq
  • 6. Several namespace-y collections ● package { … } ● module { … } ● enum <zero one two three> ● class { … } ● subset PositiveInt of Int where * > 0 ● role { … } ● grammar { … }
  • 7. Grammars are collections of named regexes grammar Answer { regex TOP { <yes> | <no> | <maybe> } }
  • 8. Grammars are collections of named regexes grammar Answer { regex TOP { <yes> | <no> | <maybe> } regex yes { :ignorecase 'y' | 'ye' | 'yes' } regex no { :ignorecase 'n' | 'no' } regex maybe { :ignorecase 'm' | 'ma' | may' | 'mayb' | 'maybe' } }
  • 9. Grammars are collections of named regexes grammar Answer { regex TOP { <yes> | <no> | <maybe> } regex yes { :i 'y' | 'ye' | 'yes' } regex no { :i 'n' | 'no' } regex maybe { :i 'm' | 'ma' | may' | 'mayb' | 'maybe' } }
  • 10. Grammars are collections of named regexes grammar Answer { regex TOP { <yes> | <no> | <maybe> } regex yes { :i 'y' | 'ye' | 'yes' } regex no { :i 'n' | 'no' } regex maybe { :i <{ 'maybe'.comb.produce(&[~]) }> } }
  • 11. grammar Answer { regex TOP { <yes> | <no> | <maybe> } regex yes { … } regex no { … } regex maybe { … } } my $response = prompt “What is your answer? “; my $match = Answer.parse($response); $match.say;
  • 12. grammar Answer { … } my $response = prompt “What is your answer? “; my $match = Answer.parse($response); $match.say; if !$match { say “I didn't understand”; } elsif $match{'yes'} { say “I'll proceed...”; } elsif $match{'no'} { say “Cancelled”; } elsif $match{'maybe'} { say “Come back later”; }
  • 13. grammar Answer { … } my $response = prompt “What is your answer? “; my $match = Answer.parse($response); $match.say; if !$match { say “I didn't understand”; } elsif $match<yes> { say “I'll proceed...”; } elsif $match<no> { say “Cancelled”; } elsif $match<maybe> { say “Come back later”; }
  • 14. grammar Answer { … } my $response = prompt “What is your answer? “; my $match = Answer.parse($response); $match.say; given $match { when $<yes> { say “I'll proceed” }; when $<no> { say “Cancelled” }; when $<maybe> { say “Come back later” }; default { say “I didn't understand” }; }
  • 15. > get-answer What is your answer? y 「 y 」 yes => 「 y 」 I'll proceed
  • 16. > get-answer What is your answer? y 「 y 」 yes => 「 y 」 I'll proceed > get-answer What is your answer? MaY 「 MaY 」 maybe => 「 MaY 」 Come back when you have decided
  • 17. > get-answer What is your answer? y 「 y 」 yes => 「 y 」 I'll proceed > get-answer What is your answer? MaY 「 MaY 」 maybe => 「 MaY 」 Come back when you have decided > get-answer What is your answer? nope
  • 18. > get-answer What is your answer? y 「 y 」 yes => 「 y 」 I'll proceed > get-answer What is your answer? MaY 「 MaY 」 maybe => 「 MaY 」 Come back when you have decided > get-answer What is your answer? nope (Any) I didn't understand...
  • 19. Tokens and Rules grammar Fastq { token TOP { <fastq-rec>+ } rule fastq-rec { '@'<seq-id> <seq> '+'<seq-id>? <qual> } @first sequence ATGCTTGACATGGA + Gje%(wk@~lefb. @Second sequence GGTAAAAT +Second sequence K#38rg@R
  • 20. Tokens and Rules grammar Fastq { token TOP { <fastq-rec>+ } rule fastq-rec { '@'<seq-id> <seq> '+'$<seq-id>? <qual> } @first sequence ATGCTTGACATGGA + Gje%(wk@~lefb. @Second sequence GGTAAAAT +Second sequence K#38rg@R
  • 21. Tokens and Rules grammar Fastq { token TOP { <fastq-rec>+ } rule fastq-rec { '@'<seq-id> <seq> '+'$<seq-id>? <qual> <?{ $<seq>.chars == $<qual>.chars }> }
  • 22. Tokens and Rules grammar Fastq { … token seq-id { .*? $$ } token seq { <base>+ } token base { <[A..Z] + [ - ] + [ * ]> } token qual { <qual-letter>+ } token qual-letter { <[0..9 A..Z a..z ! " # $ % & ' ( ) * + , - . / : ; < = > ?@ [ ] ^ _ ` { | } ~]> }
  • 23. Model class FastqRec { has Str $.id; has Str $.seq; has Str $.qual; } class FastqFile { has FastqRec @.sequences; }
  • 24. Actions class FastqActions { method fastq-rec($/) { make FastqRec.new( seq-id => ~$<seq-id>, seq => ~$<seq>, qual => ~$<qual>, ); } } rule fastq-rec { '@'<seq-id> <seq> '+'$<seq-id>? <qual> }
  • 25. Actions class FastqActions { method fastq-rec($/) { … } method TOP($/) { make FastqFile.new( sequences => @<fastq-rec>>>.made; ); } } rule fastq-rec { … } rule TOP { <fastq-rec>+ }
  • 26. Actions grammar Fastq { … } class FastqRec { … } class FastqFile { … } class FastqActions { method fastq-rec { … } method TOP { … } } my FastqFile $fq = Fastq.parsefile( 'file.fq', actions => FastqActions).made; say “File had “, $fq.sequences.elems, “ sequences”;
  • 27. Translating SQL ● Comments REM vs -- PROMPT vs echo
  • 28. Translating SQL ● Comments ● Data types VARCHAR2 vs VARCHAR NUMBER(2) vs SMALLINT BLOB vs BYTEA 9999999999999999999999999999 vs 9223372036854775807
  • 29. Translating SQL ● Comments ● Data types ● CREATE $thing options CREATE TABLE foo ( … ) ORGANIZATION HEAP; CREATE INDEX bar ON foo ( … ) COMPRESS 1; CREATE CONSTRAINT … DISABLE; CREATE SEQUENCE baz NOCYCLE NOCACHE; vs NO CYCLE;
  • 30. Translating SQL ● Comments ● Data types ● CREATE $thing options ● Functions decode(x, y, z) vs ( CASE x WHEN y ELSE z END ) systimestamp() vs localtimestamp()
  • 31. SQL grammar grammar TranslateOracleDDL::Grammar { rule TOP { <input-line>+ } proto rule input-line { * } rule input-line:sym<sqlplus-directive> { <sqlplus-directive> } rule input-line:sym<sql-statement> { <sql-statement> ';' } }
  • 32. sqlplus comments grammar TranslateOracleDDL::Grammar { rule input-line:sym<sqlplus-directive> { <sqlplus-directive> } token string-to-end-of-line { V+ } rule sqlplus-directive:sym<REM> { ['REM'<?before v>] | ['REM'<string-to-end-of-line> } rule sqlplus-directive:sym<PROMPT> { ['PROMPT'<?before v>] | ['PROMPT'<string-to-end-of-line>] } }
  • 33. Translation Actions grammar TranslateOracleDDL::Grammar { rule sqlplus-directive:sym<REM> { ['REM'<?before v>] | ['REM'<string-to-end-of-line> } } class TranslateOracleDDL::ToPostgres { method sql-statement:sym<REM>($/) { make '--' ~ ($<string-to-end-of-line> || '' ) } }
  • 34. SQL statements grammar TranslateOracleDDL::Grammar { rule input-line:sym<sql-statement> { <sql-statement> ';' } proto rule sql-statement { * } rule sql-statement:sym<CREATE-SEQUENCE> { 'CREATE' 'SEQUENCE' <entity-name> <create-seq-clause>* } proto rule create-seq-clause { * } rule create-seq-clause:sym<START> { 'START' 'WITH' <bigint> } rule create-seq-clause:sym<INC-BY> { 'INCREMENT' 'BY' <bigint> } rule create-seq-clause:sym<CACHE> { 'CACHE' <bigint> } rule create-seq-clause:sym<NOCACHE> { 'NOCACHE' } }
  • 35. Actions grammar TranslateOracleDDL::Grammar { rule sql-statement:sym<CREATE-SEQUENCE> { 'CREATE' 'SEQUENCE' <entity-name> <create-seq-clause>* } } Class TranslateOracleDDL::ToPostgres { method sql-statement:sym<CREATE-SEQUENCE>($/) { if @<create-sequence-clause> { make 'CREATE SEQUENCE' ~ $<entity-name>.made ~ ' ' ~ @<create-sequence-clause>>>.made.join(' '); }
  • 36. Actions grammar TranslateOracleDDL::Grammar { rule create-seq-clause:sym<START> { 'START' 'WITH' <bigint> } } Class TranslateOracleDDL::ToPostgres { method bigint($/) { make($/ > 9223372036854775807 ?? 9223372036854775807 || ~$/); } method create-seq-clause:sym<START>($/) { make 'START WITH' ~ $<bigint>.made; }
  • 37. SQL statements grammar TranslateOracleDDL::Grammar { rule input-line:sym<sql-statement> { <sql-statement> ';' } proto rule sql-statement { * } rule sql-statement:sym<CREATE-SEQUENCE> { … } rule sql-statement:sym<COMMENT-ON> { 'COMMENT' 'ON' ['TABLE' | 'COLUMN'] <entity-name> IS <value> } }
  • 38. SQL statements grammar TranslateOracleDDL::Grammar { rule input-line:sym<sql-statement> { <sql-statement> ';' } proto rule sql-statement { * } rule sql-statement:sym<CREATE-SEQUENCE> { … } rule sql-statement:sym<COMMENT-ON> { … } rule sql-statement:sym<CREATE-TABLE> { 'CREATE' 'TABLE' <entity-name> '(' <create-table-column-def>+? % ',' [ ',' <table-constraint-def> ]* ')' <create-table-extra-oracle-stuff>* }
  • 39. SQL statements grammar TranslateOracleDDL::Grammar { rule input-line:sym<sql-statement> { <sql-statement> ';' } proto rule sql-statement { * } rule sql-statement:sym<CREATE-SEQUENCE> { … } rule sql-statement:sym<COMMENT-ON> { … } rule sql-statement:sym<CREATE-TABLE> { … } rule sql-statement:sym<CREATE-INDEX> { 'CREATE' [ $<unique>=('UNIQUE') ]? 'INDEX' <index-name=entity-name> 'ON' <table-name=entity-name> '(' ~ ')' <columns=expr>+ % ',' <index-option>*
  • 40. SQL statements grammar TranslateOracleDDL::Grammar { rule input-line:sym<sql-statement> { <sql-statement> ';' } proto rule sql-statement { * } rule sql-statement:sym<CREATE-SEQUENCE> { … } rule sql-statement:sym<COMMENT-ON> { … } rule sql-statement:sym<CREATE-TABLE> { … } rule sql-statement:sym<CREATE-INDEX> { … } rule sql-statement:sym<CREATE-VIEW> { 'CREATE' [ 'OR' 'REPLACE' ]? 'VIEW' <view-name=entity-name> '(' ~ ')' <columns=expr> + % ',' 'AS' <select-statement> }
  • 41. Table, Column, Sequence, etc. names grammar TranslateOracleDDL::Grammar { rule entity-name { <identifier>** 1..3 % '.' } proto token identifier { * } token identifier:sym<bareword> { <[$w]>+ } token identifier:sym<qq> { '”' [ <-[“]>+: | '””' ]* '”' }
  • 42. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id )
  • 43. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id
  • 44. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id from ( select project_id, batch_number, organism_sample_id, count(illumina_seq_id) as count_illumina, seq_id, gender, setup_wo_id from (
  • 45. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id from ( select project_id, batch_number, organism_sample_id, count(illumina_seq_id) as count_illumina, seq_id, gender, setup_wo_id from ( select distinct wo.project_id, batch.batch_number as batch_number, os.organism_sample_id, ig.seq_id as illumina_seq_id, case when eg.seq_id is not null then eg.seq_id else case when eg2.seq_id is not null then eg2.seq_id else eg3.seq_id end end as seq_id, oi.gender as gender, wo.setup_wo_id as setup_wo_id
  • 46. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id from ( select project_id, batch_number, organism_sample_id, count(illumina_seq_id) as count_illumina, seq_id, gender, setup_wo_id from ( select distinct wo.project_id, batch.batch_number as batch_number, os.organism_sample_id, ig.seq_id as illumina_seq_id, case when eg.seq_id is not null then eg.seq_id else case when eg2.seq_id is not null then eg2.seq_id else eg3.seq_id end end as seq_id, oi.gender as gender, wo.setup_wo_id as setup_wo_id from setup_work_order wo join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id join organism_sample@dw os on woi.dna_id = os.organism_sample_id join organism_individual@dw oi on os.source_id = oi.organism_id left outer join ( select sa.organism_sample_id, sa.attribute_value as batch_number from sample_attribute@dw sa where sa.attribute_label = 'batch_number' ) batch on os.organism_sample_id = batch.organism_sample_id
  • 47. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id from ( select project_id, batch_number, organism_sample_id, count(illumina_seq_id) as count_illumina, seq_id, gender, setup_wo_id from ( select distinct wo.project_id, batch.batch_number as batch_number, os.organism_sample_id, ig.seq_id as illumina_seq_id, case when eg.seq_id is not null then eg.seq_id else case when eg2.seq_id is not null then eg2.seq_id else eg3.seq_id end end as seq_id, oi.gender as gender, wo.setup_wo_id as setup_wo_id from setup_work_order wo join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id join organism_sample@dw os on woi.dna_id = os.organism_sample_id join organism_individual@dw oi on os.source_id = oi.organism_id left outer join ( select sa.organism_sample_id, sa.attribute_value as batch_number from sample_attribute@dw sa where sa.attribute_label = 'batch_number' ) batch on os.organism_sample_id = batch.organism_sample_id left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id left outer join organism_sample@dw os2 on os2.source_id = os.source_id and os2.common_name = os.common_name and os2.sample_type = os.sample_type
  • 48. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id from ( select project_id, batch_number, organism_sample_id, count(illumina_seq_id) as count_illumina, seq_id, gender, setup_wo_id from ( select distinct wo.project_id, batch.batch_number as batch_number, os.organism_sample_id, ig.seq_id as illumina_seq_id, case when eg.seq_id is not null then eg.seq_id else case when eg2.seq_id is not null then eg2.seq_id else eg3.seq_id end end as seq_id, oi.gender as gender, wo.setup_wo_id as setup_wo_id from setup_work_order wo join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id join organism_sample@dw os on woi.dna_id = os.organism_sample_id join organism_individual@dw oi on os.source_id = oi.organism_id left outer join ( select sa.organism_sample_id, sa.attribute_value as batch_number from sample_attribute@dw sa where sa.attribute_label = 'batch_number' ) batch on os.organism_sample_id = batch.organism_sample_id left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id left outer join organism_sample@dw os2 on os2.source_id = os.source_id and os2.common_name = os.common_name and os2.sample_type = os.sample_type left outer join external_genotyping@dw eg3 on eg3.organism_sample_id = os2.organism_sample_id left outer join illumina_genotyping@dw ig on os.organism_sample_id = ig.organism_sample_id where wo.pipeline in ( 'Genotyping', 'Genotyping and Agilent Exome Sequencing', 'Genotyping and Illumina Whole Genome Sequencing', 'Genotyping and Nimblegen Exome Sequencing', 'Genotyping and Nimblegen Custom Capture Illumina', 'Genotyping and Nimblegen Liquid Phase Targeted Sequencing', 'Genotyping and Nimblegen Solid Phase Targeted Sequencing',
  • 49. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id from ( select project_id, batch_number, organism_sample_id, count(illumina_seq_id) as count_illumina, seq_id, gender, setup_wo_id from ( select distinct wo.project_id, batch.batch_number as batch_number, os.organism_sample_id, ig.seq_id as illumina_seq_id, case when eg.seq_id is not null then eg.seq_id else case when eg2.seq_id is not null then eg2.seq_id else eg3.seq_id end end as seq_id, oi.gender as gender, wo.setup_wo_id as setup_wo_id from setup_work_order wo join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id join organism_sample@dw os on woi.dna_id = os.organism_sample_id join organism_individual@dw oi on os.source_id = oi.organism_id left outer join ( select sa.organism_sample_id, sa.attribute_value as batch_number from sample_attribute@dw sa where sa.attribute_label = 'batch_number' ) batch on os.organism_sample_id = batch.organism_sample_id left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id left outer join organism_sample@dw os2 on os2.source_id = os.source_id and os2.common_name = os.common_name and os2.sample_type = os.sample_type left outer join external_genotyping@dw eg3 on eg3.organism_sample_id = os2.organism_sample_id left outer join illumina_genotyping@dw ig on os.organism_sample_id = ig.organism_sample_id where wo.pipeline in ( 'Genotyping', 'Genotyping and Agilent Exome Sequencing', 'Genotyping and Illumina Whole Genome Sequencing', 'Genotyping and Nimblegen Exome Sequencing', 'Genotyping and Nimblegen Custom Capture Illumina', 'Genotyping and Nimblegen Liquid Phase Targeted Sequencing', 'Genotyping and Nimblegen Solid Phase Targeted Sequencing', 'Resource Storage' ) ) group by project_id, batch_number, organism_sample_id, seq_id, gender, setup_wo_id ) samples left outer join external_genotyping@dw eg on
  • 50. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id from ( select project_id, batch_number, organism_sample_id, count(illumina_seq_id) as count_illumina, seq_id, gender, setup_wo_id from ( select distinct wo.project_id, batch.batch_number as batch_number, os.organism_sample_id, ig.seq_id as illumina_seq_id, case when eg.seq_id is not null then eg.seq_id else case when eg2.seq_id is not null then eg2.seq_id else eg3.seq_id end end as seq_id, oi.gender as gender, wo.setup_wo_id as setup_wo_id from setup_work_order wo join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id join organism_sample@dw os on woi.dna_id = os.organism_sample_id join organism_individual@dw oi on os.source_id = oi.organism_id left outer join ( select sa.organism_sample_id, sa.attribute_value as batch_number from sample_attribute@dw sa where sa.attribute_label = 'batch_number' ) batch on os.organism_sample_id = batch.organism_sample_id left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id left outer join organism_sample@dw os2 on os2.source_id = os.source_id and os2.common_name = os.common_name and os2.sample_type = os.sample_type left outer join external_genotyping@dw eg3 on eg3.organism_sample_id = os2.organism_sample_id left outer join illumina_genotyping@dw ig on os.organism_sample_id = ig.organism_sample_id where wo.pipeline in ( 'Genotyping', 'Genotyping and Agilent Exome Sequencing', 'Genotyping and Illumina Whole Genome Sequencing', 'Genotyping and Nimblegen Exome Sequencing', 'Genotyping and Nimblegen Custom Capture Illumina', 'Genotyping and Nimblegen Liquid Phase Targeted Sequencing', 'Genotyping and Nimblegen Solid Phase Targeted Sequencing', 'Resource Storage' ) ) group by project_id, batch_number, organism_sample_id, seq_id, gender, setup_wo_id ) samples left outer join external_genotyping@dw eg on samples.seq_id = eg.seq_id left outer join ( select param.param_value, param.pse_id, to_char(pse.date_scheduled, 'MM/DD/YYYY') as date_scheduled from process_step_executions pse join pse_param param on pse.pse_id = param.pse_id where pse.ps_ps_id = 7375 and param.param_name = 'organism_sample_id' ) import_pses on import_pses.param_value = eg.organism_sample_id left outer join tpp_pse on tpp_pse.pse_id = import_pses.pse_id;
  • 51. SQL expressions CREATE OR REPLACE VIEW SCHEMA_USER.genotyping_externalinfo ( project_id , batch_number , organism_sample_id , count_illumina , seq_id , date_scheduled , y_snp_count , y_snp_total , gender , total_snps , total_calls , status , setup_wo_id ) AS select distinct samples.project_id, samples.batch_number, samples.organism_sample_id, count_illumina, eg.seq_id, import_pses.date_scheduled, eg.y_snp_count, eg.y_snp_total, samples.gender, eg.total_snps, eg.total_calls, eg.status, setup_wo_id from ( select project_id, batch_number, organism_sample_id, count(illumina_seq_id) as count_illumina, seq_id, gender, setup_wo_id from ( select distinct wo.project_id, batch.batch_number as batch_number, os.organism_sample_id, ig.seq_id as illumina_seq_id, case when eg.seq_id is not null then eg.seq_id else case when eg2.seq_id is not null then eg2.seq_id else eg3.seq_id end end as seq_id, oi.gender as gender, wo.setup_wo_id as setup_wo_id from setup_work_order wo join work_order_item woi on wo.setup_wo_id = woi.setup_wo_id join organism_sample@dw os on woi.dna_id = os.organism_sample_id join organism_individual@dw oi on os.source_id = oi.organism_id left outer join ( select sa.organism_sample_id, sa.attribute_value as batch_number from sample_attribute@dw sa where sa.attribute_label = 'batch_number' ) batch on os.organism_sample_id = batch.organism_sample_id left outer join external_genotyping@dw eg on os.organism_sample_id = eg.organism_sample_id left outer join external_genotyping@dw eg2 on os.default_genotype_seq_id = eg2.seq_id left outer join organism_sample@dw os2 on os2.source_id = os.source_id and os2.common_name = os.common_name and os2.sample_type = os.sample_type left outer join external_genotyping@dw eg3 on eg3.organism_sample_id = os2.organism_sample_id left outer join illumina_genotyping@dw ig on os.organism_sample_id = ig.organism_sample_id where wo.pipeline in ( 'Genotyping', 'Genotyping and Agilent Exome Sequencing', 'Genotyping and Illumina Whole Genome Sequencing', 'Genotyping and Nimblegen Exome Sequencing', 'Genotyping and Nimblegen Custom Capture Illumina', 'Genotyping and Nimblegen Liquid Phase Targeted Sequencing', 'Genotyping and Nimblegen Solid Phase Targeted Sequencing', 'Resource Storage' ) ) group by project_id, batch_number, organism_sample_id, seq_id, gender, setup_wo_id ) samples left outer join external_genotyping@dw eg on samples.seq_id = eg.seq_id left outer join ( select param.param_value, param.pse_id, to_char(pse.date_scheduled, 'MM/DD/YYYY') as date_scheduled from process_step_executions pse join pse_param param on pse.pse_id = param.pse_id where pse.ps_ps_id = 7375 and param.param_name = 'organism_sample_id' ) import_pses on import_pses.param_value = eg.organism_sample_id left outer join tpp_pse on tpp_pse.pse_id = import_pses.pse_id;
  • 52. SQL expressions grammar TranslateOracleDDL::Grammar { token and-or { :i 'and' | 'or' } proto rule expr { * } rule expr:sym<nested> { [ '(' <expr> ')' ]+ % <and-or> } rule expr:sym<composite> { <expr> <and-or> <expr> } rule expr:sym<atom> { <entity-name> | <value> } rule expr:sym<operator> { <left=identifier-or-value> <expr-op> <right=expr> } rule expr:sym<IN> { 'IN' '(' ~ ')' <value> + % ',' } rule expr:sym<substr-f> { 'substr(' <expr>**2..3 % ',' ')' }
  • 53. Left-recursive grammar grammar TranslateOracleDDL::Grammar { token and-or { :i 'and' | 'or' } proto rule expr { * } rule expr:sym<nested> { [ '(' <expr> ')' ]+ % <and-or> } rule expr:sym<composite> { <expr> <and-or> <expr> } rule expr:sym<atom> { <entity-name> | <value> } rule expr:sym<operator> { <left=ident-or-value> <expr-op> <right=expr> } rule expr:sym<IN> { 'IN' '(' ~ ')' <value> + % ',' } rule expr:sym<substr-f> { 'substr(' <expr>**2..3 % ',' ')' }
  • 54. Remove left recursion grammar TranslateOracleDDL::Grammar { token and-or { :i 'and' | 'or' } proto rule expr { * } rule expr:sym<nested> { [ '(' <expr> ')' ]+ % <and-or> } rule expr:sym<composite> { <expr-part> <and-or> <expr> } rule expr:sym<part> { <expr-part> } rule expr:sym<atom> { <entity-name> | <value> } rule expr-part:sym<operator> { <left=ident-or-value> <expr-op> <right=expr> } rule expr-part:sym<IN> { 'IN' '(' ~ ')' <value> + % ',' } rule expr-part:sym<substr-f> { 'substr(' <expr>**2..3 % ',' ')' }
  • 55. Broken SQL CREATE UNIQUE INDEX SCHEMA_USER.te_spname_lcname ON SCHEMA_USER.tp_entry ( species_name , tp_id ,;
  • 56. Broken SQL CREATE UNIQUE INDEX SCHEMA_USER.te_spname_lcname ON SCHEMA_USER.tp_entry ( species_name , tp_id ,; rule sql-statement<broken-CREATE-INDEX> { 'CREATE' <unique='UNIQUE'> 'INDEX' <entity-name> 'ON' '(' <expr>* %% ',' }
  • 57. Broken SQL ALTER TABLE SCHEMA_USER. ADD CONSTRAINT bin$hwvvsoqpce/gu4ocaaqrog==$0 CHECK () ( ) INITIALLY DISABLE;
  • 58. Broken SQL ALTER TABLE SCHEMA_USER. ADD CONSTRAINT bin$hwvvsoqpce/gu4ocaaqrog==$0 CHECK () ( ) INITIALLY DISABLE; rule sql-statement<broken-ALTER-TABLE-ADD-CONSTRAINT> { 'ALTER' 'TABLE' S+ 'ADD' 'CONSTRAINT' S+ CHECK '(' ')' '(' ')' w+ 'DISABLE' }
  • 59. Hard to parse SQL CREATE OR REPLACE VIEW SCHEMA_USER.plate_locations ( sec_sec_id, well_name, pt_pt_id, pl_id ) AS SELECT SCHEMA_USER.sectors.SEC_ID AS "SEC_SEC_ID", SCHEMA_USER.dna_location.LOCATION_NAME AS "WELL_NAME", SCHEMA_USER.plate_types.PT_ID AS "PT_PT_ID", SCHEMA_USER.dna_location.dl_id AS "PL_ID" FROM SCHEMA_USER.sectors, SCHEMA_USER.dna_location, SCHEMA_USER.plate_types WHERE SCHEMA_USER.sectors.SEC_ID = SCHEMA_USER.dna_location.sec_id and SCHEMA_USER.dna_location.LOCATION_TYPE = to_char(SCHEMA_USER.plate_types.WELL_COUNT)||' well plate'
  • 60. Hard to parse SQL CREATE OR REPLACE VIEW SCHEMA_USER.plate_locations ( sec_sec_id, well_name, pt_pt_id, pl_id ) AS SELECT SCHEMA_USER.sectors.SEC_ID AS "SEC_SEC_ID", SCHEMA_USER.dna_location.LOCATION_NAME AS "WELL_NAME", SCHEMA_USER.plate_types.PT_ID AS "PT_PT_ID", SCHEMA_USER.dna_location.dl_id AS "PL_ID" FROM SCHEMA_USER.sectors, SCHEMA_USER.dna_location, SCHEMA_USER.plate_types WHERE SCHEMA_USER.sectors.SEC_ID = SCHEMA_USER.dna_location.sec_id and SCHEMA_USER.dna_location.LOCATION_TYPE = to_char(SCHEMA_USER.plate_types.WELL_COUNT)||' well plate'
  • 61. Hard to parse SQL rule sql-statement:sym<special-CREATE-VIEW> { 'CREATE' 'OR' 'REPLACE' 'VIEW' 'SCHEMA_USER.plate_locations' '(' <-[;]>+ } method sql-statement:sym<special-CREATE-VIEW>($/) { Make ~$/; }
  • 62. Translate use TranslateOracleDDL::Grammar; use TranslateOracleDDL::ToPostgres; sub MAIN(Str $filename where *.IO.e) { my $parsed = TranslateOracleDDL::Grammar.parsefile( $filename, actions => TranslateOracleDDL::ToPostgres, ); say $parsed.made; }
  • 63. Debugging use Grammar::Tracer; grammar TranslateOracleDDL::Grammar { … } TranslateOracleDDL::Grammar.parsefile($filename); TOP | input-line | | input-line:sym<sqlplus-directive> | | | sqlplus-directive | | | | sqlplus-directive:sym<REM> | | | | | string-to-end-of-line | | | | | * MATCH “ This DDL was reverse engineered by” | | | | * MATCH “REM This DDL was reverse engineered byn” | | | * MATCH “REM This DDL was reverse engineered byn” | | * MATCH “REM This DDL was reverse engineered byn” | * MATCH “REM This DDL was reverse engineered byn”
  • 64. Debugging rule select-statement { :ignorecase 'SELECT' [ $<distinct>=('DISTINCT') ]? <columns=select-column>+ % ',' 'FROM' <select-from-clause> + % ',' <join-clause>* <where-clause>? <group-by-clause>? }
  • 65. Debugging rule select-statement { :ignorecase 'SELECT' { say “saw SELECT” } [ $<distinct>=('DISTINCT') { say “saw DISTINCT” }]? <columns=select-column>+ % ',' { say @<columns>.elems,” columns” } 'FROM' <select-from-clause> + % ',' { say @<select-from- clause>.elems,”from”} <join-clause>* { say @<join-clause>.elems, “ joins” } <where-clause>? <group-by-clause>? }
  • 66. Grammars to make other grammars https://perl6advent.wordpress.com/2015/12/08/day-8-grammars-generating-grammars/ grammar Grammar::BNF { token TOP { s* <rule>+ s* } token rule { … } token expression { … } token term { … } … } class GrammarGenerator { has $.name = 'BNFGrammar'; method TOP { my $grmr := Metamodel::GrammarHOW.new_type(:$.name); $grmr.^add_method('TOP', EVAL 'token …'); for $<rule>.map(*.ast) -> $rule { $grmr.^add_method($rule.key, $rule.value); } $grmr.^compose; make $grmr; } ...