Haskell-awk

Haskell text processor for the commandline
Mario Pastorelli
Introduction
awk
a generic text processor where
“A file is treated as a sequence of records, and by
default each line is a record.” - A...
Why another awk?
“Whenever faced with a problem, some people say
`Lets use AWK.' Now, they have two problems.” - D.
Tilbro...
Haskell-awk (Hawk)
a generic text processor where
“A stream is treated as a sequence of records, and
by default each line ...
Why Haskell
expressive, clean and concise
>fle od[,,,]
itr d 1234
[,]
13

functions as composable building blocks
>ltwrCut...
Hawk
Modes
evaluate an expression
$hw ''
ak 1
1
$hw '12'
ak [,]
1
2
$hw '[,][,]'
ak [12,34]'
12
34

apply an expression to the ...
IO format
The input is, by default, a list of list of strings where lines are
separated by n and words by spaces
$eh ' 2n ...
Examples
get all users of a UNIX system
$ct/t/asd|hw -:- '.ed
a ecpsw
ak d m Lha'
ro
ot
deo
amn
..
.

select username and ...
Context
Hawk can be customized using files inside the context directory (by
default ~/.hawk)
The most important file is pr...
Implementation
Hawk must be fast
cache the context
use the timestamp to check if the context is changed since last
run
compile it with gh...
Parse and interpret Haskell
Hawk combines two Haskell libraries
haskell-src-exts to deal with haskell source code
>ipr Lnu...
Thank you!
https://github.com/gelisam/hawk
Upcoming SlideShare
Loading in …5
×

Hawk presentation

1,109 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,109
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Hawk presentation

  1. 1. Haskell-awk Haskell text processor for the commandline Mario Pastorelli
  2. 2. Introduction
  3. 3. awk a generic text processor where “A file is treated as a sequence of records, and by default each line is a record.” - Alfred V. Aho developed in 1977 by Alfred Aho, Peter Weinberger, and Brian Kernighan @ Bell Labs uses AWK as programming language ak'EI {pit"el Wrd"} w BGN rn Hlo ol! ' procedural interpreted a program is a series of pattern action pairs
  4. 4. Why another awk? “Whenever faced with a problem, some people say `Lets use AWK.' Now, they have two problems.” - D. Tilbrook avoid the AWK programming language use a generic language, not a DSL BGNslt" bcca,)frii abai]1r";o( i brr ";rn r EI{pi(a "a;o( n )[[]=;="fri n )=" ipit } nb$wrs" bcca u od a " procedural (imperative) vs functional programming for stream processing
  5. 5. Haskell-awk (Hawk) a generic text processor where “A stream is treated as a sequence of records, and by default each line is a record.” the same philosophy of awk! developed in 2013 by me and Samuel Gélineau, the name is a tribute to awk uses Haskell as programming language hw 'HloWrd" ak "el ol!' functional (incrementally) compiled a program is a Haskell expression
  6. 6. Why Haskell expressive, clean and concise >fle od[,,,] itr d 1234 [,] 13 functions as composable building blocks >ltwrCut=sm.mp(egh.wrs .lns e odon u a lnt od) ie >:yewrCut tp odon wrCut: Srn - It odon : tig > n >wrCut" 23n 56n 89 odon 1 4 7 " 9 partial application >:yemp tp a mp: ( - b - []- [] a : a > ) > a > b >:yent tp o nt: Bo - Bo o : ol > ol >:yempnt tp a o mpnt: [ol - [ol a o : Bo] > Bo] >mpnt[reFle a o Tu,as] [as,re FleTu] point-free style, laziness ...
  7. 7. Hawk
  8. 8. Modes evaluate an expression $hw '' ak 1 1 $hw '12' ak [,] 1 2 $hw '[,][,]' ak [12,34]' 12 34 apply an expression to the input $eh '2n'|hw - '.ees' co 1n3 ak a Lrvre 3 2 1 map an expression to each record of the input $eh ' 2n 4 |hw - '.ees' co 1 3 ' ak m Lrvre 21 43
  9. 9. IO format The input is, by default, a list of list of strings where lines are separated by n and words by spaces $eh ' 2n 4 |hw - 'hw co 1 3 ' ak a so' ["""",""""] [1,2][3,4] Options -d/-D are provided to change delimiters or set them to empty $eh ',;,'|hw - -''-'''hw co 1234 ak a d, D; so' ["""",""""] [1,2][3,4] $eh ' 2n 4 |hw - -' 'hw co 1 3 ' ak a d' so' [12,34] " "" " $eh ' 2n 4 |hw - -' -' 'hw co 1 3 ' ak a d' D' so' " 2n 4n 1 3 " The output can be any type that instantiate the typeclass Rows cas(hwa = Rw awee ls So ) > os hr rp : BtSrn - a- [yetig er : yetig > > BtSrn]
  10. 10. Examples get all users of a UNIX system $ct/t/asd|hw -:- '.ed a ecpsw ak d m Lha' ro ot deo amn .. . select username and userid $ct/t/asd|hw -:-'t - 'l- ( ! 0l! 2' a ecpsw ak d o' m > l ! , ! ) ro ot 0 deo 1 amn .. . sort by username (instead of pid) $ct/t/asd|hw -:- '.oty(opr `n Lha) a ecpsw ak d a LsrB cmae o` .ed' bnx22bn/i:bns i::::i:bn/i/h deo::::amn/s/bn/i/h amnx11deo:ursi:bns .. . get the number of users using each shell >ct/t/asd|hw -d '.a (.ed&&Llnt).Lgop.Lsr .LmpLls' a ecpsw ak a: Lmp Lha & .egh .ru .ot .a .at /i/ah1 bnbs: .. .
  11. 11. Context Hawk can be customized using files inside the context directory (by default ~/.hawk) The most important file is prelude.hs that contains the "runtime context" $ct~.akpeueh a /hw/rld.s {#LNUG EtneDfutue,OelaeSrns#} - AGAE xeddealRls vroddtig ipr Peue mot rld ipr qaiidDt.yetigLz.hr a B mot ulfe aaBtSrn.ayCa8 s ipr qaiidDt.ita L mot ulfe aaLs s for instance, we can add a function for taking elements in an interval $eh 'aeewe se=Ltk ( -s .Ldo s > ~.akpeueh co tkBten .ae e ) .rp ' > /hw/rld.s $sq010|hw - 'aeewe 24 e 0 ak a tkBten ' 2 3
  12. 12. Implementation
  13. 13. Hawk must be fast cache the context use the timestamp to check if the context is changed since last run compile it with ghc use locks to compile only once when multiple Hawk instances instances are running hw '1.'|hw - '.ae3 ak [.] ak a Ltk ' use ByteString instead of String ...
  14. 14. Parse and interpret Haskell Hawk combines two Haskell libraries haskell-src-exts to deal with haskell source code >ipr Lnug.akl.xsPre mot agaeHselEt.asr >gtoPams"- LNUG NIpiiPeueOelaeSrns#}n eTprga {# AGAE omlctrld,vroddtig -" Prek[agaerga(rLc asO LnugPam Sco {rFlnm ="nnw.s,scie=1 scoun=1) scieae ukonh" rLn , rClm } [dn "omlctrld"Iet"vroddtig"] Iet NIpiiPeue,dn OelaeSrns] hint to interpret the user expression >ipr Lnug.akl.nepee mot agaeHselItrrtr >rnnepee $stmot [Dt.n" > itrrt""(s: It uItrrtr eIprs "aaIt] > nepe 1 a : n) Rgt1 ih >rnnepee $stmot [Dt.n" > itrrt"o"(s: It uItrrtr eIprs "aaIt] > nepe fo a : n) Lf (otopl [hErr{rMg="o i soe `o'}) et WnCmie Gcro ers Nt n cp: fo"]
  15. 15. Thank you! https://github.com/gelisam/hawk

×