SlideShare a Scribd company logo
record-oriented grep
mlr-grep
ryo1kato
@github
@gmail
@twitter
@facebook
motivation
Want to "grep" multi-
line entries in a file
✦ multi-line log files, or *.ini, etc.
✦ semi-structured text like an
ifconfig output
2
for example...
$ cat data.txt

[one]

two

three



[foo]

bar

baz



[hoge]

piyo

huga
3
}
want to extract entire
record lines that contains
a pattern, where a record
Typical way
✦ grep -A 12 -B 34 -C 56
✦ pcregrep --multiline
✦ awk -v RS='nn' "/$re/"
✦ perl -e …
4
But
✦ pcregrep : You often need a very long
regex.
✦ Note that it's NOT about finding multiline pattern
(a pattern containing 'n'), but extract multiline
record containing a pattern.
✦ AWK : Possible with using RS (need gawk)
✦ Actually it's difficult to do it right using
pcregrep or awk.
✦ perl, python : well, if you go that far ...
5
But, do you want to write a one-liner / X script for these?
✦ zgrep
✦ grep -c (--count)
✦ grep -i (--ignore-case)
✦ grep -v (--invert-match)
✦ grep --color
6
So I wrote it for you!
✦mlr-grep
✦ Multi-Line Record Grep
✦ AWK, Haskell, Python
✦ named amlgrep, hmlgrep, and pmlgrep
✦ They have almost identical
features.
7
$ amlgrep 'ba' …



[foo]

bar

baz

8
e.g.
} A whole record
containing the pattern
✦ amlgrep - AWK implementation
✦ Needs gawk.
✦ Fastest
✦ --rs regex is slightly broken in RHEL5.
✦ Auto extract *.gz, *.bz2, and *.xz files
✦ --color, --count, --invert-match
✦ AND, OR of multiple keywords.
✦ hmlgrep - Haskell implementation
✦ Has almost same feature set as AWK ver.
✦ Sometimes 1.5 2x slower, with files with short lines and many
matches.
✦ pymlgrep - Python implementation
✦ Slowest (4x of AWK version)
✦ Doesn't support multiple keywords
9
Multiple Keywords
10
$ amlgrep [--or] h t [FILE]



[one]

two

three



[hoge]

piyo

huga
≒ egrep 'h|t',
but fewer key types.
11
$ amlgrep --and h t [FILE]



[one]

two

three
egrep 'h.*t|t.*h' 

but fewer key types
12
--timestamp
multi-line log files
with each entry begins
with timestamps
13
$ cat datetime.log

2014-01-23 12:34:56 log 1

foo

bar

2014-01-24 12:34:57 log 2

one

two

2014-01-25 12:34:58 log 3

hoge

piyo
14
$ amlgrep -t 'one' … 

2014-01-24 12:34:57 log 2

one

two

15
$ amlgrep -t --dump foo
gawk -W re-interval -F n -v RS='n(((Mon|
Tue|Wed|Thu|Fri|Sat),?[ t]+)?(Jan|Feb|
Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Dec),?
[ t]*[0-9]{1,2},?[ t][0-2][0-9]:[0-5]
[0-9](:[0-5][0-9])?(,?[ t]20[0-9][0-9])?|
20[0-9][0-9]-(0[0-9]|11|12)-(0[1-9]|[12]
[0-9]|3[01]))' '-v' 'ORS=' 'oldRT $0 ~ /
foo/ {i++;if(substr(oldRT,1,1)=="n")
{h=substr(oldRT,2)}else{h=oldRT};;gsub(/
foo/,"&",h);print h;gsub(/foo/,
"&");print;if(RT != "")printf "n"}
{oldRT=RT} END{if (i>0){exit 0}else{exit
1}}'
16
Change the record separator
✦ --rs '^$'
✦ Empty lines
✦ --rs '^----'
✦ Four or more dash
✦ --rs '^[[:alnum]]'
✦ Alphanumeric character on the first column. (For ifconfig
like output)
✦ --rs '^['
✦ A line begins with '[' (For *.ini files)
✦ --timestamp
≒ -rs '^(((Mon|Tue|Wed|Thu|Fri|Sat),?[t]+)?(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Dec),?[
t]*[0-9]{1,2},?[ t][0-2][0-9]:[0-5][0-9](:[0-5][0-9])?(,?[ t]20[0-9][0-9])?|20[0-9][0-9]-
(0[0-9]|11|12)-(0[1-9]|[12][0-9]|3[01]))'
17
http://github.com/
ryo1kato/mlr-grep
18

More Related Content

What's hot

Bash4
Bash4Bash4
Bash4
apsegundo
 
Top 10 Perl Performance Tips
Top 10 Perl Performance TipsTop 10 Perl Performance Tips
Top 10 Perl Performance Tips
Perrin Harkins
 
Developing High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & GoDeveloping High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & Go
Chris Stivers
 
Parsec
ParsecParsec
Parsec
Phil Freeman
 
tokyotalk
tokyotalktokyotalk
tokyotalk
Hiroshi Ono
 
Compiler basics: lisp to assembly
Compiler basics: lisp to assemblyCompiler basics: lisp to assembly
Compiler basics: lisp to assembly
Phil Eaton
 
LCDS - State Presentation
LCDS - State PresentationLCDS - State Presentation
LCDS - State Presentation
Ruochun Tzeng
 
faastCrystal
faastCrystalfaastCrystal
faastCrystal
Sachirou Inoue
 
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicumBsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Scott Tsai
 
ulimit
ulimit ulimit
ulimit
hiyelata
 
Mastering the Unix Command Line
Mastering the Unix Command LineMastering the Unix Command Line
Mastering the Unix Command Line
Howard Mao
 
Phil Bartie QGIS PLPython
Phil Bartie QGIS PLPythonPhil Bartie QGIS PLPython
Phil Bartie QGIS PLPython
Ross McDonald
 
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Maarten Mulders
 
Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010
Dirkjan Bussink
 
Ruby & GCs (QConSP 2014)
Ruby & GCs (QConSP 2014)Ruby & GCs (QConSP 2014)
Ruby & GCs (QConSP 2014)
Fabio Akita
 
Profiling with Devel::NYTProf
Profiling with Devel::NYTProfProfiling with Devel::NYTProf
Profiling with Devel::NYTProf
bobcatfish
 
Parboiled explained
Parboiled explainedParboiled explained
Parboiled explained
Paul Popoff
 
Low Overhead System Tracing with eBPF
Low Overhead System Tracing with eBPFLow Overhead System Tracing with eBPF
Low Overhead System Tracing with eBPF
Akshay Kapoor
 
Easy to Learn C language program
Easy to Learn C language programEasy to Learn C language program
Easy to Learn C language program
Hitarth Patel
 
Scaling FastAGI Applications with Go
Scaling FastAGI Applications with GoScaling FastAGI Applications with Go
Scaling FastAGI Applications with Go
Digium
 

What's hot (20)

Bash4
Bash4Bash4
Bash4
 
Top 10 Perl Performance Tips
Top 10 Perl Performance TipsTop 10 Perl Performance Tips
Top 10 Perl Performance Tips
 
Developing High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & GoDeveloping High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & Go
 
Parsec
ParsecParsec
Parsec
 
tokyotalk
tokyotalktokyotalk
tokyotalk
 
Compiler basics: lisp to assembly
Compiler basics: lisp to assemblyCompiler basics: lisp to assembly
Compiler basics: lisp to assembly
 
LCDS - State Presentation
LCDS - State PresentationLCDS - State Presentation
LCDS - State Presentation
 
faastCrystal
faastCrystalfaastCrystal
faastCrystal
 
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicumBsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
 
ulimit
ulimit ulimit
ulimit
 
Mastering the Unix Command Line
Mastering the Unix Command LineMastering the Unix Command Line
Mastering the Unix Command Line
 
Phil Bartie QGIS PLPython
Phil Bartie QGIS PLPythonPhil Bartie QGIS PLPython
Phil Bartie QGIS PLPython
 
Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)Building a DSL with GraalVM (VoxxedDays Luxembourg)
Building a DSL with GraalVM (VoxxedDays Luxembourg)
 
Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010Rubinius @ RubyAndRails2010
Rubinius @ RubyAndRails2010
 
Ruby & GCs (QConSP 2014)
Ruby & GCs (QConSP 2014)Ruby & GCs (QConSP 2014)
Ruby & GCs (QConSP 2014)
 
Profiling with Devel::NYTProf
Profiling with Devel::NYTProfProfiling with Devel::NYTProf
Profiling with Devel::NYTProf
 
Parboiled explained
Parboiled explainedParboiled explained
Parboiled explained
 
Low Overhead System Tracing with eBPF
Low Overhead System Tracing with eBPFLow Overhead System Tracing with eBPF
Low Overhead System Tracing with eBPF
 
Easy to Learn C language program
Easy to Learn C language programEasy to Learn C language program
Easy to Learn C language program
 
Scaling FastAGI Applications with Go
Scaling FastAGI Applications with GoScaling FastAGI Applications with Go
Scaling FastAGI Applications with Go
 

Similar to multi-line record grep

One-Liners to Rule Them All
One-Liners to Rule Them AllOne-Liners to Rule Them All
One-Liners to Rule Them All
egypt
 
Perl - laziness, impatience, hubris, and one liners
Perl - laziness, impatience, hubris, and one linersPerl - laziness, impatience, hubris, and one liners
Perl - laziness, impatience, hubris, and one liners
Kirk Kimmel
 
What we can learn from Rebol?
What we can learn from Rebol?What we can learn from Rebol?
What we can learn from Rebol?
lichtkind
 
shellScriptAlt.pptx
shellScriptAlt.pptxshellScriptAlt.pptx
shellScriptAlt.pptx
NiladriDey18
 
SymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performancesSymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performances
julien pauli
 
Symfony live 2017_php7_performances
Symfony live 2017_php7_performancesSymfony live 2017_php7_performances
Symfony live 2017_php7_performances
julien pauli
 
How Xslate Works
How Xslate WorksHow Xslate Works
How Xslate Works
Goro Fuji
 
Getting Started with the Alma API
Getting Started with the Alma APIGetting Started with the Alma API
Getting Started with the Alma API
Kyle Banerjee
 
Profiling php5 to php7
Profiling php5 to php7Profiling php5 to php7
Profiling php5 to php7
julien pauli
 
Cli the other SAPI confoo11
Cli the other SAPI confoo11Cli the other SAPI confoo11
Cli the other SAPI confoo11
Combell NV
 
Gun make
Gun makeGun make
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
Tim Bunce
 
DevChatt 2010 - *nix Cmd Line Kung Foo
DevChatt 2010 - *nix Cmd Line Kung FooDevChatt 2010 - *nix Cmd Line Kung Foo
DevChatt 2010 - *nix Cmd Line Kung Foo
brian_dailey
 
Fundamental of Shell Programming
Fundamental of Shell ProgrammingFundamental of Shell Programming
Fundamental of Shell Programming
Rahul Hada
 
Perly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsPerly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data Records
Workhorse Computing
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Serialization in Go
Serialization in GoSerialization in Go
Serialization in Go
Albert Strasheim
 
PigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptxPigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptx
Rahul Borate
 
Love Your Command Line
Love Your Command LineLove Your Command Line
Love Your Command Line
Liz Henry
 
Unleash your inner console cowboy
Unleash your inner console cowboyUnleash your inner console cowboy
Unleash your inner console cowboy
Kenneth Geisshirt
 

Similar to multi-line record grep (20)

One-Liners to Rule Them All
One-Liners to Rule Them AllOne-Liners to Rule Them All
One-Liners to Rule Them All
 
Perl - laziness, impatience, hubris, and one liners
Perl - laziness, impatience, hubris, and one linersPerl - laziness, impatience, hubris, and one liners
Perl - laziness, impatience, hubris, and one liners
 
What we can learn from Rebol?
What we can learn from Rebol?What we can learn from Rebol?
What we can learn from Rebol?
 
shellScriptAlt.pptx
shellScriptAlt.pptxshellScriptAlt.pptx
shellScriptAlt.pptx
 
SymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performancesSymfonyCon 2017 php7 performances
SymfonyCon 2017 php7 performances
 
Symfony live 2017_php7_performances
Symfony live 2017_php7_performancesSymfony live 2017_php7_performances
Symfony live 2017_php7_performances
 
How Xslate Works
How Xslate WorksHow Xslate Works
How Xslate Works
 
Getting Started with the Alma API
Getting Started with the Alma APIGetting Started with the Alma API
Getting Started with the Alma API
 
Profiling php5 to php7
Profiling php5 to php7Profiling php5 to php7
Profiling php5 to php7
 
Cli the other SAPI confoo11
Cli the other SAPI confoo11Cli the other SAPI confoo11
Cli the other SAPI confoo11
 
Gun make
Gun makeGun make
Gun make
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
DevChatt 2010 - *nix Cmd Line Kung Foo
DevChatt 2010 - *nix Cmd Line Kung FooDevChatt 2010 - *nix Cmd Line Kung Foo
DevChatt 2010 - *nix Cmd Line Kung Foo
 
Fundamental of Shell Programming
Fundamental of Shell ProgrammingFundamental of Shell Programming
Fundamental of Shell Programming
 
Perly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data RecordsPerly Parallel Processing of Fixed Width Data Records
Perly Parallel Processing of Fixed Width Data Records
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to Perl
 
Serialization in Go
Serialization in GoSerialization in Go
Serialization in Go
 
PigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptxPigHive presentation and hive impor.pptx
PigHive presentation and hive impor.pptx
 
Love Your Command Line
Love Your Command LineLove Your Command Line
Love Your Command Line
 
Unleash your inner console cowboy
Unleash your inner console cowboyUnleash your inner console cowboy
Unleash your inner console cowboy
 

Recently uploaded

Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
aymanquadri279
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 

Recently uploaded (20)

Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
What is Master Data Management by PiLog Group
What is Master Data Management by PiLog GroupWhat is Master Data Management by PiLog Group
What is Master Data Management by PiLog Group
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 

multi-line record grep

  • 2. motivation Want to "grep" multi- line entries in a file ✦ multi-line log files, or *.ini, etc. ✦ semi-structured text like an ifconfig output 2
  • 3. for example... $ cat data.txt
 [one]
 two
 three
 
 [foo]
 bar
 baz
 
 [hoge]
 piyo
 huga 3 } want to extract entire record lines that contains a pattern, where a record
  • 4. Typical way ✦ grep -A 12 -B 34 -C 56 ✦ pcregrep --multiline ✦ awk -v RS='nn' "/$re/" ✦ perl -e … 4
  • 5. But ✦ pcregrep : You often need a very long regex. ✦ Note that it's NOT about finding multiline pattern (a pattern containing 'n'), but extract multiline record containing a pattern. ✦ AWK : Possible with using RS (need gawk) ✦ Actually it's difficult to do it right using pcregrep or awk. ✦ perl, python : well, if you go that far ... 5
  • 6. But, do you want to write a one-liner / X script for these? ✦ zgrep ✦ grep -c (--count) ✦ grep -i (--ignore-case) ✦ grep -v (--invert-match) ✦ grep --color 6
  • 7. So I wrote it for you! ✦mlr-grep ✦ Multi-Line Record Grep ✦ AWK, Haskell, Python ✦ named amlgrep, hmlgrep, and pmlgrep ✦ They have almost identical features. 7
  • 8. $ amlgrep 'ba' …
 
 [foo]
 bar
 baz
 8 e.g. } A whole record containing the pattern
  • 9. ✦ amlgrep - AWK implementation ✦ Needs gawk. ✦ Fastest ✦ --rs regex is slightly broken in RHEL5. ✦ Auto extract *.gz, *.bz2, and *.xz files ✦ --color, --count, --invert-match ✦ AND, OR of multiple keywords. ✦ hmlgrep - Haskell implementation ✦ Has almost same feature set as AWK ver. ✦ Sometimes 1.5 2x slower, with files with short lines and many matches. ✦ pymlgrep - Python implementation ✦ Slowest (4x of AWK version) ✦ Doesn't support multiple keywords 9
  • 11. $ amlgrep [--or] h t [FILE]
 
 [one]
 two
 three
 
 [hoge]
 piyo
 huga ≒ egrep 'h|t', but fewer key types. 11
  • 12. $ amlgrep --and h t [FILE]
 
 [one]
 two
 three egrep 'h.*t|t.*h' 
 but fewer key types 12
  • 13. --timestamp multi-line log files with each entry begins with timestamps 13
  • 14. $ cat datetime.log
 2014-01-23 12:34:56 log 1
 foo
 bar
 2014-01-24 12:34:57 log 2
 one
 two
 2014-01-25 12:34:58 log 3
 hoge
 piyo 14
  • 15. $ amlgrep -t 'one' … 
 2014-01-24 12:34:57 log 2
 one
 two
 15
  • 16. $ amlgrep -t --dump foo gawk -W re-interval -F n -v RS='n(((Mon| Tue|Wed|Thu|Fri|Sat),?[ t]+)?(Jan|Feb| Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Dec),? [ t]*[0-9]{1,2},?[ t][0-2][0-9]:[0-5] [0-9](:[0-5][0-9])?(,?[ t]20[0-9][0-9])?| 20[0-9][0-9]-(0[0-9]|11|12)-(0[1-9]|[12] [0-9]|3[01]))' '-v' 'ORS=' 'oldRT $0 ~ / foo/ {i++;if(substr(oldRT,1,1)=="n") {h=substr(oldRT,2)}else{h=oldRT};;gsub(/ foo/,"&",h);print h;gsub(/foo/, "&");print;if(RT != "")printf "n"} {oldRT=RT} END{if (i>0){exit 0}else{exit 1}}' 16
  • 17. Change the record separator ✦ --rs '^$' ✦ Empty lines ✦ --rs '^----' ✦ Four or more dash ✦ --rs '^[[:alnum]]' ✦ Alphanumeric character on the first column. (For ifconfig like output) ✦ --rs '^[' ✦ A line begins with '[' (For *.ini files) ✦ --timestamp ≒ -rs '^(((Mon|Tue|Wed|Thu|Fri|Sat),?[t]+)?(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Dec),?[ t]*[0-9]{1,2},?[ t][0-2][0-9]:[0-5][0-9](:[0-5][0-9])?(,?[ t]20[0-9][0-9])?|20[0-9][0-9]- (0[0-9]|11|12)-(0[1-9]|[12][0-9]|3[01]))' 17