TiReX: a Tiled Regular eXpression matching architecture

•Download as PPTX, PDF•

0 likes•75 views

NECST Lab @ Politecnico di Milano

NGC17 Talk @ Oracle - June 7, 2017

Engineering

TiReX
Davide Conficconi <davide.conficconi@mail.polimi.it>
Alessandro Comodi <alessandro.comodi@mail.polimi.it>
NECST lab, Politecnico di MIlano
@Oracle HQ
June 7, 2017

• Signature based
detection
2
Applicative Scenarios
• Genomic data
analysis

• High speed
requirements
3
Issues
• Huge amount of
data to handle

• RE as instructions,
the needs of a
customized
processor
4
Our idea
• Reconfigurable
ISA

5
Architecture overview
Fetch and Decode Execute
Control Path

6
Architecture overview
Execute
Control Path

9
Trivial example
Regular expression : ACCGTGGA
Opcode Reference
& ACCG
& TGGA
NOP -
Input string : ACACCGTGGA
Instructions Clock Cycles Data
#1 #2 #3 #4 #5 #6
& ACCG FD EX ACAC
& ACCG FD EX CACC
& ACCG FD EX ACCG
& TGGA FD EX TGGA
NOP FD EX -

10
Example: Kleene operators
Opcode Refernce
( -
&)+ TTTT
& CT
NOP -
Regular expression : (TTTT)+CT
Input string : TCTTTTCT
Instructions Clock Cycles Data
Opcode / Ref #1 #2 #3 #4 #5 #6 #7 #8
( FD EX -
&)+ TTTT FD EX TCTT
&)+ TTTT FD EX CTTT
&)+ TTTT FD EX TTTT
&)+ TTTT FD EX CT--
& CT FD EX CT--
NOP FD EX -

Regular Expression
Time Flex
[µs]
Time TiReX
[µs]
File
dimension
Speedup
factor
ACCGTGGA 271 us 39,9 us 16 kB x6
(TTTT)+CT 121 us 81,35 us 16 kB x1.5
(CAGT)|(GGGG)|(TTGG)TGCA(C|G)+ 263 us 173,835 us 16 kB x1.5
11
Flex vs TiReX

Area Utilization
VC-707
Evaluation
Board
Slice LUTs Slice
Registers
F7
Muxes
F8 Muxes Slice
LUT as
Logic
LUT as FF
Pairs
Available 303600 607200 75900 75900 75900 303600 303600
TiReX
1757
[0.57%]
1590
[0.26%]
265
[0.35%]
114
[0.15%]
890
[1.17%]
1757
[0.57%]
310
[0.1%]
12

• Tiled Architecture
• Two ways to operate:
– Same RE multiple data
streams
– Multiple RE single data
stream
14
From single to multicore

• Single Core Pattern Matching Architecture
• Current implementation outperform Flex
performance
• Future works: Multicore architecture
15
Conclusions & Future works

16
Davide Conficconi
<davide.conficconi@mail.polimi.it>
Alessandro Comodi
<alessandro.comodi@mail.polimi.it>
TiReX team <tirexatnecst@gmail.com>
Questions?

A further step: Multicore 18
• Dark silicon problem why fpga

Similar to TiReX: a Tiled Regular eXpression matching architecture

Mass Scale NetworkingSteve Iatrou

Spark Streaming Early Warning Use Caserandom_chance

What’s eating python performancePiotr Przymus

Tamir Dresher - DotNet 7 What's new.pptxTamir Dresher

Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova

design-compiler.pdfFrangoCamila

Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...Haidee McMahon

Introduction to Programmable Networks by Clarence Anslem, IntelMyNOG

DPDK layer for porting IPS-IDSVipin Varghese

Big (chemical) data? No Problem!Greg Landrum

CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...NECST Lab @ Politecnico di Milano

Michael_Kogan_portfolioMichael Kogan

NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...NECST Lab @ Politecnico di Milano

No[1][1]51 lecture

DReAMS: High Performance Reconfigurable Computing at NECSTLabNECST Lab @ Politecnico di Milano

High Performance Reconfigurable Computing at NECSTLabNECST Lab @ Politecnico di Milano

An Efficient VLSI Design of AES Cryptography Based on DNA TRNG DesignIRJET Journal

IP Core Design of Hight Lightweight Cipher and its Implementation csandit

IP CORE DESIGN OF HIGHT LIGHTWEIGHT CIPHER AND ITS IMPLEMENTATIONcscpconf

Similar to TiReX: a Tiled Regular eXpression matching architecture (20)

Mass Scale Networking

Spark Streaming Early Warning Use Case

What’s eating python performance

Tamir Dresher - DotNet 7 What's new.pptx

Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches

design-compiler.pdf

Software Network Data Plane - Satisfying the need for speed - FD.io - VPP and...

Introduction to Programmable Networks by Clarence Anslem, Intel

DPDK layer for porting IPS-IDS

Big (chemical) data? No Problem!

CNNECST: an FPGA-based approach for the hardware acceleration of Convolutiona...

Michael_Kogan_portfolio

NNECST: an FPGA-based approach for the hardware acceleration of Convolutional...

No[1][1]

DReAMS: High Performance Reconfigurable Computing at NECSTLab

High Performance Reconfigurable Computing at NECSTLab

An Efficient VLSI Design of AES Cryptography Based on DNA TRNG Design

IP Core Design of Hight Lightweight Cipher and its Implementation

IP CORE DESIGN OF HIGHT LIGHTWEIGHT CIPHER AND ITS IMPLEMENTATION

Recently uploaded

Online banking management system project.pdfKamal Acharya

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N

Roadmap to Membership of RICS - Pathways and RoutesM Maged Hegazy, LLM, MBA, CCP, P3O

(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7Call Girls in Nagpur High Profile Call Girls

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat

BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxfenichawla

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis

Introduction to Multiple Access Protocol.pptxupamatechverse

KubeKraft presentation @CloudNativeHooghlysanyuktamishra911

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani

UNIT-II FMM-Flow Through Circular Conduitsrknatarajan

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

Introduction and different types of Ethernet.pptxupamatechverse

Porous Ceramics seminar and technical writingrakeshbaidya232001

ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya

AKTU Computer Networks notes --- Unit 3.pdfankushspencer015

Recently uploaded (20)

Online banking management system project.pdf

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE

Roadmap to Membership of RICS - Pathways and Routes

(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts

High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts

BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...

Introduction to Multiple Access Protocol.pptx

KubeKraft presentation @CloudNativeHooghly

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record

UNIT-II FMM-Flow Through Circular Conduits

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik

Introduction and different types of Ethernet.pptx

Porous Ceramics seminar and technical writing

ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf

AKTU Computer Networks notes --- Unit 3.pdf

TiReX: a Tiled Regular eXpression matching architecture

1. TiReX Davide Conficconi <davide.conficconi@mail.polimi.it> Alessandro Comodi <alessandro.comodi@mail.polimi.it> NECST lab, Politecnico di MIlano @Oracle HQ June 7, 2017

2. • Signature based detection 2 Applicative Scenarios • Genomic data analysis

3. • High speed requirements 3 Issues • Huge amount of data to handle

4. • RE as instructions, the needs of a customized processor 4 Our idea • Reconfigurable ISA

5. 5 Architecture overview Fetch and Decode Execute Control Path

6. 6 Architecture overview Execute Control Path

7. 7 Architecture overview Control Path

8. 8 Architecture overview

9. 9 Trivial example Regular expression : ACCGTGGA Opcode Reference & ACCG & TGGA NOP - Input string : ACACCGTGGA Instructions Clock Cycles Data #1 #2 #3 #4 #5 #6 & ACCG FD EX ACAC & ACCG FD EX CACC & ACCG FD EX ACCG & TGGA FD EX TGGA NOP FD EX -

10. 10 Example: Kleene operators Opcode Refernce ( - &)+ TTTT & CT NOP - Regular expression : (TTTT)+CT Input string : TCTTTTCT Instructions Clock Cycles Data Opcode / Ref #1 #2 #3 #4 #5 #6 #7 #8 ( FD EX - &)+ TTTT FD EX TCTT &)+ TTTT FD EX CTTT &)+ TTTT FD EX TTTT &)+ TTTT FD EX CT-- & CT FD EX CT-- NOP FD EX -

11. Regular Expression Time Flex [µs] Time TiReX [µs] File dimension Speedup factor ACCGTGGA 271 us 39,9 us 16 kB x6 (TTTT)+CT 121 us 81,35 us 16 kB x1.5 (CAGT)|(GGGG)|(TTGG)TGCA(C|G)+ 263 us 173,835 us 16 kB x1.5 11 Flex vs TiReX

12. Area Utilization VC-707 Evaluation Board Slice LUTs Slice Registers F7 Muxes F8 Muxes Slice LUT as Logic LUT as FF Pairs Available 303600 607200 75900 75900 75900 303600 303600 TiReX 1757 [0.57%] 1590 [0.26%] 265 [0.35%] 114 [0.15%] 890 [1.17%] 1757 [0.57%] 310 [0.1%] 12

13. From single core to multicore 13

14. • Tiled Architecture • Two ways to operate: – Same RE multiple data streams – Multiple RE single data stream 14 From single to multicore

15. • Single Core Pattern Matching Architecture • Current implementation outperform Flex performance • Future works: Multicore architecture 15 Conclusions & Future works

16. 16 Davide Conficconi <davide.conficconi@mail.polimi.it> Alessandro Comodi <alessandro.comodi@mail.polimi.it> TiReX team <tirexatnecst@gmail.com> Questions?

17. 17 Testing System

18. A further step: Multicore 18 • Dark silicon problem why fpga

Editor's Notes

Hi to everyone i’m davide a master student in cs and engineering at engineering at politecnico di milano and now i will present to u tirex tiled regular expressions matching architecture
Our focus is on regular expression that have several applicative domains that ranges from signature based detection for antivirus and network intrusion detection systems to genomic data analysis for personalized medicine and diagnostics I’d like to introduce 2 applicative scenarios…. That has the common task of finding
Pattern matching is a compute intesive task and furthermore has high speed requirements and needs to manage huge amount of data. For example billions of characters compose the human dna. Disadvantage wrt sw solutions
Making a further step we can see RE as set of instructions over a stream of data. For example we can see & and | as a plus or minus. But since a processor has a fixed instruction set But we don’t want a fixed ISA, and we design the core to have a reconfigurable ISA. Thus whenever a user wants to match a new RE. He write the RE, pass it to the compiler that translates into the machine code that will drive the computation of the processor.
Our architecture has a 2 pipeline stage, composed by fd and ex stage, and the control path that synchronize the computation and keeps track of re status
Firstly we fetch an instruction from the instruction memory then the decode unit produces three signals: an opcode of the instruction, the reference, that are the characters that has to be matched, and the valid reference that is the number of characters present in the instruction
Afterwards we have the execute phase. The data are fetched from the data buffer and passed to the clusters. Those cluster are a set of comparators that produce a result signal. Each cluster takes as input a chunk of data, each one shifted by a position  esempio??? Then each intermediate result is procesed by the engine that depending on the opcode and the valid refernce produce and global result.
Lastly we have the control unit, composed by a fsm that synchronize the pipeline and produce the control signals. We have also a status register and stack in order manage with the context switch of an open parenthesis.
Esempio degli operatori di kleene comparazione tra flex e il nostro core. mancaaa
Esempio degli operatori di kleene comparazione tra flex e il nostro core. mancaaa
Even if we have just a simple prototype we compare our core to flex. As we can see from the table tirex outperform flex in all these three kind of examples.
The other result i want to show u is the area utilization. We implement tirex on vc707 board powered by virtex 7. The table evidence about we are underutilizing the fpga resources.
Considering this factor and the huge amount of data we have to deal with. We are going from a single core architecture to a multicore architecture able to manage this brontosaurus data and reach high performance
This kind of architecture can interoperate with 2 differnt modes. The simd where we have the same RE for each core and we divide the stream of data to achieve a parallel computation. The other mode is MISD, with security application field we have the problem of having a lot of RE to be matched over the same stream of data. Thus let the user to decide which kind of modus operandi use dependending also on the application scenario.
In conclusion i’ve presented to u a single core pattern matching architecture that at the current implementation outperform flex. As future works we are working on the multicore architecture to push on performance side and on the amount of dat we can process
Lastly we have the control unit, composed by a fsm that synchronize the pipeline and produce the control signals. We have also a status register and stack in order manage with the context switch of an open parenthesis.

TiReX: a Tiled Regular eXpression matching architecture

Recommended

Recommended

More Related Content

Similar to TiReX: a Tiled Regular eXpression matching architecture

Similar to TiReX: a Tiled Regular eXpression matching architecture (20)

More from NECST Lab @ Politecnico di Milano

More from NECST Lab @ Politecnico di Milano (20)

Recently uploaded

Recently uploaded (20)

TiReX: a Tiled Regular eXpression matching architecture

Editor's Notes