SlideShare a Scribd company logo
1 of 14
Download to read offline
Parallelization of a string-matching algorithm
Advanced Algorithms
Alessandro Liparoti
<Name Surname>
2
String-matching: AC algorithm
 String-matching algorithms are a class of algorithms
that aim to find occurrences of words (patterns) within
a larger string (text)
 Aho-Corasick algorithm (AC) is a classic solution to
exact set matching.
 Given
pattern set 𝑃 = { 𝑃1, . . . , 𝑃𝑘 }
text 𝑇[1 … 𝑚]
total length of patterns n = 𝑖=1
𝑘
|𝑃𝑖|
the AC algorithms complexity is 𝑂(𝑛 + 𝑚 + 𝑧), where 𝑧
is the number of pattern occurrences in 𝑇
<Name Surname>
3
AC algorithm: finite-state machine
 The AC algorithm builds a finite-state machine to
efficiently memorize the pattern set
 The FSA is memorized along with three functions
the goto function 𝑔(𝑞, 𝑎) gives the state entered
from current state 𝑞 by matching target char 𝑎
the failure function 𝑓 𝑞 , 𝑞 ≠ 0 gives the state
entered at a mismatch
the output function out 𝑞 gives the set of patterns
recognized when entering state q
<Name Surname>
4
AC algorithm: FSA example
 𝑃 = ℎ𝑒, 𝑠ℎ𝑒, ℎ𝑖𝑠, ℎ𝑒𝑟𝑠
 Dashed arrows are fail transitions
<Name Surname>
5
AC algorithm: matching phase
 The AC algorithm uses the FSA to match the text
against the keywords
 𝐴𝐶_𝑚𝑎𝑡𝑐ℎ𝑖𝑛𝑔 𝑇 1 … 𝑚
𝑞 ≔ 0; // initial state (root)
𝒇𝒐𝒓 𝑖 ≔ 1 𝒕𝒐 𝑚 𝒅𝒐
𝒘𝒉𝒊𝒍𝒆 𝑔 𝑞, 𝑇 𝑖 = 0 𝒅𝒐
𝑞 ≔ 𝑓 𝑞 ; // follow a fail
𝑞 ≔ 𝑔 𝑞, 𝑇 𝑖 ; // follow a goto
𝒊𝒇 𝑜𝑢𝑡 𝑞 ≠ 0 𝒕𝒉𝒆𝒏 𝒑𝒓𝒊𝒏𝒕 𝑖, 𝑜𝑢𝑡 𝑞 ;
𝒆𝒏𝒅𝒇𝒐𝒓
 The number of steps of the loop is equal to the length
of the text
<Name Surname>
6
Parallelization step
 Idea: parallelize the matching phase of the AC
algorithm (the FSA can be built once for each
pattern data set)
 The 𝑚 steps of the loop can be split in 𝑘 chunks,
each one of length 𝑙 = 𝑚 𝑘 and then each chunk
can be processed by a thread
 Feasible because a chunk can be independently
analyzed
 𝑚 = 19 𝑘 = 3 𝑙 = 7
<Name Surname>
7
Parallelization: problems
 The splitting phase as performed before can lead
to missing occurrences
 Let assume 𝑃 = 𝑎𝑑𝑣, 𝑜𝑟𝑖𝑡, 𝑒𝑑
 Each thread would run AC on its related chunk
Thread 1: 𝑇 = 𝑎𝑑𝑣𝑎𝑛𝑐𝑒
Thread 2: 𝑇 = 𝑑 𝑎𝑙𝑔𝑜𝑟
Thread 3: 𝑇 = 𝑖𝑡ℎ𝑚𝑠
 None of them would find the occurrences of the
second and third keyword
 Needed a redundancy for text overlapping two
chunks
<Name Surname>
8
Parallelization: solutions
 The maximum needed overlap o is the lenght of
the longest word in the pattern data set – 1
 Each chunk will contain the last o characters of the
previous one
 However: orit correctly found by thread 3 but ed
incorrectly matched twice (threads 1 and 2)
 Correction: start counting matches only after o
characters read
<Name Surname>
9
Implementation
 AC has been implemented in C using openMP; the
matching-phase has been split among threads
using the pragma for structure
 Input: text, keywords, number of threads
 Output: number of occurences
 The chunk size 𝑙 is computed with the following
formula
𝑙 = 𝑚 + 𝑜𝑣 ( 𝑘 − 1)
 The output variable is aggregated after the end of
the loop ( reduction statement )
<Name Surname>
10
Implementation
 Each read character is converted in its ASCII code
 Therefore, the FSA
allows 256 different
transitions
 It allows to use the AC
algorithm even with
non-textual files
 Binary files must be
read bytewise
<Name Surname>
11
Test
 Very large input files have been used in order to
test the algorithm’s performance
a text file containing the English version of the
bible
a dictionary including the 10000 most common
English words
 A single test consists of an aggregation measure of
10 different runs of the algorithm on the inputs
using the same number of threads
<Name Surname>
12
Test
1 2 3 4 5 6 7 8 9 10
0
50
100
150
200
250
300
350
number of threads
executiontime(sec)
i7 4700MQ - 4 cores/8 threads
Mean
Minimum
<Name Surname>
13
Test
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
0
15
30
45
60
75
90
105
number of threads
executiontime(sec)
12 cores/24 threads machine
mean
min
<Name Surname>
14
Conclusion
 In this work it has been showed a parallelization
procedure for a serial-designed algorithm
 The more threads are used the faster the
algorithm runs until a certain point after which we
do not get any improvements
 Parallelization improves performance but requires
modifications not always clear from the beginning
that often lead to overheads

More Related Content

What's hot (20)

Insertion sort algorithm power point presentation
Insertion  sort algorithm power point presentation Insertion  sort algorithm power point presentation
Insertion sort algorithm power point presentation
 
Sorting algorithms
Sorting algorithmsSorting algorithms
Sorting algorithms
 
Doubly linked list (animated)
Doubly linked list (animated)Doubly linked list (animated)
Doubly linked list (animated)
 
Data Structures - Searching & sorting
Data Structures - Searching & sortingData Structures - Searching & sorting
Data Structures - Searching & sorting
 
single linked list
single linked listsingle linked list
single linked list
 
Sorting Algorithms
Sorting AlgorithmsSorting Algorithms
Sorting Algorithms
 
Counting sort
Counting sortCounting sort
Counting sort
 
Data Structures - Lecture 9 [Stack & Queue using Linked List]
 Data Structures - Lecture 9 [Stack & Queue using Linked List] Data Structures - Lecture 9 [Stack & Queue using Linked List]
Data Structures - Lecture 9 [Stack & Queue using Linked List]
 
Some question for Section C (Embeded )
Some question for Section C (Embeded )Some question for Section C (Embeded )
Some question for Section C (Embeded )
 
Circular link list.ppt
Circular link list.pptCircular link list.ppt
Circular link list.ppt
 
Sorting
SortingSorting
Sorting
 
Sorting network
Sorting networkSorting network
Sorting network
 
Sortingnetworks
SortingnetworksSortingnetworks
Sortingnetworks
 
Linked list
Linked listLinked list
Linked list
 
Linked list
Linked listLinked list
Linked list
 
Module 01 Stack and Recursion
Module 01 Stack and RecursionModule 01 Stack and Recursion
Module 01 Stack and Recursion
 
Parallel sorting Algorithms
Parallel  sorting AlgorithmsParallel  sorting Algorithms
Parallel sorting Algorithms
 
String Library Functions
String Library FunctionsString Library Functions
String Library Functions
 
Implementation of queue using singly and doubly linked list.
Implementation of queue using singly and doubly linked list.Implementation of queue using singly and doubly linked list.
Implementation of queue using singly and doubly linked list.
 
Counting sort
Counting sortCounting sort
Counting sort
 

Similar to Parallelization of the Aho-Corasick string-matching algorithm

Advanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter pptAdvanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter pptMuhammad Sikandar Mustafa
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandrarantav
 
Module 1 notes of data warehousing and data
Module 1 notes of data warehousing and dataModule 1 notes of data warehousing and data
Module 1 notes of data warehousing and datavijipersonal2012
 
c++ Data Types and Selection
c++ Data Types and Selectionc++ Data Types and Selection
c++ Data Types and SelectionAhmed Nobi
 
JavaScript Objects
JavaScript ObjectsJavaScript Objects
JavaScript ObjectsReem Alattas
 
Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
Parallel Algorithms: Sort & Merge, Image Processing, Fault ToleranceParallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
Parallel Algorithms: Sort & Merge, Image Processing, Fault ToleranceUniversity of Technology - Iraq
 
Introducing Pattern Matching in Scala
 Introducing Pattern Matching  in Scala Introducing Pattern Matching  in Scala
Introducing Pattern Matching in ScalaAyush Mishra
 
Intro to tsql unit 10
Intro to tsql   unit 10Intro to tsql   unit 10
Intro to tsql unit 10Syed Asrarali
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency ConstructsTed Leung
 
COMPANION TO MATRICES SESSION II.pptx
COMPANION TO MATRICES SESSION II.pptxCOMPANION TO MATRICES SESSION II.pptx
COMPANION TO MATRICES SESSION II.pptximman gwu
 

Similar to Parallelization of the Aho-Corasick string-matching algorithm (20)

5. string
5. string5. string
5. string
 
Gk3611601162
Gk3611601162Gk3611601162
Gk3611601162
 
Advanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter pptAdvanced procedures in assembly language Full chapter ppt
Advanced procedures in assembly language Full chapter ppt
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
Module 1 notes of data warehousing and data
Module 1 notes of data warehousing and dataModule 1 notes of data warehousing and data
Module 1 notes of data warehousing and data
 
c++ Data Types and Selection
c++ Data Types and Selectionc++ Data Types and Selection
c++ Data Types and Selection
 
JavaScript Objects
JavaScript ObjectsJavaScript Objects
JavaScript Objects
 
Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
Parallel Algorithms: Sort & Merge, Image Processing, Fault ToleranceParallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance
 
Ch09
Ch09Ch09
Ch09
 
PHP Web Programming
PHP Web ProgrammingPHP Web Programming
PHP Web Programming
 
Aes
AesAes
Aes
 
Introducing Pattern Matching in Scala
 Introducing Pattern Matching  in Scala Introducing Pattern Matching  in Scala
Introducing Pattern Matching in Scala
 
Intro to tsql unit 10
Intro to tsql   unit 10Intro to tsql   unit 10
Intro to tsql unit 10
 
C programming part4
C programming part4C programming part4
C programming part4
 
C programming part4
C programming part4C programming part4
C programming part4
 
A Survey of Concurrency Constructs
A Survey of Concurrency ConstructsA Survey of Concurrency Constructs
A Survey of Concurrency Constructs
 
IJETAE_1013_119
IJETAE_1013_119IJETAE_1013_119
IJETAE_1013_119
 
Fault Detection AES
Fault Detection AESFault Detection AES
Fault Detection AES
 
COMPANION TO MATRICES SESSION II.pptx
COMPANION TO MATRICES SESSION II.pptxCOMPANION TO MATRICES SESSION II.pptx
COMPANION TO MATRICES SESSION II.pptx
 
Matlabch01
Matlabch01Matlabch01
Matlabch01
 

Recently uploaded

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

Recently uploaded (20)

The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

Parallelization of the Aho-Corasick string-matching algorithm

  • 1. Parallelization of a string-matching algorithm Advanced Algorithms Alessandro Liparoti
  • 2. <Name Surname> 2 String-matching: AC algorithm  String-matching algorithms are a class of algorithms that aim to find occurrences of words (patterns) within a larger string (text)  Aho-Corasick algorithm (AC) is a classic solution to exact set matching.  Given pattern set 𝑃 = { 𝑃1, . . . , 𝑃𝑘 } text 𝑇[1 … 𝑚] total length of patterns n = 𝑖=1 𝑘 |𝑃𝑖| the AC algorithms complexity is 𝑂(𝑛 + 𝑚 + 𝑧), where 𝑧 is the number of pattern occurrences in 𝑇
  • 3. <Name Surname> 3 AC algorithm: finite-state machine  The AC algorithm builds a finite-state machine to efficiently memorize the pattern set  The FSA is memorized along with three functions the goto function 𝑔(𝑞, 𝑎) gives the state entered from current state 𝑞 by matching target char 𝑎 the failure function 𝑓 𝑞 , 𝑞 ≠ 0 gives the state entered at a mismatch the output function out 𝑞 gives the set of patterns recognized when entering state q
  • 4. <Name Surname> 4 AC algorithm: FSA example  𝑃 = ℎ𝑒, 𝑠ℎ𝑒, ℎ𝑖𝑠, ℎ𝑒𝑟𝑠  Dashed arrows are fail transitions
  • 5. <Name Surname> 5 AC algorithm: matching phase  The AC algorithm uses the FSA to match the text against the keywords  𝐴𝐶_𝑚𝑎𝑡𝑐ℎ𝑖𝑛𝑔 𝑇 1 … 𝑚 𝑞 ≔ 0; // initial state (root) 𝒇𝒐𝒓 𝑖 ≔ 1 𝒕𝒐 𝑚 𝒅𝒐 𝒘𝒉𝒊𝒍𝒆 𝑔 𝑞, 𝑇 𝑖 = 0 𝒅𝒐 𝑞 ≔ 𝑓 𝑞 ; // follow a fail 𝑞 ≔ 𝑔 𝑞, 𝑇 𝑖 ; // follow a goto 𝒊𝒇 𝑜𝑢𝑡 𝑞 ≠ 0 𝒕𝒉𝒆𝒏 𝒑𝒓𝒊𝒏𝒕 𝑖, 𝑜𝑢𝑡 𝑞 ; 𝒆𝒏𝒅𝒇𝒐𝒓  The number of steps of the loop is equal to the length of the text
  • 6. <Name Surname> 6 Parallelization step  Idea: parallelize the matching phase of the AC algorithm (the FSA can be built once for each pattern data set)  The 𝑚 steps of the loop can be split in 𝑘 chunks, each one of length 𝑙 = 𝑚 𝑘 and then each chunk can be processed by a thread  Feasible because a chunk can be independently analyzed  𝑚 = 19 𝑘 = 3 𝑙 = 7
  • 7. <Name Surname> 7 Parallelization: problems  The splitting phase as performed before can lead to missing occurrences  Let assume 𝑃 = 𝑎𝑑𝑣, 𝑜𝑟𝑖𝑡, 𝑒𝑑  Each thread would run AC on its related chunk Thread 1: 𝑇 = 𝑎𝑑𝑣𝑎𝑛𝑐𝑒 Thread 2: 𝑇 = 𝑑 𝑎𝑙𝑔𝑜𝑟 Thread 3: 𝑇 = 𝑖𝑡ℎ𝑚𝑠  None of them would find the occurrences of the second and third keyword  Needed a redundancy for text overlapping two chunks
  • 8. <Name Surname> 8 Parallelization: solutions  The maximum needed overlap o is the lenght of the longest word in the pattern data set – 1  Each chunk will contain the last o characters of the previous one  However: orit correctly found by thread 3 but ed incorrectly matched twice (threads 1 and 2)  Correction: start counting matches only after o characters read
  • 9. <Name Surname> 9 Implementation  AC has been implemented in C using openMP; the matching-phase has been split among threads using the pragma for structure  Input: text, keywords, number of threads  Output: number of occurences  The chunk size 𝑙 is computed with the following formula 𝑙 = 𝑚 + 𝑜𝑣 ( 𝑘 − 1)  The output variable is aggregated after the end of the loop ( reduction statement )
  • 10. <Name Surname> 10 Implementation  Each read character is converted in its ASCII code  Therefore, the FSA allows 256 different transitions  It allows to use the AC algorithm even with non-textual files  Binary files must be read bytewise
  • 11. <Name Surname> 11 Test  Very large input files have been used in order to test the algorithm’s performance a text file containing the English version of the bible a dictionary including the 10000 most common English words  A single test consists of an aggregation measure of 10 different runs of the algorithm on the inputs using the same number of threads
  • 12. <Name Surname> 12 Test 1 2 3 4 5 6 7 8 9 10 0 50 100 150 200 250 300 350 number of threads executiontime(sec) i7 4700MQ - 4 cores/8 threads Mean Minimum
  • 13. <Name Surname> 13 Test 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 0 15 30 45 60 75 90 105 number of threads executiontime(sec) 12 cores/24 threads machine mean min
  • 14. <Name Surname> 14 Conclusion  In this work it has been showed a parallelization procedure for a serial-designed algorithm  The more threads are used the faster the algorithm runs until a certain point after which we do not get any improvements  Parallelization improves performance but requires modifications not always clear from the beginning that often lead to overheads