SlideShare a Scribd company logo
Hardware Acceleration of TEA and XTEA Algorithms on FPGA, GPU and
                                                        Multi-Core Processors
                                                                                  Vivek Venugopal and Devu Manikantan Shila {venugov, manikad}@utrc.utc.com

 Introduction                                                                                              Tiny Encryption Algorithm (TEA) Extended Tiny Encryption Algorithm (XTEA)
                                                                                                                                                                                            half round1                                                                                                                                               half round 2                                                                                          half round1                                                               half round 2
                                                                                                                              v1 32
                                                                                                                                                                                                                                                                              32                                                                                                                                          v1 32            << 4                                                     32
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         << 4
                                                                                                                                                 << 4                                                                                                                                               << 4
                                                                                                                              k0   32                                  +                                                                                            k2 32                                                         +                                                                                        v1   32
                                                                                                                                                                                                                                                                                                                                                                                                                                           >> 5
                                                                                                                                                                                                                                                                                                                                                                                                                                                      XOR
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    32
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         >> 5
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          XOR


                                                                                                                              v1      32                                                                                                                                      32
                                                                                                                                                                                                                                                                                                                                                                                                                           v1   32
                                                                                                                                      32       +                                           XOR
                                                                                                                                                                                                                                                                               32                   +                                          XOR                                                                                                          +                                                                    +
                                                                                                                              sum                                                                                                                                  sum

   Gateway to                                                                                                                         32                                                                                                                                       32                                                                                                                                               32                                                                               32
                                                                                                                              v1                 >> 5                                                                                                                                               >> 5                                                                                                                 sum0                                                                               ky
    Internet
                                 GPU + ARM (NVIDIA CARMA)                                                                     k1      32                               +                             XOR                                                              k3 32                                                     +                  XOR
                                                                                                                                                                                                                                                                                                                                                                                                                          kx    32                     +        XOR
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         sum1    32      +           XOR
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       v1_new
                                                                                                                                                                                                                                                                                                                                                                              v1_new
                                                              Planning                                                                                                                                              32                   +/-                   v0_new                                                                                        32   +/-                                                                                                 32   +/-                                                             32    +/-
                                                                                                                                                                                                     v0                                                                                                                                            v1                                                                                                           v0                              v0_new                               v1
                                                             Computer
                                                                                                                              encrypt/decrypt                                                                                                                                                                                                                                                                            encrypt/decrypt
       Encrypted communication

                                                                                     Flight Control and
                                                                                    Navigation Computer   • TEA uses addition, XOR and shift operations on 32-bit words • The Extended Tiny Encryption Algorithm (XTEA) was introduced after
                                                                                                          and has a very small code footprint.                                                                                                                                                                                                                                                       weaknesses for smaller rounds were found in TEA.
  Smart meter application         FPGA + ARM (Xilinx Zynq)
                                                                Unmanned Autonomous Vehicle               • TEA has security holes and weaknesses for smaller rounds,                                                                                                                                                                                                                                • In XTEA, the key scheduling is modified to reflect different patterns for
                                                                                                          especially the Avalanche Effect seen for 6 rounds                                                                                                                                                                                                                                          mixing the data and key continuously per round.
 • In smart grids, sensitive information such as power
 consumption, price update, or outage awareness is
 exchanged between the meters and the power utility
                                                                                                                                                                                                                        Implementation platforms and Results                                                                                                                                                                                                                                    8000
                                                                                                                                                                                                                                                                                                                                                                                             8000                                                                                                                     Intel Xeon X5650                          Nvidia C2070
 company in real-time over the Internet.                                                                  • Nvidia's Tesla C2070 high-end GPU, 2 hexa-core                                                                                                                                                                                                                                          Intel Xeon X5650
                                                                                                                                                                                                                                                                                                                                                                                                    Nvidia C2070
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Intel Quad core i7                        Nvidia GT650M
 • Unmanned Autonomous Vehicles (UAV) continuously                                                        Intel Xeon processors, Nvidia's GeForce GT 650M                                                                                                                                                                                                                                           Intel Quad core i7
                                                                                                                                                                                                                                                                                                                                                                                                    Nvidia GT650M                                                                               6000
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      Zynq

 exchange dynamic information regarding the urban                                                         notebook GPU consisting of 384 cores, quad-core                                                                                                                                                                                                                                    6000




                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Throughput in Mbps
                                                                                                                                                                                                                                                                                                                                                                                                    Zynq




                                                                                                                                                                                                                                                                                                                                                                        Throughput in Mbps
 environment with a gateway. The gateway also provides                                                    Intel Core i7 CPU.
 feedback regarding the optimization parameters that                                                      • Xilinx's Zynq-7000 SoC ZC702 evaluation board.                                                                                                                                                                                                                                   4000
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                4000

 need to be fed into the UAV's path planning algorithm                                                    The Zynq-7000 platform consists of a dual ARM
 for mapping different routes to reach it's destination                                                   Cortex A-9 processor clocked at 800 MHz and                                                                                                                                                                                                                                                                                                                                           2000
                                                                                                                                                                                                                                                                                                                                                                                             2000
 safely.                                                                                                  Artix-7 FPGA as the programmable logic.                       Streaming Multiprocessor (SMX) Architecture
                                                                                                                                                                        Kepler GK110’s new SMX introduces several architectural innovations that make it not only the most




 • Cyber attacks on such critical and dynamic
                                                                                                                                                                        powerful multiprocessor we’ve built, but also the most programmable and power efficient.



                                                                                                                                                                                                                                                                                                                    Copy input data and
                                                                                                                                                                                                                                                                                                                   keys to GPU memory
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   0
 information can lead to severe losses of                                                                                                                                                                                                                                                                                                                                                       0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         8 KB                16 KB              8 MB       128 MB        1 GB
                                                                                                                                                                                                                                                                                                                                                                                                    8 KB      16 KB             8 MB              128 MB      1 GB
 resources and finance.                                                                                            SMX

                                                                                                            Control Logic
                                                                                                                                           SMX

                                                                                                                                      Control Logic
                                                                                                                                                                                                                                                                                                                  pre-compute sum values
                                                                                                                                                                                                                                                                                                                  for each round and store
                                                                                                                                                                                                                                                                                                                      in shared memory                                                                                Plaintext size
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Plaintext size
                                                                                                                                                                                                                                                                                                                                                                                                             Throughput (Mbps) comparison of TEA                                                                       Throughput (Mbps) comparison of XTEA

Motivation                                                                                                                                                                                                                                                                                                          calculate ciphers for
                                                                                                                                                                                                                                                                                                                     blocks in parallel




 • All the information from/to these smart meters need                                                           GT650M: 2 SMX with
                                                                                                                                                                                                                                                                                                                    copy ciphers back to
                                                                                                                                                                                                                                                                                                                            CPU
                                                                                                                                                                                                                                                                                                                                                                                              Conclusion
 to be decrypted/encrypted at the gateway, which in                                                                192 cores each                                                                            Inside SMX                                                                                     GPU Implementation
                                                                                                                                                                                                                                                                                                                                                                                              • GPUs and FPGAs provide better throughput for both TEA and XTEA as
                                                                                                                                                                        SMX: 192 single precision CUDA cores, 64 double precision units, 32 special function units (SFU), and 32 load/store units
                                                                                                                                                                        (LD/ST).




 turn can lead to very large response times. A larger
                                                                                                                                                                                                                                                                                                                                                                                              compared to CPUs.
                                                                                                                                       Flash          DRAM           SRAM



 response time implies poorer performance in terms of
 both throughput and latency.
                                                                                                          GIGe


                                                                                                          USB
                                                                                                                        Processing
                                                                                                                         System
                                                                                                                                                     Memory
                                                                                                                                                    Interfaces                         Custom
                                                                                                                                                                                                                                        Displays


                                                                                                                                                                                                                                           PCIe                      Running on Zynq board                                                 Running in ISIM
                                                                                                                                                                                                                                                                                                                                                                                              • FPGAs perform better for smaller plaintext sizes whereas GPUs are better for
                                                                                                                                                                                                                                                                                                                                                                                              larger plaintext sizes.
 • Continuous transmission of data from UAV regarding                                                     CAN
                                                                                                                                                                                                                                                                                                                        AXI Interconnect




                                                                                                                                                                                                                                                                                                                                                                                              • In terms of development time and cost, GPUs are better suited as embedded
                                                                                                                                               Dual ARM Cortex A-9
                                                                                                                          Fixed                 MPCore (800 MHz)
                                                                                                          I2C                                                                       Peripheral
                                                                                                                        peripherals


 the evidence grid need to be encrypted fast.
                                                                                                                                                                                                                                      SelectIO
                                                                                                                                                                                                                                     Resources
                                                                                                                                                                                                                                                                              Processing                                                             Programmable
                                                                                                          SD                                                                                                                                                                   System                                                                    Logic


                                                                                                                                                                                                                                                                                                                                                                                              cryptography co-processors as compared to FPGAs.
                                                                                                                                                                                                                                                                                                           JTAG


 • FPGAs and GPUs can be used in gateways to speed
                                                                                                          UART
                                                                                                                         2x 12-bit
                                                                                                                                                     Custom          Programmable

                                                                                                                                                                                                                                                                                                                                                                                              • Future research efforts may address the use of Zynq platform as a complete, low-
                                                                                                          GPIO          MSPS ADC                                                                                                        Memory
                                                                                                                                                                         Logic

 up the TEA/XTEA encryption and decryption of bulk
 information for improved throughput and latency.
                                                                                                                                      Analog        Monitors         Analog
                                                                                                                                                                                                                                                                                                                                                                                              cost cryptographic co-processor for more complex cryptographic algorithms
                                                                                                                               Zynq Internal block diagram                                                                                                                                      Hardware in Loop setup




 References
[1] D. J. Wheeler and R. M. Needham. TEA, a tiny encryption algorithm, 1995.
[2] D. J. Wheeler and R. M. Needham. TEA extensions. Technical report, Cambridge University, England, October 1997.
[3] Xilinx Inc. Xilinx Zynq-7000 SoC ZC702 Evaluation kit.
[4] Nvidia Inc. (Last Accessed: February 2012) Nvidia Tesla C2070 GPU Computing Processor, Nvidia GeoForce GT650M Notebook GPU [Available Online]

More Related Content

Similar to Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core processors

Crompton Way Traffic Proposal Map
Crompton Way Traffic Proposal MapCrompton Way Traffic Proposal Map
Crompton Way Traffic Proposal Mapguestf8bf20
 
Extending carriers network with fring OTT
Extending carriers network with fring OTTExtending carriers network with fring OTT
Extending carriers network with fring OTT
Roy Timor-Rousso
 
9 18 Part 2
9 18 Part 29 18 Part 2
9 18 Part 2
burgerja
 
Whitehall Framework Plan
Whitehall Framework Plan  Whitehall Framework Plan
Whitehall Framework Plan
ExSite
 
Whitehall Framework Plan
Whitehall Framework Plan  Whitehall Framework Plan
Whitehall Framework Plan
ExSite
 
Jan&rsquo;s Health Bar Proposed Patio Revisions
Jan&rsquo;s Health Bar Proposed Patio RevisionsJan&rsquo;s Health Bar Proposed Patio Revisions
Jan&rsquo;s Health Bar Proposed Patio Revisionswedway
 
La Corda D'Oro: Brand New Breeze for Violin
La Corda D'Oro: Brand New Breeze for ViolinLa Corda D'Oro: Brand New Breeze for Violin
La Corda D'Oro: Brand New Breeze for Violin
sayakahime
 
Fools garden lemon tree
Fools garden   lemon treeFools garden   lemon tree
Fools garden lemon treeSah Ya
 
CambridgeIP: IP Data as a source of Business Intelligence
CambridgeIP: IP Data as a source of Business IntelligenceCambridgeIP: IP Data as a source of Business Intelligence
CambridgeIP: IP Data as a source of Business Intelligence
CambridgeIP Ltd
 
AC/DC highway to hell
AC/DC highway to hellAC/DC highway to hell
AC/DC highway to hell
dhan drummer
 
BOV, Abu Dhabi, U.A.E.
BOV, Abu Dhabi, U.A.E.BOV, Abu Dhabi, U.A.E.
BOV, Abu Dhabi, U.A.E.
Starckn
 
Cafe Life Thumbnail Charts
Cafe Life Thumbnail ChartsCafe Life Thumbnail Charts
Cafe Life Thumbnail Chartsguest7cc3e6
 
Memorias (Juan Pablo Cediel)
Memorias (Juan Pablo Cediel)Memorias (Juan Pablo Cediel)
Memorias (Juan Pablo Cediel)pabloced
 
Architectural Portfolio
Architectural PortfolioArchitectural Portfolio
Architectural Portfolio
Sam Sampoux
 
クラウドコンピューティングと OSS
クラウドコンピューティングと OSSクラウドコンピューティングと OSS
クラウドコンピューティングと OSS
Open Source Software Association of Japan
 

Similar to Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core processors (20)

Crompton Way Traffic Proposal Map
Crompton Way Traffic Proposal MapCrompton Way Traffic Proposal Map
Crompton Way Traffic Proposal Map
 
Rain in-spring
Rain in-springRain in-spring
Rain in-spring
 
Or cad
Or cadOr cad
Or cad
 
Extending carriers network with fring OTT
Extending carriers network with fring OTTExtending carriers network with fring OTT
Extending carriers network with fring OTT
 
9 18 Part 2
9 18 Part 29 18 Part 2
9 18 Part 2
 
Whitehall Framework Plan
Whitehall Framework Plan  Whitehall Framework Plan
Whitehall Framework Plan
 
Whitehall Framework Plan
Whitehall Framework Plan  Whitehall Framework Plan
Whitehall Framework Plan
 
Jan&rsquo;s Health Bar Proposed Patio Revisions
Jan&rsquo;s Health Bar Proposed Patio RevisionsJan&rsquo;s Health Bar Proposed Patio Revisions
Jan&rsquo;s Health Bar Proposed Patio Revisions
 
La Corda D'Oro: Brand New Breeze for Violin
La Corda D'Oro: Brand New Breeze for ViolinLa Corda D'Oro: Brand New Breeze for Violin
La Corda D'Oro: Brand New Breeze for Violin
 
Fools garden lemon tree
Fools garden   lemon treeFools garden   lemon tree
Fools garden lemon tree
 
CambridgeIP: IP Data as a source of Business Intelligence
CambridgeIP: IP Data as a source of Business IntelligenceCambridgeIP: IP Data as a source of Business Intelligence
CambridgeIP: IP Data as a source of Business Intelligence
 
Canon in-d
Canon in-dCanon in-d
Canon in-d
 
AC/DC highway to hell
AC/DC highway to hellAC/DC highway to hell
AC/DC highway to hell
 
21 chahd
21 chahd21 chahd
21 chahd
 
21 chahd
21 chahd21 chahd
21 chahd
 
BOV, Abu Dhabi, U.A.E.
BOV, Abu Dhabi, U.A.E.BOV, Abu Dhabi, U.A.E.
BOV, Abu Dhabi, U.A.E.
 
Cafe Life Thumbnail Charts
Cafe Life Thumbnail ChartsCafe Life Thumbnail Charts
Cafe Life Thumbnail Charts
 
Memorias (Juan Pablo Cediel)
Memorias (Juan Pablo Cediel)Memorias (Juan Pablo Cediel)
Memorias (Juan Pablo Cediel)
 
Architectural Portfolio
Architectural PortfolioArchitectural Portfolio
Architectural Portfolio
 
クラウドコンピューティングと OSS
クラウドコンピューティングと OSSクラウドコンピューティングと OSS
クラウドコンピューティングと OSS
 

More from Vivek Venugopalan

xDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
xDEFENSE: An Extended DEFENSE for mitigating Next Generation IntrusionsxDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
xDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
Vivek Venugopalan
 
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGADesign, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Vivek Venugopalan
 
Accelerating Real-Time LiDAR Data Processing Using GPUs
Accelerating Real-Time LiDAR Data Processing Using GPUsAccelerating Real-Time LiDAR Data Processing Using GPUs
Accelerating Real-Time LiDAR Data Processing Using GPUsVivek Venugopalan
 
Real-time processing for ATST
Real-time processing for ATSTReal-time processing for ATST
Real-time processing for ATST
Vivek Venugopalan
 
Accelerating Particle Image Velocimetry using Hybrid Architectures
Accelerating Particle Image Velocimetry using Hybrid ArchitecturesAccelerating Particle Image Velocimetry using Hybrid Architectures
Accelerating Particle Image Velocimetry using Hybrid Architectures
Vivek Venugopalan
 
CISL talk
CISL talkCISL talk

More from Vivek Venugopalan (6)

xDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
xDEFENSE: An Extended DEFENSE for mitigating Next Generation IntrusionsxDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
xDEFENSE: An Extended DEFENSE for mitigating Next Generation Intrusions
 
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGADesign, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
Design, Implementation and Security Analysis of Hardware Trojan Threats in FPGA
 
Accelerating Real-Time LiDAR Data Processing Using GPUs
Accelerating Real-Time LiDAR Data Processing Using GPUsAccelerating Real-Time LiDAR Data Processing Using GPUs
Accelerating Real-Time LiDAR Data Processing Using GPUs
 
Real-time processing for ATST
Real-time processing for ATSTReal-time processing for ATST
Real-time processing for ATST
 
Accelerating Particle Image Velocimetry using Hybrid Architectures
Accelerating Particle Image Velocimetry using Hybrid ArchitecturesAccelerating Particle Image Velocimetry using Hybrid Architectures
Accelerating Particle Image Velocimetry using Hybrid Architectures
 
CISL talk
CISL talkCISL talk
CISL talk
 

Recently uploaded

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 

Recently uploaded (20)

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 

Hardware acceleration of TEA and XTEA algorithms on FPGA, GPU and multi-core processors

  • 1. Hardware Acceleration of TEA and XTEA Algorithms on FPGA, GPU and Multi-Core Processors Vivek Venugopal and Devu Manikantan Shila {venugov, manikad}@utrc.utc.com Introduction Tiny Encryption Algorithm (TEA) Extended Tiny Encryption Algorithm (XTEA) half round1 half round 2 half round1 half round 2 v1 32 32 v1 32 << 4 32 << 4 << 4 << 4 k0 32 + k2 32 + v1 32 >> 5 XOR 32 >> 5 XOR v1 32 32 v1 32 32 + XOR 32 + XOR + + sum sum Gateway to 32 32 32 32 v1 >> 5 >> 5 sum0 ky Internet GPU + ARM (NVIDIA CARMA) k1 32 + XOR k3 32 + XOR kx 32 + XOR sum1 32 + XOR v1_new v1_new Planning 32 +/- v0_new 32 +/- 32 +/- 32 +/- v0 v1 v0 v0_new v1 Computer encrypt/decrypt encrypt/decrypt Encrypted communication Flight Control and Navigation Computer • TEA uses addition, XOR and shift operations on 32-bit words • The Extended Tiny Encryption Algorithm (XTEA) was introduced after and has a very small code footprint. weaknesses for smaller rounds were found in TEA. Smart meter application FPGA + ARM (Xilinx Zynq) Unmanned Autonomous Vehicle • TEA has security holes and weaknesses for smaller rounds, • In XTEA, the key scheduling is modified to reflect different patterns for especially the Avalanche Effect seen for 6 rounds mixing the data and key continuously per round. • In smart grids, sensitive information such as power consumption, price update, or outage awareness is exchanged between the meters and the power utility Implementation platforms and Results 8000 8000 Intel Xeon X5650 Nvidia C2070 company in real-time over the Internet. • Nvidia's Tesla C2070 high-end GPU, 2 hexa-core Intel Xeon X5650 Nvidia C2070 Intel Quad core i7 Nvidia GT650M • Unmanned Autonomous Vehicles (UAV) continuously Intel Xeon processors, Nvidia's GeForce GT 650M Intel Quad core i7 Nvidia GT650M 6000 Zynq exchange dynamic information regarding the urban notebook GPU consisting of 384 cores, quad-core 6000 Throughput in Mbps Zynq Throughput in Mbps environment with a gateway. The gateway also provides Intel Core i7 CPU. feedback regarding the optimization parameters that • Xilinx's Zynq-7000 SoC ZC702 evaluation board. 4000 4000 need to be fed into the UAV's path planning algorithm The Zynq-7000 platform consists of a dual ARM for mapping different routes to reach it's destination Cortex A-9 processor clocked at 800 MHz and 2000 2000 safely. Artix-7 FPGA as the programmable logic. Streaming Multiprocessor (SMX) Architecture Kepler GK110’s new SMX introduces several architectural innovations that make it not only the most • Cyber attacks on such critical and dynamic powerful multiprocessor we’ve built, but also the most programmable and power efficient. Copy input data and keys to GPU memory 0 information can lead to severe losses of 0 8 KB 16 KB 8 MB 128 MB 1 GB 8 KB 16 KB 8 MB 128 MB 1 GB resources and finance. SMX Control Logic SMX Control Logic pre-compute sum values for each round and store in shared memory Plaintext size Plaintext size Throughput (Mbps) comparison of TEA Throughput (Mbps) comparison of XTEA Motivation calculate ciphers for blocks in parallel • All the information from/to these smart meters need GT650M: 2 SMX with copy ciphers back to CPU Conclusion to be decrypted/encrypted at the gateway, which in 192 cores each Inside SMX GPU Implementation • GPUs and FPGAs provide better throughput for both TEA and XTEA as SMX: 192 single precision CUDA cores, 64 double precision units, 32 special function units (SFU), and 32 load/store units (LD/ST). turn can lead to very large response times. A larger compared to CPUs. Flash DRAM SRAM response time implies poorer performance in terms of both throughput and latency. GIGe USB Processing System Memory Interfaces Custom Displays PCIe Running on Zynq board Running in ISIM • FPGAs perform better for smaller plaintext sizes whereas GPUs are better for larger plaintext sizes. • Continuous transmission of data from UAV regarding CAN AXI Interconnect • In terms of development time and cost, GPUs are better suited as embedded Dual ARM Cortex A-9 Fixed MPCore (800 MHz) I2C Peripheral peripherals the evidence grid need to be encrypted fast. SelectIO Resources Processing Programmable SD System Logic cryptography co-processors as compared to FPGAs. JTAG • FPGAs and GPUs can be used in gateways to speed UART 2x 12-bit Custom Programmable • Future research efforts may address the use of Zynq platform as a complete, low- GPIO MSPS ADC Memory Logic up the TEA/XTEA encryption and decryption of bulk information for improved throughput and latency. Analog Monitors Analog cost cryptographic co-processor for more complex cryptographic algorithms Zynq Internal block diagram Hardware in Loop setup References [1] D. J. Wheeler and R. M. Needham. TEA, a tiny encryption algorithm, 1995. [2] D. J. Wheeler and R. M. Needham. TEA extensions. Technical report, Cambridge University, England, October 1997. [3] Xilinx Inc. Xilinx Zynq-7000 SoC ZC702 Evaluation kit. [4] Nvidia Inc. (Last Accessed: February 2012) Nvidia Tesla C2070 GPU Computing Processor, Nvidia GeoForce GT650M Notebook GPU [Available Online]