SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 331
Auto conversion of serial C code to CUDA code
1Tejas Gijare, 2Vishal Bafna, 3Chaitanya Subhedar, 4Aniket Ingale
Computer Engineering Department, Zeal College of Engineering and Research Pune, India
-------------------------------------------------------------------------------------***--------------------------------------------------------------------------------
Abstract - As the GPU is growing demand of the Game
Industry and large scientific computations , efforts have
been made to take advantages to gain maximum utilization
of GPUs in computation. GPUs follow architecture named
CUDA- Compute Unified Device Architecture. And to use
GPUs there is a language CUDA C which is extension to C.
But CUDA C needs to be learned by the developers. Though
GPUs are widely used in Supercomputers today , they are
not portable because one has to sit and code the
algorithms in CUDA to run them on GPU. So if we can have
some middleware that converts the C programs to CUDA,
the end user gets transparency. We tried to develop a
prototype compiler that is in visual studio and converts the
C programs in CUDA C language. The paper describes the
Pattern approach to develop a translator for source code
to source code translation.
Keywords: Parallel Computing, Serial Computing, CUDA,
GPU, HPC
Introduction:
Because of the demands of game industry, Graphics
Processing Units (GPUs) have evolved from application-
specific units for 3D scene rendering into highly parallel
and programmable multi pipelined processors that can
satisfy extremely high computational requirements at low
cost. The fact that the performance of graphic processing
units (GPUs) is much bigger than the central processing
units (CPUs) of now-a-days [1] is hardly surprising. GPUs
were formerly focused on such limited field of computing
graphic scenes. Within the course of time, GPUs became
very powerful and the area of use dramatically grew. So,
we can come together on the term General Purpose GPU
(GPGPU) denoting modern graphic accelerators. The
driving force of rapid rising of the performance is
computer games and the entertainment industry that
evolves economic pressure on the developers of GPUs to
perform a vast number of floating-point calculations
within the unit of time. The research in the field of GPGPU
started in late 70's. Today's fastest GPUs can deliver a peak
performance in the order of 500 GFLOPS , more than four
times the performance of the fastest x86 quad-core
processor. This thesis introduces a source to source
transformation of c programs to CUDA Architecture. It also
finds out dependencies and performs optimization for
peak performance gain. Automatic evolution of kernels,
independent code finding, Loop unrolling, Memory
coalescing and thread scheduling are main part of
concerns. IR level optimization and higher level
optimizations patterns finding is important issue that may
be covered by this thesis. Thesis describes parallelization
patterns and CUDA C extensions from C to find out
transformation rules.
Introduction to ANTLR3:
ANTLR, ANother Tool for Language Recognition, is a
language tool that provides a framework for constructing
recognizers, interpreters, compilers, and translators from
grammatical descriptions containing actions in a vari-ety
of target languages. ANTLR has a sophisticated grammar
development environment called ANTLRWorks. ANTLR
provides environment to develop a compiler that parses
the input program. Lexer and Parser code can be
generated in C#, C, Java, Python etc.
C2CUDATranslator Flow:
The flow of the translator shows the overall functionality
and proper compiler structure. The input is C file which is
pre-processed input to translator. The input is given to C
Parser which is generated using ANTLR grammar and
contains two files lexer and parser. Lexer generates tokens
and using ITokens interface the parser rules are parsed
and the code is checked using the parser. The symbol table
is generated. The preprocessor outlines the kernel by
pragma pack. Before starting the kernel region the line
with "#pragma kernel start" comes. So the compiler can
know that the next statements are of kernel region which
will be ported on the GPU. The kernel is finished with the
statement "#pragma kernel end". The translator uses
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 332
symbol table and converts the kernel region with kernel
function.
Figure 1: Flow of the C2CUDA Translator
For kernel code writing the translator performs 5 steps
shown in figure above. They are nothing but the proposed
patterns in previous sections. The translator generates
codes for memory allocation on both host and devices and
auto generates required pointers and variables. After that
it ports the kernel region in kernel function. After
parallelization is added it will be ported on many threads
and blocks.
Finally, the CUDA File code is generated and extra
functions in .c file are as it is in .cu file.
Compiler Style:
For kernel region compiler has a unique style. There are
some projects that start but never reach to the end. The
algorithms for parallelization are more and time to add all
of them are not enough for thesis. So I tried to implement a
mechanism in which one can add more algorithms if one
wants to. Here compiler generates patterns for the input
program. By the time we can see there are N numbers of
programs and infinite. New programs and algorithms will
be introduced to the CUDA by the time. So there will be N
number of patterns. Patterns are the heart of the compiler.
Pattern describes the mechanism for the compiler like
virus signature does in Antivirus software. New virus are
always generated and corresponding signatures are also
made in antivirus. Similarly new patterns can be added
later and so on.
Figure 2: Compiler Style
Results and Evaluations:
The compiler is tested with parboil benchmark suite. The
graph shows the comparison between CUDA BASE
programs, which are handwritten and fully optimized and
C2CUDATranslator converted programs.
Figure 3: Evaluation of C2CUDATranslator
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 333
Conclusion:
The C2CUDATranslator saves 95% of the development
translation time. This can also be used as a framework for
future translator development for other developers.
References:
[1] Purcell T. J., Buck I., Mark W. R., Hanrahan P., Ray
tracing on programmable
graphics hardware, ACM Transactions on Graphics 21, 3
(July 2002), pp 703712.
[2] Knott D., Pai D. K., CInDeR: Collision and interference
detection in real-time using graphics hardware,
Proceedings of the 2003 Conference on Graphics Interface,
June 2003, pp. 7380.
[3] Svetlin A. Manavski, "Cuda compatible GPU as an
efficient hardware accelerator for AES cryptography" Proc.
IEEE International Conference on Signal Processing and
Communication, ICSPC 2007, (Dubai, United Arab
Emirates), November 2007, pp.65-68.
[4] T. D. Han and T. S. Abdelrahman, "hiCUDA: High-Level
GPGPU Programming",
IEEE Transactions on Parallel and Distributed Systems,
Jan.2011, vol. 22, no. 1, pp. 78-90.
[5]Yu Liu, M. Huang, B. Huang, H.-L. A Huang, and T.Lee,
"GPU-Accelerated Longwave Radiation Scheme of the
Rapid 1508 Radiative Transfer Model for General
Circulation Models (RRTMG)" IEEE J. Sel. Top. Appl. Earth
Observ. Remote Sens., vol. 7, pp. 3660-3667, Aug, 2014.

More Related Content

What's hot

Design and implementation of Closed Loop Control of Three Phase Interleaved P...
Design and implementation of Closed Loop Control of Three Phase Interleaved P...Design and implementation of Closed Loop Control of Three Phase Interleaved P...
Design and implementation of Closed Loop Control of Three Phase Interleaved P...IJMTST Journal
 
Metasepi Demo at PPL2015
Metasepi Demo at PPL2015Metasepi Demo at PPL2015
Metasepi Demo at PPL2015Kiwamu Okabe
 
Full stack component of software and middleware for quantum machine
Full stack component of software and middleware for quantum machineFull stack component of software and middleware for quantum machine
Full stack component of software and middleware for quantum machineYuichiro MInato
 
C Programming and Coding Standards, Learn C Programming
C Programming and Coding Standards, Learn C ProgrammingC Programming and Coding Standards, Learn C Programming
C Programming and Coding Standards, Learn C ProgrammingTonex
 
Performance Verification for ESL Design Methodology from AADL Models
Performance Verification for ESL Design Methodology from AADL ModelsPerformance Verification for ESL Design Methodology from AADL Models
Performance Verification for ESL Design Methodology from AADL ModelsSpace Codesign
 
Extracting a Rails Engine to a separated application
Extracting a Rails Engine to a separated applicationExtracting a Rails Engine to a separated application
Extracting a Rails Engine to a separated applicationJônatas Paganini
 
High quality implementation for
High quality implementation forHigh quality implementation for
High quality implementation forijseajournal
 
hetshah_resume
hetshah_resumehetshah_resume
hetshah_resumehet shah
 
Comparison between python and c++
Comparison between python and c++Comparison between python and c++
Comparison between python and c++ssusera7faf41
 
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Aljoscha Krettek
 

What's hot (15)

Design and implementation of Closed Loop Control of Three Phase Interleaved P...
Design and implementation of Closed Loop Control of Three Phase Interleaved P...Design and implementation of Closed Loop Control of Three Phase Interleaved P...
Design and implementation of Closed Loop Control of Three Phase Interleaved P...
 
Metasepi Demo at PPL2015
Metasepi Demo at PPL2015Metasepi Demo at PPL2015
Metasepi Demo at PPL2015
 
ziad_cv
ziad_cvziad_cv
ziad_cv
 
Full stack component of software and middleware for quantum machine
Full stack component of software and middleware for quantum machineFull stack component of software and middleware for quantum machine
Full stack component of software and middleware for quantum machine
 
Functional coverages
Functional coveragesFunctional coverages
Functional coverages
 
C Programming and Coding Standards, Learn C Programming
C Programming and Coding Standards, Learn C ProgrammingC Programming and Coding Standards, Learn C Programming
C Programming and Coding Standards, Learn C Programming
 
Performance Verification for ESL Design Methodology from AADL Models
Performance Verification for ESL Design Methodology from AADL ModelsPerformance Verification for ESL Design Methodology from AADL Models
Performance Verification for ESL Design Methodology from AADL Models
 
Extracting a Rails Engine to a separated application
Extracting a Rails Engine to a separated applicationExtracting a Rails Engine to a separated application
Extracting a Rails Engine to a separated application
 
Smriti shikha cv
Smriti shikha cvSmriti shikha cv
Smriti shikha cv
 
High quality implementation for
High quality implementation forHigh quality implementation for
High quality implementation for
 
hetshah_resume
hetshah_resumehetshah_resume
hetshah_resume
 
C++ vs python
C++ vs pythonC++ vs python
C++ vs python
 
Comparison between python and c++
Comparison between python and c++Comparison between python and c++
Comparison between python and c++
 
Shift Dev Conf API
Shift Dev Conf APIShift Dev Conf API
Shift Dev Conf API
 
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
 

Similar to Auto conversion of serial C code to CUDA code

Larson and toubro
Larson and toubroLarson and toubro
Larson and toubroanoopc1998
 
IRJET- Lost: The Horror Game
IRJET- Lost: The Horror GameIRJET- Lost: The Horror Game
IRJET- Lost: The Horror GameIRJET Journal
 
Wireless Base CNC Mini Plotter Three Axis Control Machine
Wireless Base CNC Mini Plotter Three Axis Control MachineWireless Base CNC Mini Plotter Three Axis Control Machine
Wireless Base CNC Mini Plotter Three Axis Control MachineGhulamDastgeer14
 
A REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORS
A REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORSA REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORS
A REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORSIRJET Journal
 
AI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performanceAI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performanceinside-BigData.com
 
electronics-11-03883.pdf
electronics-11-03883.pdfelectronics-11-03883.pdf
electronics-11-03883.pdfRioCarthiis
 
IRJET- Design Automation of Flange Coupling using NX 10.0
IRJET- Design Automation of Flange Coupling using NX 10.0IRJET- Design Automation of Flange Coupling using NX 10.0
IRJET- Design Automation of Flange Coupling using NX 10.0IRJET Journal
 
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CDMACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CDIRJET Journal
 
Fyp presentation-final
Fyp presentation-finalFyp presentation-final
Fyp presentation-finalImran Mumtaz
 
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDAIRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDAIRJET Journal
 
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA  PLANE AND OPTIMIZATION POSSIBILITIES STUDY ON EMERGING APPLICATIONS ON DATA  PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES ijdpsjournal
 
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIESSTUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIESijdpsjournal
 
IRJET - An Embedded Approach for Design and Development of the Mini CNC C...
IRJET -  	  An Embedded Approach for Design and Development of the Mini CNC C...IRJET -  	  An Embedded Approach for Design and Development of the Mini CNC C...
IRJET - An Embedded Approach for Design and Development of the Mini CNC C...IRJET Journal
 
Network Analyzer and Report Generation Tool for NS-2 using TCL Script
Network Analyzer and Report Generation Tool for NS-2 using TCL ScriptNetwork Analyzer and Report Generation Tool for NS-2 using TCL Script
Network Analyzer and Report Generation Tool for NS-2 using TCL ScriptIRJET Journal
 
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET Journal
 
Dynamic sorting algorithm vizualizer.pdf
Dynamic sorting algorithm vizualizer.pdfDynamic sorting algorithm vizualizer.pdf
Dynamic sorting algorithm vizualizer.pdfAgneshShetty
 
IRJET- Build a Secure Web based Code Editor for C Programming Language
IRJET-  	  Build a Secure Web based Code Editor for C Programming LanguageIRJET-  	  Build a Secure Web based Code Editor for C Programming Language
IRJET- Build a Secure Web based Code Editor for C Programming LanguageIRJET Journal
 
Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...Ashley Carter
 

Similar to Auto conversion of serial C code to CUDA code (20)

Larson and toubro
Larson and toubroLarson and toubro
Larson and toubro
 
A STUDY OF AN ENTRENCHED SYSTEM USING INTERNET OF THINGS
A STUDY OF AN ENTRENCHED SYSTEM USING INTERNET OF THINGSA STUDY OF AN ENTRENCHED SYSTEM USING INTERNET OF THINGS
A STUDY OF AN ENTRENCHED SYSTEM USING INTERNET OF THINGS
 
IRJET- Lost: The Horror Game
IRJET- Lost: The Horror GameIRJET- Lost: The Horror Game
IRJET- Lost: The Horror Game
 
Wireless Base CNC Mini Plotter Three Axis Control Machine
Wireless Base CNC Mini Plotter Three Axis Control MachineWireless Base CNC Mini Plotter Three Axis Control Machine
Wireless Base CNC Mini Plotter Three Axis Control Machine
 
A REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORS
A REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORSA REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORS
A REVIEW ON ANALYSIS OF 32-BIT AND 64-BIT RISC PROCESSORS
 
AI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performanceAI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performance
 
electronics-11-03883.pdf
electronics-11-03883.pdfelectronics-11-03883.pdf
electronics-11-03883.pdf
 
IRJET- Design Automation of Flange Coupling using NX 10.0
IRJET- Design Automation of Flange Coupling using NX 10.0IRJET- Design Automation of Flange Coupling using NX 10.0
IRJET- Design Automation of Flange Coupling using NX 10.0
 
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CDMACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
 
Fyp presentation-final
Fyp presentation-finalFyp presentation-final
Fyp presentation-final
 
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDAIRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
IRJET-A Study on Parallization of Genetic Algorithms on GPUS using CUDA
 
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA  PLANE AND OPTIMIZATION POSSIBILITIES STUDY ON EMERGING APPLICATIONS ON DATA  PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
 
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIESSTUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
 
IRJET - An Embedded Approach for Design and Development of the Mini CNC C...
IRJET -  	  An Embedded Approach for Design and Development of the Mini CNC C...IRJET -  	  An Embedded Approach for Design and Development of the Mini CNC C...
IRJET - An Embedded Approach for Design and Development of the Mini CNC C...
 
GCF
GCFGCF
GCF
 
Network Analyzer and Report Generation Tool for NS-2 using TCL Script
Network Analyzer and Report Generation Tool for NS-2 using TCL ScriptNetwork Analyzer and Report Generation Tool for NS-2 using TCL Script
Network Analyzer and Report Generation Tool for NS-2 using TCL Script
 
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
IRJET- A Review- FPGA based Architectures for Image Capturing Consequently Pr...
 
Dynamic sorting algorithm vizualizer.pdf
Dynamic sorting algorithm vizualizer.pdfDynamic sorting algorithm vizualizer.pdf
Dynamic sorting algorithm vizualizer.pdf
 
IRJET- Build a Secure Web based Code Editor for C Programming Language
IRJET-  	  Build a Secure Web based Code Editor for C Programming LanguageIRJET-  	  Build a Secure Web based Code Editor for C Programming Language
IRJET- Build a Secure Web based Code Editor for C Programming Language
 
Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 

Auto conversion of serial C code to CUDA code

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 331 Auto conversion of serial C code to CUDA code 1Tejas Gijare, 2Vishal Bafna, 3Chaitanya Subhedar, 4Aniket Ingale Computer Engineering Department, Zeal College of Engineering and Research Pune, India -------------------------------------------------------------------------------------***-------------------------------------------------------------------------------- Abstract - As the GPU is growing demand of the Game Industry and large scientific computations , efforts have been made to take advantages to gain maximum utilization of GPUs in computation. GPUs follow architecture named CUDA- Compute Unified Device Architecture. And to use GPUs there is a language CUDA C which is extension to C. But CUDA C needs to be learned by the developers. Though GPUs are widely used in Supercomputers today , they are not portable because one has to sit and code the algorithms in CUDA to run them on GPU. So if we can have some middleware that converts the C programs to CUDA, the end user gets transparency. We tried to develop a prototype compiler that is in visual studio and converts the C programs in CUDA C language. The paper describes the Pattern approach to develop a translator for source code to source code translation. Keywords: Parallel Computing, Serial Computing, CUDA, GPU, HPC Introduction: Because of the demands of game industry, Graphics Processing Units (GPUs) have evolved from application- specific units for 3D scene rendering into highly parallel and programmable multi pipelined processors that can satisfy extremely high computational requirements at low cost. The fact that the performance of graphic processing units (GPUs) is much bigger than the central processing units (CPUs) of now-a-days [1] is hardly surprising. GPUs were formerly focused on such limited field of computing graphic scenes. Within the course of time, GPUs became very powerful and the area of use dramatically grew. So, we can come together on the term General Purpose GPU (GPGPU) denoting modern graphic accelerators. The driving force of rapid rising of the performance is computer games and the entertainment industry that evolves economic pressure on the developers of GPUs to perform a vast number of floating-point calculations within the unit of time. The research in the field of GPGPU started in late 70's. Today's fastest GPUs can deliver a peak performance in the order of 500 GFLOPS , more than four times the performance of the fastest x86 quad-core processor. This thesis introduces a source to source transformation of c programs to CUDA Architecture. It also finds out dependencies and performs optimization for peak performance gain. Automatic evolution of kernels, independent code finding, Loop unrolling, Memory coalescing and thread scheduling are main part of concerns. IR level optimization and higher level optimizations patterns finding is important issue that may be covered by this thesis. Thesis describes parallelization patterns and CUDA C extensions from C to find out transformation rules. Introduction to ANTLR3: ANTLR, ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a vari-ety of target languages. ANTLR has a sophisticated grammar development environment called ANTLRWorks. ANTLR provides environment to develop a compiler that parses the input program. Lexer and Parser code can be generated in C#, C, Java, Python etc. C2CUDATranslator Flow: The flow of the translator shows the overall functionality and proper compiler structure. The input is C file which is pre-processed input to translator. The input is given to C Parser which is generated using ANTLR grammar and contains two files lexer and parser. Lexer generates tokens and using ITokens interface the parser rules are parsed and the code is checked using the parser. The symbol table is generated. The preprocessor outlines the kernel by pragma pack. Before starting the kernel region the line with "#pragma kernel start" comes. So the compiler can know that the next statements are of kernel region which will be ported on the GPU. The kernel is finished with the statement "#pragma kernel end". The translator uses
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 332 symbol table and converts the kernel region with kernel function. Figure 1: Flow of the C2CUDA Translator For kernel code writing the translator performs 5 steps shown in figure above. They are nothing but the proposed patterns in previous sections. The translator generates codes for memory allocation on both host and devices and auto generates required pointers and variables. After that it ports the kernel region in kernel function. After parallelization is added it will be ported on many threads and blocks. Finally, the CUDA File code is generated and extra functions in .c file are as it is in .cu file. Compiler Style: For kernel region compiler has a unique style. There are some projects that start but never reach to the end. The algorithms for parallelization are more and time to add all of them are not enough for thesis. So I tried to implement a mechanism in which one can add more algorithms if one wants to. Here compiler generates patterns for the input program. By the time we can see there are N numbers of programs and infinite. New programs and algorithms will be introduced to the CUDA by the time. So there will be N number of patterns. Patterns are the heart of the compiler. Pattern describes the mechanism for the compiler like virus signature does in Antivirus software. New virus are always generated and corresponding signatures are also made in antivirus. Similarly new patterns can be added later and so on. Figure 2: Compiler Style Results and Evaluations: The compiler is tested with parboil benchmark suite. The graph shows the comparison between CUDA BASE programs, which are handwritten and fully optimized and C2CUDATranslator converted programs. Figure 3: Evaluation of C2CUDATranslator
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 333 Conclusion: The C2CUDATranslator saves 95% of the development translation time. This can also be used as a framework for future translator development for other developers. References: [1] Purcell T. J., Buck I., Mark W. R., Hanrahan P., Ray tracing on programmable graphics hardware, ACM Transactions on Graphics 21, 3 (July 2002), pp 703712. [2] Knott D., Pai D. K., CInDeR: Collision and interference detection in real-time using graphics hardware, Proceedings of the 2003 Conference on Graphics Interface, June 2003, pp. 7380. [3] Svetlin A. Manavski, "Cuda compatible GPU as an efficient hardware accelerator for AES cryptography" Proc. IEEE International Conference on Signal Processing and Communication, ICSPC 2007, (Dubai, United Arab Emirates), November 2007, pp.65-68. [4] T. D. Han and T. S. Abdelrahman, "hiCUDA: High-Level GPGPU Programming", IEEE Transactions on Parallel and Distributed Systems, Jan.2011, vol. 22, no. 1, pp. 78-90. [5]Yu Liu, M. Huang, B. Huang, H.-L. A Huang, and T.Lee, "GPU-Accelerated Longwave Radiation Scheme of the Rapid 1508 Radiative Transfer Model for General Circulation Models (RRTMG)" IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens., vol. 7, pp. 3660-3667, Aug, 2014.