This paper presents a parallel approach to improve the time complexity problem associated with sequential algorithms. An image steganography algorithm in transform domain is considered for implementation. Image steganography is a technique to hide secret message in an image. With the parallel implementation, large message can be hidden in large image since it does not take much processing time. It is implemented on GPU systems. Parallel programming is done using OpenCL in CUDA cores from NVIDIA. The speed-up improvement
obtained is very good with reasonably good output signal quality, when large amount of data is processed
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHYcsandit
This paper presents a parallel approach to improve the time complexity problem associated
with sequential algorithms. An image steganography algorithm in transform domain is
considered for implementation. Image steganography is a technique to hide secret message in
an image. With the parallel implementation, large message can be hidden in large image since
it does not take much processing time. It is implemented on GPU systems. Parallel
programming is done using OpenCL in CUDA cores from NVIDIA. The speed-up improvement
obtained is very good with reasonably good output signal quality, when large amount of data is
processed
Performance Optimization of Clustering On GPUijsrd.com
In today's digital world, Data sets are increasing exponentially. Statistical analysis using clustering in various scientific and engineering applications become very challenging issue for such large data set. Clustering on huge data set and its performance are two major factors demand for optimization. Parallelization is well-known approach to optimize performance. It has been observed from recent research work that GPU based parallelization help to achieve high degree of performance. Hence, this thesis focuses on optimizing hierarchical clustering algorithms using parallelization. It covers implementation of optimized algorithm on various parallel environments using Open MP on multi-core architecture and using CUDA on may-core architecture.
The objective of this paper is to present the hybrid approach for edge detection. Under this technique, edge
detection is performed in two phase. In first phase, Canny Algorithm is applied for image smoothing and in
second phase neural network is to detecting actual edges. Neural network is a wonderful tool for edge
detection. As it is a non-linear network with built-in thresholding capability. Neural Network can be trained
with back propagation technique using few training patterns but the most important and difficult part is to
identify the correct and proper training set.
SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHYcsandit
This paper presents a parallel approach to improve the time complexity problem associated
with sequential algorithms. An image steganography algorithm in transform domain is
considered for implementation. Image steganography is a technique to hide secret message in
an image. With the parallel implementation, large message can be hidden in large image since
it does not take much processing time. It is implemented on GPU systems. Parallel
programming is done using OpenCL in CUDA cores from NVIDIA. The speed-up improvement
obtained is very good with reasonably good output signal quality, when large amount of data is
processed
Performance Optimization of Clustering On GPUijsrd.com
In today's digital world, Data sets are increasing exponentially. Statistical analysis using clustering in various scientific and engineering applications become very challenging issue for such large data set. Clustering on huge data set and its performance are two major factors demand for optimization. Parallelization is well-known approach to optimize performance. It has been observed from recent research work that GPU based parallelization help to achieve high degree of performance. Hence, this thesis focuses on optimizing hierarchical clustering algorithms using parallelization. It covers implementation of optimized algorithm on various parallel environments using Open MP on multi-core architecture and using CUDA on may-core architecture.
The objective of this paper is to present the hybrid approach for edge detection. Under this technique, edge
detection is performed in two phase. In first phase, Canny Algorithm is applied for image smoothing and in
second phase neural network is to detecting actual edges. Neural network is a wonderful tool for edge
detection. As it is a non-linear network with built-in thresholding capability. Neural Network can be trained
with back propagation technique using few training patterns but the most important and difficult part is to
identify the correct and proper training set.
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTINGcscpconf
Parallel computing systems compose task partitioning strategies in a true multiprocessing
manner. Such systems share the algorithm and processing unit as computing resources which
leads to highly inter process communications capabilities. The main part of the proposed
algorithm is resource management unit which performs task partitioning and co-scheduling .In
this paper, we present a technique for integrated task partitioning and co-scheduling on the
privately owned network. We focus on real-time and non preemptive systems. A large variety of
experiments have been conducted on the proposed algorithm using synthetic and real tasks.
Goal of computation model is to provide a realistic representation of the costs of programming
The results show the benefit of the task partitioning. The main characteristics of our method are
optimal scheduling and strong link between partitioning, scheduling and communication. Some
important models for task partitioning are also discussed in the paper. We target the algorithm
for task partitioning which improve the inter process communication between the tasks and use
the recourses of the system in the efficient manner. The proposed algorithm contributes the
inter-process communication cost minimization amongst the executing processes.
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Ilham Amezzane
Support Vector Machines (SVMs) have proven to yield high accuracy and have been used widespread in recent years. However, the standard versions of the SVM algorithm are very time-consuming and computationally intensive; which places a challenge on engineers to explore other hardware architectures than CPU, capable of performing real-time training and classifications while maintaining low power consumption in embedded systems. This paper proposes an overview of works based on the two most popular parallel processing devices: GPU and FPGA, with a focus on multiclass training process. Since different techniques have been evaluated using different experimentation platforms and methodologies, we only focus on the improvements realized in each study.
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...IJCNCJournal
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel
system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting
of multiple collections of nodes with different types of computing devices. The execution engine of the
system is open for optimizer implementations, focusing on various criteria. In this paper, we propose a new
optimizer for KernelHive, that utilizes distributed databases and performs data prefetching to optimize the
execution time of applications, which process large input data. Employing a versatile data management
scheme, which allows combining various distributed data providers, we propose using NoSQL databases
for our purposes. We support our solution with results of experiments with real executions of our OpenCL
implementation of a regular expression matching application in various hardware configurations.
Additionally, we propose a network-aware scheduling scheme for selecting hardware for the proposed
optimizer and present simulations that demonstrate its advantages.
HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...Sunny Kr
Cardinality estimation has a wide range of applications and
is of particular importance in database systems. Various
algorithms have been proposed in the past, and the HyperLogLog algorithm is one of them
International Refereed Journal of Engineering and Science (IRJES)irjes
International Refereed Journal of Engineering and Science (IRJES) is a leading international journal for publication of new ideas, the state of the art research results and fundamental advances in all aspects of Engineering and Science. IRJES is a open access, peer reviewed international journal with a primary objective to provide the academic community and industry for the submission of half of original research and applications
Simulation of Single and Multilayer of Artificial Neural Network using Verilogijsrd.com
Artificial neural network play an important role in VLSI circuit to find and diagnosis multiple fault in digital circuit. In this paper, the example of single layer and multi-layer neural network had been discussed secondly implement those structure by using verilog code and same idea must be implement in mat lab for getting number of iteration and verilog code gives us time taken to adjust the weight when error become almost equal to zero. The purposed aim at reducing resource requirement, without much compromises on the speed that neural network can be realized on single chip at lower cost.
HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...ijcsity
Due to increase in number of vehicles, Traffic is a major problem faced in urban areas throughout the
world. This document presents a newly developed Matlab Simulink model to compute traffic load for real
time traffic signal control. Signal processing, video and image processing and Xilinx Blockset have been
extensively used for traffic load computation. The approach used is Edge detection operation, wherein,
Edges are extracted to identify the number of vehicles. The developed model computes the results with
greater degrees of accuracy and is capable of being used to set the green signal duration so as to release
the traffic dynamically on traffic junctions.
Xilinx System Generator (XSG) provides Simulink Blockset for several hardware operations that could be
implemented on various Xilinx Field programmable gate arrays (FPGAs). The method described in this
paper involves object feature identification and detection. Xilinx System Generator provides some blocks to
transform data provided from the software side of the simulation environment to the hardware side. In our
case it is MATLAB Simulink to System Generator blocks. This is an important concept to understand in the
design process using Xilinx System Generator. The Xilinx System Generator, embedded in MATLAB
Simulink is used to program the model and then test on the FPGA board using the properties of hardware
co-simulation tools.
Brief Explanation about the Tau-Leaping Process, Parallel Processing and NVIDIA's CUDA architecture
And the use of cuTau - Leaping for simulation of Biological systems
Implementation of p pic algorithm in map reduce to handle big dataeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
High Performance Parallel Computing with Clouds and Cloud Technologiesjaliyae
Infrastructure services (Infrastructure-as-a-service), provided by cloud vendors, allow any user to provision a large number of compute instances fairly easily. Whether leased from public clouds or allocated from private clouds, utilizing these virtual resources to perform data/compute intensive analyses requires employing different parallel runtimes to implement such applications. Among many parallelizable problems, most “pleasingly parallel” applications can be performed using MapReduce technologies such as Hadoop, CGL-MapReduce, and Dryad, in a fairly easy manner. However, many scientific applications, which have complex communication patterns, still require low latency communication mechanisms and rich set of communication constructs offered by runtimes such as MPI. In this paper, we first discuss large scale data analysis using different MapReduce implementations and then, we present a performance analysis of high performance parallel applications on virtualized resources.
Accelerating Real Time Applications on Heterogeneous PlatformsIJMER
In this paper we describe about the novel implementations of depth estimation from a stereo
images using feature extraction algorithms that run on the graphics processing unit (GPU) which is
suitable for real time applications like analyzing video in real-time vision systems. Modern graphics
cards contain large number of parallel processors and high-bandwidth memory for accelerating the
processing of data computation operations. In this paper we give general idea of how to accelerate the
real time application using heterogeneous platforms. We have proposed to use some added resources to
grasp more computationally involved optimization methods. This proposed approach will indirectly
accelerate a database by producing better plan quality.
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...IJECEIAES
Cloud Computing is the most powerful computing model of our time. While the major IT providers and consumers are competing to exploit the benefits of this computing model in order to thrive their profits, most of the cloud computing platforms are still built on operating systems that uses basic CPU (Core Processing Unit) scheduling algorithms that lacks the intelligence needed for such innovative computing model. Correspdondingly, this paper presents the benefits of applying Artificial Neural Networks algorithms in regards to enhancing CPU scheduling for Cloud Computing model. Furthermore, a set of characteristics and theoretical metrics are proposed for the sake of comparing the different Artificial Neural Networks algorithms and finding the most accurate algorithm for Cloud Computing CPU Scheduling.
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP IJCSEIT Journal
The current multi-core architectures have become popular due to performance, and efficient processing of
multiple tasks simultaneously. Today’s the parallel algorithms are focusing on multi-core systems. The
design of parallel algorithm and performance measurement is the major issue on multi-core environment. If
one wishes to execute a single application faster, then the application must be divided into subtask or
threads to deliver desired result. Numerical problems, especially the solution of linear system of equation
have many applications in science and engineering. This paper describes and analyzes the parallel
algorithms for computing the solution of dense system of linear equations, and to approximately compute
the value of π using OpenMP interface. The performances (speedup) of parallel algorithms on multi-core
system have been presented. The experimental results on a multi-core processor show that the proposed
parallel algorithms achieves good performance (speedup) compared to the sequential
A COMPARISON BETWEEN PARALLEL AND SEGMENTATION METHODS USED FOR IMAGE ENCRYPT...ijcsit
Preserving confidentiality, integrity and authenticity of images is becoming very important. There are so many different encryption techniques to protect images from unauthorized access. Matrix multiplication
can be successfully used to encrypt-decrypt digital images. In this paper we made a comparison study between two image encryption techniques based on matrix multiplication namely, segmentation and parallel methods.
DYNAMIC TASK PARTITIONING MODEL IN PARALLEL COMPUTINGcscpconf
Parallel computing systems compose task partitioning strategies in a true multiprocessing
manner. Such systems share the algorithm and processing unit as computing resources which
leads to highly inter process communications capabilities. The main part of the proposed
algorithm is resource management unit which performs task partitioning and co-scheduling .In
this paper, we present a technique for integrated task partitioning and co-scheduling on the
privately owned network. We focus on real-time and non preemptive systems. A large variety of
experiments have been conducted on the proposed algorithm using synthetic and real tasks.
Goal of computation model is to provide a realistic representation of the costs of programming
The results show the benefit of the task partitioning. The main characteristics of our method are
optimal scheduling and strong link between partitioning, scheduling and communication. Some
important models for task partitioning are also discussed in the paper. We target the algorithm
for task partitioning which improve the inter process communication between the tasks and use
the recourses of the system in the efficient manner. The proposed algorithm contributes the
inter-process communication cost minimization amongst the executing processes.
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Ilham Amezzane
Support Vector Machines (SVMs) have proven to yield high accuracy and have been used widespread in recent years. However, the standard versions of the SVM algorithm are very time-consuming and computationally intensive; which places a challenge on engineers to explore other hardware architectures than CPU, capable of performing real-time training and classifications while maintaining low power consumption in embedded systems. This paper proposes an overview of works based on the two most popular parallel processing devices: GPU and FPGA, with a focus on multiclass training process. Since different techniques have been evaluated using different experimentation platforms and methodologies, we only focus on the improvements realized in each study.
NETWORK-AWARE DATA PREFETCHING OPTIMIZATION OF COMPUTATIONS IN A HETEROGENEOU...IJCNCJournal
Rapid development of diverse computer architectures and hardware accelerators caused that designing parallel systems faces new problems resulting from their heterogeneity. Our implementation of a parallel
system called KernelHive allows to efficiently run applications in a heterogeneous environment consisting
of multiple collections of nodes with different types of computing devices. The execution engine of the
system is open for optimizer implementations, focusing on various criteria. In this paper, we propose a new
optimizer for KernelHive, that utilizes distributed databases and performs data prefetching to optimize the
execution time of applications, which process large input data. Employing a versatile data management
scheme, which allows combining various distributed data providers, we propose using NoSQL databases
for our purposes. We support our solution with results of experiments with real executions of our OpenCL
implementation of a regular expression matching application in various hardware configurations.
Additionally, we propose a network-aware scheduling scheme for selecting hardware for the proposed
optimizer and present simulations that demonstrate its advantages.
HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...Sunny Kr
Cardinality estimation has a wide range of applications and
is of particular importance in database systems. Various
algorithms have been proposed in the past, and the HyperLogLog algorithm is one of them
International Refereed Journal of Engineering and Science (IRJES)irjes
International Refereed Journal of Engineering and Science (IRJES) is a leading international journal for publication of new ideas, the state of the art research results and fundamental advances in all aspects of Engineering and Science. IRJES is a open access, peer reviewed international journal with a primary objective to provide the academic community and industry for the submission of half of original research and applications
Simulation of Single and Multilayer of Artificial Neural Network using Verilogijsrd.com
Artificial neural network play an important role in VLSI circuit to find and diagnosis multiple fault in digital circuit. In this paper, the example of single layer and multi-layer neural network had been discussed secondly implement those structure by using verilog code and same idea must be implement in mat lab for getting number of iteration and verilog code gives us time taken to adjust the weight when error become almost equal to zero. The purposed aim at reducing resource requirement, without much compromises on the speed that neural network can be realized on single chip at lower cost.
HARDWARE SOFTWARE CO-SIMULATION FOR TRAFFIC LOAD COMPUTATION USING MATLAB SIM...ijcsity
Due to increase in number of vehicles, Traffic is a major problem faced in urban areas throughout the
world. This document presents a newly developed Matlab Simulink model to compute traffic load for real
time traffic signal control. Signal processing, video and image processing and Xilinx Blockset have been
extensively used for traffic load computation. The approach used is Edge detection operation, wherein,
Edges are extracted to identify the number of vehicles. The developed model computes the results with
greater degrees of accuracy and is capable of being used to set the green signal duration so as to release
the traffic dynamically on traffic junctions.
Xilinx System Generator (XSG) provides Simulink Blockset for several hardware operations that could be
implemented on various Xilinx Field programmable gate arrays (FPGAs). The method described in this
paper involves object feature identification and detection. Xilinx System Generator provides some blocks to
transform data provided from the software side of the simulation environment to the hardware side. In our
case it is MATLAB Simulink to System Generator blocks. This is an important concept to understand in the
design process using Xilinx System Generator. The Xilinx System Generator, embedded in MATLAB
Simulink is used to program the model and then test on the FPGA board using the properties of hardware
co-simulation tools.
Brief Explanation about the Tau-Leaping Process, Parallel Processing and NVIDIA's CUDA architecture
And the use of cuTau - Leaping for simulation of Biological systems
Implementation of p pic algorithm in map reduce to handle big dataeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
High Performance Parallel Computing with Clouds and Cloud Technologiesjaliyae
Infrastructure services (Infrastructure-as-a-service), provided by cloud vendors, allow any user to provision a large number of compute instances fairly easily. Whether leased from public clouds or allocated from private clouds, utilizing these virtual resources to perform data/compute intensive analyses requires employing different parallel runtimes to implement such applications. Among many parallelizable problems, most “pleasingly parallel” applications can be performed using MapReduce technologies such as Hadoop, CGL-MapReduce, and Dryad, in a fairly easy manner. However, many scientific applications, which have complex communication patterns, still require low latency communication mechanisms and rich set of communication constructs offered by runtimes such as MPI. In this paper, we first discuss large scale data analysis using different MapReduce implementations and then, we present a performance analysis of high performance parallel applications on virtualized resources.
Accelerating Real Time Applications on Heterogeneous PlatformsIJMER
In this paper we describe about the novel implementations of depth estimation from a stereo
images using feature extraction algorithms that run on the graphics processing unit (GPU) which is
suitable for real time applications like analyzing video in real-time vision systems. Modern graphics
cards contain large number of parallel processors and high-bandwidth memory for accelerating the
processing of data computation operations. In this paper we give general idea of how to accelerate the
real time application using heterogeneous platforms. We have proposed to use some added resources to
grasp more computationally involved optimization methods. This proposed approach will indirectly
accelerate a database by producing better plan quality.
Comparative Study of Neural Networks Algorithms for Cloud Computing CPU Sched...IJECEIAES
Cloud Computing is the most powerful computing model of our time. While the major IT providers and consumers are competing to exploit the benefits of this computing model in order to thrive their profits, most of the cloud computing platforms are still built on operating systems that uses basic CPU (Core Processing Unit) scheduling algorithms that lacks the intelligence needed for such innovative computing model. Correspdondingly, this paper presents the benefits of applying Artificial Neural Networks algorithms in regards to enhancing CPU scheduling for Cloud Computing model. Furthermore, a set of characteristics and theoretical metrics are proposed for the sake of comparing the different Artificial Neural Networks algorithms and finding the most accurate algorithm for Cloud Computing CPU Scheduling.
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP IJCSEIT Journal
The current multi-core architectures have become popular due to performance, and efficient processing of
multiple tasks simultaneously. Today’s the parallel algorithms are focusing on multi-core systems. The
design of parallel algorithm and performance measurement is the major issue on multi-core environment. If
one wishes to execute a single application faster, then the application must be divided into subtask or
threads to deliver desired result. Numerical problems, especially the solution of linear system of equation
have many applications in science and engineering. This paper describes and analyzes the parallel
algorithms for computing the solution of dense system of linear equations, and to approximately compute
the value of π using OpenMP interface. The performances (speedup) of parallel algorithms on multi-core
system have been presented. The experimental results on a multi-core processor show that the proposed
parallel algorithms achieves good performance (speedup) compared to the sequential
A COMPARISON BETWEEN PARALLEL AND SEGMENTATION METHODS USED FOR IMAGE ENCRYPT...ijcsit
Preserving confidentiality, integrity and authenticity of images is becoming very important. There are so many different encryption techniques to protect images from unauthorized access. Matrix multiplication
can be successfully used to encrypt-decrypt digital images. In this paper we made a comparison study between two image encryption techniques based on matrix multiplication namely, segmentation and parallel methods.
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS An efficient-parallel-approach-fo...IEEEBEBTECHSTUDENTPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Image Processing Application on Graphics processorsCSCJournals
In this work, we introduce real time image processing techniques using modern programmable Graphic Processing Units GPU. GPU are SIMD (Single Instruction, Multiple Data) device that is inherently data-parallel. By utilizing NVIDIA new GPU programming framework, “Compute Unified Device Architecture” CUDA as a computational resource, we realize significant acceleration in image processing algorithm computations. We show that a range of computer vision algorithms map readily to CUDA with significant performance gains. Specifically, we demonstrate the efficiency of our approach by a parallelization and optimization of image processing, Morphology applications and image integral.
Cuda Based Performance Evaluation Of The Computational Efficiency Of The Dct ...acijjournal
Recent advances in computing such as the massively parallel GPUs (Graphical Processing Units),coupled
with the need to store and deliver large quantities of digital data especially images, has brought a number
of challenges for Computer Scientists, the research community and other stakeholders. These challenges,
such as prohibitively large costs to manipulate the digital data amongst others, have been the focus of the
research community in recent years and has led to the investigation of image compression techniques that
can achieve excellent results. One such technique is the Discrete Cosine Transform, which helps separate
an image into parts of differing frequencies and has the advantage of excellent energy-compaction.
This paper investigates the use of the Compute Unified Device Architecture (CUDA) programming model
to implement the DCT based Cordic based Loeffler algorithm for efficient image compression. The
computational efficiency is analyzed and evaluated under both the CPU and GPU. The PSNR (Peak Signal
to Noise Ratio) is used to evaluate image reconstruction quality in this paper. The results are presented
and discussed
Hybrid Model Based Testing Tool Architecture for Exascale Computing SystemCSCJournals
Exascale computing refers to a computing system which is capable to at least one exaflop in next couple of years. Many new programming models, architectures and algorithms have been introduced to attain the objective for exascale computing system. The primary objective is to enhance the system performance. In modern/super computers, GPU is being used to attain the high computing performance. However, it’s the objective of proposed technologies and programming models is almost same to make the GPU more powerful. But these technologies are still facing the number of challenges including parallelism, scale and complexity and also many more that must be fixed to achieve make computing system more powerful and efficient. In this paper, we have present a testing tool architecture for a parallel programming approach using two programming models as CUDA and OpenMP. Both CUDA and OpenMP could be used to program shared memory and GPU cores. The object of this architecture is to identify the static errors in the program that occurred during writing the code and cause absence of parallelism. Our architecture enforces the developers to write the feasible code through we can avoid from the essential errors in the program and run successfully.
An Investigation towards Effectiveness in Image Enhancement Process in MPSoC IJECEIAES
Image enhancement has a primitive role in the vision-based applications. It involves the processing of the input image by boosting its visualization for various applications. The primary objective is to filter the unwanted noises, clutters, sharpening or blur. The characteristics such as resolution and contrast are constructively altered to obtain an outcome of an enhanced image in the bio-medical field. The paper highlights the different techniques proposed for the digital enhancement of images. After surveying these methods that utilize Multiprocessor System-on-Chip (MPSoC), it is concluded that these methodologies have little accuracy and hence none of them are efficiently capable of enhancing the digital biomedical images.
A SURVEY OF NEURAL NETWORK HARDWARE ACCELERATORS IN MACHINE LEARNING mlaij
The use of Machine Learning in Artificial Intelligence is the inspiration that shaped technology as it is today. Machine Learning has the power to greatly simplify our lives. Improvement in speech recognition and language understanding help the community interact more naturally with technology. The popularity of machine learning opens up the opportunities for optimizing the design of computing platforms using welldefined hardware accelerators. In the upcoming few years, cameras will be utilised as sensors for several applications. For ease of use and privacy restrictions, the requested image processing should be limited to a local embedded computer platform and with a high accuracy. Furthermore, less energy should be consumed. Dedicated acceleration of Convolutional Neural Networks can achieve these targets with high flexibility to perform multiple vision tasks. However, due to the exponential growth in technology constraints (especially in terms of energy) which could lead to heterogeneous multicores, and increasing number of defects, the strategy of defect-tolerant accelerators for heterogeneous multi-cores may become a main micro-architecture research issue. The up to date accelerators used still face some performance issues such as memory limitations, bandwidth, speed etc. This literature summarizes (in terms of a survey) recent work of accelerators including their advantages and disadvantages to make it easier for developers with neural network interests to further improve what has already been established.
High Performance Computing for Satellite Image Processing and Analyzing – A ...Editor IJCATR
High Performance Computing (HPC) is the recently developed technology in the field of computer science, which evolved
due to meet increasing demands for processing speed and analysing/processing huge size of data sets. HPC brings together several
technologies such as computer architecture, algorithm, programs and system software under one canopy to solve/handle advanced
complex problems quickly and effectively. It is a crucial element today to gather and process large amount of satellite (remote sensing)
data which is the need of an hour. In this paper, we review recent development in HPC technology (Parallel, Distributed and Cluster
Computing) for satellite data processing and analysing. We attempt to discuss the fundamentals of High Performance Computing
(HPC) for satellite data processing and analysing, in a way which is easy to understand without much previous background. We sketch
the various HPC approach such as Parallel, Distributed & Cluster Computing and subsequent satellite data processing & analysing
methods like geo-referencing, image mosaicking, image classification, image fusion and Morphological/neural approach for hyperspectral satellite data. Collective, these works deliver a snapshot, tables and algorithms of the recent developments in those sectors and
offer a thoughtful perspective of the potential and promising challenges of satellite data processing and analysing using HPC
paradigms.
First phase report on "ANALYZING THE EFFECTIVENESS OF THE ADVANCED ENCRYPTION...Nikhil Jain
To implement and improve the performance of Advanced Encryption Standard algorithm by using multicore systems and OpenMP API extracting as much parallelism as possible from the algorithm in parallel implementation approach.
Real-Time Implementation and Performance Optimization of Local Derivative Pat...IJECEIAES
Pattern based texture descriptors are widely used in Content Based Image Retrieval (CBIR) for efficient retrieval of matching images. Local Derivative Pattern (LDP), a higher order local pattern operator, originally proposed for face recognition, encodes the distinctive spatial relationships contained in a local region of an image as the feature vector. LDP efficiently extracts finer details and provides efficient retrieval however, it was proposed for images of limited resolution. Over the period of time the development in the digital image sensors had paid way for capturing images at a very high resolution. LDP algorithm though very efficient in content-based image retrieval did not scale well when capturing features from such high-resolution images as it becomes computationally very expensive. This paper proposes how to efficiently extract parallelism from the LDP algorithm and strategies for optimally implementing it by exploiting some inherent General-Purpose Graphics Processing Unit (GPGPU) characteristics. By optimally configuring the GPGPU kernels, image retrieval was performed at a much faster rate. The LDP algorithm was ported on to Compute Unified Device Architecture (CUDA) supported GPGPU and a maximum speed up of around 240x was achieved as compared to its sequential counterpart.
Performance Comparison between Pytorch and Mindsporeijdms
Deep learning has been well used in many fields. However, there is a large amount of data when training neural networks, which makes many deep learning frameworks appear to serve deep learning practitioners, providing services that are more convenient to use and perform better. MindSpore and PyTorch are both deep learning frameworks. MindSpore is owned by HUAWEI, while PyTorch is owned by Facebook. Some people think that HUAWEI's MindSpore has better performance than FaceBook's PyTorch, which makes deep learning practitioners confused about the choice between the two. In this paper, we perform analytical and experimental analysis to reveal the comparison of training speed of MIndSpore and PyTorch on a single GPU. To ensure that our survey is as comprehensive as possible, we carefully selected neural networks in 2 main domains, which cover computer vision and natural language processing (NLP). The contribution of this work is twofold. First, we conduct detailed benchmarking experiments on MindSpore and PyTorch to analyze the reasons for their performance differences. This work provides guidance for end users to choose between these two frameworks.
DEVELOPMENT AND PERFORMANCE EVALUATION OF A LAN-BASED EDGE-DETECTION TOOL ijsc
This paper presents a description and performance evaluation of an efficient and reliable edge-detection tool that utilize the growing computational power of local area networks (LANs). It is therefore referred to as LAN-based edge detection (LANED) tool. The processor-farm methodology is used in porting the
sequential edge-detection calculations to run efficiently on the LAN. In this methodology, each computer on the LAN executes the same program independently from other computers, each operating on different part of the total data. It requires no data communication other than that involves in forwarding input
data/results between the LAN computers. LANED uses the Java parallel virtual machine (JPVM) data communication library to exchange data between computers. For equivalent calculations, the computation times on a single computer and a LAN of various number of computers, are estimated, and the resulting
speedup and parallelization efficiency, are computed. The estimated results demonstrated that parallelization efficiencies achieved vary between 87% to 60% when the number of computers on the LAN varies between 2 to 5 computers connected through 10/100 Mbps Ethernet switch.
Development and Performance Evaluation Of A Lan-Based EDGE-Detection Tool ijsc
This paper presents a description and performance evaluation of an efficient and reliable edge-detection tool that utilize the growing computational power of local area networks (LANs). It is therefore referred to as LAN-based edge detection (LANED) tool. The processor-farm methodology is used in porting the sequential edge-detection calculations to run efficiently on the LAN. In this methodology, each computer on the LAN executes the same program independently from other computers, each operating on different part of the total data. It requires no data communication other than that involves in forwarding input data/results between the LAN computers. LANED uses the Java parallel virtual machine (JPVM) data communication library to exchange data between computers. For equivalent calculations, the computation times on a single computer and a LAN of various number of computers, are estimated, and the resulting speedup and parallelization efficiency, are computed. The estimated results demonstrated that parallelization efficiencies achieved vary between 87% to 60% when the number of computers on the LAN varies between 2 to 5 computers connected through 10/100 Mbps Ethernet switch.
Similar to SPEED-UP IMPROVEMENT USING PARALLEL APPROACH IN IMAGE STEGANOGRAPHY (20)
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR cscpconf
The progressive development of Synthetic Aperture Radar (SAR) systems diversify the exploitation of the generated images by these systems in different applications of geoscience. Detection and monitoring surface deformations, procreated by various phenomena had benefited from this evolution and had been realized by interferometry (InSAR) and differential interferometry (DInSAR) techniques. Nevertheless, spatial and temporal decorrelations of the interferometric couples used, limit strongly the precision of analysis results by these techniques. In this context, we propose, in this work, a methodological approach of surface deformation detection and analysis by differential interferograms to show the limits of this technique according to noise quality and level. The detectability model is generated from the deformation signatures, by simulating a linear fault merged to the images couples of ERS1 / ERS2 sensors acquired in a region of the Algerian south.
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATIONcscpconf
A novel based a trajectory-guided, concatenating approach for synthesizing high-quality image real sample renders video is proposed . The lips reading automated is seeking for modeled the closest real image sample sequence preserve in the library under the data video to the HMM predicted trajectory. The object trajectory is modeled obtained by projecting the face patterns into an KDA feature space is estimated. The approach for speaker's face identification by using synthesise the identity surface of a subject face from a small sample of patterns which sparsely each the view sphere. An KDA algorithm use to the Lip-reading image is discrimination, after that work consisted of in the low dimensional for the fundamental lip features vector is reduced by using the 2D-DCT.The mouth of the set area dimensionality is ordered by a normally reduction base on the PCA to obtain the Eigen lips approach, their proposed approach by[33]. The subjective performance results of the cost function under the automatic lips reading modeled , which wasn’t illustrate the superior performance of the
method.
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...cscpconf
Universities offer software engineering capstone course to simulate a real world-working environment in which students can work in a team for a fixed period to deliver a quality product. The objective of the paper is to report on our experience in moving from Waterfall process to Agile process in conducting the software engineering capstone project. We present the capstone course designs for both Waterfall driven and Agile driven methodologies that highlight the structure, deliverables and assessment plans.To evaluate the improvement, we conducted a survey for two different sections taught by two different instructors to evaluate students’ experience in moving from traditional Waterfall model to Agile like process. Twentyeight students filled the survey. The survey consisted of eight multiple-choice questions and an open-ended question to collect feedback from students. The survey results show that students were able to attain hands one experience, which simulate a real world-working environment. The results also show that the Agile approach helped students to have overall better design and avoid mistakes they have made in the initial design completed in of the first phase of the capstone project. In addition, they were able to decide on their team capabilities, training needs and thus learn the required technologies earlier which is reflected on the final product quality
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIEScscpconf
Using social media in education provides learners with an informal way for communication. Informal communication tends to remove barriers and hence promotes student engagement. This paper presents our experience in using three different social media technologies in teaching software project management course. We conducted different surveys at the end of every semester to evaluate students’ satisfaction and engagement. Results show that using social media enhances students’ engagement and satisfaction. However, familiarity with the tool is an important factor for student satisfaction.
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
In real world computing environment with using a computer to answer questions has been a human dream since the beginning of the digital era, Question-answering systems are referred to as intelligent systems, that can be used to provide responses for the questions being asked by the user based on certain facts or rules stored in the knowledge base it can generate answers of questions asked in natural , and the first main idea of fuzzy logic was to working on the problem of computer understanding of natural language, so this survey paper provides an overview on what Question-Answering is and its system architecture and the possible relationship and
different with fuzzy logic, as well as the previous related research with respect to approaches that were followed. At the end, the survey provides an analytical discussion of the proposed QA models, along or combined with fuzzy logic and their main contributions and limitations.
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
Human beings generate different speech waveforms while speaking the same word at different times. Also, different human beings have different accents and generate significantly varying speech waveforms for the same word. There is a need to measure the distances between various words which facilitate preparation of pronunciation dictionaries. A new algorithm called Dynamic Phone Warping (DPW) is presented in this paper. It uses dynamic programming technique for global alignment and shortest distance measurements. The DPW algorithm can be used to enhance the pronunciation dictionaries of the well-known languages like English or to build pronunciation dictionaries to the less known sparse languages. The precision measurement experiments show 88.9% accuracy.
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS cscpconf
In education, the use of electronic (E) examination systems is not a novel idea, as Eexamination systems have been used to conduct objective assessments for the last few years. This research deals with randomly designed E-examinations and proposes an E-assessment system that can be used for subjective questions. This system assesses answers to subjective questions by finding a matching ratio for the keywords in instructor and student answers. The matching ratio is achieved based on semantic and document similarity. The assessment system is composed of four modules: preprocessing, keyword expansion, matching, and grading. A survey and case study were used in the research design to validate the proposed system. The examination assessment system will help instructors to save time, costs, and resources, while increasing efficiency and improving the productivity of exam setting and assessments.
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICcscpconf
African Buffalo Optimization (ABO) is one of the most recent swarms intelligence based metaheuristics. ABO algorithm is inspired by the buffalo’s behavior and lifestyle. Unfortunately, the standard ABO algorithm is proposed only for continuous optimization problems. In this paper, the authors propose two discrete binary ABO algorithms to deal with binary optimization problems. In the first version (called SBABO) they use the sigmoid function and probability model to generate binary solutions. In the second version (called LBABO) they use some logical operator to operate the binary solutions. Computational results on two knapsack problems (KP and MKP) instances show the effectiveness of the proposed algorithm and their ability to achieve good and promising solutions.
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINcscpconf
In recent years, many malware writers have relied on Dynamic Domain Name Services (DDNS) to maintain their Command and Control (C&C) network infrastructure to ensure a persistence presence on a compromised host. Amongst the various DDNS techniques, Domain Generation Algorithm (DGA) is often perceived as the most difficult to detect using traditional methods. This paper presents an approach for detecting DGA using frequency analysis of the character distribution and the weighted scores of the domain names. The approach’s feasibility is demonstrated using a range of legitimate domains and a number of malicious algorithmicallygenerated domain names. Findings from this study show that domain names made up of English characters “a-z” achieving a weighted score of < 45 are often associated with DGA. When a weighted score of < 45 is applied to the Alexa one million list of domain names, only 15% of the domain names were treated as non-human generated.
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...cscpconf
The amount of piracy in the streaming digital content in general and the music industry in specific is posing a real challenge to digital content owners. This paper presents a DRM solution to monetizing, tracking and controlling online streaming content cross platforms for IP enabled devices. The paper benefits from the current advances in Blockchain and cryptocurrencies. Specifically, the paper presents a Global Music Asset Assurance (GoMAA) digital currency and presents the iMediaStreams Blockchain to enable the secure dissemination and tracking of the streamed content. The proposed solution provides the data owner the ability to control the flow of information even after it has been released by creating a secure, selfinstalled, cross platform reader located on the digital content file header. The proposed system provides the content owners’ options to manage their digital information (audio, video, speech, etc.), including the tracking of the most consumed segments, once it is release. The system benefits from token distribution between the content owner (Music Bands), the content distributer (Online Radio Stations) and the content consumer(Fans) on the system blockchain.
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMcscpconf
This paper discusses the importance of verb suffix mapping in Discourse translation system. In
discourse translation, the crucial step is Anaphora resolution and generation. In Anaphora
resolution, cohesion links like pronouns are identified between portions of text. These binders
make the text cohesive by referring to nouns appearing in the previous sentences or nouns
appearing in sentences after them. In Machine Translation systems, to convert the source
language sentences into meaningful target language sentences the verb suffixes should be
changed as per the cohesion links identified. This step of translation process is emphasized in
the present paper. Specifically, the discussion is on how the verbs change according to the
subjects and anaphors. To explain the concept, English is used as the source language (SL) and
an Indian language Telugu is used as Target language (TL)
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...cscpconf
In this paper, based on the definition of conformable fractional derivative, the functional
variable method (FVM) is proposed to seek the exact traveling wave solutions of two higherdimensional
space-time fractional KdV-type equations in mathematical physics, namely the
(3+1)-dimensional space–time fractional Zakharov-Kuznetsov (ZK) equation and the (2+1)-
dimensional space–time fractional Generalized Zakharov-Kuznetsov-Benjamin-Bona-Mahony
(GZK-BBM) equation. Some new solutions are procured and depicted. These solutions, which
contain kink-shaped, singular kink, bell-shaped soliton, singular soliton and periodic wave
solutions, have many potential applications in mathematical physics and engineering. The
simplicity and reliability of the proposed method is verified.
AUTOMATED PENETRATION TESTING: AN OVERVIEWcscpconf
The using of information technology resources is rapidly increasing in organizations,
businesses, and even governments, that led to arise various attacks, and vulnerabilities in the
field. All resources make it a must to do frequently a penetration test (PT) for the environment
and see what can the attacker gain and what is the current environment's vulnerabilities. This
paper reviews some of the automated penetration testing techniques and presents its
enhancement over the traditional manual approaches. To the best of our knowledge, it is the
first research that takes into consideration the concept of penetration testing and the standards
in the area.This research tackles the comparison between the manual and automated
penetration testing, the main tools used in penetration testing. Additionally, compares between
some methodologies used to build an automated penetration testing platform.
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKcscpconf
Since the mid of 1990s, functional connectivity study using fMRI (fcMRI) has drawn increasing
attention of neuroscientists and computer scientists, since it opens a new window to explore
functional network of human brain with relatively high resolution. BOLD technique provides
almost accurate state of brain. Past researches prove that neuro diseases damage the brain
network interaction, protein- protein interaction and gene-gene interaction. A number of
neurological research paper also analyse the relationship among damaged part. By
computational method especially machine learning technique we can show such classifications.
In this paper we used OASIS fMRI dataset affected with Alzheimer’s disease and normal
patient’s dataset. After proper processing the fMRI data we use the processed data to form
classifier models using SVM (Support Vector Machine), KNN (K- nearest neighbour) & Naïve
Bayes. We also compare the accuracy of our proposed method with existing methods. In future,
we will other combinations of methods for better accuracy.
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
In order to treat and analyze real datasets, fuzzy association rules have been proposed. Several
algorithms have been introduced to extract these rules. However, these algorithms suffer from
the problems of utility, redundancy and large number of extracted fuzzy association rules. The
expert will then be confronted with this huge amount of fuzzy association rules. The task of
validation becomes fastidious. In order to solve these problems, we propose a new validation
method. Our method is based on three steps. (i) We extract a generic base of non redundant
fuzzy association rules by applying EFAR-PN algorithm based on fuzzy formal concept analysis.
(ii) we categorize extracted rules into groups and (iii) we evaluate the relevance of these rules
using structural equation model.
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAcscpconf
In many applications of data mining, class imbalance is noticed when examples in one class are
overrepresented. Traditional classifiers result in poor accuracy of the minority class due to the
class imbalance. Further, the presence of within class imbalance where classes are composed of
multiple sub-concepts with different number of examples also affect the performance of
classifier. In this paper, we propose an oversampling technique that handles between class and
within class imbalance simultaneously and also takes into consideration the generalization
ability in data space. The proposed method is based on two steps- performing Model Based
Clustering with respect to classes to identify the sub-concepts; and then computing the
separating hyperplane based on equal posterior probability between the classes. The proposed
method is tested on 10 publicly available data sets and the result shows that the proposed
method is statistically superior to other existing oversampling methods.
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHcscpconf
Data collection is an essential, but manpower intensive procedure in ecological research. An
algorithm was developed by the author which incorporated two important computer vision
techniques to automate data cataloging for butterfly measurements. Optical Character
Recognition is used for character recognition and Contour Detection is used for imageprocessing.
Proper pre-processing is first done on the images to improve accuracy. Although
there are limitations to Tesseract’s detection of certain fonts, overall, it can successfully identify
words of basic fonts. Contour detection is an advanced technique that can be utilized to
measure an image. Shapes and mathematical calculations are crucial in determining the precise
location of the points on which to draw the body and forewing lines of the butterfly. Overall,
92% accuracy were achieved by the program for the set of butterflies measured.
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
Smart cities utilize Internet of Things (IoT) devices and sensors to enhance the quality of the city
services including energy, transportation, health, and much more. They generate massive
volumes of structured and unstructured data on a daily basis. Also, social networks, such as
Twitter, Facebook, and Google+, are becoming a new source of real-time information in smart
cities. Social network users are acting as social sensors. These datasets so large and complex
are difficult to manage with conventional data management tools and methods. To become
valuable, this massive amount of data, known as 'big data,' needs to be processed and
comprehended to hold the promise of supporting a broad range of urban and smart cities
functions, including among others transportation, water, and energy consumption, pollution
surveillance, and smart city governance. In this work, we investigate how social media analytics
help to analyze smart city data collected from various social media sources, such as Twitter and
Facebook, to detect various events taking place in a smart city and identify the importance of
events and concerns of citizens regarding some events. A case scenario analyses the opinions of
users concerning the traffic in three largest cities in the UAE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGEcscpconf
The anonymity of social networks makes it attractive for hate speech to mask their criminal
activities online posing a challenge to the world and in particular Ethiopia. With this everincreasing
volume of social media data, hate speech identification becomes a challenge in
aggravating conflict between citizens of nations. The high rate of production, has become
difficult to collect, store and analyze such big data using traditional detection methods. This
paper proposed the application of apache spark in hate speech detection to reduce the
challenges. Authors developed an apache spark based model to classify Amharic Facebook
posts and comments into hate and not hate. Authors employed Random forest and Naïve Bayes
for learning and Word2Vec and TF-IDF for feature selection. Tested by 10-fold crossvalidation,
the model based on word2vec embedding performed best with 79.83%accuracy. The
proposed method achieve a promising result with unique feature of spark for big data.
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
This article presents Part of Speech tagging for Nepali text using General Regression Neural
Network (GRNN). The corpus is divided into two parts viz. training and testing. The network is
trained and validated on both training and testing data. It is observed that 96.13% words are
correctly being tagged on training set whereas 74.38% words are tagged correctly on testing
data set using GRNN. The result is compared with the traditional Viterbi algorithm based on
Hidden Markov Model. Viterbi algorithm yields 97.2% and 40% classification accuracies on
training and testing data sets respectively. GRNN based POS Tagger is more consistent than the
traditional Viterbi decoding technique.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension.
For more information, visit-www.vavaclasses.com
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
How to Split Bills in the Odoo 17 POS ModuleCeline George
Bills have a main role in point of sale procedure. It will help to track sales, handling payments and giving receipts to customers. Bill splitting also has an important role in POS. For example, If some friends come together for dinner and if they want to divide the bill then it is possible by POS bill splitting. This slide will show how to split bills in odoo 17 POS.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
2. 126 Computer Science & Information Technology (CS & IT)
several different forms of parallel programming: bit-level, instruction level, data, and task
parallelism. GPUs are very efficient for parallel algorithms where processing of large blocks of
data is done in parallel. The architecture of GPU is suitable for not only graphics rendering
algorithms, but also general parallel algorithms in a wide variety of application domains. General-
purpose computing on graphics processing units (GPGPU) is the utilization of a graphics
processing unit (GPU), which typically handles computation only for computer graphics, to
perform computation in applications traditionally handled by the central processing unit (CPU).
1.2 OPENCL FRAMEWORK
Open Computing Language (OpenCL) is a framework for writing programs that execute across
heterogeneous platforms consisting of central processing units (CPUs), GPUs, digital signal
processors (DSPs) and other processors. It includes a language for writing kernels plus application
programming interfaces (APIs) that are used to define and then control the platforms. It provides
parallel computing using task-based and data-based parallelism. It is an open standard maintained
by the non-profit technology consortium Khronos Group, supported by AMD, IBM, Intel,
NVIDIA, and others [1]. NVIDIA’s Computation Unified Device Architecture (CUDA) platform
is the earliest widely adopted programming model for GPU computing.
1.2.1 OpenCL Execution Model
• The platform model presents abstract device architecture that programmers target when
writing OpenCL code. The platform model is shown in Fig. 1. Host is connected to one
or more OpenCL devices.
• A device consists of one or more compute units.
• A compute unit consists of one or more processing elements.
Figure 1. The platform model
In OpenCL, a context is an abstract container that exists on the host. A context coordinates the
mechanisms for host–device interaction, manages the memory objects that are available to the
devices, and keeps track of the programs and kernels that are created for each device.
The following are associated with a context.
• Devices: The things doing the execution.
• Program Objects: The program source that implements the kernels.
• Kernels: Functions that run on OpenCL devices.
• Memory Objects: Data that are operated on by the device.
• Command Queues: Mechanisms for interaction with the devices.
3. Computer Science & Information Technology (CS & IT) 127
The unit of concurrent execution in OpenCL is a work item. Scalability can be achieved by
dividing the work-items of an NDRange into smaller, equally sized workgroups as shown in Fig.
2. An index space with N dimensions requires workgroups to be specified using the same N
dimensions.Work items within a workgroup have a special relationship with one another. They
can perform barrier operations to synchronize and they have access to a shared memory address
space [2].
Figure 2.Work-items are created as an NDRange and grouped in workgroups
In this paper an image steganography algorithm to hide text message in image is implemented in
parallel. The factor considered for parallel program performance is execution time. The image
used to hide the message is called as cover image. The secret message is also called as payload.
Cover object along with the payload is called as stego object.
2. RELATED WORK
Parallel algorithms in scientific applications can be programmed in new Graphic Processor Units
(GPUs) with high performance at low cost. GPU parallel implementation of the iterative
reconstruction algorithm using CUDA is proposed by Belzunce MA, et.al [3]. They have analyzed
the image quality in each platform and showed that theexecution time is slowed down by four
times. Due to race condition problems a higher level of noise is found in GPU.
Cheong Ghil Kim & Yong Soo Choi [4] presented a parallel implementation of 2D DCT on
heterogeneous environment using multicore CPU and many-core GPUs. They have discussed two
parallel programming techniques of Intel TBB (Threading Building Blocks) and OpenCL. TBB is
used to exploit multi-core parallelism and OpenCL is used to exploit many-core parallelism.
OpenCL implementation on GPU shows linear speed up with increase of 2D data set.
An image steganography algorithm is implemented in parallel by SilvanaGreca and EdliraMartiri
[5]. The analysis is done in terms of speedup and efficiency. For Speed-Up the number of
instructions executed in sequential and parallel is considered. Efficiency is the ratio of Speed-Up
and the number of processors. It is proved that optimal number of processors is 600 for that
particular algorithm. The implementation details are not specified in this paper.
Using heterogeneous parallel platform SangminSeo et.al, [6] have characterized the performance
of OpenCL implementation of the NAS Parallel Benchmark suite (NPB). They have compared the
performance of the NPB in OpenCL to that of the OpenMP version. Depending on different
OpenCL compute devices OpenCL version showed different characteristics. For better
performance on a different compute device they have showed that the application needs to be
rewritten or re-optimized although OpenCL provides source-code portability.
Anastasia et.al, [7] presented a method to hide a large amount of data using the technique based
on the edges present in an image. They have produced a new steganographic algorithm by
4. 128 Computer Science & Information Technology (CS & IT)
combining a method that includes stego bits in edge pixels derived by a hybrid edge detector and
a high payload technique for color image. The high payload technique specifies the number of
bits to be embedded in a pixel. Plus one more bit is included for each RGB channel if the current
pixel examined by a hybrid edge detector is an edge pixel.
Based on integer transform and adaptive embedding, Fei Peng et.al, [8] presented a new
reversible data hiding algorithm suitable for high capacity. The pre-estimated distortion estimates
the image block type. The parameter in integer transform is adaptively selected in different
blocks. Due to this more data bits are embedded in these blocks avoiding large distortions.
Lee et.al, [9] presented a method where one or more images are hidden inside a large cover
image. Using JPEG2000 the secret image is compressed and embedded in the cover image using
tri-way pixel-value differencing. Due to the high compression ratio of JPEG2000 the extracted
secret image quality may be reduced. To overcome the loss of distortion residual value coding is
proposed. Without requiring the original cover image the hidden images are extracted from stego
images.
3. PROPOSED METHOD
The image steganography algorithm to hide text message in transform domain is implemented in
sequential and parallel. The text to be hidden is encrypted before being embedded in the image.
Then integer wavelet transform is performed on the cover image so as to generate a coefficient
matrix and then the encrypted text is embedded in the least significant bits of the coefficient
matrix. Inverse integer wavelet transform is then performed on the resultant coefficient matrix so
as to generate the stego image. This process is reversed to extract the encrypted text from the
stego image and then decrypted to get the original text message that was hidden in the cover
image. The embedding and extracting processes are shown in Fig. 3 and Fig. 4 respectively.
Figure 3. Embedding Process
5. Computer Science & Information Technology (CS & IT) 129
Figure 4. Extracting Process
The sequential and parallel algorithms are explained in the following section.
The sequential algorithm is as follows:
Algorithm to hide text message in a grey scale cover image: Input: Cover image C and Secret
message T, Output: Stego image G.
Step 1: Read cover image C and secret message T
Step 2: (Obtain IWT of cover image)
LS = liftwave( 'cdf2.2', 'Int2Int' )
[CLL, CHL, CLH, CHH] = lwt2 (C, LS)
Step 3: Count the number of characters in T. Let this count be N.
Step 4: Count the number of ones in N. Let this count be the Key K1.
Step 5: For each character in T do
Begin
Encrypt each character using Cesar Cipher with Key K1 and store that in E
End
Step 6: (Embed the count N)
Store the count N in LSB bit planes of CHL sequentially
Step 7: (Embed the encrypted characters)
For each character in E do
Begin
Store each character in LSB bit planes of CHL, CLH and CHH sequentially
End
Step 8: (Obtain the stego image)
G = ilwt2 (CLL, CHL, CLH, CHH, LS)
Step 9: End
The pictorial representation of this algorithm is shown in Fig. 5.
6. 130 Computer Science & Information Technology (CS & IT)
Figure 5. Sequential representation for embedding
Algorithm to extract the secret message from the grey scale stego image: input: stego image G,
output: secret message T’
Step 1: (Obtain IWT of the stego image)
LS = liftwave( 'cdf2.2', 'Int2Int' )
[GLL, GHL, GLH, GHH] = lwt2 (G, LS)
Step 2: (Obtain the total number of characters N)
Get the LSB bit plane of GHL to obtain thecharacter count N.
Step3: Add the number of ones in N to get Key K1’.
Step 4: (Obtain Encrypted characters)
For i= 1 to N do
Begin
Extract each bit from LSB bit plane of GHL, GLHand GHH sequentially, to obtain the
encrypted character. Store the character in E’.
End.
Step 5: (Decrypt the character)
For each character in E’ do
Begin
7. Computer Science & Information Technology (CS & IT) 131
Decrypt the character using Cesar Cipher with Key K1’. Store it in T’
End
Step 6: T’ contains the extracted secret message
Step 7: End
The pictorial representation of this algorithm is shown in Fig. 6.
Figure 6. Sequential representation for extracting
These embedding and extracting algorithms are implemented in parallel as follows:
Algorithm to hide text message in a grey scale cover image: Input: Cover image C and Secret
message T, Output: Stego image G.
Step 1: Read cover image C and secret message T
Step 2: (Obtain IWT of cover image)
LS = liftwave( 'cdf2.2', 'Int2Int' )
[CLL, CHL, CLH, CHH] = lwt2 (C, LS)
Step 3: Count the number of characters in T. Let this count be N.
Step 4: Count the number of ones in the count N. Let thiscount be the Key K1.
Step 5: For each character in T do in parallel
Begin
Encrypt each character using Cesar Cipher with Key K1 and store that in E.
8. 132 Computer Science & Information Technology (CS & IT)
End
Step 6: (Embed the Count N)
Store the count N in LSB bit planes of CHL sequentially
Step 7: (Embed the encrypted characters)
For each character in E do in parallel
Begin
Store each character in LSB bit planes of CHL, CLH and CHH sequentially
End
Step 8: (Obtain the stego image)
G = ilwt2 (CLL, CHL, CLH, CHH, LS)
Step 9: End
The pictorial representation of this algorithm is shown in Fig. 7
Figure 7. Parallel representation for embedding
Algorithm to extract the secret message from the grey scale stego image: input: stego image G,
output: secret message T’
Step 1: (Obtain IWT of the stego image)
9. Computer Science & Information Technology (CS & IT) 133
LS = liftwave( 'cdf2.2', 'Int2Int' )
[GLL, GHL, GLH, GHH] = lwt2 (G, LS)
Step 2: (Obtain the total number of characters N)
Get the LSB bit plane of GHL to obtain the character count N.
Step 3: Add the number of ones in N to get Key K1’.
Step 4: (Obtain Encrypted characters)
For i= 1 to N do in parallel
Begin
Extract each bit from LSB bit plane of GHL, GLH and GHH sequentially to obtain the
encrypted character. Store the character in E’.
Step 5: (Decrypt the character)
For each character in E’ do in parallel
Begin
Decrypt the character using Cesar Cipher with Key K1’
Store it in T’
End
Step 6: T’ contains the extracted secret message
End
The pictorial representation of this algorithm is shown in Fig. 8.
Figure 8.Parallel representation for extracting
10. 134 Computer Science & Information Technology (CS & IT)
4. EXPERIMENTAL RESULTS
The proposed steganography technique is implemented using MATLAB 7.11 and visual studio
for sequential case. Parallel implementation is done using OpenCL in AMD based systems with
the specifications: NVIDIA GT640, DDR 3 128 bit, CUDA cores 384, Processor Clock 797
MHz, Graphics clock 797MHz. The cover image considered is a grey scale image of size
1024x1024 pixels. In a cover image of 1024x1024 pixels maximum of 98,302 characters can be
stored. Because using CHL (512x512), CLH (512x512) and CHH (512x512) total coefficients
obtained are 7,86,432. In order to store each character 8 pixels are required. Totally 98,304
characters can be stored in which 32 pixels must be reserved to store the size of the secret text
message. The cover image is shown in Fig. 9. The secret text messages of different sizes are
hidden in this cover image by considering 1st and 3rd LSB planes. The resulting stego images are
shown in Fig. 10(a) 10(b) and 10(c).
Figure 9. Cover Image
(a) 98,000 characters (b) 45,000 characters (c)1,500 characters
Figure 10.Stego images with different amount of secret message characters
Table 1 shows execution times for sequential and parallel approaches and the speedup obtained.
As it is observed from Table 1 the execution time taken by the parallel approach is less compared
to the sequential approach. When the text size is small then parallel execution time will be more
compared to the sequential approach because of the overhead involved for creation of OpenCL
environment. But when the secret data size is increased good speed up will be obtained.
Table 1 Execution Time
Cover
Image
Secret Text
Size
(characters)
Execution Time in ms
1024x102
4
Pixels
Bitplane used : 1st
LSB
Bitplane used : 3rd
LSB
Speedup :
1st
LSB
Speedup:
3rd
LSB
Sequential Parallel Sequential Parallel
1,500 1,062.11 1,062.08 1,078.11 1,045.08 1.0 1.03
45,000 1,333.30 1,226.83 1,318.30 1,235.83 1.09 1.06
98,000 1,692.57 1,566.14 1,676.59 1,488.14 1.08 1.13
The quality of the stego image is tested by computing the Peak Signal to Noise Ratio between the
cover and the stego image. it is calculated using equation (1).
11. Computer Science & Information Technology (CS & IT) 135
where L = maximum value of the pixel (=255) and MSE = Mean Square Error.
where X = original value, X’ = stego value and N = number of pixels.
High PSNR value indicates high security because then MSE is less which means that stego image
is very close to the cover image.
Table 2 shows PSNR values obtained when the different secret messages are hidden in the cover
image. As it is shown in Table 2 PSNR values of 1st LSB is very good. As the data is hidden in
higher bit planes the PSNR decreases which means that the quality is degraded. But PSNR values
30 dB and above are acceptable.
Table 2 PSNR Values
Cover Image
Secret Text
Size
(characters)
PSNR in dB
1024x1024
pixels
Bitplane used : 1st
LSB Bitplane used : 3rd
LSB
Sequential Parallel Sequential Parallel
1,500 48.62 48.62 45.87 45.87
45,000 41.60 41.60 40.80 40.80
98,000 39.57 39.57 38.42 38.42
5. CONCLUSIONS
In this paper it is showed that steganography algorithms can be parallelized to improve the
execution speed so that large images can be used to hide large amount of secret messages. For
small data size parallel algorithms are not efficient because then the parallel processing overhead
will be more. This algorithm can be optimized using optimization techniques for parallel
algorithms, so that resources of the system can be utilized efficiently. In this paper an initial step
is taken to improve the execution speed by parallelizing certain parts of the sequential algorithms.
REFERENCES
[1] Rasit O. Topaloglu and Benedict Gaster, “GPU Programming for EDA with OpenCL”, IEEE 2011, pp
63-66.
[2] Benedict Gaster, et.al, “ Heterogeneous Computing with OpenCL”
[3] Belzunce MA, et.al, “Cuda Parallel Implementation of Image Reconstruction Algorithm for Positron
Emission Tomography”, The Open Medical Imaging Journal, 6, 2012, pp 108-118.
[4] Cheong Ghil Kim & Yong Soo Choi, “A high performance parallel DCT with OpenCL on
heterogeneous computing environment”, Multimedia Tools and Applications, 64, 2013, pp 475–489.
[5] SilvanaGreca, EdlireMartiri, “Wu-Lee Steganographic algorithm on Binary Images processed in
Parallel”, IJVIPNS, 2012
12. 136 Computer Science & Information Technology (CS & IT)
[6] SangminSeo, Gangwon Jo and Jaejin Lee, “Performance Characterization of the NAS Parallel
[7] Benchmarks in OpenCL”, IEEE conference, 2011, pp 137-148.
[8] Anastasia Ioannidou, et.al, “A novel technique for image steganography based on a high payload
method and edge detection”, Expert Systems with Applications, 39, 2012, pp 11517–11524.
[9] Fei Peng, XiaolongLi and BinYang, “Adaptive reversible data hiding scheme based on integer
transform”, Signal Processing, 92, 2012, pp 54–62.
[10] Yen-Po Lee, Jen-Chun Lee, et.al, “High-payload image hiding with quality recovery using tri-way
pixel-value differencing”, Information Sciences, 191, 2012, pp 214–225.