SlideShare a Scribd company logo
1 of 7
Download to read offline
Jan Zizka (Eds) : CCSIT, SIPP, AISC, PDCTA - 2013
pp. 451–457, 2013. © CS & IT-CSCP 2013 DOI : 10.5121/csit.2013.3651
THE EFFECTIVE WAY OF PROCESSOR
PERFORMANCE ENHANCEMENT BY
PROPER BRANCH HANDLING
Jisha P Abraham1
and Dr.Sheena Mathew2
1
M A College of Engineering,Kothamangalam,Kerala,India
jishaanil@gmail.com
2
Cochin University of Science and Technology,Cochin-22,Kerala,India
ABSTRACT
The processor performance is highly dependent on the regular supply of correct instruction at
the right time. To reduce instruction cache misses, one of the proposed mechanism is the
instruction prefetching, which in turn will increase instructions supply to the processor. The
technology developments in these fields indicates that in future the gap between processing
speeds of processor and data transfer speed of memory is likely to be increased. Memory
bandwidth can be increased significantly using the prefetching, but unsuccessful prefetches will
pollute the primary cache. Prefetching can be done either with software or hardware. In
software prefetching the compiler will insert a prefetch code in the program. In this case as
actual memory capacity is not known to the compiler and it will lead to some harmful
prefetches. In hardware prefetching instead of inserting prefetch code it will make use of extra
hardware and which is utilized during the execution. The most significant source of lost
performance when the process waiting for the availability of the next instruction. All the
prefetching methods are giving stress only to the fetching of the instruction for the execution,
not to the overall performance of the processor. This paper is an attempt to study the branch
handling in a uniprocessing environment, where, when ever branching is identified the proper
cache memory management is enabled inside the memory management unit.
KEYWORDS
Uniprocessing, Prefetch, Branch handling, Processor performance.
1. INTRODUCTION
The performance of superscalar processors and high speed sequential machines are degraded by
the instruction cache misses. Instruction prefetch algorithms attempt to reduce the performance
degradation by bringing prefetch code into the instruction cache before they are needed by the
CPU fetch unit. The technology developments in these fields indicates that in future the gap
between processing speeds of processor and data transfer speed of memory is likely to be
increased. A traditional solution for this problem is to introduce multi-level cache hierarchies. To
reduces memory access latency by fetching lines into cache before a demand reference is called
452 Computer Science & Information Technology (CS & IT)
prefetching. Each missing cache line should be prefetched so that it arrives in the cache just
before its next reference. Each prefetching techniques should be able to predict the prefetch
addresses. This will lead to following issues; Some addresses which cannot accurately predict the
limits will limit the effectiveness of speculative prefetching. With accurate predictions the
prefetch issued early enough to cover the nominal latency, the full memory latency may not be
hidden due to additional delays caused by limited available memory bandwidth. Prefetched lines,
whether correctly predicted or not, may be issued too early will pollute the cache by replacing the
desirable There are two main techniques to attack memory latency: (a) the first set of techniques
attempt to reduce memory latency and (b) the second set attempt to hide it. These prefetching can
be either done with software or with hardware technology. Save memory bandwidth due to
useless prefetch, improve prefetch performance. Our study on the branch prediction will narrow
to the following conclusion. Domain-fetching is not an effective mechanism for filling the
required instruction to the cache memory. Conventional hardware prefetches are not useful over
the time intervals in which performance loss is the most dire. Bulk transferring the private cache
is surprisingly ineffective, and in many cases is worse than doing nothing. The only over head in
this case is the over head caused while identifying the branch instruction.
2. BACK GROUND AND RELATED WORKS
Sair, et,al[14] survey several prefetchers, and introduce a method for classifying memory access
behaviors in hardware: the branch access stream is matched against the behavior specific table in
parallel. We utilize such a similar technique in the listing the branches in the programming
environment. Mahmut,Yuanrui [15] study shows that timing and scheduling of prefetch
instructions is a critical issue in software data prefetching and prefetch instructions must be issued
in a timely manner for them to be useful. If a prefetch is issued too early, there is a chance that
the prefetched data will be replaced from the cache before its use or it may also lead to
replacement of other useful data from the higher levels of the memory hierarchy. If the prefetch is
issued too late, the requested data may not arrive before the actual memory reference is made,
thereby introducing processor stall cycles. We make use of this concept will the pre-empting the
current running sequence form the processor. I.K. Chen[6] Next-line prefetching tries to prefetch
sequential cache lines before they are needed by the CPUs fetch unit. The next line is not resident
in the cache; it will be prefetched when an instruction located some distance into the current line
is accessed. This specified distance is measured from the end of the cache line and is called the
fetch ahead distance. Next-line prefetching predicts that execution will fall-through any
conditional branches in the current line and continue along the sequential path. Next line guess
will be incorrect except in the case of short branches and the correct execution path will not be
prefetched. Performance of the scheme is dependent upon the choice of fetch ahead distance. For
the calculation of the wastage of the processor cycles can be calculated using this fetch ahead
distance. Run ahead Execution [16]prefetches by speculatively executing the application, but
ignoring dependences on long latency misses. This seems well-suited to our need to prefetch the
branch instruction where the branching is identified.
3. BASELINE ARCHITECTURE
On a single processor system, if the processor executes only one instruction at a time, then
multiprogramming in itself does not involve true parallelism. Of course, if the single main
processor in the system happens to be pipelined, then that system feature leads to pseudo-
parallelism. Pseudo-parallelism share the some technical issues in common, related to the need
Computer Science & Information Technology (CS & IT) 453
for synchronization between running programs. Figure 1 is somewhat detailed view of the
running of the four programs- labeled P1, P2, P3, and P4- in multiprogramming mode. The top
part of the figure, all the four programs seems to be running in parallel. With time, each program
makes progress in terms of both instruction and input/output performed. Due to single processor,
in terms of actual execution of program execution, there is no parallelism in this system. Running
program is such a crucial entity in a computer system that it is given a name of its own and
referred to as a process. The behavior of a running program, in terms it spend in compute phase
and I/O phase.
4. MOTIVATION
Parallelism in a uniprocessor environment can be achieved with pipelined concept. Today,
pipelining is the key technique used to make fast CPUs. The time required between moving an
instruction one step down the pipeline is a processor cycle. The length of processor cycle is
determined by the time required for the slowest pipe stage. If the stages are perfectly balanced,
then the time per instruction on the pipelined processor is equal to Time per instruction on
unpipelined machine/ Number of pipe stages Pipelining yields a reduction in the average
execution time per instruction. It is an implementation technique that exploits parallelism among
the instruction in the sequential instruction stream. In he case of a RISC processor ,it basically
consisting of five pipeline stages. Figure 2 deals with the various stages associated with it.
Pipeline overhead arises from the combination of pipeline register delay and clock skew. The
pipeline registers add setup time plus propagation delay to the clock cycle. The major hurdle of
pipelining is structural, data and control hazards. We are mainly concentrated on the control
hazards; arise from the pipelining of branches and other instructions that changes the PC. The
structure of a standard program [2] shows that around 56% of instruction are conditional
branching,10% of instructions are unconditional and 8% of the instruction are comes under the
call return pattern. Hence during the design of pipeline structure we should take care of the non-
sequential execution of the program. Control hazards can cause a greater performance loss for
MIPS pipeline than do data hazards. When branch (conditional/unconditional) is executed, it may
or may not change the PC to something other than its current value. The simplest scheme to
handle branches is to freeze or flush the pipeline, holding or detecting any instructions after the
branch until the branch destination is known. It will destroy the entire parallelism achieved in the
system. If the branching can be predicted early to some extent we can overcome the delay. That
is, the required instruction can be brought to the cache region early. Prefetching can be done
either with software or hardware . In software prefetching the compiler will insert a prefetch code
in the program. In this case as actual memory capacity is not known to the compiler and it will
lead to some harmful prefetches. In hardware prefetching instead of inserting prefetch code it will
make use of extra hardware and which is utilized during the execution. The most significant
source of lost performance when the process waiting for the availability of the next instruction. In
both the cases we have to flush some instruction out from the pipe. Combining the software
prefetching with the branch handling, we can overcome this problem. Whenever branching
encountered the system will go for a context switching, pipe will be filled with instruction for the
next process.
454 Computer Science & Information Technology (CS & IT)
Fig. 1. The pipeline can be thought of as a series of data paths shifted in time
5. SYSTEM DESIGN
5.1 User Modeling Process
User Model life cycle is defined by a sequence of actions performed by an adaptive web-based
system during user’s interaction. The process is depicted in the Fig.3. We separate the user
modelling process into three distinct stages, as defined below.
1.Data collection, when an adaptive system collects various user-related data such as forms and
questionnaires filled-in by the user, logs of user activity within the system, feedback given by the
user to a particular domain item, information about user’s context(device, browser, bandwidth
,etc.). All these information represent observable facts and are stored in an evidence layer of the
user model.
Fig. 2. User Model Life Cycle
The facts can be supplied also from external sources, for instance we can acquire information
about user relationships with other users from an external application for social networking.
2. User model inference, when an adaptive system processes the data from the evidence layer into
higher level user characteristics such as interests, preferences or personal traits. It is important to
mention that these are only estimates of the real user characteristics.
3. Adaptation and personalization is the actual use of a user model (both evidence and higher
level characteristics layer) to provide a user with personalized experience when using an adaptive
system. The success rate of the supplied adaptation effect gives us a feedback about an
accurateness of the underlying user model. The process has a cyclic character, as an adaptive
system continuously acquires new data about the user (influenced by already performed
adaptation and thus by a previous version of the user model) and continuously refines the user
model to better reflect reality and thus to serve as a better base for personalization. An adaptive
Computer Science & Information Technology (CS & IT) 455
system must handle specific problems related to each step of the process. In data collection we
must find a balance between user privacy and amount and nature of data which is required in
order to deliver a successful personalization. We must find such sources of data, which do not
pose additional extra burden on the user and design data collection process in such a way, which
is unobtrusive and runs automatically in background. The main challenge of the actual user model
inference is (apart from actual transformation of user action history into user characteristics)
maintenance of user characteristics, which should keep pace with user personal development,
changing interests and knowledge.
6. ARCHITECTURAL SUPPORT
The miss stream is not sufficient to cover many branch related misses; we need to characterize the
access stream. For that we construct a working set predictor which works in different stages.
First, we observe the access stream of a process and capture the patterns and behavior. Second
,we summarize this behavior and transfer the summery data to the stack region. Third we apply
the summery via a prefetch engine to the target process, to rapidly fill the caches in advance of
the migrated process. The following set of capture engines are required to perform the migration
process. For the processing of each process it will enter to the pipeline, once it enter to the
decoding stage the processor is able to find the branching condition. By that time the next
instruction will be in the fetch cycle.
6.1 Memory locker
For each process we add a memory locker, a specialized unit which records the selected details of
the each committed memory instruction. This unit passively observes the current processes it
executes, but does not directly affect execution. If a conservatively implemented memory locker
becomes swamped with input records may safely be discarded; this will degrade the accuracy of
later prefetching. memory locker can be implemented with small associative memory tables each
with control logic. Each table with in the memory locker targets a specific type of access patterns.
Next-block : these detect sequential block accesses; entries are advanced by one cache block on a
hit. Target PC: this tracks individual instructions which walk through memory in fixed sized
steps. Same object: this capturer accesses to ranges of memory from a common base address, as is
common for structure and object access code. This takes advantage of the common base+offset
method, tracking minimum and maximum offsets for each base address, while ignoring accesses
relative to the global pointer or stack pointer. Blocked BTR: these capture taken branches and
their targets, recording the most recent inbound branch for each target. Block BTR is a block
aligned, allowing a greater amount of the instruction working set to be characterized at a given
table size. Return stack: This maintains a shadow copy of the processor return stack, prefetching
blocks of instruction near the top few control frames.
6.2 Target generator
The target generator activates when a core is signaled to branching. Our baseline core design
assumes hardware support for branching; at halt-time, the core collect and stores the register state
of the process being halted. While register state is being transferred, the target generator reads
through the tables populated by the memory locker and prepare a compact summery of the
process which is moving towards the running state. This summery is transmitted after the
architected process state, and is used to prefetch the ready working set when it resume on the
456 Computer Science & Information Technology (CS & IT)
running state. During summarization each table entry is inspected to determine its usefulness by
observing whether the process comes to ready state is whether from start or waiting state. Table
entries are summarized for transfer by generating a sequence of block addresses from each,
following the behavior pattern captured by that table We encode each sequence with a simple
linear-range encoding, which tells the prefetcher to fetch length cache blocks, starting at start-
address.
6.3 Target-driven prefetcher
Rounding out our working-set switching hardware is the target-driven prefetcher. When a
previously pre-empted branch is activated on a core, its summary records are read by the
prefetcher. Each record is expanded to a sequence of cache block addresses, which are submitted
for prefetching as bandwidth allows. While the main execution pipeline reloads register values
and resumes execution, the prefetcher independently begins to prefetch a likely working set.
Prefetches search the entire memory hierarchy, and contend for the same resources as demand
requests. Once the register state has been loaded, the process resumes execution and competes
with the prefetchers for memory resources. Communication overlap between the process are
modelled among transfers of register state, table summaries, prefetches, and the service of
demand-misses. We model the prefetch engine as submitting virtually addressed memory requests
at the existing ports of the caches, utilizing the existing memory hierarchy for service. While
these requests compete with the process itself, we find that most prefetching immediately after
switching occurs while the process would otherwise be stalled for memory access; thus we do not
require additional cache porting.
7. ANALYSIS AND RESULT
The baselines for comparison in our simulation result are the prediction rate of each benchmark
when executed as a single process and run to completion with no intervening process. The
program is executed for different size of instruction for experiment our experiment has the effects
on the size of the cache line provided along with the number of cache blocks considered in the
cache memory. The cache management is done with the first in first out method.
Fig. 3. Lines of code Vs. Time Graph
Computer Science & Information Technology (CS & IT) 457
8. CONCLUSIONS
The report here sheds light on the applicability of instruction cache prefetching schemes in
current and next-generation microprocessor designs where memory latencies are likely to be
longer. When ever, a new prefetching algorithm is examined that was inspired by previous studies
of the effect of speculation down miss predicted not correct paths. Data prefetching has been used
in the past as one of the mechanism to hide memory access latencies. But prefetching requires
accurate timing to be effective in practice. This timing problem will affect more in the case of
multiple processor. Which is handled with the help of target-driven prefetcher. The designing of
an effective prefetching algorithm which will minimize the prefetching overhead, and it is a big
challenge and needs more thoughts and effort on it.
REFERENCES
[1] A.J. Smith, Cache memories, ACM Computing ,Surveys, pp. 473-530, Sep. 1982
[2] ECE 4100/6100 Advanced Computer Architecture ,Lecture 5 Branch Prediction,Prof. Hsien-Hsin
Sean Lee School of Electrical and Computer Engineering Georgia Institute of Technology
[3] A. Brown and D. M. Tullsen. The shared-thread multiprocessor. In 21st InternationalConference on
Supercomputing, pages 73-82, June 2008.
[4] Abdel-Shafi,H.Hall,J. Adve.”An Evaluation of Fine Grain producer-initiated communication in
Cache-Coherent proceeding of the 3rd HPCA,IEEEcomputer Society press
[5] Aleksardar Milenkovic, Veljko Milutinovic Lazy prefetching proceedings of the IEEE hicss 1998
[6] I.K. Chen, C.C. Lee, and T.N. Mudge. Instruction prefetching us- ing branch prediction information.
In International Conference on Computer Design, pages 593-601, October 1997.
[7] Mahmut Kandemir, Yuanrui zhang,Ozcan OzturkAdaptive prefetching for shared cache based chip
multiprocessorsEDAA 2009
[8] Jim Pierce Intel Corporation,Trevor Mudge University of Michigan Wrong-Path Instruction
Prefetching IEEE 1996
[9] P. Hsu, Design of the TFP microprocessor, IEEE Micro, 1993.
[10] J. Pierce, Cache Behavior in the Presence of Speculative ExecutionThe Benefits of Misprediction,
Ph.D. Thesis, The University of Michigan, 1995.
[11] G. Reinman, T. Austin, and B. Calder. A scalable front-end architecture for fast instruction delivery.
In 26th Annual International Symposium on Computer Architecture, May 1999.
[12] Lawrence Spracklen, Yuan chou and Snathosh G Abraham effective instruction prefetching in chip
multiprocessor for modern commersal application proceedings of 11th Intl. Symbosium on
HPCA,2005
[13] D Kongetira,K Aingraran and K olukotun NiagraA 32-way multithreaded SPARC processor IEEE
micro, 2005.
[14] J. D. Collins, S. Sair, B. Calder, and D. M. Tullsen. Pointer cache assisted prefetching. In 35th
International Symposium on Microarchitecture, pages 62-73, Nov. 2002.
[15] R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt, and Y. N. Patt. Simultaneous subordinate
microthreading (SSMT). In 26th International Symposium on Computer Architecture, pages 186-195,
M

More Related Content

What's hot

Robust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory InterfaceRobust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory InterfaceIOSRJVSP
 
Cache performance-x86-2009
Cache performance-x86-2009Cache performance-x86-2009
Cache performance-x86-2009Léia de Sousa
 
Chrome server2 print_http_www_uni_mannheim_de_acm97_papers_soderquist_m_13736...
Chrome server2 print_http_www_uni_mannheim_de_acm97_papers_soderquist_m_13736...Chrome server2 print_http_www_uni_mannheim_de_acm97_papers_soderquist_m_13736...
Chrome server2 print_http_www_uni_mannheim_de_acm97_papers_soderquist_m_13736...Léia de Sousa
 
Hyper threading technology
Hyper threading technologyHyper threading technology
Hyper threading technologydeepakmarndi
 
Counter Matrix Code for SRAM Based FPGA to Correct Multi Bit Upset Error
Counter Matrix Code for SRAM Based FPGA to Correct Multi Bit Upset ErrorCounter Matrix Code for SRAM Based FPGA to Correct Multi Bit Upset Error
Counter Matrix Code for SRAM Based FPGA to Correct Multi Bit Upset ErrorIRJET Journal
 
Fault Injection Approach for Network on Chip
Fault Injection Approach for Network on ChipFault Injection Approach for Network on Chip
Fault Injection Approach for Network on Chipijsrd.com
 

What's hot (12)

Robust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory InterfaceRobust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory Interface
 
43
4343
43
 
Bm34399403
Bm34399403Bm34399403
Bm34399403
 
Cache performance-x86-2009
Cache performance-x86-2009Cache performance-x86-2009
Cache performance-x86-2009
 
Chrome server2 print_http_www_uni_mannheim_de_acm97_papers_soderquist_m_13736...
Chrome server2 print_http_www_uni_mannheim_de_acm97_papers_soderquist_m_13736...Chrome server2 print_http_www_uni_mannheim_de_acm97_papers_soderquist_m_13736...
Chrome server2 print_http_www_uni_mannheim_de_acm97_papers_soderquist_m_13736...
 
New lima latw03
New lima latw03New lima latw03
New lima latw03
 
eet_NPU_file.PDF
eet_NPU_file.PDFeet_NPU_file.PDF
eet_NPU_file.PDF
 
Y4301127131
Y4301127131Y4301127131
Y4301127131
 
Hyper threading technology
Hyper threading technologyHyper threading technology
Hyper threading technology
 
Counter Matrix Code for SRAM Based FPGA to Correct Multi Bit Upset Error
Counter Matrix Code for SRAM Based FPGA to Correct Multi Bit Upset ErrorCounter Matrix Code for SRAM Based FPGA to Correct Multi Bit Upset Error
Counter Matrix Code for SRAM Based FPGA to Correct Multi Bit Upset Error
 
IEEExeonmem
IEEExeonmemIEEExeonmem
IEEExeonmem
 
Fault Injection Approach for Network on Chip
Fault Injection Approach for Network on ChipFault Injection Approach for Network on Chip
Fault Injection Approach for Network on Chip
 

Similar to THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDLING

Tdt4260 miniproject report_group_3
Tdt4260 miniproject report_group_3Tdt4260 miniproject report_group_3
Tdt4260 miniproject report_group_3Yulong Bai
 
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORSAFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORScscpconf
 
Affect of parallel computing on multicore processors
Affect of parallel computing on multicore processorsAffect of parallel computing on multicore processors
Affect of parallel computing on multicore processorscsandit
 
Aca lab project (rohit malav)
Aca lab project (rohit malav) Aca lab project (rohit malav)
Aca lab project (rohit malav) Rohit malav
 
Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpeSAT Publishing House
 
Enery efficient data prefetching
Enery efficient data prefetchingEnery efficient data prefetching
Enery efficient data prefetchingHimanshu Koli
 
A Taxonomy of Data Prefetching Mechanisms
A Taxonomy of Data Prefetching MechanismsA Taxonomy of Data Prefetching Mechanisms
A Taxonomy of Data Prefetching Mechanismsijtsrd
 
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTION
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTIONLEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTION
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTIONcscpconf
 
Learning scheduler parameters for adaptive preemption
Learning scheduler parameters for adaptive preemptionLearning scheduler parameters for adaptive preemption
Learning scheduler parameters for adaptive preemptioncsandit
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxfaithxdunce63732
 
Pipeline Mechanism
Pipeline MechanismPipeline Mechanism
Pipeline MechanismAshik Iqbal
 
Computer Applications: An International Journal (CAIJ)
Computer Applications: An International Journal (CAIJ)Computer Applications: An International Journal (CAIJ)
Computer Applications: An International Journal (CAIJ)caijjournal
 
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...caijjournal
 
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...IDES Editor
 

Similar to THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDLING (20)

Tdt4260 miniproject report_group_3
Tdt4260 miniproject report_group_3Tdt4260 miniproject report_group_3
Tdt4260 miniproject report_group_3
 
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORSAFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
AFFECT OF PARALLEL COMPUTING ON MULTICORE PROCESSORS
 
Affect of parallel computing on multicore processors
Affect of parallel computing on multicore processorsAffect of parallel computing on multicore processors
Affect of parallel computing on multicore processors
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
Oversimplified CA
Oversimplified CAOversimplified CA
Oversimplified CA
 
Aca lab project (rohit malav)
Aca lab project (rohit malav) Aca lab project (rohit malav)
Aca lab project (rohit malav)
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
Cad report
Cad reportCad report
Cad report
 
Hardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmpHardback solution to accelerate multimedia computation through mgp in cmp
Hardback solution to accelerate multimedia computation through mgp in cmp
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
Enery efficient data prefetching
Enery efficient data prefetchingEnery efficient data prefetching
Enery efficient data prefetching
 
A Taxonomy of Data Prefetching Mechanisms
A Taxonomy of Data Prefetching MechanismsA Taxonomy of Data Prefetching Mechanisms
A Taxonomy of Data Prefetching Mechanisms
 
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTION
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTIONLEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTION
LEARNING SCHEDULER PARAMETERS FOR ADAPTIVE PREEMPTION
 
Learning scheduler parameters for adaptive preemption
Learning scheduler parameters for adaptive preemptionLearning scheduler parameters for adaptive preemption
Learning scheduler parameters for adaptive preemption
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
 
Pipeline Mechanism
Pipeline MechanismPipeline Mechanism
Pipeline Mechanism
 
Computer Applications: An International Journal (CAIJ)
Computer Applications: An International Journal (CAIJ)Computer Applications: An International Journal (CAIJ)
Computer Applications: An International Journal (CAIJ)
 
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
PERFORMANCE ENHANCEMENT WITH SPECULATIVE-TRACE CAPPING AT DIFFERENT PIPELINE ...
 
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
Implementing True Zero Cycle Branching in Scalar and Superscalar Pipelined Pr...
 

More from cscpconf

ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR cscpconf
 
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATIONcscpconf
 
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...cscpconf
 
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIESPROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIEScscpconf
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICcscpconf
 
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
 
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS cscpconf
 
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICTWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICcscpconf
 
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINDETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINcscpconf
 
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...cscpconf
 
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMIMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMcscpconf
 
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...cscpconf
 
AUTOMATED PENETRATION TESTING: AN OVERVIEW
AUTOMATED PENETRATION TESTING: AN OVERVIEWAUTOMATED PENETRATION TESTING: AN OVERVIEW
AUTOMATED PENETRATION TESTING: AN OVERVIEWcscpconf
 
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKCLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKcscpconf
 
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...cscpconf
 
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAPROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAcscpconf
 
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHCHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHcscpconf
 
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...cscpconf
 
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGESOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGEcscpconf
 
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTGENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTcscpconf
 

More from cscpconf (20)

ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
ANALYSIS OF LAND SURFACE DEFORMATION GRADIENT BY DINSAR
 
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
4D AUTOMATIC LIP-READING FOR SPEAKER'S FACE IDENTIFCATION
 
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
MOVING FROM WATERFALL TO AGILE PROCESS IN SOFTWARE ENGINEERING CAPSTONE PROJE...
 
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIESPROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
PROMOTING STUDENT ENGAGEMENT USING SOCIAL MEDIA TECHNOLOGIES
 
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGICA SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
A SURVEY ON QUESTION ANSWERING SYSTEMS: THE ADVANCES OF FUZZY LOGIC
 
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS
 
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
INTELLIGENT ELECTRONIC ASSESSMENT FOR SUBJECTIVE EXAMS
 
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTICTWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
TWO DISCRETE BINARY VERSIONS OF AFRICAN BUFFALO OPTIMIZATION METAHEURISTIC
 
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAINDETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
DETECTION OF ALGORITHMICALLY GENERATED MALICIOUS DOMAIN
 
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
GLOBAL MUSIC ASSET ASSURANCE DIGITAL CURRENCY: A DRM SOLUTION FOR STREAMING C...
 
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEMIMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
IMPORTANCE OF VERB SUFFIX MAPPING IN DISCOURSE TRANSLATION SYSTEM
 
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
EXACT SOLUTIONS OF A FAMILY OF HIGHER-DIMENSIONAL SPACE-TIME FRACTIONAL KDV-T...
 
AUTOMATED PENETRATION TESTING: AN OVERVIEW
AUTOMATED PENETRATION TESTING: AN OVERVIEWAUTOMATED PENETRATION TESTING: AN OVERVIEW
AUTOMATED PENETRATION TESTING: AN OVERVIEW
 
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORKCLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
CLASSIFICATION OF ALZHEIMER USING fMRI DATA AND BRAIN NETWORK
 
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
VALIDATION METHOD OF FUZZY ASSOCIATION RULES BASED ON FUZZY FORMAL CONCEPT AN...
 
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATAPROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
PROBABILITY BASED CLUSTER EXPANSION OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA
 
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCHCHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
CHARACTER AND IMAGE RECOGNITION FOR DATA CATALOGING IN ECOLOGICAL RESEARCH
 
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
SOCIAL MEDIA ANALYTICS FOR SENTIMENT ANALYSIS AND EVENT DETECTION IN SMART CI...
 
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGESOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
SOCIAL NETWORK HATE SPEECH DETECTION FOR AMHARIC LANGUAGE
 
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXTGENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
GENERAL REGRESSION NEURAL NETWORK BASED POS TAGGING FOR NEPALI TEXT
 

Recently uploaded

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 

Recently uploaded (20)

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 

THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDLING

  • 1. Jan Zizka (Eds) : CCSIT, SIPP, AISC, PDCTA - 2013 pp. 451–457, 2013. © CS & IT-CSCP 2013 DOI : 10.5121/csit.2013.3651 THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDLING Jisha P Abraham1 and Dr.Sheena Mathew2 1 M A College of Engineering,Kothamangalam,Kerala,India jishaanil@gmail.com 2 Cochin University of Science and Technology,Cochin-22,Kerala,India ABSTRACT The processor performance is highly dependent on the regular supply of correct instruction at the right time. To reduce instruction cache misses, one of the proposed mechanism is the instruction prefetching, which in turn will increase instructions supply to the processor. The technology developments in these fields indicates that in future the gap between processing speeds of processor and data transfer speed of memory is likely to be increased. Memory bandwidth can be increased significantly using the prefetching, but unsuccessful prefetches will pollute the primary cache. Prefetching can be done either with software or hardware. In software prefetching the compiler will insert a prefetch code in the program. In this case as actual memory capacity is not known to the compiler and it will lead to some harmful prefetches. In hardware prefetching instead of inserting prefetch code it will make use of extra hardware and which is utilized during the execution. The most significant source of lost performance when the process waiting for the availability of the next instruction. All the prefetching methods are giving stress only to the fetching of the instruction for the execution, not to the overall performance of the processor. This paper is an attempt to study the branch handling in a uniprocessing environment, where, when ever branching is identified the proper cache memory management is enabled inside the memory management unit. KEYWORDS Uniprocessing, Prefetch, Branch handling, Processor performance. 1. INTRODUCTION The performance of superscalar processors and high speed sequential machines are degraded by the instruction cache misses. Instruction prefetch algorithms attempt to reduce the performance degradation by bringing prefetch code into the instruction cache before they are needed by the CPU fetch unit. The technology developments in these fields indicates that in future the gap between processing speeds of processor and data transfer speed of memory is likely to be increased. A traditional solution for this problem is to introduce multi-level cache hierarchies. To reduces memory access latency by fetching lines into cache before a demand reference is called
  • 2. 452 Computer Science & Information Technology (CS & IT) prefetching. Each missing cache line should be prefetched so that it arrives in the cache just before its next reference. Each prefetching techniques should be able to predict the prefetch addresses. This will lead to following issues; Some addresses which cannot accurately predict the limits will limit the effectiveness of speculative prefetching. With accurate predictions the prefetch issued early enough to cover the nominal latency, the full memory latency may not be hidden due to additional delays caused by limited available memory bandwidth. Prefetched lines, whether correctly predicted or not, may be issued too early will pollute the cache by replacing the desirable There are two main techniques to attack memory latency: (a) the first set of techniques attempt to reduce memory latency and (b) the second set attempt to hide it. These prefetching can be either done with software or with hardware technology. Save memory bandwidth due to useless prefetch, improve prefetch performance. Our study on the branch prediction will narrow to the following conclusion. Domain-fetching is not an effective mechanism for filling the required instruction to the cache memory. Conventional hardware prefetches are not useful over the time intervals in which performance loss is the most dire. Bulk transferring the private cache is surprisingly ineffective, and in many cases is worse than doing nothing. The only over head in this case is the over head caused while identifying the branch instruction. 2. BACK GROUND AND RELATED WORKS Sair, et,al[14] survey several prefetchers, and introduce a method for classifying memory access behaviors in hardware: the branch access stream is matched against the behavior specific table in parallel. We utilize such a similar technique in the listing the branches in the programming environment. Mahmut,Yuanrui [15] study shows that timing and scheduling of prefetch instructions is a critical issue in software data prefetching and prefetch instructions must be issued in a timely manner for them to be useful. If a prefetch is issued too early, there is a chance that the prefetched data will be replaced from the cache before its use or it may also lead to replacement of other useful data from the higher levels of the memory hierarchy. If the prefetch is issued too late, the requested data may not arrive before the actual memory reference is made, thereby introducing processor stall cycles. We make use of this concept will the pre-empting the current running sequence form the processor. I.K. Chen[6] Next-line prefetching tries to prefetch sequential cache lines before they are needed by the CPUs fetch unit. The next line is not resident in the cache; it will be prefetched when an instruction located some distance into the current line is accessed. This specified distance is measured from the end of the cache line and is called the fetch ahead distance. Next-line prefetching predicts that execution will fall-through any conditional branches in the current line and continue along the sequential path. Next line guess will be incorrect except in the case of short branches and the correct execution path will not be prefetched. Performance of the scheme is dependent upon the choice of fetch ahead distance. For the calculation of the wastage of the processor cycles can be calculated using this fetch ahead distance. Run ahead Execution [16]prefetches by speculatively executing the application, but ignoring dependences on long latency misses. This seems well-suited to our need to prefetch the branch instruction where the branching is identified. 3. BASELINE ARCHITECTURE On a single processor system, if the processor executes only one instruction at a time, then multiprogramming in itself does not involve true parallelism. Of course, if the single main processor in the system happens to be pipelined, then that system feature leads to pseudo- parallelism. Pseudo-parallelism share the some technical issues in common, related to the need
  • 3. Computer Science & Information Technology (CS & IT) 453 for synchronization between running programs. Figure 1 is somewhat detailed view of the running of the four programs- labeled P1, P2, P3, and P4- in multiprogramming mode. The top part of the figure, all the four programs seems to be running in parallel. With time, each program makes progress in terms of both instruction and input/output performed. Due to single processor, in terms of actual execution of program execution, there is no parallelism in this system. Running program is such a crucial entity in a computer system that it is given a name of its own and referred to as a process. The behavior of a running program, in terms it spend in compute phase and I/O phase. 4. MOTIVATION Parallelism in a uniprocessor environment can be achieved with pipelined concept. Today, pipelining is the key technique used to make fast CPUs. The time required between moving an instruction one step down the pipeline is a processor cycle. The length of processor cycle is determined by the time required for the slowest pipe stage. If the stages are perfectly balanced, then the time per instruction on the pipelined processor is equal to Time per instruction on unpipelined machine/ Number of pipe stages Pipelining yields a reduction in the average execution time per instruction. It is an implementation technique that exploits parallelism among the instruction in the sequential instruction stream. In he case of a RISC processor ,it basically consisting of five pipeline stages. Figure 2 deals with the various stages associated with it. Pipeline overhead arises from the combination of pipeline register delay and clock skew. The pipeline registers add setup time plus propagation delay to the clock cycle. The major hurdle of pipelining is structural, data and control hazards. We are mainly concentrated on the control hazards; arise from the pipelining of branches and other instructions that changes the PC. The structure of a standard program [2] shows that around 56% of instruction are conditional branching,10% of instructions are unconditional and 8% of the instruction are comes under the call return pattern. Hence during the design of pipeline structure we should take care of the non- sequential execution of the program. Control hazards can cause a greater performance loss for MIPS pipeline than do data hazards. When branch (conditional/unconditional) is executed, it may or may not change the PC to something other than its current value. The simplest scheme to handle branches is to freeze or flush the pipeline, holding or detecting any instructions after the branch until the branch destination is known. It will destroy the entire parallelism achieved in the system. If the branching can be predicted early to some extent we can overcome the delay. That is, the required instruction can be brought to the cache region early. Prefetching can be done either with software or hardware . In software prefetching the compiler will insert a prefetch code in the program. In this case as actual memory capacity is not known to the compiler and it will lead to some harmful prefetches. In hardware prefetching instead of inserting prefetch code it will make use of extra hardware and which is utilized during the execution. The most significant source of lost performance when the process waiting for the availability of the next instruction. In both the cases we have to flush some instruction out from the pipe. Combining the software prefetching with the branch handling, we can overcome this problem. Whenever branching encountered the system will go for a context switching, pipe will be filled with instruction for the next process.
  • 4. 454 Computer Science & Information Technology (CS & IT) Fig. 1. The pipeline can be thought of as a series of data paths shifted in time 5. SYSTEM DESIGN 5.1 User Modeling Process User Model life cycle is defined by a sequence of actions performed by an adaptive web-based system during user’s interaction. The process is depicted in the Fig.3. We separate the user modelling process into three distinct stages, as defined below. 1.Data collection, when an adaptive system collects various user-related data such as forms and questionnaires filled-in by the user, logs of user activity within the system, feedback given by the user to a particular domain item, information about user’s context(device, browser, bandwidth ,etc.). All these information represent observable facts and are stored in an evidence layer of the user model. Fig. 2. User Model Life Cycle The facts can be supplied also from external sources, for instance we can acquire information about user relationships with other users from an external application for social networking. 2. User model inference, when an adaptive system processes the data from the evidence layer into higher level user characteristics such as interests, preferences or personal traits. It is important to mention that these are only estimates of the real user characteristics. 3. Adaptation and personalization is the actual use of a user model (both evidence and higher level characteristics layer) to provide a user with personalized experience when using an adaptive system. The success rate of the supplied adaptation effect gives us a feedback about an accurateness of the underlying user model. The process has a cyclic character, as an adaptive system continuously acquires new data about the user (influenced by already performed adaptation and thus by a previous version of the user model) and continuously refines the user model to better reflect reality and thus to serve as a better base for personalization. An adaptive
  • 5. Computer Science & Information Technology (CS & IT) 455 system must handle specific problems related to each step of the process. In data collection we must find a balance between user privacy and amount and nature of data which is required in order to deliver a successful personalization. We must find such sources of data, which do not pose additional extra burden on the user and design data collection process in such a way, which is unobtrusive and runs automatically in background. The main challenge of the actual user model inference is (apart from actual transformation of user action history into user characteristics) maintenance of user characteristics, which should keep pace with user personal development, changing interests and knowledge. 6. ARCHITECTURAL SUPPORT The miss stream is not sufficient to cover many branch related misses; we need to characterize the access stream. For that we construct a working set predictor which works in different stages. First, we observe the access stream of a process and capture the patterns and behavior. Second ,we summarize this behavior and transfer the summery data to the stack region. Third we apply the summery via a prefetch engine to the target process, to rapidly fill the caches in advance of the migrated process. The following set of capture engines are required to perform the migration process. For the processing of each process it will enter to the pipeline, once it enter to the decoding stage the processor is able to find the branching condition. By that time the next instruction will be in the fetch cycle. 6.1 Memory locker For each process we add a memory locker, a specialized unit which records the selected details of the each committed memory instruction. This unit passively observes the current processes it executes, but does not directly affect execution. If a conservatively implemented memory locker becomes swamped with input records may safely be discarded; this will degrade the accuracy of later prefetching. memory locker can be implemented with small associative memory tables each with control logic. Each table with in the memory locker targets a specific type of access patterns. Next-block : these detect sequential block accesses; entries are advanced by one cache block on a hit. Target PC: this tracks individual instructions which walk through memory in fixed sized steps. Same object: this capturer accesses to ranges of memory from a common base address, as is common for structure and object access code. This takes advantage of the common base+offset method, tracking minimum and maximum offsets for each base address, while ignoring accesses relative to the global pointer or stack pointer. Blocked BTR: these capture taken branches and their targets, recording the most recent inbound branch for each target. Block BTR is a block aligned, allowing a greater amount of the instruction working set to be characterized at a given table size. Return stack: This maintains a shadow copy of the processor return stack, prefetching blocks of instruction near the top few control frames. 6.2 Target generator The target generator activates when a core is signaled to branching. Our baseline core design assumes hardware support for branching; at halt-time, the core collect and stores the register state of the process being halted. While register state is being transferred, the target generator reads through the tables populated by the memory locker and prepare a compact summery of the process which is moving towards the running state. This summery is transmitted after the architected process state, and is used to prefetch the ready working set when it resume on the
  • 6. 456 Computer Science & Information Technology (CS & IT) running state. During summarization each table entry is inspected to determine its usefulness by observing whether the process comes to ready state is whether from start or waiting state. Table entries are summarized for transfer by generating a sequence of block addresses from each, following the behavior pattern captured by that table We encode each sequence with a simple linear-range encoding, which tells the prefetcher to fetch length cache blocks, starting at start- address. 6.3 Target-driven prefetcher Rounding out our working-set switching hardware is the target-driven prefetcher. When a previously pre-empted branch is activated on a core, its summary records are read by the prefetcher. Each record is expanded to a sequence of cache block addresses, which are submitted for prefetching as bandwidth allows. While the main execution pipeline reloads register values and resumes execution, the prefetcher independently begins to prefetch a likely working set. Prefetches search the entire memory hierarchy, and contend for the same resources as demand requests. Once the register state has been loaded, the process resumes execution and competes with the prefetchers for memory resources. Communication overlap between the process are modelled among transfers of register state, table summaries, prefetches, and the service of demand-misses. We model the prefetch engine as submitting virtually addressed memory requests at the existing ports of the caches, utilizing the existing memory hierarchy for service. While these requests compete with the process itself, we find that most prefetching immediately after switching occurs while the process would otherwise be stalled for memory access; thus we do not require additional cache porting. 7. ANALYSIS AND RESULT The baselines for comparison in our simulation result are the prediction rate of each benchmark when executed as a single process and run to completion with no intervening process. The program is executed for different size of instruction for experiment our experiment has the effects on the size of the cache line provided along with the number of cache blocks considered in the cache memory. The cache management is done with the first in first out method. Fig. 3. Lines of code Vs. Time Graph
  • 7. Computer Science & Information Technology (CS & IT) 457 8. CONCLUSIONS The report here sheds light on the applicability of instruction cache prefetching schemes in current and next-generation microprocessor designs where memory latencies are likely to be longer. When ever, a new prefetching algorithm is examined that was inspired by previous studies of the effect of speculation down miss predicted not correct paths. Data prefetching has been used in the past as one of the mechanism to hide memory access latencies. But prefetching requires accurate timing to be effective in practice. This timing problem will affect more in the case of multiple processor. Which is handled with the help of target-driven prefetcher. The designing of an effective prefetching algorithm which will minimize the prefetching overhead, and it is a big challenge and needs more thoughts and effort on it. REFERENCES [1] A.J. Smith, Cache memories, ACM Computing ,Surveys, pp. 473-530, Sep. 1982 [2] ECE 4100/6100 Advanced Computer Architecture ,Lecture 5 Branch Prediction,Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia Institute of Technology [3] A. Brown and D. M. Tullsen. The shared-thread multiprocessor. In 21st InternationalConference on Supercomputing, pages 73-82, June 2008. [4] Abdel-Shafi,H.Hall,J. Adve.”An Evaluation of Fine Grain producer-initiated communication in Cache-Coherent proceeding of the 3rd HPCA,IEEEcomputer Society press [5] Aleksardar Milenkovic, Veljko Milutinovic Lazy prefetching proceedings of the IEEE hicss 1998 [6] I.K. Chen, C.C. Lee, and T.N. Mudge. Instruction prefetching us- ing branch prediction information. In International Conference on Computer Design, pages 593-601, October 1997. [7] Mahmut Kandemir, Yuanrui zhang,Ozcan OzturkAdaptive prefetching for shared cache based chip multiprocessorsEDAA 2009 [8] Jim Pierce Intel Corporation,Trevor Mudge University of Michigan Wrong-Path Instruction Prefetching IEEE 1996 [9] P. Hsu, Design of the TFP microprocessor, IEEE Micro, 1993. [10] J. Pierce, Cache Behavior in the Presence of Speculative ExecutionThe Benefits of Misprediction, Ph.D. Thesis, The University of Michigan, 1995. [11] G. Reinman, T. Austin, and B. Calder. A scalable front-end architecture for fast instruction delivery. In 26th Annual International Symposium on Computer Architecture, May 1999. [12] Lawrence Spracklen, Yuan chou and Snathosh G Abraham effective instruction prefetching in chip multiprocessor for modern commersal application proceedings of 11th Intl. Symbosium on HPCA,2005 [13] D Kongetira,K Aingraran and K olukotun NiagraA 32-way multithreaded SPARC processor IEEE micro, 2005. [14] J. D. Collins, S. Sair, B. Calder, and D. M. Tullsen. Pointer cache assisted prefetching. In 35th International Symposium on Microarchitecture, pages 62-73, Nov. 2002. [15] R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt, and Y. N. Patt. Simultaneous subordinate microthreading (SSMT). In 26th International Symposium on Computer Architecture, pages 186-195, M