Upcoming SlideShare
×

# Tute

770 views
471 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
770
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
7
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Tute

1. 1. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-1 Q1. Give various architectural classification schemes. Also discuss the Flynn’s and Shore’s classification in detail. Q2. Explain how instruction set, compiler technology, CPU performance and justify the effects in terms of program length, clock rate, and effective CPI.
2. 2. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-2 Q1. A workstation uses a 15-MHz processor with a claimed 10-MIPS rating to execute a given program mix. Assume a one-cycle delay for each memory access. a. What is the effective CPI of this computer? b. Suppose the processor is being upgraded with 30-MHz clock. However, the speed of the memory subsystem remains unchanged, and consequently two clock cycles are needed per memory access. If 30% of the instructions require one memory access and another 5% require two memory accesses per instruction, what is the performance of the upgraded processor with a compatible set and equal instruction counts in the given program mix. Q2. Consider the execution of an object code with 200,000 instructions on a 40- MHz processor. The program consists of four major types of instructions. The instruction mix and the number of cycles (CPI) needed for each instruction type are given below based on the result of a program trace experiment: a. Calculate the average CPI when the program is executed on a uniprocessor with the above trace results. b. Calculate the corresponding MIPS rate based on the CPI obtained in part (a) Instruction type CPI Instruction mix Arithmetic and logic 1 60% Load/store with cache hit 2 18% Branch 4 12% Memory reference with cache miss 8 10%
3. 3. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-3 Q1. Define the fallowing terms related to parallelization and dependence relation. a. Computational Granularity b. Communication latency c. Flow dependence d. Anti dependence e. Output dependence f. I/O dependence g. Control dependence h. Resource dependence i. Bernstein Condition j. Degree of parallelism Q2. Analyze the data dependences among the following statements in a given program: Where (Ri) means the content of register Ri and Memory (10) contains 64 initially. a. Draw a dependence graph to show all the dependences. b. Are there any resource dependences if only one copy of each functional unit is available in the CPU? c. Repeat the above for the following program statements: S1: Load R1, 1024 /R1←1024/ S2: Load R2, M(10) /R2← Memory (10)/ S3: Add R1, R2 /R1← (R1) + (R2)/ S4: Store M (1024), R1 / Memory (1024) ← (R1)/ S5: Store M ((R2)), 1024 / Memory (64) ←1024/ S1: Load R1, M (100) /R1← Memory (100)/ S2: Move R2, R1 /R2← (R1)/ S3: Inc R1 /R1← (R1) + 1/ S4: Add R2, R1 / R2← (R2) + R1/ S5: Store M (100), R1 / Memory (100) ←(R1)/
4. 4. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-4 Q1. Consider the execution of a program of 15,000 instructions by a linear pipeline processor with a clock rate of 25 MHz.Assume that the instruction pipeline has five stages and that one instruction and out-of-sequence executions are ignored. a. Calculate the speedup factor in using this pipeline to execute the program as compared with the use of an equivalent nonpipelined processor with an equal amount of flow-through delay. b. What are the efficiency and throughput of this pipelined processor? Q2. Consider the following reservation table for a four-stage pipeline with a clock cycle τ= 20 ns. a. What are the forbidden latencies and the initial collision vector? b. Draw the state transition diagram for scheduling the pipeline. c. Determine the MAL associated with the shortest greedy cycle. d. Determine the pipeline throughput corresponding to the MAL and given τ. e. Determine the lower bound on the MAL for this pipeline. Have you obtained the optimal latency from the above state diagram? 1 2 3 4 5 6 S1 X X S2 X X S3 X S4 X
5. 5. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-5 Q1. Consider the five-stage pipelined processor specified by the following reservation table: a. List the set of forbidden latencies and all the collision vector. b. Draw a state transition diagram showing all possible initial sequences (cycles) without causing a collision in the pipeline. c. List all the simple cycles from the state diagram. d. Identify the greedy cycles among the simple cycles. e. What is the minimum average latency (MAL) of this pipeline? f. What is the minimum average latency (MAL) of this pipeline? g. What will be the maximum throughput of this pipeline? h. What will be the throughput if the minimum constant cycle is used? Q2. Consider a four–stage floating-point adder with a 10-ns delay per stage which equals the pipeline clock period. a. Name the appropriate functions to be performed by the four stages. b. Find the minimum number of periods required to add 100 floating-point numbers A1+A2+………….A100 using this pipeline adder, assuming that the output Z of stage S4 can be routed back to either of the two inputs X or Y of the pipeline with delays equal to a multiple of the clock period. 1 2 3 4 5 6 S1 X X S2 X X S3 X S4 X S5 X X
6. 6. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-6 Q1. Explain the super scalar and super pipelined execution. Also give the performance of such processor. Q2. Explain multithreading. Differentiate among blocked, interleaved and simultaneous multi threading.
7. 7. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-7 Q1. Explain PRAM model. Also give its division into various categories based on the way of simultaneous memory accesses. Q2. Give the parallel algorithm that uses N X N processors arranged in mesh for matrix multiplication.
8. 8. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-8 Q1. Explain the Bidirectional Gaussian Elimination for solving a set of linear algebraic equation. Q2. Explain the following loop transformations • Loop reversal • Loop tiling • Loop skewing • Loop permutation
9. 9. Deptt. of Comp. Sc. and Engg. M.M.M. Engg. College, Gorakhpur-273010 Session: 2007-08 Course: B Tech Subject: Advanced Computer Architecture Code : TCS-802 Tutorial-9 Q3. Explain the following terms associated with fast and efficient synchronization schemes on a shared memory multiprocessor: • Busy-wait verses sleep-wait protocol for sole access of a critical section. • Lock mechanism for pre synchronization to achieve sole access of a critical section. • Post Synchronization method. Q4. Discuss about run time library routines