(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SIMD Computation

Efficient Evaluation Methods of Elementary Functions Suitable for SIMD Computation Naoki Shibata Shiga University

Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Overview (contd.) ,[object Object],[object Object],[object Object],[object Object],x y SIMD operation SIMD operation SIMD operation SIMD operation sin x sin y Apply exactly same series of SIMD operations to x and y , and we get sin x and sin y .

Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Background ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Background (contd.) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Cares needed in SIMD optimization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Gathering and scattering operations are slow ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Table look-up is slow ,[object Object],[object Object],x y x y Scatter Look-up Look-up x' y' x y Gather Table This part must be repeated for each element

Division and SQRT are not too slow ,[object Object],[object Object],[object Object],[object Object],[object Object],[*] Intel 64 and IA-32 Architectures Optimization Reference Manual

Challenge ,[object Object],[object Object],[object Object],[object Object],[object Object],So, we cannot tolerate error in each step

Related Works ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Trigonometric functions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],Reducing argument within 0 and  /4

Cancellation error ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Solution for this problem ,[object Object],[object Object],[object Object],[object Object]

Solution for this problem (contd.) ,[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Step 2 in evaluating trigonometric functions

[object Object],[object Object],[object Object],[object Object],Step 2 (contd.)

Step 2 (contd.) ,[object Object],[object Object],[object Object],[object Object],[object Object]

Inverse trigonometric functions ,[object Object],[object Object],[object Object],[object Object]

New argument reduction for evaluating arc tangent (1/3) ,[object Object],[object Object],[object Object],[object Object]

New argument reduction for evaluating arc tangent (2/3) ,[object Object],[object Object],[object Object],[object Object],[object Object],(13) (14)

New argument reduction for evaluating arc tangent (3/3) ,[object Object],[object Object],[object Object],[object Object],[object Object],(13) (14) After argument reduction, we calculate Taylor series of atan

asin and acos functions ,[object Object],[object Object],[object Object]

Exponential function ,[object Object],[object Object],[object Object],[object Object]

Step 1 ,[object Object],[object Object],[object Object]

Step 2 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Logarithmic function ,[object Object],[object Object],[object Object],[object Object]

Evaluation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Accuracy Accuracy results for sin, cos and tan Accuracy results for atan Accuracy results for asin and acos

Accuracy (contd.) Accuracy results for exp Accuracy results for log In all cases, error does not exceed 6 ulps

Speed Proposed method is about two times as fast as FPU calculation

Code size ,[object Object],[object Object],[object Object]

Conclusion ,[object Object],[object Object],[object Object],[object Object]

Thank you ,[object Object],[object Object],http://freshmeat.net/projects/sleef

(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SIMD Computation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to (Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SIMD Computation

Similar to (Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SIMD Computation (20)

More from Naoki Shibata

More from Naoki Shibata (20)

Recently uploaded

Recently uploaded (20)

(Slides) Efficient Evaluation Methods of Elementary Functions Suitable for SIMD Computation