SlideShare a Scribd company logo
1 of 51
Download to read offline
分層的表格為主函數近似方法
Hierarchical Multipartite
Function Evaluation
Advisor : Prof. Shen-Fu Hsiao (蕭勝夫)
Student : Yi-Hau Chen (陳奕豪)
Outline
•Motivation
•Related Work
•Proposed
•Results & Comparison
•Conclusion
2
Outline
•Motivation
•Related Work
•Proposed
•Results & Comparison
•Conclusion
3
Motivation
• 特殊函數運算單元被廣泛應用於在數位訊號處理及多媒體
應用,如:圖像處理器(graphics processing unit)。
• 特殊函數運算單元(Special function unit)
• 三角函數(trigonometric)、倒數(reciprocal)、指數
(exponential) 與對數(logarithm)。
• 查表(lookup tables(LUT)) 與一些簡單的算數運算單元所
構成
• 主要分為兩類:
• piecewise polynomial approximation (PPA)
• table-lookup-and-addition (TA)
• 本論文主要探討如何有效地減少TA 的表格面積,仍然可以
保持TA 運算速度較快的優點。
4
Outline
• Motivation
• Related Work
• Category
• Piecewise Polynomial Approximation (PPA)
• Table-Lookup-and-Addition (TA)
• Bipartite Table Methods (BP)
• Symmetric Bipartite Table Methods (SBTM)
• Symmetric Table Addition Methods (STAM)
• Multipartite Table Methods (MP)
• Proposed
• Results & Comparison
• Conclusion
5
Category
6
Piecewise Polynomial
Approximation (PPA)-(1/2)
7
𝑓 𝑥 ≅ 𝑎0 𝑥 𝑚 + 𝑎1(𝑥 𝑚) ∙ 𝑥𝑙
PPA-(2/2) deg-2 Architecture
8
𝑓 𝑥 ≅ 𝑎0 𝑥 𝑚 + 𝑎1(𝑥 𝑚) ∙ 𝑥𝑙 + 𝑎2(𝑥 𝑚) ∙ 𝑥𝑙
2
Table-Lookup-and-Addition (TA)
• 主要分為兩類,add-table-add(ATA) 方法
以及bipartite/multipartite 方法。
• 而bipartite/multipartite 類的方法包含
• bipartite table methods (BP) [16]
• symmetric bipartite table methods (SBTM) [17]
• symmetric table addition methods (STAM) [18]
• multipartite table methods (MP) [1,19]
9
Bipartite Table Methods (BP)-
(1/5) 位元分區(bit partition)
10
在函數近似方法裡,為了近似一個的函數f(x),n-bit 的輸
入 x 被分成兩個部分𝑥0以及𝑥1,其位元寬度分別為𝛼和𝛽且𝛼 +
𝛽 = 𝑛。我們假設初始輸入區間為0 ≤ 𝑥 < 1, 即
𝑥 = 𝑥0 + 𝑥1
0 ≤ 𝑥0 ≤ 1 − 2−𝛼
0 ≤ 𝑥1 ≤ 2−𝛼
− 2−𝑛
0 ≤ 𝑥0,1 ≤ 1 − 2−𝛾
BP-(2/5) 泰勒展開式
11
因此,函數f(x) 可以透過泰勒展開式的前兩項來近似:
𝑛=0
∞
𝑓 𝑛
(𝑎)
𝑛!
∙ 𝑥 − 𝑎 𝑛
(𝑎 = 𝑥0 and x = 𝑥0 + 𝑥1)
𝑓 𝑥 = 𝑓 𝑥0 + 𝑓′ 𝑥0 ∙ 𝑥1 + 𝜀𝑙𝑖𝑛
𝑓 𝑥 = 𝑓 𝑥0 + 𝑓′
𝑥0,1 ∙ 𝑥1 + 𝜀𝑙𝑖𝑛 + 𝜀 𝑠𝑙𝑝
BP-(3/5) 曲線放大示意圖
12
BP-(4/5) 架構(Architecture)
13
𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝑓′
𝑥0,1 ∙ 𝑥1
≅ 𝑇𝐼 𝑥0 + 𝑇𝑂(𝑥0,1, 𝑥1)
𝑇𝐼 𝑥0 ≅ 𝑄[𝑓 𝑥0 ]
𝑇𝑂(𝑥0,1, 𝑥1) ≅ 𝑄[𝑓′ 𝑥0,1 ∙ 𝑥1]
Table of Initial Values
Table of Offset
BP-(5/5) 表格分割(table decomposition)
14
Symmetric Bipartite Table
Methods (SBTM)
15
0 ≤ 𝑥0 ≤ 1 − 2−𝛼
0 ≤ 𝑥1 ≤ 2−𝛼
− 2−𝑛
0 ≤ 𝑥0,1 ≤ 1 − 2−𝛾
𝛿1 = 2−𝛼
− 2−𝑛
𝛿0 = 2−𝛾
− 2−𝛼
16
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0 +
𝛿1
2
) ∙ (𝑥1 −
𝛿1
2
)
17
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙ (𝑥1 −
𝛿1
2
)
18
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙ (𝑥1 −
𝛿1
2
)
𝑇𝐼 𝑥0 = 𝑄[𝑓 𝑥0 +
𝛿1
2
]
𝑇𝑂 𝑥0,1, 𝑥1 = 𝑄[𝑓′
𝑥0,1 +
𝛿0
2
+
𝛿1
2
∙ 𝑥1 −
𝛿1
2
]
Symmetric Bipartite Table
Methods (SBTM)
Symmetric Table Addition
Methods (STAM)
19
𝛿1 =
𝑖=1
𝑚
𝛿1,𝑖, 𝛿1,𝑖 = 2−𝑝 𝑖−1 − 2−𝑝 𝑖, 𝑖 = 1,2, … , 𝑚
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙ (𝑥1 −
𝛿1
2
)
with
𝑝0 = 𝛼, 𝑝𝑖 = 𝑝𝑖−1 + 𝛽𝑖, 𝑖 = 1,2, … , 𝑚
𝑥1 =
𝑖=1
𝑚
𝑥1,𝑖
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙ (
𝑖=1
𝑚
𝑥1,𝑖 −
𝑖=1
𝑚
𝛿1,𝑖
2
)
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙
𝑖=1
𝑚
(𝑥1,𝑖 −
𝛿1,𝑖
2
)
Multipartite Table Methods (MP[1])-
(1/5) 位元分區(bit partition)
20
Multipartite Table Methods (MP[1])-
(2/5) 不同的初值以及斜率產生方式
21
𝑇𝐼 𝑥0 = 𝑄[
𝑓 𝑥0 + 𝑓 𝑥0 + 𝛿1
2
]
𝑇𝑂 𝑥0,1, 𝑥1 = 𝑄[𝑠 𝑥0,𝑖 ∙ 𝑥1,𝑖 −
𝛿1,𝑖
2
]
𝑇𝐼 𝑥0 = 𝑄[𝑓 𝑥0 +
𝛿1
2
]
𝑇𝑂 𝑥0,1, 𝑥1,𝑖 = 𝑄[𝑓′ 𝑥0,1 +
𝛿0
2
+
𝛿1
2
∙ 𝑥1,𝑖 −
𝛿1,𝑖
2
]
MP[1]:
STAM:
Multipartite Table Methods (MP[1])-
(3/5) 斜率s的算法
22
𝑠 𝑥0,𝑖 =
𝑓 𝜑2 − 𝑓 𝜑1 + 𝑓 𝜑4 − 𝑓 𝜑3
2 ∙ 𝛿1,𝑖
Multipartite Table Methods (MP[1])-
(4/5)架構(Architecture)
23
Multipartite Table Methods (MP[1])-
(5/5)表格分割(table decomposition)
24
Outline
• Motivation
• Related Work
• Proposed
• 函數的定義域(domain) 與值域(range)
• 取樣方法及誤差分配(Error Budget)
• HMP方法概述
• Lossless ROM Compression with Low Cost
• 整合誤差(Combined Error) 與窮舉搜尋(Exhaustive Search)
• 搜尋方法的加速
• Results & Comparison
• Conclusion
25
函數的定義域(domain) 與值域(range)
26
27
𝑇𝐼 𝑥0 = 𝑄[
𝑓 𝑥0 + 𝑓 𝑥0 + 𝛿1
2
]
𝜀 𝑞 = 𝑚 + 1 ∙ 2−𝑛−𝑔−1
取樣方法及誤差分配(Error Budget)
𝜀 𝑟𝑛𝑑 = 0.5 ∙ (2−𝑛
− 2−𝑔
)
𝜀 𝑎𝑝𝑥 +𝜀 𝑞 +𝜀 𝑟𝑛𝑑 = 𝜀𝑡𝑜𝑡𝑎𝑙 < 2−𝑛
28
取樣方法及誤差分配(Error Budget)
|𝜀1| = 𝜀2 = |𝜀3| = |𝜀4|
HMP方法概述
29
HMP方法概述
30
HMP方法概述
31
HMP方法概述
32
比較MP與HMP
33
Lossless ROM Compression
原理示意圖(s= 3)
34
Lossless ROM Compression with Low Cost-
表格分割(table decomposition)
35
整合誤差(Combined Error) 與窮舉
搜尋(Exhaustive Search)
36
搜尋方法的加速 流程圖
37
搜尋方法的加速 驗證示意圖
38
Outline
•Motivation
•Related Work
•Proposed
•Results & Comparison
•Conclusion
39
40
41
表4.2: 24 位元
SIN 函數採用
MP [1] 及
HMP 之表格
分解
42
43
比較 MP, HMP, HMP_TI
44
45
46
47
使用版本為ISE10.1
編號為Xilinx Virtex-II XC2V1000-fg456-5
Outline
•Motivation
•Related Work
•Proposed
•Results & Comparison
•Conclusion
48
Conclusion
• 本論文提出之HMP能有效改良MP[1]的表格
面積。
• 本論文提出之Lossless ROM Compression不僅
有效降低表格面積,且delay增加得很少。
• 本論文一併提出的整合誤差(Combined Error)
與窮舉搜尋(Exhaustive Search)能加速到有效
時間內完成,相比過去有很大的進展。
• 未來展望:希望能將這些方法,拓展到更高
精確度上。(i.e.,32 bits)
49
References
1) F. de Dinechin and A. Tisserand, “Multipartite table methods,” IEEE Transactions on Computers, vol. 54, pp. 319–330, March 2005.
2) Y. J. Kim, H. E. Kim, S. H. Kim, J. S. Park, S. Paek, and L. S. Kim, “Homogeneous stream processors with embedded special function units for high-utilization
programmable shaders,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, pp. 1691–1704, Sept 2012.
3) D. D. Caro, N. Petra, and A. G. M. Strollo, “Reducing lookup-table size in direct digital frequency synthesizers using optimized multipartite table method,” IEEE
Transactions on Circuits and Systems I: Regular Papers, vol. 55, pp. 2116–2127, Aug 2008.
4) B. G. Nam, H. Kim, and H. J. Yoo, “Power and area-efficient unified computation of vector and elementary functions for handheld 3d graphics systems,” IEEE
Transactions on Computers, vol. 57, pp. 490–504, April 2008.
5) D. D. Caro, N. Petra, and A. G. M. Strollo, “High-performance special function unit for programmable 3-d graphics processors,” IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 56, pp. 1968–1978, Sept 2009.
6) D. D. Caro, N. Petra, and A. G. M. Strollo, “Direct digital frequency synthesizer using nonuniform piecewise-linear approximation,” IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 58, pp. 2409–2419, Oct 2011.
7) J. A. Pineiro, S. F. Oberman, J. M. Muller, and J. D. Bruguera, “High-speed function approximation using a minimax quadratic interpolator,” IEEE Transactions on
Computers, vol. 54, pp. 304–318, March 2005.
8) D. U. Lee, R. Cheung, W. Luk, and J. Villasenor, “Hardware implementation tradeoffs of polynomial approximations and interpolations,” IEEE Transactions on
Computers, vol. 57, pp. 686–701, May 2008.
9) D. U. Lee and J. D. Villasenor, “Optimized custom precision function evaluation for embedded processors,” IEEE Transactions on Computers, vol. 58, pp. 46–59, Jan
2009.56
10) D. U. Lee, R. C. C. Cheung, W. Luk, and J. D. Villasenor, “Hierarchical segmentation for hardware function evaluation,” IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 17, pp. 103–116, Jan 2009.
11) T. Sasao, S. Nagayama, and J. T. Butler, “Numerical function generators using lut cascades,” IEEE Transactions on Computers, vol. 56, pp. 826–838, June 2007.
12) S. F. Hsiao, H. J. Ko, Y. L. Tseng, W. L. Huang, S. H. Lin, and C. S. Wen, “Design of hardware function evaluators using low-overhead nonuniform segmentation with
address remapping,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 21, pp. 875–886, May 2013.
13) A. G. M. Strollo, D. D. Caro, and N. Petra, “Elementary functions hardware implementation using constrained piecewise-polynomial approximations,” IEEE Transactions
on Computers, vol. 60, pp. 418–432, March 2011.
14) S. F. Hsiao, H. J. Ko, and C. S. Wen, “Two-level hardware function evaluation based on correction of normalized piecewise difference functions,” IEEE Transactions on
Circuits and Systems II: Express Briefs, vol. 59, pp. 292–296, May 2012.
15) M. Chaudhary and P. Lee, “An improved two-step binary logarithmic converter for fpgas,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp.
476–480, May 2015.
50
References
16) D. D. Sarma and D. W. Matula, “Faithful bipartite rom reciprocal tables,” in Computer Arithmetic, 1995., Proceedings of the 12th Symposium on, pp. 17–28, Jul 1995.
17) M. J. Schulte and J. E. Stine, “Approximating elementary functions with symmetric bipartite tables,” IEEE Transactions on Computers, vol. 48, pp. 842–847, Aug 1999.
18) J. E. Stine and M. J. Schulte, “The symmetric table addition method for accurate function approximation,” Journal of VLSI signal processing systems for signal, image and
video technology, vol. 21, no. 2, pp. 167–177, 1999.
19) J.-M. Muller, “A few results on table-based methods,” Reliable Computing, vol. 5, no. 3, pp. 279–288, 1999.
20) P. K. Meher, “Lut optimization for memory-based computation,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 57, pp. 285–289, April 2010. 57
21) W. F. Wong and E. Goto, “Fast evaluation of the elementary functions in single precision,” IEEE Transactions on Computers, vol. 44, pp. 453–457, Mar 1995.
22) J. Y. L. Low and C. C. Jong, “A memory-efficient tables-and-additions method for accurate computation of elementary functions,” IEEE Transactions on Computers, vol.
62, pp. 858–872, May 2013.
23) D. Wang, J. M. Muller, N. Brisebarre, and M. D. Ercegovac, “(m,p,k) –friendly points: A table-based method to evaluate trigonometric function,” IEEE Transactions on
Circuits and Systems II: Express Briefs, vol. 61, pp. 711–715, Sept 2014.
24) S. F. Hsiao, P. H. Wu, C. S. Wen, and P. K. Meher, “Table size reduction methods for faithfully rounded lookup-table-based multiplierless function evaluation,” IEEE
Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp. 466–470, May 2015.
25) J.-M. Muller, Elementary Functions: Algorithms and Implementation, 2nd ed. Birkhauser, 2006.
26) M. D. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann Pub, 2004.
27) B. Parhami, Algorithms and Design Methods for Digital Computer Arithmetic, International 2nd ed. Oxford University Press, 2012.
28) S.-F. Hsiao, P.-C. Wei, and C.-P. Lin, “An automatic hardware generator for special arithmetic functions using various rom-based approximation approaches,” in Circuits
and Systems, 2008. ISCAS 2008. IEEE International Symposium on, pp. 468–471, May 2008.
29) 曾于玲, “使用位元截斷法之查表式函數求值單元自動產生器設計,” 國立中山大學資訊工程學系碩士論文, 2011.
30) 吳柏翰, “無乘法器查表法函數運算設計之表格縮減和最佳化,” 國立中山大學資訊工程學系碩士論文, 2013.
31) S. F. Hsiao, C. S. Wen, Y. H. Chen, and K. C. Huang, “Hierarchical multipartite function evaluation,” IEEE Transactions on Computers, vol. PP, no. 99, pp. 1–1, 2016.
51

More Related Content

What's hot

131111台大論文口試
131111台大論文口試131111台大論文口試
131111台大論文口試貽得 廖
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰台灣資料科學年會
 
為什麼醫療需要社會學?淺談質性研究與敘事醫學.pdf
為什麼醫療需要社會學?淺談質性研究與敘事醫學.pdf為什麼醫療需要社會學?淺談質性研究與敘事醫學.pdf
為什麼醫療需要社會學?淺談質性研究與敘事醫學.pdfChing-wen Lu
 
環境教育人員認證說明簡報1051202
環境教育人員認證說明簡報1051202環境教育人員認證說明簡報1051202
環境教育人員認證說明簡報1051202TIEC
 
福星國小家長會志工專刊
福星國小家長會志工專刊福星國小家長會志工專刊
福星國小家長會志工專刊BRightTravelLog
 
MONEY, TOKENS, AND GAMES:Blockchain’s Next Billion Users and Trillions in Value
MONEY, TOKENS, AND GAMES:Blockchain’s Next Billion Users and Trillions in ValueMONEY, TOKENS, AND GAMES:Blockchain’s Next Billion Users and Trillions in Value
MONEY, TOKENS, AND GAMES:Blockchain’s Next Billion Users and Trillions in Valuedigitalinasia
 
崇越論文競賽簡報
崇越論文競賽簡報崇越論文競賽簡報
崇越論文競賽簡報Ricky Lee
 
YOLO V1 論文導讀
YOLO V1 論文導讀YOLO V1 論文導讀
YOLO V1 論文導讀Ko Ko
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
 
Cryptocurrencies and the Banking Sector
Cryptocurrencies and the Banking SectorCryptocurrencies and the Banking Sector
Cryptocurrencies and the Banking SectorIlan Alon
 
ResearchGate簡單玩
ResearchGate簡單玩ResearchGate簡單玩
ResearchGate簡單玩皓仁 柯
 
121203論文計劃書簡報
121203論文計劃書簡報121203論文計劃書簡報
121203論文計劃書簡報貽得 廖
 
[系列活動] 使用 R 語言建立自己的演算法交易事業
[系列活動] 使用 R 語言建立自己的演算法交易事業[系列活動] 使用 R 語言建立自己的演算法交易事業
[系列活動] 使用 R 語言建立自己的演算法交易事業台灣資料科學年會
 
環球左營異業合作宣傳提案
環球左營異業合作宣傳提案環球左營異業合作宣傳提案
環球左營異業合作宣傳提案俊吉 施
 

What's hot (20)

131111台大論文口試
131111台大論文口試131111台大論文口試
131111台大論文口試
 
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
[台灣人工智慧學校] 工業 4.0 與智慧製造的發展趨勢與挑戰
 
Crypto Economics Crash Course
Crypto Economics Crash CourseCrypto Economics Crash Course
Crypto Economics Crash Course
 
為什麼醫療需要社會學?淺談質性研究與敘事醫學.pdf
為什麼醫療需要社會學?淺談質性研究與敘事醫學.pdf為什麼醫療需要社會學?淺談質性研究與敘事醫學.pdf
為什麼醫療需要社會學?淺談質性研究與敘事醫學.pdf
 
環境教育人員認證說明簡報1051202
環境教育人員認證說明簡報1051202環境教育人員認證說明簡報1051202
環境教育人員認證說明簡報1051202
 
Introduction Bitcoin
Introduction BitcoinIntroduction Bitcoin
Introduction Bitcoin
 
福星國小家長會志工專刊
福星國小家長會志工專刊福星國小家長會志工專刊
福星國小家長會志工專刊
 
MONEY, TOKENS, AND GAMES:Blockchain’s Next Billion Users and Trillions in Value
MONEY, TOKENS, AND GAMES:Blockchain’s Next Billion Users and Trillions in ValueMONEY, TOKENS, AND GAMES:Blockchain’s Next Billion Users and Trillions in Value
MONEY, TOKENS, AND GAMES:Blockchain’s Next Billion Users and Trillions in Value
 
崇越論文競賽簡報
崇越論文競賽簡報崇越論文競賽簡報
崇越論文競賽簡報
 
YOLO V1 論文導讀
YOLO V1 論文導讀YOLO V1 論文導讀
YOLO V1 論文導讀
 
星巴克組織文化
星巴克組織文化星巴克組織文化
星巴克組織文化
 
人工智慧03_關聯規則
人工智慧03_關聯規則人工智慧03_關聯規則
人工智慧03_關聯規則
 
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...
 
論文口試
論文口試論文口試
論文口試
 
Cryptocurrencies and the Banking Sector
Cryptocurrencies and the Banking SectorCryptocurrencies and the Banking Sector
Cryptocurrencies and the Banking Sector
 
ResearchGate簡單玩
ResearchGate簡單玩ResearchGate簡單玩
ResearchGate簡單玩
 
121203論文計劃書簡報
121203論文計劃書簡報121203論文計劃書簡報
121203論文計劃書簡報
 
[系列活動] 使用 R 語言建立自己的演算法交易事業
[系列活動] 使用 R 語言建立自己的演算法交易事業[系列活動] 使用 R 語言建立自己的演算法交易事業
[系列活動] 使用 R 語言建立自己的演算法交易事業
 
環球左營異業合作宣傳提案
環球左營異業合作宣傳提案環球左營異業合作宣傳提案
環球左營異業合作宣傳提案
 
mbot2.0教學-光感測器與LED應用.pdf
mbot2.0教學-光感測器與LED應用.pdfmbot2.0教學-光感測器與LED應用.pdf
mbot2.0教學-光感測器與LED應用.pdf
 

Similar to 碩士論文投影片

Positioning Error Analysis and Compensation of Differential Precision Workbench
Positioning Error Analysis and Compensation of Differential Precision WorkbenchPositioning Error Analysis and Compensation of Differential Precision Workbench
Positioning Error Analysis and Compensation of Differential Precision WorkbenchIJRES Journal
 
An Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
An Algebraic Method to Check the Singularity-Free Paths for Parallel RobotsAn Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
An Algebraic Method to Check the Singularity-Free Paths for Parallel RobotsDr. Ranjan Jha
 
Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...ijfcstjournal
 
An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...IJECEIAES
 
New Directions in Mahout's Recommenders
New Directions in Mahout's RecommendersNew Directions in Mahout's Recommenders
New Directions in Mahout's Recommenderssscdotopen
 
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...IRJET Journal
 
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...IJMTST Journal
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleHakka Labs
 
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using MatpackIRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using MatpackIRJET Journal
 
An Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingAn Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingCSCJournals
 
Palmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestPalmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestDakshina Kisku
 
Palmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestPalmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestDakshina Kisku
 
Bayesian Estimation for Missing Values in Latin Square Design
Bayesian Estimation for Missing Values in Latin Square DesignBayesian Estimation for Missing Values in Latin Square Design
Bayesian Estimation for Missing Values in Latin Square Designinventionjournals
 
Fault detection based on novel fuzzy modelling
Fault detection based on novel fuzzy modelling Fault detection based on novel fuzzy modelling
Fault detection based on novel fuzzy modelling csijjournal
 
Multiple Ant Colony Optimizations for Stereo Matching
Multiple Ant Colony Optimizations for Stereo MatchingMultiple Ant Colony Optimizations for Stereo Matching
Multiple Ant Colony Optimizations for Stereo MatchingCSCJournals
 
Implementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderImplementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderVLSICS Design
 
Design of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic MultiplierDesign of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic MultiplierVLSICS Design
 

Similar to 碩士論文投影片 (20)

Positioning Error Analysis and Compensation of Differential Precision Workbench
Positioning Error Analysis and Compensation of Differential Precision WorkbenchPositioning Error Analysis and Compensation of Differential Precision Workbench
Positioning Error Analysis and Compensation of Differential Precision Workbench
 
An Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
An Algebraic Method to Check the Singularity-Free Paths for Parallel RobotsAn Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
An Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
 
Unit1 pg math model
Unit1 pg math modelUnit1 pg math model
Unit1 pg math model
 
Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...
 
An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...
 
New Directions in Mahout's Recommenders
New Directions in Mahout's RecommendersNew Directions in Mahout's Recommenders
New Directions in Mahout's Recommenders
 
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
 
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
 
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using MatpackIRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
 
An Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingAn Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video Coding
 
Medial axis transformation based skeletonzation of image patterns using image...
Medial axis transformation based skeletonzation of image patterns using image...Medial axis transformation based skeletonzation of image patterns using image...
Medial axis transformation based skeletonzation of image patterns using image...
 
9.venkata naga vamsi. a
9.venkata naga vamsi. a9.venkata naga vamsi. a
9.venkata naga vamsi. a
 
Palmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestPalmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interest
 
Palmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestPalmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interest
 
Bayesian Estimation for Missing Values in Latin Square Design
Bayesian Estimation for Missing Values in Latin Square DesignBayesian Estimation for Missing Values in Latin Square Design
Bayesian Estimation for Missing Values in Latin Square Design
 
Fault detection based on novel fuzzy modelling
Fault detection based on novel fuzzy modelling Fault detection based on novel fuzzy modelling
Fault detection based on novel fuzzy modelling
 
Multiple Ant Colony Optimizations for Stereo Matching
Multiple Ant Colony Optimizations for Stereo MatchingMultiple Ant Colony Optimizations for Stereo Matching
Multiple Ant Colony Optimizations for Stereo Matching
 
Implementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderImplementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adder
 
Design of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic MultiplierDesign of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic Multiplier
 

Recently uploaded

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 

Recently uploaded (20)

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 

碩士論文投影片

  • 1. 分層的表格為主函數近似方法 Hierarchical Multipartite Function Evaluation Advisor : Prof. Shen-Fu Hsiao (蕭勝夫) Student : Yi-Hau Chen (陳奕豪)
  • 4. Motivation • 特殊函數運算單元被廣泛應用於在數位訊號處理及多媒體 應用,如:圖像處理器(graphics processing unit)。 • 特殊函數運算單元(Special function unit) • 三角函數(trigonometric)、倒數(reciprocal)、指數 (exponential) 與對數(logarithm)。 • 查表(lookup tables(LUT)) 與一些簡單的算數運算單元所 構成 • 主要分為兩類: • piecewise polynomial approximation (PPA) • table-lookup-and-addition (TA) • 本論文主要探討如何有效地減少TA 的表格面積,仍然可以 保持TA 運算速度較快的優點。 4
  • 5. Outline • Motivation • Related Work • Category • Piecewise Polynomial Approximation (PPA) • Table-Lookup-and-Addition (TA) • Bipartite Table Methods (BP) • Symmetric Bipartite Table Methods (SBTM) • Symmetric Table Addition Methods (STAM) • Multipartite Table Methods (MP) • Proposed • Results & Comparison • Conclusion 5
  • 7. Piecewise Polynomial Approximation (PPA)-(1/2) 7 𝑓 𝑥 ≅ 𝑎0 𝑥 𝑚 + 𝑎1(𝑥 𝑚) ∙ 𝑥𝑙
  • 8. PPA-(2/2) deg-2 Architecture 8 𝑓 𝑥 ≅ 𝑎0 𝑥 𝑚 + 𝑎1(𝑥 𝑚) ∙ 𝑥𝑙 + 𝑎2(𝑥 𝑚) ∙ 𝑥𝑙 2
  • 9. Table-Lookup-and-Addition (TA) • 主要分為兩類,add-table-add(ATA) 方法 以及bipartite/multipartite 方法。 • 而bipartite/multipartite 類的方法包含 • bipartite table methods (BP) [16] • symmetric bipartite table methods (SBTM) [17] • symmetric table addition methods (STAM) [18] • multipartite table methods (MP) [1,19] 9
  • 10. Bipartite Table Methods (BP)- (1/5) 位元分區(bit partition) 10 在函數近似方法裡,為了近似一個的函數f(x),n-bit 的輸 入 x 被分成兩個部分𝑥0以及𝑥1,其位元寬度分別為𝛼和𝛽且𝛼 + 𝛽 = 𝑛。我們假設初始輸入區間為0 ≤ 𝑥 < 1, 即 𝑥 = 𝑥0 + 𝑥1 0 ≤ 𝑥0 ≤ 1 − 2−𝛼 0 ≤ 𝑥1 ≤ 2−𝛼 − 2−𝑛 0 ≤ 𝑥0,1 ≤ 1 − 2−𝛾
  • 11. BP-(2/5) 泰勒展開式 11 因此,函數f(x) 可以透過泰勒展開式的前兩項來近似: 𝑛=0 ∞ 𝑓 𝑛 (𝑎) 𝑛! ∙ 𝑥 − 𝑎 𝑛 (𝑎 = 𝑥0 and x = 𝑥0 + 𝑥1) 𝑓 𝑥 = 𝑓 𝑥0 + 𝑓′ 𝑥0 ∙ 𝑥1 + 𝜀𝑙𝑖𝑛 𝑓 𝑥 = 𝑓 𝑥0 + 𝑓′ 𝑥0,1 ∙ 𝑥1 + 𝜀𝑙𝑖𝑛 + 𝜀 𝑠𝑙𝑝
  • 13. BP-(4/5) 架構(Architecture) 13 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝑓′ 𝑥0,1 ∙ 𝑥1 ≅ 𝑇𝐼 𝑥0 + 𝑇𝑂(𝑥0,1, 𝑥1) 𝑇𝐼 𝑥0 ≅ 𝑄[𝑓 𝑥0 ] 𝑇𝑂(𝑥0,1, 𝑥1) ≅ 𝑄[𝑓′ 𝑥0,1 ∙ 𝑥1] Table of Initial Values Table of Offset
  • 15. Symmetric Bipartite Table Methods (SBTM) 15 0 ≤ 𝑥0 ≤ 1 − 2−𝛼 0 ≤ 𝑥1 ≤ 2−𝛼 − 2−𝑛 0 ≤ 𝑥0,1 ≤ 1 − 2−𝛾 𝛿1 = 2−𝛼 − 2−𝑛 𝛿0 = 2−𝛾 − 2−𝛼
  • 16. 16 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0 + 𝛿1 2 ) ∙ (𝑥1 − 𝛿1 2 )
  • 17. 17 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ (𝑥1 − 𝛿1 2 )
  • 18. 18 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ (𝑥1 − 𝛿1 2 ) 𝑇𝐼 𝑥0 = 𝑄[𝑓 𝑥0 + 𝛿1 2 ] 𝑇𝑂 𝑥0,1, 𝑥1 = 𝑄[𝑓′ 𝑥0,1 + 𝛿0 2 + 𝛿1 2 ∙ 𝑥1 − 𝛿1 2 ] Symmetric Bipartite Table Methods (SBTM)
  • 19. Symmetric Table Addition Methods (STAM) 19 𝛿1 = 𝑖=1 𝑚 𝛿1,𝑖, 𝛿1,𝑖 = 2−𝑝 𝑖−1 − 2−𝑝 𝑖, 𝑖 = 1,2, … , 𝑚 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ (𝑥1 − 𝛿1 2 ) with 𝑝0 = 𝛼, 𝑝𝑖 = 𝑝𝑖−1 + 𝛽𝑖, 𝑖 = 1,2, … , 𝑚 𝑥1 = 𝑖=1 𝑚 𝑥1,𝑖 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ ( 𝑖=1 𝑚 𝑥1,𝑖 − 𝑖=1 𝑚 𝛿1,𝑖 2 ) 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ 𝑖=1 𝑚 (𝑥1,𝑖 − 𝛿1,𝑖 2 )
  • 20. Multipartite Table Methods (MP[1])- (1/5) 位元分區(bit partition) 20
  • 21. Multipartite Table Methods (MP[1])- (2/5) 不同的初值以及斜率產生方式 21 𝑇𝐼 𝑥0 = 𝑄[ 𝑓 𝑥0 + 𝑓 𝑥0 + 𝛿1 2 ] 𝑇𝑂 𝑥0,1, 𝑥1 = 𝑄[𝑠 𝑥0,𝑖 ∙ 𝑥1,𝑖 − 𝛿1,𝑖 2 ] 𝑇𝐼 𝑥0 = 𝑄[𝑓 𝑥0 + 𝛿1 2 ] 𝑇𝑂 𝑥0,1, 𝑥1,𝑖 = 𝑄[𝑓′ 𝑥0,1 + 𝛿0 2 + 𝛿1 2 ∙ 𝑥1,𝑖 − 𝛿1,𝑖 2 ] MP[1]: STAM:
  • 22. Multipartite Table Methods (MP[1])- (3/5) 斜率s的算法 22 𝑠 𝑥0,𝑖 = 𝑓 𝜑2 − 𝑓 𝜑1 + 𝑓 𝜑4 − 𝑓 𝜑3 2 ∙ 𝛿1,𝑖
  • 23. Multipartite Table Methods (MP[1])- (4/5)架構(Architecture) 23
  • 24. Multipartite Table Methods (MP[1])- (5/5)表格分割(table decomposition) 24
  • 25. Outline • Motivation • Related Work • Proposed • 函數的定義域(domain) 與值域(range) • 取樣方法及誤差分配(Error Budget) • HMP方法概述 • Lossless ROM Compression with Low Cost • 整合誤差(Combined Error) 與窮舉搜尋(Exhaustive Search) • 搜尋方法的加速 • Results & Comparison • Conclusion 25
  • 27. 27 𝑇𝐼 𝑥0 = 𝑄[ 𝑓 𝑥0 + 𝑓 𝑥0 + 𝛿1 2 ] 𝜀 𝑞 = 𝑚 + 1 ∙ 2−𝑛−𝑔−1 取樣方法及誤差分配(Error Budget) 𝜀 𝑟𝑛𝑑 = 0.5 ∙ (2−𝑛 − 2−𝑔 ) 𝜀 𝑎𝑝𝑥 +𝜀 𝑞 +𝜀 𝑟𝑛𝑑 = 𝜀𝑡𝑜𝑡𝑎𝑙 < 2−𝑛
  • 35. Lossless ROM Compression with Low Cost- 表格分割(table decomposition) 35
  • 40. 40
  • 41. 41 表4.2: 24 位元 SIN 函數採用 MP [1] 及 HMP 之表格 分解
  • 42. 42
  • 43. 43
  • 44. 比較 MP, HMP, HMP_TI 44
  • 45. 45
  • 46. 46
  • 49. Conclusion • 本論文提出之HMP能有效改良MP[1]的表格 面積。 • 本論文提出之Lossless ROM Compression不僅 有效降低表格面積,且delay增加得很少。 • 本論文一併提出的整合誤差(Combined Error) 與窮舉搜尋(Exhaustive Search)能加速到有效 時間內完成,相比過去有很大的進展。 • 未來展望:希望能將這些方法,拓展到更高 精確度上。(i.e.,32 bits) 49
  • 50. References 1) F. de Dinechin and A. Tisserand, “Multipartite table methods,” IEEE Transactions on Computers, vol. 54, pp. 319–330, March 2005. 2) Y. J. Kim, H. E. Kim, S. H. Kim, J. S. Park, S. Paek, and L. S. Kim, “Homogeneous stream processors with embedded special function units for high-utilization programmable shaders,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, pp. 1691–1704, Sept 2012. 3) D. D. Caro, N. Petra, and A. G. M. Strollo, “Reducing lookup-table size in direct digital frequency synthesizers using optimized multipartite table method,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 55, pp. 2116–2127, Aug 2008. 4) B. G. Nam, H. Kim, and H. J. Yoo, “Power and area-efficient unified computation of vector and elementary functions for handheld 3d graphics systems,” IEEE Transactions on Computers, vol. 57, pp. 490–504, April 2008. 5) D. D. Caro, N. Petra, and A. G. M. Strollo, “High-performance special function unit for programmable 3-d graphics processors,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 56, pp. 1968–1978, Sept 2009. 6) D. D. Caro, N. Petra, and A. G. M. Strollo, “Direct digital frequency synthesizer using nonuniform piecewise-linear approximation,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 58, pp. 2409–2419, Oct 2011. 7) J. A. Pineiro, S. F. Oberman, J. M. Muller, and J. D. Bruguera, “High-speed function approximation using a minimax quadratic interpolator,” IEEE Transactions on Computers, vol. 54, pp. 304–318, March 2005. 8) D. U. Lee, R. Cheung, W. Luk, and J. Villasenor, “Hardware implementation tradeoffs of polynomial approximations and interpolations,” IEEE Transactions on Computers, vol. 57, pp. 686–701, May 2008. 9) D. U. Lee and J. D. Villasenor, “Optimized custom precision function evaluation for embedded processors,” IEEE Transactions on Computers, vol. 58, pp. 46–59, Jan 2009.56 10) D. U. Lee, R. C. C. Cheung, W. Luk, and J. D. Villasenor, “Hierarchical segmentation for hardware function evaluation,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 17, pp. 103–116, Jan 2009. 11) T. Sasao, S. Nagayama, and J. T. Butler, “Numerical function generators using lut cascades,” IEEE Transactions on Computers, vol. 56, pp. 826–838, June 2007. 12) S. F. Hsiao, H. J. Ko, Y. L. Tseng, W. L. Huang, S. H. Lin, and C. S. Wen, “Design of hardware function evaluators using low-overhead nonuniform segmentation with address remapping,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 21, pp. 875–886, May 2013. 13) A. G. M. Strollo, D. D. Caro, and N. Petra, “Elementary functions hardware implementation using constrained piecewise-polynomial approximations,” IEEE Transactions on Computers, vol. 60, pp. 418–432, March 2011. 14) S. F. Hsiao, H. J. Ko, and C. S. Wen, “Two-level hardware function evaluation based on correction of normalized piecewise difference functions,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 59, pp. 292–296, May 2012. 15) M. Chaudhary and P. Lee, “An improved two-step binary logarithmic converter for fpgas,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp. 476–480, May 2015. 50
  • 51. References 16) D. D. Sarma and D. W. Matula, “Faithful bipartite rom reciprocal tables,” in Computer Arithmetic, 1995., Proceedings of the 12th Symposium on, pp. 17–28, Jul 1995. 17) M. J. Schulte and J. E. Stine, “Approximating elementary functions with symmetric bipartite tables,” IEEE Transactions on Computers, vol. 48, pp. 842–847, Aug 1999. 18) J. E. Stine and M. J. Schulte, “The symmetric table addition method for accurate function approximation,” Journal of VLSI signal processing systems for signal, image and video technology, vol. 21, no. 2, pp. 167–177, 1999. 19) J.-M. Muller, “A few results on table-based methods,” Reliable Computing, vol. 5, no. 3, pp. 279–288, 1999. 20) P. K. Meher, “Lut optimization for memory-based computation,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 57, pp. 285–289, April 2010. 57 21) W. F. Wong and E. Goto, “Fast evaluation of the elementary functions in single precision,” IEEE Transactions on Computers, vol. 44, pp. 453–457, Mar 1995. 22) J. Y. L. Low and C. C. Jong, “A memory-efficient tables-and-additions method for accurate computation of elementary functions,” IEEE Transactions on Computers, vol. 62, pp. 858–872, May 2013. 23) D. Wang, J. M. Muller, N. Brisebarre, and M. D. Ercegovac, “(m,p,k) –friendly points: A table-based method to evaluate trigonometric function,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 61, pp. 711–715, Sept 2014. 24) S. F. Hsiao, P. H. Wu, C. S. Wen, and P. K. Meher, “Table size reduction methods for faithfully rounded lookup-table-based multiplierless function evaluation,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp. 466–470, May 2015. 25) J.-M. Muller, Elementary Functions: Algorithms and Implementation, 2nd ed. Birkhauser, 2006. 26) M. D. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann Pub, 2004. 27) B. Parhami, Algorithms and Design Methods for Digital Computer Arithmetic, International 2nd ed. Oxford University Press, 2012. 28) S.-F. Hsiao, P.-C. Wei, and C.-P. Lin, “An automatic hardware generator for special arithmetic functions using various rom-based approximation approaches,” in Circuits and Systems, 2008. ISCAS 2008. IEEE International Symposium on, pp. 468–471, May 2008. 29) 曾于玲, “使用位元截斷法之查表式函數求值單元自動產生器設計,” 國立中山大學資訊工程學系碩士論文, 2011. 30) 吳柏翰, “無乘法器查表法函數運算設計之表格縮減和最佳化,” 國立中山大學資訊工程學系碩士論文, 2013. 31) S. F. Hsiao, C. S. Wen, Y. H. Chen, and K. C. Huang, “Hierarchical multipartite function evaluation,” IEEE Transactions on Computers, vol. PP, no. 99, pp. 1–1, 2016. 51