Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)

198 views

Published on

With the state-of-the-art computation mode, Map&Reduce,
This talk will present a novel approach to speed up the computation of extracting maximal repeats from tagged sequences and meanwhile computing the class frequency distribution of these repeats. An USA patent based on above approach is being applied as "Wang, Ching-Tu. Method for Extracting Maximal Repeat Patterns and Computing Frequency Distribution Tables. Patent Application Serial Number 15/208,994. 13 July 2016."

Published in: Engineering
  • Be the first to comment

Hadoop con 2016_9_10_王經篤(Jing-Doo Wang)

  1. 1. Potential Applications using the Class Frequency Distribution of Maximal Repeats from Tagged Sequential data. Jing-Doo Wang (王經篤) Associate Professor Asia University, Taiwan. 第八屆台灣 Hadoop 社群年會 HadoopCon 2016 中央研究院人文社會科學館 (2016.9.10)
  2. 2. 簡介 • 現職: – 亞洲大學 資訊工程系 副教授 • 最高學歷: – 中正大學資訊工程所 博士 • 專長 – 文字挖掘(Text Mining)-Maximal Repeat Extraction – 生物資訊(Bioinformatics)、 – 雲端計算(Cloud Computing)、 – 類別架構模糊度之評估(Class Structure Ambiguity Evaluation)
  3. 3. 亞洲現代美術館 http://www.asia.edu.tw/main_1.php?yard/yard_01_09
  4. 4. Asia University Hospital 亞洲大學附屬醫院(2016.8.1) http://www.auh.org.tw/web/index.php
  5. 5. 精準醫療 (Precision Medicine) • 臨床醫學 (中國醫大醫院+亞大附屬醫院) • 基因體定序(Next Generation Sequencing) • 生物資訊(亞洲大學) http://newjust.masterlink.com.tw/HotPr oduct/HTML/img/GetImg.xdjpng?A=PA3 14-1b.png
  6. 6. 大數據研究中心 (一)建置本校大數據研究設備。 (二)整合精準醫療相關資料庫。 (三)執行有關大數據之學術交流、研究發展與 產學事宜。 (四)執行有關大數據之推廣服務相關事宜。
  7. 7. 學士後第二專長學士學位學程 申請學校: 亞洲大學 新設學程: 學士後巨量資料處理與分析學程
  8. 8. Outline • Introduction • Pattern History For Trend Analysis • Product Traceability for Quality Monitoring • Mining for Distinctive Pattern (Biomarker) from Genomic Sequences • Future Works
  9. 9. 9 “xabcyiiizabcqabcyrxar” • ab • bc • abc • abcy Not Maximal repeat Pattern Maximal Repeat Pattern Yes Yes
  10. 10. Why use “Maximal Repeats ” as features? • Dictionary – How to identify new words or phrases? – e.g. “just do it”, “洪荒之力”。 • N-gram – 2-gram, 3-gram,…,5-grams. (Google Ngram viewer) – The value of “N” is limited. • Maximal Repeat – The length of maximal repeat is variable.
  11. 11. Journal of Supercomputing, 72(8), pp. 3236-3260,April 2016
  12. 12. 專利概念圖 一個最大重複樣式抽取與計算 出現次數分布表格之方法
  13. 13. Patent Application Serial Number (US 15/208,994)( 申請中) • Wang, Ching-Tu. Method for Extracting Maximal Repeat Patterns and Computing Frequency Distribution Tables. Patent Application Serial Number 15/208,994. 13 July 2016. • 申請美國發明專利PA – 所有權:王經篤 – 發明人:王經篤
  14. 14. The flowchart of Maximal Repeat Extraction via MapReduce
  15. 15. Right Boundary Verification
  16. 16. Left Boundary Verification
  17. 17. Boundary Verification (Left&Right)
  18. 18. Outline • Introduction • Pattern History For Trend Analysis • Product Traceability for Quality Monitoring • Mining for Distinctive Pattern (Biomarker) from Genomic Sequences • Future Works
  19. 19. Pattern History for Trend Analysis Jing-Doo Wang (王經篤) Associate Professor Asia University, Taiwan. 2016/9/12 19FSKD 20'11 Sequential Data + Timestamp
  20. 20. 2016/9/12 20FSKD 20'11
  21. 21. Experimental Results • Download the abstracts of articles in PubMed from 1990~2014.(25 years) 2016/9/12 21FSKD 20'11
  22. 22. PubMed (1990~2014)(25 years) The abstracts and titles of 14,473,242 articles =>about 12 GB 2016/9/12 22FSKD 20'11
  23. 23. The Abstracts and Titles of PubMed Articles (1990~2014)(12GB) 6 PCs=> 5 hours
  24. 24. The History of a Significant Pattern 顯要樣式歷史 The history of a significant pattern is the frequency distribution of that pattern over equally spaced time intervals. 25
  25. 25. Significant Pattern (顯要樣式) • A significant pattern is one maximal repeat of consecutive words within texts. 26 (Length=1) TDP-43 (Length=1)SARS (Length=1)H1N1 (Length=5)non-small cell lung cancer (NSCLC) (Length=6)75 g oral glucose tolerance test (Length=6)4 x 4 Latin square design (Length=7)2 x 2 factorial arrangement of treatments (Length=9)the National Institute of Child Health and Human Development (Length=10)patients with squamous cell carcinoma of the head and neck (Length=11)anomalous origin of the left coronary artery from the pulmonary artery (Length=12)Pregnancy and Childbirth Group trials register and the Cochrane Controlled Trials Regist (Length=13)the European Organization for Research and Treatment of Cancer Quality of Life Questi
  26. 26. The History of the “HINI” 27
  27. 27. (50,75,100) g Oral Glucose Tolerance Test 2016/9/12 FSKD 20'11 28
  28. 28. The (goal, aim, purpose) of this study 2016/9/12 FSKD 20'11 29 aim goal purpose
  29. 29. Archaeology (考古學) https://zh.wikipedia.org/wiki/%E8%80%83%E5%8F%A4%E5%AD%A6
  30. 30. On-Line. – PubMed (Medicine Articles)(1990-2014) – CNA (Central News Agent)(中央社新聞)(1990-1996) – 中華民國專利(1950~2009)(中華民國專利文件) Potential Works. 法院判決案例 (Judicial Systems) – (Studies that allow closer examination of legal texts) – 全國碩博士論文 – 中央圖書館 – 中華明國專利(Patents) – 小說 – 金庸小說全集 – Harry Potter(哈力波特) – Shakespeare – 九把刀 – Blogs. – News. Text Archaeology (文件考古學) 圖:https://www.google.com.tw/search
  31. 31. 個人化學習 樣式歷史雲 http://www.tutortristar.com/topic/Internet/seminar-big-data-marketing-beckett- 20151005.html?utm_source=facebook&utm_medium=banner&utm_campaign=seminar- bigdata-marketing
  32. 32. 老人語言智力檢測 (重複性話語) 你吃飽了沒! https://www.google.com.tw/search
  33. 33. 新增語言詞彙檢測 (口頭禪) https://www.google.com.tw/search
  34. 34. 如何收集口語? 轉換成文字? Google’s Speech Recognition API Version 2 https://www.google.com.tw/search
  35. 35. 時間標籤文字+地理位置 https://www.google.com.tw/search
  36. 36. Outline • Introduction • Pattern History For Trend Analysis • Product Traceability for Quality Monitoring • Mining for Distinctive Pattern (Biomarker) from Genomic Sequences • Future Works
  37. 37. 農產品Traceability (產銷履歷) • 農產品產銷履歷制度=臺灣良好農業規範 實施及驗證+履歷追溯體系 • 產銷履歷—安全、永續、資訊公開之可追溯 農產品 From: http://taft.coa.gov.tw/ct.asp?xItem=4&CtNode=206&role=C
  38. 38. χ α β ϒ θ φ樣式(Symbol) 序列化 (Serialization)數值(Numeric Value) 生產履歷 (Product Traceability) 樣式化 (Symbolization) 大陸惠州 /蘇州 紙本記錄自動量測機量測紙本記錄X-R管制 目視管制顯微鏡量測 照片檔 紙本記錄X-R管制 Icon: http://www.freepik.com 自動量測機量測 自動量測機量測
  39. 39. 40 產銷履歷=> ********************$*********** ********************$*********** *****&*******%****************** ******************************** *****&*******%****************** ******************************** *********#****?************@**** ******************************** *********#****?************@**** *****&*******%****************** *********#****?***********@***** Products (產品) Steps(步驟)專家意見
  40. 40. 41 特殊事件(訊號)序列 (某類別中重複出現的序列) ******************************** ******************************** ******************************** *********#****?************@**** *********#****?************@**** *********#****?************@**** *****&*******%****************** *****&*******%****************** *****&*******%****************** ********************$*********** ********************$*********** 專家意見
  41. 41. 特殊事件(訊號)序列 事件(訊號)序列 #****? @**** &*******% $********** *****
  42. 42. 工業4.0 tw.digiwin.biz
  43. 43. 物聯網(IoT) http://www.indetail.com.tw/archives/2494
  44. 44. 物聯網(IoT)+ 工業4.0=? http://portal.stpi.narl.org.tw/index/article/10095
  45. 45. 物聯網(IoT)+ 工業4.0 資料處理=>BigData! www.slideshare.net
  46. 46. 監控與異常分析 http://innobic.blogspot.tw/
  47. 47. 單點故障 (single point of failure) www.ontargetpartners.com
  48. 48. 單點故障=> 具有經驗者可能解決! www.benison.com.tw www.youtube.com
  49. 49. 統計分析 newgenerationresearcher.blogspot.com
  50. 50. 統計分析觀點 • 單點故障=>單變數分析 • 多點故障=>多變數分析(multivariate analysis) www.slideshare.net
  51. 51. 意外(Accident) ?=多點故障(Multiple Points of Failure) multiple points of failure ops.fhwa.dot.gov
  52. 52. 電影• 絕命終結站 g333773.pixnet.net maizizi.pixnet.net
  53. 53. (一連串事件的組合!) =>意外 • 無法預測? www.frillo.co.uk
  54. 54. Wu Gui says "There are no accidents" - https://www.youtube.com/watch?v=Q04LPj99ZPc
  55. 55. 人力組裝-生產線 jdzol.com.cn www.cfea.org.cn
  56. 56. 機械人組裝-生產線 kaifangzhansb.mofcom.gov.cn
  57. 57. 從資料處理的角度來看 這些代表了甚麼? http://superbest.typepad.com/.a/6a00d83451b39269e20147e 25f0551970b-pisuperbest.typepad.com
  58. 58. 每件【物品】的背後 一連串的數字(符號) www.36dsj.com big5.xinhuanet.com
  59. 59. 「工業4.0」吹響號角 加速資訊化與智慧化融合 提升「智造」競爭力 http://tw.digiwin.biz/newsListDetail_6828.html
  60. 60. 天下武功,無堅不破,唯快不破! chuansong.me dannylun.blogspot.com
  61. 61. 62 物聯網(IoT) vs. 產銷履歷(Traceability) ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ***********^******************** ******************************** Products (產品) Steps(步驟)
  62. 62. 63 物聯網(IoT) vs. 產銷履歷(Traceability) ********************$*********** ********************$*********** *****&*******%****************** ******************************** *****&*******%****************** ******************************** *********#****?************@**** ******************************** *********#****?************@**** *****&*******%****************** *********#****?***********@***** Products (產品) Steps(步驟) blog.tianya.cn news.hxsd.com
  63. 63. 專家意見=> 標籤
  64. 64. Traceability (產銷履歷) P1 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P2 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P3 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P4 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P5 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P6 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 … … … P99 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P100=> S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 100 Products (產品) 10 Steps(步驟)
  65. 65. Traceability (產銷履歷)(1) …[S3] [S4] [S5]… https://www.google.com.tw/search
  66. 66. Traceability (產銷履歷)(1) P1 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P2 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P3 => S1, S2, S3, S4 P4 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P5 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P6 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 … … … P99 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P100=> S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 Detect Abnormal (發現異常) www.publicdomainpictures.net
  67. 67. Traceability (產銷履歷)(2) https://www.google.com.tw/search …[S3] [S4] [S5]…
  68. 68. Traceability (產銷履歷)(2) P1 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P2 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P3 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P4 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P5 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P6 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 … … … P99 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P100=> S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 Detect Abnormal (發現異常) =>產品(半成品)完成時
  69. 69. 逐一檢查產銷過程? • P3 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 • P99 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 Figure: www.yuhing.edu.tw
  70. 70. techpinions.com giphy.com
  71. 71. Traceability (產銷履歷)(3) P1 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P2 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P3 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P4 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P5 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P6 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 … … … P99 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P100=> S1, S2, S3, S4, S5, S6, S7, S8, S9, S10
  72. 72. Traceability (產銷履歷)(3) P3 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P99 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P2 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P4 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P6 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P1 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P4 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 P5 => S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 … … P100=> S1, S2, S3, S4, S5, S6, S7, S8, S9, S10
  73. 73. 特殊事件(訊號)序列 事件(訊號)序列 S5, S6, S7 S1, S2, S3, S4 S1, S2, S3 S8, S9, S10 S1, S2, S3, S4, S5, S6, S7, S8, S9, S10 S9, S10 https://www.google.com.tw/search
  74. 74. Traceability (產銷履歷)(2) Ma_1 Ma_2 Mb_1 Mb_2 Ma_3 Mc_1 Mc_2 https://www.google.com.tw/search …-[S3,Ma_1]-[S4,Mb_2]-[S5,Mc_1]-…
  75. 75. Traceability (產銷履歷)(4) [Step,Machine] P1 =>[S1,M]-[S2,M]-[S3,M]-…-[S9,M] -[S10,M] P2 =>[S1,M]-[S2,M]-[S3,M]-…-[S9,M] -[S10,M] P3 =>[S1,M]-[S2,M]-[S3,M]-…-[S9,M] -[S10,M] P4 =>[S1,M]-[S2,M]-[S3,M]-…-[S9,M] -[S10,M] P5 =>[S1,M]-[S2,M]-[S3,M]-…-[S9,M] -[S10,M] P6 =>[S1,M]-[S2,M]-[S3,M]-…-[S9,M] -[S10,Mj] … … … P99 =>[S1,M]-[S2,M]-[S3,M]-…-[S9,M] -[S10,M] P100 =>[S1,M]-[S2,M]-[S3,M]-…-[S9,M] -[S10,M] 10 Steps(步驟) 100 Products (產品)
  76. 76. Traceability (產銷履歷)(2) Ma_1 Ma_2 Mb_1 Mb_2 Ma_3 Mc_1 Mc_2 https://www.google.com.tw/search …-[S3,Ma_1]-[S4,Mb_2]-[S5,Mc_1]-…
  77. 77. …-[S3,Ma_2]-[S4,Mb_1]-[S5,Mc_2]-… …-[S3,Ma_1]-[S4,Mb_2]-[S5,Mc_1]-… …-[S3,Ma_1]-[S4,Mb_3]-[S5,Mc_1]-… …-[S3,Ma_2]-[S4,Mb_3]-[S5,Mc_1]-… …-[S3,Ma_2]-[S4,Mb_2]-[S5,Mc_2]-… …-[S3,Ma_1]-[S4,Mb_2]-[S5,Mc_2]-… …-[S3,Ma_2]-[S4,Mb_1]-[S5,Mc_2]-… …-[S3,Ma_2]-[S4,Mb_2]-[S5,Mc_1]-…
  78. 78. …-[S3,Ma_2]-[S4,Mb_2]-[S5,Mc_1]-… …-[S3,Ma_2]-[S4,Mb_2]-[S5,Mc_2]-… …-[S3,Ma_1]-[S4,Mb_2]-[S5,Mc_2]-… …-[S3,Ma_2]-[S4,Mb_1]-[S5,Mc_2]-… …-[S3,Ma_1]-[S4,Mb_3]-[S5,Mc_1]-… …-[S3,Ma_2]-[S4,Mb_3]-[S5,Mc_1]-… …-[S3,Ma_1]-[S4,Mb_2]-[S5,Mc_1]-… …-[S3,Ma_2]-[S4,Mb_1]-[S5,Mc_2]-…
  79. 79. 特殊事件(訊號)序列 …-[S3,Ma_1] [S4,Mb_2]-… [S4,Mb_2]-… …-[S3,Ma_2]-[S4,Mb_1]-[S5,Mc_2]-… -[S5,Mc_1]-… 事件(訊號)序列 https://www.google.com.tw/search
  80. 80. Traceability (產銷履歷)(5) [Step,Machine,Time] P1 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P2 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P3 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P4 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P5 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P6 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] … … … P99 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P100 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10]
  81. 81. Traceability (產銷履歷)(6) [Step,Machine,Time] P1 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P2 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P3 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P4 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P5 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P6 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] … … … P99 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10] P100 =>[S1,Ma,T1]-[S2,Mb,T2]-[S3,Mc,T3]-…-[S9,Mi,T9] -[S10,Mj,T10]
  82. 82. 逆向思考 • 實驗中找出為何{ 好}的原因? frank727.pixnet.net
  83. 83. 城市獵人 沙漠之鷹 (萬中選一) detail.chiebukuro.yahoo.co.jp www.aigouboke.com
  84. 84. 85 特殊事件(訊號)序列 (某類別中重複出現的序列) ************#*@**&************* ************#*@**&************* ************#*@**&************* ***@*****#****?******#*****@*** * *********#****?************@**** *********#****?********&***@*** *****&***@***%******#********** ******&*******%****************** *****&*******%****************** ***@****************$******@**** **********#*********$***@******* 專家意見
  85. 85. 電容-生產履歷(Traceability) Tags + Sequential Data • 序列資料 (Sequential Data) – 生產過程中-自動化數據收集 • (材料批號、機台、時間等)。 • 專家意見(Tags) – (出廠前)電容品質 • (標籤: ok, 缺失A, 缺失B, 缺失C,… ) – (出廠後) 電容保固期間損壞 • (標籤: ok, 缺失D, 缺失E, 缺失F,… )
  86. 86. 研華股份有限公司(研華) Advantech Co.,Ltd. From: http://www.advantech.com.tw/
  87. 87. 打造企業的資料湖泊( ETU ) From:http://etusolution.com/index.php/tw/solution-tw/etu-datalake 生產履歷 分析 (jdwang: Maximal Repeat Extraction)
  88. 88. 工業4.0 製造業 產銷履歷分析 製造業工廠 (Factory) 嵌入式系統業 (Embedded System) 雲端計算服務 平台 (Cloud Computing Platform) 產銷履歷-分析 (Maximal Repeat Extraction) 品管部門 (Quality Department ) 管理決策 (Decision Management)
  89. 89. http://d1zlh37f1ep3tj.cloudfront.net/wp/wblob/54592E651337D2/33A8/5D7B11/lCIOuvczuB3 U81XGqm9nag/903_1438243768yq2L.jpg
  90. 90. 我懂【半導體產業】嗎? http://technews.tw/2016/04/11/tsmc-and-largan/ http://image.slidesharecdn.com/random-09062 phpapp02/95/-9-728.jpg?cb=1245484455 http://www.slideshare.net/5045033/ss-1002323
  91. 91. It will be a hard work! http://previews.123rf.com/images/dirkercken/dirkercken1208/dirkercken120800053/14852048- hard-work-ahead-tough-job-be-ambitious-even-if-you-have-a-difficult-challenging-task-with- impact-to--Stock-Photo.jpg
  92. 92. New Direction & Thinking! http://switchandshift.com/11-trademarks-of-rebellious-leadership
  93. 93. 94 晶片-生產履歷 **************************************** www.iconarchive.comhttp://technews.tw/2016/04/11/tsmc-and-largan/ http://www.slideshare.net/5045033/ss-1002323
  94. 94. 台積電再度與清大合辦 「第2屆半導體大數據分析競賽」 http://www.appledaily.com.tw/realtimenews/article/new/20150713/647073/
  95. 95. ************************************* ************************************* ************************************* ************************************* ************************************* ************************************* ************************************* 專家意見 (Tags) 時間、機台、原料(廠牌)、溫度、壓力、操作人員等 Sequential Data (序列化資料)
  96. 96. 97 ********************$*********** ********************$*********** *****&*******%****************** ******************************** *****&*******%****************** ******************************** *********#****?************@**** ******************************** *********#****?************@**** *****&*******%****************** *********#****?***********@***** Products (產品) Sequential Data (序列化資料) 專家意見(Tags) Tags +Sequential Data
  97. 97. 98 特殊事件(訊號)序列 (某類別中重複出現的序列) ******************************** ******************************** ******************************** *********#****?************@**** *********#****?************@**** *********#****?************@**** *****&*******%****************** *****&*******%****************** *****&*******%****************** ********************$*********** ********************$*********** 專家意見
  98. 98. 特殊事件(訊號)序列 事件(訊號)序列 #****? @**** &*******% $********** *****
  99. 99. Traceability Granularity **************************************** ########## %%%%%%%%%% @ © ᴂ$
  100. 100. Obstacles • Numerical Values => Symbols • Expert Opinions – (Class or Tag) • Interactive Interfaces • Computing Environment • Chip- Traceability http://sponsorshipgreen.com/wp-content/uploads/2015/05/overcoming-obstacles-web- 1200x725.jpg
  101. 101. 推動產學合作狀況! • 廠商、公司:有機會,再研究!? https://includedbygrace.files.wordpress.com/2014/01/upset-character.jpg
  102. 102. Make a wish or hope!? http://ncmissionofhope.org/updates/wp-content/uploads/2016/03/Hope.jpg
  103. 103. 回到原點! http://books.cw.com.tw/sites/default/files/blog/images/%E6%88%9 1%E7%9A%84%E5%8F%B0%E6%9D%B1%E5%A4%A2.jpg
  104. 104. Outline • Introduction • Pattern History For Trend Analysis • Product Traceability for Quality Monitoring • Mining for Distinctive Pattern (Biomarker) from Genomic Sequences • Future Works
  105. 105. From: http://image.slidesharecdn.com/a-systematic-approach-to-genotypephenotype- correlations-1203626174948644-4/95/a-systematic-approach-to-genotypephenotype- correlations-4-728.jpg?cb=1203597376 From: Nucleic Acids Res. 2007 Aug; 35(16): 5625–5633.
  106. 106. 異中求同(不同物種間的比對)
  107. 107. 異中求同(不同物種間的比對)
  108. 108. 異中求同(不同物種間的比對)
  109. 109. 同中求異(同物種間的比對)
  110. 110. Human 23 Chromsome https://zh.wikipedia.org/wiki/%E4%BA%BA%E9%A1%9E%E5%9F%BA%E5%9B%A0%E7%B5% 84#/media/File:Karyotype.png
  111. 111. Coding Genes https://zh.wikipedia.org/wiki/%E4%BA%BA%E9%A1%9E%E5%9F%BA%E5%9B%A0%E7%B5%84# /media/File:Human_genome_to_genes_zh.png
  112. 112. 49267 Human Genes (upstream 5000 bp) Class Types # of Genes Percent Coding Genes 38,645 78% NonCoding Genes 10,622 22%
  113. 113. https://en.wikipedia.org/wiki/Gene#/media/File:DNA_to_protein_or_ncRNA.svg
  114. 114. Upstream & Downstream ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** ******************************** *** *** *** *** *** *** *** *** *** *** *** 5000 bp 500 bp Coding Genes NonCoding Genes Classes
  115. 115. Finding Possibility http://engage.riggspartners.com/r-blog/3business/finding-possibility
  116. 116. OncoGene&TumorSuppressor Genes ClassID Class Types # of Genes Percent Coding Genes C1 Non-Cancer Gene 37,914 76.96% C2 OncoGene 345 0.70% C3 TumorSuppressor Gene 386 0.78% NonCoding Genes C4 NonCoding Gene 10,622 21.56%
  117. 117. 37,914 345 386 10,622 Human Genes (49,267 ) Non-Cancer Gene OncoGene TumorSuppressor Gene NonCoding Gene
  118. 118. Experiments 1: Existence. 0: Non- Existence Condition C1 C2 C3 C4 DF (≧) Length (≧) # of Maxiaml Repeats C1C2C3C4 1 1 1 1 1000 36 143 C2C3C4 0 1 1 1 10 10 1 C2C3 0 1 1 0 2 10 1788 C2C4 0 1 0 1 10 10 35 C3C4 0 0 1 1 10 10 53
  119. 119. Acknowledgements • 張建國 教授 – 中國醫藥大學 • 詹雯玲助理教授 – (亞洲大學) • 王昭能助理教授 – (亞洲大學) 100 萬
  120. 120. 次世代定序 (Next Generation Sequencing , NGS) http://www.slideshare.net/ueb52/introduction-to-next-generation-sequencing-v2
  121. 121. 美國總統向癌症宣戰! 全力推動抗癌 登月計畫(Cancer Moonshot) http://i2.wp.com/geneonline.news/wp- content/uploads/2016/01/Obama_precision_medicine_0130151- e1453785268417.jpg?fit=1292%2C665
  122. 122. 千兆基因組定序完成 癌症研究重大里程碑 • 「加拿大卑詩癌症中心完成定序 1,000 兆位 元組(Terabyte,TB)的基因序列,相當於 一個拍位元組(Petabyte,PB),遠比國際 性的人類基因體計畫所定序的 DNA 超出 33,000 倍;科學家更找出全球第四大致死 癌症——胃癌的變異基因。」 http://geneonline.news/index.php/2016/05/30/canada-bioinformatics/
  123. 123. Human 23 Chromsome https://zh.wikipedia.org/wiki/%E4%BA%BA%E9%A1%9E%E5%9F%BA%E5%9B%A0%E7%B5% 84#/media/File:Karyotype.png
  124. 124. Maximal Repeats appearing in all of 24 human chromosomes. • Length |Maximal Repeats| <= 500 bp – Ok! • Length |Maximal Repeats| <= 1000 bp – Disk Space Full!
  125. 125. Acknowledgements • 王耀聰 (Jazz Wang)
  126. 126. Outline • Introduction • Pattern History For Trend Analysis • Product Traceability for Quality Monitoring • Mining for Distinctive Pattern (Biomarker) from Genomic Sequences • Future Works
  127. 127. 國際半導體展 (2016.9.9,台北南港展覽館)
  128. 128. Future Works • Hadoop (OffLine) => Spark (OnLine) • Next Generation Sequence (次世代基因定序) • Product Traceability (產品履歷) • 立隆電子-亞洲大學 (產學合作案) • Web Logs Analysis (駭客行為?) • User Behavior Analysis (使用者行為分析)
  129. 129. 亞洲大學-產學合作機會 • 測試生產履歷(Traceability) 儲存空間 • 巨量資料計算平台
  130. 130. Thanks for your listening! 感謝聆聽! 請多指教! www.flickr.com www.slideshare.net http://www.pptschool.com/250.html

×