漢語間統計式機器翻譯語料處理-用臺灣閩南語示範
Corpus Preprocessing for Statistical Machine Translation between the Chinese Languages - Using Taiwan Southern Min as Examples
臺灣是一个多元民族、多元語言的國家。
講母語、使用母語是上基本的權利,
毋過母語的電腦相關應用煞誠少,
需要加強自然語言處理的研究佮語料收集整理。
臺灣本土語言百百種,
本論文是針對閩南語,
研究伊翻譯語料的特性。
除了閩南語本身以外,
嘛希望研究結果對別的本土語言有幫助。
本論文提出一个自動整理漢語語料的方法,
予資訊無完整的語料庫補足資訊,
發揮上大的價值,
BLEU分數對9.30搝到13.82。
另外閣用實驗證明平行語料數量無到十萬句的時,
加語料對翻譯的效果影響非常大,
原本64121句加到99147句了後,
BLEU分數對13.82提昇到19.33。
漢語間統計式機器翻譯語料處理-用臺灣閩南語示範
Corpus Preprocessing for Statistical Machine Translation between the Chinese Languages - Using Taiwan Southern Min as Examples
臺灣是一个多元民族、多元語言的國家。
講母語、使用母語是上基本的權利,
毋過母語的電腦相關應用煞誠少,
需要加強自然語言處理的研究佮語料收集整理。
臺灣本土語言百百種,
本論文是針對閩南語,
研究伊翻譯語料的特性。
除了閩南語本身以外,
嘛希望研究結果對別的本土語言有幫助。
本論文提出一个自動整理漢語語料的方法,
予資訊無完整的語料庫補足資訊,
發揮上大的價值,
BLEU分數對9.30搝到13.82。
另外閣用實驗證明平行語料數量無到十萬句的時,
加語料對翻譯的效果影響非常大,
原本64121句加到99147句了後,
BLEU分數對13.82提昇到19.33。
7. Project name or document title (max. two lines) – Type of document – Client name (max. two lines) Location, date of presentation (month day, year) Project name or document title | Date (month day, year)
8.
9. A. Chapter page (after chapter point: 4 "space bars")
28. Client types ” Unsure" ” Hesitant" ” Successful" ” Hopeless" ” Drowning" ” Struggling" Ensure survival in short term Restructuring or search for partners Actively initiate structural change Growing and positioning project Secure future Strategy project Startout toward the future Specific imple-mentation project Client type Need Project type $
32. Columns and tables Column Title 1 Item Item Item Item Item Item Item Item Item Column Title 1 Column Title 1 Column Title 1 Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number Number
56. Cost breakdown chart Value Value Value Value Value Value Unit Gross Margin Cost Component (%) Component (%) Component (%) Component (%) Component (%) Component (%)
124. Organizational and flow chart Pantene Jiehua New Brand Rejoice Head & Shoulders Ulan Whisper Ariel Tide JV Brands Crest Professional Marketing Safeguard Zest Camay Assistant Brand Manager, 2-5 per Brand Media & Commercial Production Marketing Manager Dimitri Panayatapolous School Program Group Nurses for School Program Randall Beard Marketing Director Rene Co Marketing Director Ken Doi Marketing Director Robin Oxendine Marketing Manager Peter Foyston Marketing Manager Laundry Haircare Toothpaste/Soap Skincare/Paper Vacant General Manager Brad Casper General Manager* Vacant General Manager Virginia Lee Vice President* Marketing Organization
201. Booz-Ball tables should be clean and organized, all of the same color, and individual circles should not be too large Breadth and Depth of Functionality Strength of Architecture Ability to Support Ease of Implementation Overall Strength of Solution ACNielsen Information Resources, Inc. Kenosia Interactive Edge Demantra Mercari RW3 Tech Graph Title, Time Period Subtitle High Low Key:
202. Stacked bar graph 21 35 40 84 54 Axis Title Axis Title Source: What’s the source? Title, Time Period Subtitle
203. 100% stacked bar graph Axis Title Axis Title Source: What’s the source? Title, Time Period Subtitle Column 1 Column 2 Column 3 Column 4 Column 5 Key
204. Stacked column graph 21 35 40 84 54 CAGR 1995-1999 Total xx% Axis Label Axis Label Dataset 5 xx% Dataset 4 xx% Dataset 3 xx% Dataset 2 xx% Dataset 1 xx% Source: What’s the source? Title, Time Period Subtitle
206. Waterfall chart, double step Axis Title Axis Title Title, Time Period Subtitle Dataset 1 Dataset 2 Source: What’s the source?
207. Tornado chart: back-to-back bar graphs Axis Title Category A Category B Category C Category D Category F Axis Title Category E Source: What’s the source? Title, Time Period Subtitle
208. Tornado chart: horizontal stacked chart Axis Title Category A Category B Category C Category D Category E Category F Category G Category H Source: What’s the source? Title, Time Period Subtitle
209. Combination line and column graph: dual-axis Axis Title Axis Title Dataset 1 Dataset 2 Axis Title Source: What’s the source? Title, Time Period Subtitle
210. Area graph: data labels inside Axis Label Axis Label Source: What’s the source? Title, Time Period Subtitle
211. Pie graph: labels on outside Title, Time Period Total = xx million Source: What’s the source? Order the pieces from largest to smallest, unless some other order is logically more appropriate. In general, the angle of the first slice should be set to zero degrees.
214. Porter’s five forces Threat of New Entrants Threat of Substitutes Bargaining Power of Suppliers Bargaining Power of Buyers Industry Rivalry
215. Three interlocking circles Label 1 Label 2 Label 3 Label A Label B Label C Note: You can check “Semi-transparent” in the “More Fill Colors” dialog box to create the overlapping color effect (used on the Label A, B, C section).
219. We believe that success in the new economy lies in a seamless combination of capabilities across the extended supply chain The Supply Chain Continuum Relationships along the Supply Chain Scope of Impact Increasing Capabilities, Increasing Benefits Within Business Activities Traditional Optimization Integration Integrated Step 1: Integrate functions of the existing supply chain Between Business Functions With Customers & Suppliers Collaboration Collaborative Step 2: Improve collaboration and control with vendors, customers Across Alliance Partners Synchronization eSynchronization Step 3: Virtually Synchronize the supply chain across players into one logical enterprise Web-Based Entrants Leader in the New Economy
220. Our vision of winning new business models in supply chain leverage the emerging exchange space, integrated with innovative supply chain planning and execution capabilities Procurement Supply Chain Planning 3rd Party Partners/Alliances/ Ventures eFulfillment Collaborative Manufacturing eCRM Virtual Synchronization eSupport Service/ Support/ Maintenance Material/procurement exchanges/auctions IP/product development exchanges Capability/service exchanges Supply Chain Ecosystem eDesign eCommerce Capabilities
221.
222. Successful business models are driven by two key concepts – revenue and profitability Business Model Design Concepts Design Concepts Think Global; Act Local Customer Revenue and Profitability Customer Service Requirements Channel Asset Leverage Collaborative Value Creation Revenue/ Profitability Service Global Optima Customer (1) Customer (2) Customer (3)
224. SOME ARGUMENT PATTERNS 部 分 逻 辑 论 证 模 式 Success requires X 需 有 X 才 能 成 功 Success requires X 需 要 X 才 能 成 功 You are pursuing X 你 正 朝 X 发 展 You thought X was a problem 你 认 为 X 是 问 题 Performance is not as expected 绩 效 不 如 预 期 You are not equipped to do X 你 无 法 作 到 You are not focusing on X 你 的 重 心 不 在 X Y would be better 但 Y 比 较 有 利 Further investigation shows it Y 但 调 查 显 示 Y 才 是 问 题 Underlying cause is X 问 题 出 在 X Therefore, develop capability for X 因 此 , 建 立 作 X 的 能 力 Therefore, shift focus to X 因 此 , 转 移 重 心 到 X Therefore, change direction to Y 因 此 , 转 向 Y 前 进 Therefore, shift focus to Y 因 此 , 转 移 重 心 到 Y Therefore, take steps to fix X 因 此 , 设 法 解 决 X
254. Understanding buyer values helps prove or disprove current hypotheses as well as generate strategy solutions. Identify Buyer Value Segments Situation Assessment Hypothesis List Development Survey Collection Data Coding and Utility Calculation Instrument Design and Testing Sample Quota Design and List Pull Data Analysis Market Research & Visioning Conceptual Design Detailed Design & Pilot Implementation (phased) Systems Development/ Enhancement Field Administration Preparation
255. Organizational beliefs and strategy alternatives identified in the situation assessment are translated into hypotheses for testing. Situation Assessment Hypothesis List Development Survey Collection Data Coding and Utility Calculation Instrument Design and Testing Field Administration Preparation Sample Quota Design and List Pull Data Analysis
256.
257. Value based segment strategies can produce incremental revenues of $700 million and reduce costs up to $150 million. Identify Buyer Value Segments - Quantifiable Benefits - These cost savings will be offset by the $52 million increase in central delivery unit costs... By 1999, even after absorbing significant implementation/infrastructure costs, project can contribute over $700 million pre-tax annually . . . $493 $670 $309 The shareholder value effect can be significant. Year Branch and City Administration annual operating costs will be reduced by nearly $200 million . . . $MM ILLUSTRATIVE
258.
259. With traditional research, when you ask how important any particular feature is individually, consumers tend to say each is very important. 1 2 3 4 5 6 7 8 9 10 Styling Price Speed Reliability Not at all important Very important
260.
261. These groups or segments of consumers not only have different profit potentials... — Investment Buyer Value Segments — % of Market (Consumers) — Investment Buyer Value Segments — % of Market ($) — Average Investment Balance Per Value Segment — Source: Andersen Consulting National Buyer Values Study for Retail Financial Services Consumers Value Segment Channel Rate / Speed Liquidity / Access Sensitive Speed Avg.. Investment Balance $77,725 $92,264 $96,335 $68,212 Case Study Channelists Speed Liquidity / Access Sensitive 34% Rate / Speed Sensitive 16% Speed Liquidity / Access Channel Rate / Speed
262.
263.
264.
265. Consolidated results for the markets - from a market attractiveness perspective. - i.e. looking at market size, growth, competition, market entry (i.e. distribution opportunity) from a current and future perspective. Entry Priority High Medium-High Medium Medium - Low Low Market Size Strategic View Distribution Opportunity Competition Market Growth Source: AC market investigations Beijing Nanjing Hangzhou Jinan Changsha Chongqing
266. From the sales offices and warehouses established in these several markets, strategic nodes can be established to allow access to nearby markets. Source: AC market investigations High Priority Medium to High Priority Medium Priority Medium to Low Priority Cities to be covered Cities Requiring Investigation Year One Beyond Year One Beijing Jinan Nanjing Shanghai Hangzhou Changsha Shenzhen Chongqing Tianjin ShijiaZhuang (Hebei) Zhengzhou (Henan) Qingdao Hefei (Anhui) Suzhou Chengdu Shantou Xiamen Wuhan Xi’an (Shaanxi) Zhanjiang Guiyang Kunming