Numerical Fourier transform based on hyperfunction theoryHidenoriOgata
This is the PC slide of a contributed talk in the conference "ECMI2018 (The 20th European Conference on Mathematics for Industry)", 18-20 June 2018, Budapest, Hungary. In this talk, we propose a numerical method of Fourier transforms based on hyperfunction theory.
Numerical Fourier transform based on hyperfunction theoryHidenoriOgata
This is the PC slide of a contributed talk in the conference "ECMI2018 (The 20th European Conference on Mathematics for Industry)", 18-20 June 2018, Budapest, Hungary. In this talk, we propose a numerical method of Fourier transforms based on hyperfunction theory.
Koreanizer : Statistical Machine Translation based Ro-Ko TransliteratorHONGJOO LEE
Koreanizer is Roman to Korean Transliterator (Back-Romanizer) based on Statistical Machine Translation technique with ngram language model, IBM alignment model for translation model and decoding algorithm.
This slide introducing Koreanizer and some techniques applied for the system for a session in PyCon KR '19 .
This is chapter 4 “Variants of EM algorithm” in my book “Tutorial on EM algorithm”, which focuses on EM variants. The main purpose of expectation maximization (EM) algorithm, also GEM algorithm, is to maximize the log-likelihood L(Θ) = log(g(Y|Θ)) with observed data Y by maximizing the conditional expectation Q(Θ’|Θ). Such Q(Θ’|Θ) is defined fixedly in E-step. Therefore, most variants of EM algorithm focus on how to maximize Q(Θ’|Θ) in M-step more effectively so that EM is faster or more accurate.
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONcscpconf
This paper introduces an advanced, efficient approach for rule based English to Bengali (E2B) machine translation (MT), where Penn-Treebank parts of speech (PoS) tags, HMM (Hidden
Markov Model) Tagger is used. Fuzzy-If-Then-Rule approach is used to select the lemma from rule-based-knowledge. The proposed E2B-MT has been tested through F-Score measurement,
and the accuracy is more than eighty percent
Expectation Maximization Algorithm with Combinatorial AssumptionLoc Nguyen
Expectation maximization (EM) algorithm is a popular and powerful mathematical method for parameter estimation in case that there exist both observed data and hidden data. The EM process depends on an implicit relationship between observed data and hidden data which is specified by a mapping function in traditional EM and a joint probability density function (PDF) in practical EM. However, the mapping function is vague and impractical whereas the joint PDF is not easy to be defined because of heterogeneity between observed data and hidden data. The research aims to improve competency of EM by making it more feasible and easier to be specified, which removes the vagueness. Therefore, the research proposes an assumption that observed data is the combination of hidden data which is realized as an analytic function where data points are numerical. In other words, observed points are supposedly calculated from hidden points via regression model. Mathematical computations and proofs indicate feasibility and clearness of the proposed method which can be considered as an extension of EM.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Koreanizer : Statistical Machine Translation based Ro-Ko TransliteratorHONGJOO LEE
Koreanizer is Roman to Korean Transliterator (Back-Romanizer) based on Statistical Machine Translation technique with ngram language model, IBM alignment model for translation model and decoding algorithm.
This slide introducing Koreanizer and some techniques applied for the system for a session in PyCon KR '19 .
This is chapter 4 “Variants of EM algorithm” in my book “Tutorial on EM algorithm”, which focuses on EM variants. The main purpose of expectation maximization (EM) algorithm, also GEM algorithm, is to maximize the log-likelihood L(Θ) = log(g(Y|Θ)) with observed data Y by maximizing the conditional expectation Q(Θ’|Θ). Such Q(Θ’|Θ) is defined fixedly in E-step. Therefore, most variants of EM algorithm focus on how to maximize Q(Θ’|Θ) in M-step more effectively so that EM is faster or more accurate.
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONcscpconf
This paper introduces an advanced, efficient approach for rule based English to Bengali (E2B) machine translation (MT), where Penn-Treebank parts of speech (PoS) tags, HMM (Hidden
Markov Model) Tagger is used. Fuzzy-If-Then-Rule approach is used to select the lemma from rule-based-knowledge. The proposed E2B-MT has been tested through F-Score measurement,
and the accuracy is more than eighty percent
Expectation Maximization Algorithm with Combinatorial AssumptionLoc Nguyen
Expectation maximization (EM) algorithm is a popular and powerful mathematical method for parameter estimation in case that there exist both observed data and hidden data. The EM process depends on an implicit relationship between observed data and hidden data which is specified by a mapping function in traditional EM and a joint probability density function (PDF) in practical EM. However, the mapping function is vague and impractical whereas the joint PDF is not easy to be defined because of heterogeneity between observed data and hidden data. The research aims to improve competency of EM by making it more feasible and easier to be specified, which removes the vagueness. Therefore, the research proposes an assumption that observed data is the combination of hidden data which is realized as an analytic function where data points are numerical. In other words, observed points are supposedly calculated from hidden points via regression model. Mathematical computations and proofs indicate feasibility and clearness of the proposed method which can be considered as an extension of EM.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Neuro-symbolic is not enough, we need neuro-*semantic*
Presentation
1. Language Model
自然言語処理シリーズ 4 機械翻訳 pp.62-80
Koichi Akabe
MT study
NAIST
2014-05-08
2014-05-08 Koichi Akabe (NAIST MT) 1 / 20
2. Fluency of Machine Translation
Machine Translation: f −→ e
2014-05-08 Koichi Akabe (NAIST MT) 2 / 20
3. Fluency of Machine Translation
Machine Translation: f −→ e
Which translation e is correct?
▶ e1 = he is big
▶ e2 = is big he
▶ e3 = this is a purple dog
2014-05-08 Koichi Akabe (NAIST MT) 2 / 20
4. Fluency of Machine Translation
Machine Translation: f −→ e
Which translation e is correct?
▶ e1 = he is big
▶ e2 = is big he
▶ e3 = this is a purple dog
2014-05-08 Koichi Akabe (NAIST MT) 2 / 20
5. Fluency of Machine Translation
Machine Translation: f −→ e
Which translation e is correct?
▶ e1 = he is big
▶ e2 = is big he
▶ e3 = this is a purple dog
We can know the answer without f
2014-05-08 Koichi Akabe (NAIST MT) 2 / 20
6. Fluency of Machine Translation
Machine Translation: f −→ e
Which translation e is correct?
▶ e1 = he is big
▶ e2 = is big he −→ Syntax broken
▶ e3 = this is a purple dog −→ We have never seen
We can know the answer without f
2014-05-08 Koichi Akabe (NAIST MT) 2 / 20
7. Language model (LM)
Language model gives scores P(e) for each sentence without f
▶ P(e = he is big)
▶ P(e = is big he)
▶ P(e = this is a purple dog)
2014-05-08 Koichi Akabe (NAIST MT) 3 / 20
8. Language model (LM)
Language model gives scores P(e) for each sentence without f
▶ P(e = he is big)
▶ P(e = is big he)
▶ P(e = this is a purple dog)
Using this, we can compare sentences!
P(e = e1) > P(e = e3) > P(e = e2) ?
MT uses LM to increase translation accuracy
We call P(e) “language model probability”
2014-05-08 Koichi Akabe (NAIST MT) 3 / 20
9. How to calculate P(e)?
We want to calculate probability of a sentence:
P(e = he is big)
2014-05-08 Koichi Akabe (NAIST MT) 4 / 20
10. How to calculate P(e)?
We want to calculate probability of a sentence:
P(e = he is big)
Direct method: count frequency of sentences in the training data
PML(e) =
ctrain(e)
∑
e′ ctrain(e′)
2014-05-08 Koichi Akabe (NAIST MT) 4 / 20
11. How to calculate P(e)?
We want to calculate probability of a sentence:
P(e = he is big)
Direct method: count frequency of sentences in the training data
PML(e) =
ctrain(e)
∑
e′ ctrain(e′)
Almost possible sentences are not contained in the training data
(−→ PML(e) = 0 for almost sentences)
2014-05-08 Koichi Akabe (NAIST MT) 4 / 20
12. How to calculate P(e)?
We want to calculate probability of a sentence:
P(e = he is big)
Direct method: count frequency of sentences in the training data
PML(e) =
ctrain(e)
∑
e′ ctrain(e′)
Almost possible sentences are not contained in the training data
(−→ PML(e) = 0 for almost sentences)
Focus words to solve this problem
2014-05-08 Koichi Akabe (NAIST MT) 4 / 20
13. Rewrite P using words
P(e = he is big)
2014-05-08 Koichi Akabe (NAIST MT) 5 / 20
14. Rewrite P using words
P(e = he is big)
First, we split variable e into words and text length I
P(I = 3, e1 = he, e2 = is, e3 = big)
2014-05-08 Koichi Akabe (NAIST MT) 5 / 20
15. Rewrite P using words
P(e = he is big)
First, we split variable e into words and text length I
P(I = 3, e1 = he, e2 = is, e3 = big)
To use uniform variable type, we replace I to eI = ⟨/s⟩
P(e1 = he, e2 = is, e3 = big, e4 = ⟨/s⟩)
2014-05-08 Koichi Akabe (NAIST MT) 5 / 20
16. Rewrite P using words
P(e = he is big)
First, we split variable e into words and text length I
P(I = 3, e1 = he, e2 = is, e3 = big)
To use uniform variable type, we replace I to eI = ⟨/s⟩
P(e1 = he, e2 = is, e3 = big, e4 = ⟨/s⟩)
We also add a prefix symbol for contexts (described later)
P(e0 = ⟨s⟩, e1 = he, e2 = is, e3 = big, e4 = ⟨/s⟩)
2014-05-08 Koichi Akabe (NAIST MT) 5 / 20
17. Rewrite P using conditional probability P(word|context)
2014-05-08 Koichi Akabe (NAIST MT) 6 / 20
19. Rewrite P using conditional probability P(word|context)
Chain rule: P(e0 = ⟨s⟩, e1 = he, e2 = is, e3 = big, e4 = ⟨/s⟩)
= P(e4 = ⟨/s⟩|e0 = ⟨s⟩, e1 = he, e2 = is, e3 = big)
×P(e3 = big|e0 = ⟨s⟩, e1 = he, e2 = is)
×P(e2 = is|e0 = ⟨s⟩, e1 = he)
×P(e1 = he|e0 = ⟨s⟩)×P(e0 = ⟨s⟩)
Generalize:
P(eI
1) =
I+1∏
i=1
PML(ei|ei−1
0 ) =
I+1∏
i=1
ctrain(ei
0)
ctrain(ei−1
0 )
ej
i = ei ei+1 · · · ej is a part of word sequence e0 e1 · · · eI+1
2014-05-08 Koichi Akabe (NAIST MT) 6 / 20
20. Rewrite P using conditional probability P(word|context)
Chain rule: P(e0 = ⟨s⟩, e1 = he, e2 = is, e3 = big, e4 = ⟨/s⟩)
= P(e4 = ⟨/s⟩|e0 = ⟨s⟩, e1 = he, e2 = is, e3 = big)
×P(e3 = big|e0 = ⟨s⟩, e1 = he, e2 = is)
×P(e2 = is|e0 = ⟨s⟩, e1 = he)
×P(e1 = he|e0 = ⟨s⟩)×P(e0 = ⟨s⟩)
Generalize:
P(eI
1) =
I+1∏
i=1
PML(ei|ei−1
0 ) =
I+1∏
i=1
ctrain(ei
0)
ctrain(ei−1
0 )
ej
i = ei ei+1 · · · ej is a part of word sequence e0 e1 · · · eI+1
However, ctrain(ei
0) becomes 0 for large i
2014-05-08 Koichi Akabe (NAIST MT) 6 / 20
21. n-gram language model
So, we do not use long word sequences!
2014-05-08 Koichi Akabe (NAIST MT) 7 / 20
22. n-gram language model
So, we do not use long word sequences!
n-gram model uses only n − 1 words as contexts:
P(eI
1) ≈
I+1∏
i=1
PML(ei|ei−1
i−n+1) =
I+1∏
i=1
ctrain(ei
i−n+1)
ctrain(ei−1
i−n+1)
n-gram model eases 0 probability problem
2014-05-08 Koichi Akabe (NAIST MT) 7 / 20
23. Example of strict / 2-gram probabilities
Strict probability:
P(e = he is big) = PML(⟨/s⟩|⟨s⟩ he is big)
×PML(big|⟨s⟩ he is)
×PML(is|⟨s⟩ he)
×PML(he|⟨s⟩)
2-gram probability:
P(e = he is big) ≈ PML(⟨/s⟩|big)
×PML(big|is)
×PML(is|he)
×PML(he|⟨s⟩)
2014-05-08 Koichi Akabe (NAIST MT) 8 / 20
24. Smoothing
Smoothing makes a robust LM for unknown linguistic phenomena
2014-05-08 Koichi Akabe (NAIST MT) 9 / 20
25. Smoothing
Smoothing makes a robust LM for unknown linguistic phenomena
Basically, we calculate n-gram LM probability with (n − 1)-gram or
shorter contexts
2014-05-08 Koichi Akabe (NAIST MT) 9 / 20
26. Linear interpolation
Interpolate probability with shorter n-grams:
P(ei|ei−1
i−n+1) = (1 − α)PML(ei|ei−1
i−n+1) + αP(ei|ei−1
i−n+2)
For large α: ■■■■
■■■
For small α: ■■■■
■■■
2014-05-08 Koichi Akabe (NAIST MT) 10 / 20
27. Linear interpolation
Interpolate probability with shorter n-grams:
P(ei|ei−1
i−n+1) = (1 − α)PML(ei|ei−1
i−n+1) + αP(ei|ei−1
i−n+2)
For large α: ■■■■
■■■
For small α: ■■■■
■■■
Give constant probability for unknown words:
P(ei) = (1 − α)PML(ei) + α
1
|V|
where |V| is the vocabulary size
2014-05-08 Koichi Akabe (NAIST MT) 10 / 20
28. Linear interpolation
Interpolate probability with shorter n-grams:
P(ei|ei−1
i−n+1) = (1 − α)PML(ei|ei−1
i−n+1) + αP(ei|ei−1
i−n+2)
For large α: ■■■■
■■■
For small α: ■■■■
■■■
Give constant probability for unknown words:
P(ei) = (1 − α)PML(ei) + α
1
|V|
where |V| is the vocabulary size
How to choose α?
2014-05-08 Koichi Akabe (NAIST MT) 10 / 20
29. Idea of Witten-Bell method
Table Comparison of two n-gram contexts
president was president ronald
elected 5 reagan 38
the 3 caza 1
in 3 venetiaan 1
first 3
· · ·
52 unique words, 110 times 3 unique words, 40 times
2014-05-08 Koichi Akabe (NAIST MT) 11 / 20
30. Idea of Witten-Bell method
Table Comparison of two n-gram contexts
president was president ronald
elected 5 reagan 38
the 3 caza 1
in 3 venetiaan 1
first 3
· · ·
52 unique words, 110 times 3 unique words, 40 times
▶ “president was” may follow unknown words
−→ We cannot trust P(·|president was)
▶ “president ronald” frequently follow “reagan”
−→ We can trust P(·|president ronald)
2014-05-08 Koichi Akabe (NAIST MT) 11 / 20
35. Absolute discounting method
Give discounted quantity for shorter n-grams
αei−1
i−n+1
= 1 −
∑
ei
Pd(ei|ei−1
i−n+1)
P(ei|ei−1
i−n+1) = Pd(ei|ei−1
i−n+1) + αei−1
i−n+1
P(ei|ei−1
i−n+2)
e.g. α := 0.5 (normally decided to maximize likelihood of dev set)
Pd(reagan|president ronald) =
38 − 0.5
40
= 0.9375
Pd(caza|president ronald) =
1 − 0.5
40
= 0.0125
Pd(venetiaan|president ronald) =
1 − 0.5
40
= 0.0125
αpresident ronald = 1 −
∑
ei
Pd(ei|president ronald) = 0.0375
2014-05-08 Koichi Akabe (NAIST MT) 14 / 20
36. Kneser-Ney method
Idea
“ronald reagan” or “president reagan” is frequently contained in
corpora, and normal smoothing methods give large probability for
“ronald” and “reagan”. However, “reagan” is not used in other
contexts.
2014-05-08 Koichi Akabe (NAIST MT) 15 / 20
37. Kneser-Ney method
Idea
“ronald reagan” or “president reagan” is frequently contained in
corpora, and normal smoothing methods give large probability for
“ronald” and “reagan”. However, “reagan” is not used in other
contexts.
Kneser and Ney used the unique counter u in absolute discounting
Pkn(ei|ei−1
i−n+1) =
max(u(·, ei
i−n+1) − d, 0)
u(·, ei−1
i−n+1, ·)
αei−1
i−n+1
= 1 −
∑
ei
Pkn(ei|ei−1
i−n+1)
2014-05-08 Koichi Akabe (NAIST MT) 15 / 20
40. Other methods
Good-Turing (“Good” is a scientist)
Turing estimator uses revised values as a number of words:
r∗
= (r + 1)
Nr+1
Nr
where Nr is a number of words occurring r times
If Nr = 0, r∗ becomes indeterminate form
Good-Turing estimator uses linear regression with Zipf’s law to
solve this problem
Zri :=
2Nri
ri+1 − ri−1
where ri is ith non-zero number (r1 < r2 < r3 < · · · )
2014-05-08 Koichi Akabe (NAIST MT) 18 / 20