More Related Content
Similar to Natural language processing in iOS / OSX (20)
More from Cotap Engineering (7)
Natural language processing in iOS / OSX
- 2. NLP Tools in iOS/OSX: Topics
• CFStringTransform
• transliteration, normalization
• CFStringTokenizer
• string tokenization, language identification
• UITextChecker
• spell check
• NSLinguisticTagger
• parts of speech tagging, named entity recognition,
lemmatization, language/script identification
• NSDataDetector
• data detection
- 3. NLP Tools in iOS/OSX: CFStringTransform
The CFStringTransform Function
- 4. NLP Tools in iOS/OSX: CFStringTransform
Transliterate Thai to Latin
Original: สวัสดี; Transformed: s̄wạs̄dī
- 6. NLP Tools in iOS/OSX: CFStringTransform
Transliterate Latin to Gujarati
Original: Gujarātī; Transformed: ગuજરાતી
- 7. NLP Tools in iOS/OSX: CFStringTransform
Remove Diacritics and Accents
Original: s̄wạs̄dī; Transformed: swasdi
- 8. NLP Tools in iOS/OSX: CFStringTransform
Describe Unicode Characters
Original: 👍; Transformed: N{THUMBS UP SIGN}
- 10. NLP Tools in iOS/OSX: CFStringTokenizer
Tokenize Into Words: Simplified Chinese
Tokens: [⼈人, ⼈人⽣生, ⽽而, ⾃自由, 在, 尊严, 和, 权利, 上, ⼀一律, 平等, 他们,
赋有, 理性, 和, 良⼼心, 并, 应, 以, 兄弟, 关系, 的, 精神, 互相, 对待]
- 11. NLP Tools in iOS/OSX: CFStringTokenizer
Transliterate Tokens: Simplified Chinese
Tokens: [rén, rénshēng, ér, zìyóu, zài, zūnyán, hé, quánlì, shàng, yīlv̀, píngděng, tāmén,
fùyǒu, lǐxìng, hé, liángxīn, bìng, yìng, yǐ, xiōngdì, guānxī, de, jīngshén, hùxiāng, duìdài]
- 12. NLP Tools in iOS/OSX: CFStringTokenizer
Language Identification: Icelandic
Language Code: is
- 15. NLP Tools in iOS/OSX: UITextChecker
Spell Check
Misspelled Range: (7,4); Guesses: Optional([ice,
Bice, bide, nice, vice, bike, bile, bite, bace, bbce,
bcce, bdce, bece, bfce, dice, lice, mice, pice, rice,
brice, bicep])
Misspelled Range: (12,3); Guesses: Optional([ay, cay,
day, say])
- 17. NLP Tools in iOS/OSX: NSLinguisticTagger
Parts of Speech Tagging and Named Entity
Recognition
- 18. NLP Tools in iOS/OSX: NSLinguisticTagger
NSLinguisticTagger Schemes
- 19. NLP Tools in iOS/OSX: NSLinguisticTagger
Parts of Speech Tagging and Named Entity
Recognition
- 20. NLP Tools in iOS/OSX: NSLinguisticTagger
Parts of Speech Tagging and Named Entity
Recognition
Token: What; Tag: Pronoun
Token: is; Tag: Verb
Token: the; Tag: Determiner
Token: capital; Tag: Noun
Token: of; Tag: Preposition
Token: New York; Tag: PlaceName
- 21. NLP Tools in iOS/OSX: NSLinguisticTagger
Script Identification
- 22. NLP Tools in iOS/OSX: NSLinguisticTagger
Script Identification
Token: hello; Tag: Latn
Token: สวัสดี; Tag: Thai
Token: bonjour; Tag: Latn
Token: 你; Tag: Hani
Token: 好; Tag: Hani
Token: !લો; Tag: Gujr
Token: привет; Tag: Cyrl
Token: नमस्ते; Tag: Deva
- 24. NLP Tools in iOS/OSX: NSDataDetector
Extracting Structured Data
- 25. Match: Lunch tomorrow at 12:30PM;
- Date: Optional(2014-11-20 20:30:00 +0000)
Match: 1600 Pennsylvania Ave. NW, Washington, D.C.
20500;
- Street: Optional(1600 Pennsylvania Ave.);
- Zip: Optional(20500)
Match: 202-456-1414
Match: 2:15PM;
- Date: Optional(2014-11-19 22:15:00 +0000)
Match: Southwest Airlines Flight 737
Match: www.southwest.com
NLP Tools in iOS/OSX: NSDataDetector
Extracting Structured Data