Your SlideShare is downloading. ×
0
Analyzing
Japanese Art
with Node.js
and Computer
Vision
John Resig
Lot 55: 20 Japanese Woodblock Prints
Each depicting a female/Geisha figure with
calligraphy throughout each print. Prints
...
Step 1: Acquire and read tons of expensive books.
Step 2: Learn to read Japanese. *
Japanese from the 17th to 19th century. *
You’re not going to learn this from Rosetta St...
Step 3: Learn to read Japanese calligraphy.
Solution: A fast-loading, responsive, i18ned, web
site: Ukiyo-e.org
https://github.com/jeresig/i18n-node-2
var greeting = i18n.__('Hello %s, how are you
today?', 'Marcus');
i18n.__n('%s cat'...
https://github.com/jeresig/i18n-node-2
{!
"Hello": "Hello",!
"Hello %s, how are you today?": "Hello %s, how are you today?...
Digital Ocean
Amazon S3
Amazon Cloudfront
Digital Ocean
Images
Data

(HTML,
XML, JSON)
Images JS, CSS
Images JS, CSS
nginx...
https://github.com/jeresig/jquery-imgscrubber
Collecting Tons of Woodblock Print Data
Search
Page Page Page
HTML
Image
HTML
Image
HTML
Image
Search
Page Page Page
HTML
Image
HTML
Image
HTML
Image
Queue-based ...
Some Website
WebKit
PhantomJS
CasperJS
SpookyJS
Save Data
XML Files
Mongo Log
libxml (+ xpath)
MongoDB
Extract Data
Proces...
module.exports = function() {!
return {!
scrape: [!
{!
start: "http://ukiyo-e.org/search",!
visit: "//a[@class='img']",!
n...
"surname" : "Hashimoto",
"surname_kana" : "はしもと",
"name" : "Hashimoto Okiie",
"ascii" : "Hashimoto Okiie",
"plain" : "Hash...
“Stack Scraper”
https://github.com/jeresig/stack-scraper
https://github.com/jeresig/ukiyoe-scrapers
Image Similarity
https://github.com/jeresig/node-matchengine
Image Similarity Search
Idyll: Offline Image Cropping
• https://github.com/jeresig/idyll

• Crop images offline and on a mobile
device.

• Saves the ...
by David Chester

at Shutterstock
https://github.com/dchester/perl-image-crop-calibration-target
http://www.ersatzlabs.com/
Aiding Woodblock Print
Studies with Image Analysis
Correcting Print Data
Japanese Names
• Utagawa Hiroshige	

• Ando Hiroshige	

• Andō Hiroshige	

• Hiroshige	

• 歌川広重	

• 広重
安土
安堂
安島
安東
安籐
安藤
安道
安達
阿藤
Andō
安藤
andō
antō
anzō
yasuzuka
A many-to-many mapping!
Sharaku Toshusai
東洲斎写楽
Sharaku Toshusai
東洲斎写楽
Is this the family name?
Where are the stress marks?
How do you “split” this name?
Which name parts...
Tools (all are Node modules!)
• https://github.com/lovell/
hepburn

• https://github.com/jeresig/
node-enamdict

• https:/...
Hepburn
• https://github.com/lovell/
hepburn

• Takes in the English form of a
Japanese word.

• Returns it written in Hir...
Enamdict
• https://github.com/jeresig/
node-enamdict

• Downloads and queries the
ENAMDICT database

• (A mapping of Japan...
NDLNA
• https://github.com/jeresig/
node-ndlna

• Queries the NDLNA database

• Finds the correct Kanji for an
English nam...
ndlnahepburn enamdict
romaji-name
{
"original" : "Sharaku Toshusai (東洲斎写楽 )",
"locale" : "ja",
"kanji" : "東洲斎写楽",
"given" : "Sharaku",
"given_kana" : "しゃらく"...
Dates
• https://github.com/jeresig/node-yearrange
var yr = require("yearrange");!
"
yr.parse("1877")!
// {"start": 1877, "...
Artist Rectification
Miyagawa Shuntei
Printed in 1897
Sold for: $550
Prints sell for $100-$400 individually
True Estimate: $2100 - $8400 *
* Yo...
• http://ejohn.org/research/

• http://ukiyo-e.org/
• https://github.com/jeresig
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
EmpireJS: Hacking Art with Node js and Image Analysis
Upcoming SlideShare
Loading in...5
×

EmpireJS: Hacking Art with Node js and Image Analysis

7,248

Published on

Talk at EmpireJS, May 6th, 2014.

1 Comment
41 Likes
Statistics
Notes
No Downloads
Views
Total Views
7,248
On Slideshare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
56
Comments
1
Likes
41
Embeds 0
No embeds

No notes for slide

Transcript of "EmpireJS: Hacking Art with Node js and Image Analysis"

  1. 1. Analyzing Japanese Art with Node.js and Computer Vision John Resig
  2. 2. Lot 55: 20 Japanese Woodblock Prints Each depicting a female/Geisha figure with calligraphy throughout each print. Prints measure 13.75" H x 9.375" W. Toning to each print, some losses around edges. Estimated Price: $400 - $600
  3. 3. Step 1: Acquire and read tons of expensive books.
  4. 4. Step 2: Learn to read Japanese. * Japanese from the 17th to 19th century. * You’re not going to learn this from Rosetta Stone.
  5. 5. Step 3: Learn to read Japanese calligraphy.
  6. 6. Solution: A fast-loading, responsive, i18ned, web site: Ukiyo-e.org
  7. 7. https://github.com/jeresig/i18n-node-2 var greeting = i18n.__('Hello %s, how are you today?', 'Marcus'); i18n.__n('%s cat', '%s cats', 3); Node i18n 2 (npm install i18n-2) setLocaleFromSubdomain([request])
  8. 8. https://github.com/jeresig/i18n-node-2 {! "Hello": "Hello",! "Hello %s, how are you today?": "Hello %s, how are you today?",! "weekend": "weekend",! "Hello %s, how are you today? How was your %s.": "Hello %s, how are you today? How was your %s.",! "Hi": "Hi",! "Howdy": "Howdy",! "%s cat": {! "one": "%s cat",! "other": "%s cats"! },! "There is one monkey in the %%s": {! "one": "There is one monkey in the %%s",! "other": "There are %d monkeys in the %%s"! },! "tree": "tree"! }! Node i18n 2 (npm install i18n-2)
  9. 9. Digital Ocean Amazon S3 Amazon Cloudfront Digital Ocean Images Data
 (HTML, XML, JSON) Images JS, CSS Images JS, CSS nginx (w/ cache) node.js express node.js express naught mongodb Elastic
 Search Scraper
  10. 10. https://github.com/jeresig/jquery-imgscrubber
  11. 11. Collecting Tons of Woodblock Print Data
  12. 12. Search Page Page Page HTML Image HTML Image HTML Image Search Page Page Page HTML Image HTML Image HTML Image Queue-based Crawling using PhantomJS Processing Queue
  13. 13. Some Website WebKit PhantomJS CasperJS SpookyJS Save Data XML Files Mongo Log libxml (+ xpath) MongoDB Extract Data Process Data Artists Images Correct Artist and Date Add to Site!
  14. 14. module.exports = function() {! return {! scrape: [! {! start: "http://ukiyo-e.org/search",! visit: "//a[@class='img']",! next: "//a[contains(@rel,'next')]"! },! {! extract: {! "title": "//p[contains(@class, 'title')]//span",! "dateCreated": "//p[contains(@class, 'date')]//span",! "artists[]": "//p[contains(@class, 'artist')]//a",! "images[]": "//div[contains(@class,'imageholder')]//a/@href"! }! }! ]! };! };!
  15. 15. "surname" : "Hashimoto", "surname_kana" : "はしもと", "name" : "Hashimoto Okiie", "ascii" : "Hashimoto Okiie", "plain" : "Hashimoto Okiie", "kana" : "はしもとおきいえ", "_id" : ObjectId("530c0825d9a80976b2000437") } ], "names" : [ { "original" : "Hashimoto Okiie (橋本興家)", "locale" : "ja", "kanji" : "橋本興家", "given" : "Okiie", "given_kana" : "おきいえ", "surname" : "Hashimoto", "surname_kana" : "はしもと", "given_kanji" : "興家", "surname_kanji" : "橋本", "name" : "Hashimoto Okiie", "ascii" : "Hashimoto Okiie", "plain" : "Hashimoto Okiie", "kana" : "はしもとおきいえ", "_id" : ObjectId("530c0825d9a80976b2000439") } ], "extract" : [ "53dfc997cbf9fa7501d78e4820b24a9c" ], "created" : ISODate("2014-02-25T03:04:05Z"), "__v" : 0 }
  16. 16. “Stack Scraper” https://github.com/jeresig/stack-scraper https://github.com/jeresig/ukiyoe-scrapers
  17. 17. Image Similarity
  18. 18. https://github.com/jeresig/node-matchengine
  19. 19. Image Similarity Search
  20. 20. Idyll: Offline Image Cropping • https://github.com/jeresig/idyll • Crop images offline and on a mobile device. • Saves the selections back to a server. • Data is synced and saved using HTML 5 appcache. • https://github.com/jeresig/node- appcache-glob
  21. 21. by David Chester
 at Shutterstock https://github.com/dchester/perl-image-crop-calibration-target
  22. 22. http://www.ersatzlabs.com/
  23. 23. Aiding Woodblock Print Studies with Image Analysis
  24. 24. Correcting Print Data
  25. 25. Japanese Names • Utagawa Hiroshige • Ando Hiroshige • Andō Hiroshige • Hiroshige • 歌川広重 • 広重
  26. 26. 安土 安堂 安島 安東 安籐 安藤 安道 安達 阿藤 Andō
  27. 27. 安藤 andō antō anzō yasuzuka A many-to-many mapping!
  28. 28. Sharaku Toshusai 東洲斎写楽
  29. 29. Sharaku Toshusai 東洲斎写楽 Is this the family name? Where are the stress marks? How do you “split” this name? Which name parts
 correlate?
  30. 30. Tools (all are Node modules!) • https://github.com/lovell/ hepburn • https://github.com/jeresig/ node-enamdict • https://github.com/jeresig/ node-ndlna • https://github.com/jeresig/ node-romaji-name ndlnahepburn enamdict romaji-name
  31. 31. Hepburn • https://github.com/lovell/ hepburn • Takes in the English form of a Japanese word. • Returns it written in Hiragana or Katakana (phonetic Japanese alphabets). ndlnahepburn enamdict romaji-name うたがわひろしげUtagawa Hiroshige
  32. 32. Enamdict • https://github.com/jeresig/ node-enamdict • Downloads and queries the ENAMDICT database • (A mapping of Japanese proper names to Hiragana and English.) • Used to correct typos and figure out surname/given name. ndlnahepburn enamdict romaji-name
  33. 33. NDLNA • https://github.com/jeresig/ node-ndlna • Queries the NDLNA database • Finds the correct Kanji for an English name. • Or the correct English for a Kanji name. ndlnahepburn enamdict romaji-name
  34. 34. ndlnahepburn enamdict romaji-name
  35. 35. { "original" : "Sharaku Toshusai (東洲斎写楽 )", "locale" : "ja", "kanji" : "東洲斎写楽", "given" : "Sharaku", "given_kana" : "しゃらく", "surname" : "Tōshūsai", "surname_kana" : "とおしゅうさい", "surname_kanji" : "東洲斎", "given_kanji" : "写楽", "name" : "Tōshūsai Sharaku", "ascii" : "Tooshuusai Sharaku", "plain" : "Toshusai Sharaku", "kana" : "とおしゅうさいしゃらく" }
  36. 36. Dates • https://github.com/jeresig/node-yearrange var yr = require("yearrange");! " yr.parse("1877")! // {"start": 1877, "end": 1877}! " yr.parse("1847-48")! // {"start": 1847, "end": 1848}! " yr.parse("ca. 1810-20s")! // {"start": 1810, "end": 1829, "circa": true}! " yr.parse("18th–19th century")! // {"start": 1700, "end": 1899}! " yr.parse("Meiji era")! // {"start": 1868, "end": 1912}
  37. 37. Artist Rectification
  38. 38. Miyagawa Shuntei Printed in 1897 Sold for: $550 Prints sell for $100-$400 individually True Estimate: $2100 - $8400 * * You just have to find someone willing to buy them!
  39. 39. • http://ejohn.org/research/ • http://ukiyo-e.org/ • https://github.com/jeresig
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×