EmpireJS: Hacking Art with Node js and Image Analysis

  • 6,329 views
Uploaded on

Talk at EmpireJS, May 6th, 2014.

Talk at EmpireJS, May 6th, 2014.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
6,329
On Slideshare
0
From Embeds
0
Number of Embeds
10

Actions

Shares
Downloads
53
Comments
1
Likes
37

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Analyzing Japanese Art with Node.js and Computer Vision John Resig
  • 2. Lot 55: 20 Japanese Woodblock Prints Each depicting a female/Geisha figure with calligraphy throughout each print. Prints measure 13.75" H x 9.375" W. Toning to each print, some losses around edges. Estimated Price: $400 - $600
  • 3. Step 1: Acquire and read tons of expensive books.
  • 4. Step 2: Learn to read Japanese. * Japanese from the 17th to 19th century. * You’re not going to learn this from Rosetta Stone.
  • 5. Step 3: Learn to read Japanese calligraphy.
  • 6. Solution: A fast-loading, responsive, i18ned, web site: Ukiyo-e.org
  • 7. https://github.com/jeresig/i18n-node-2 var greeting = i18n.__('Hello %s, how are you today?', 'Marcus'); i18n.__n('%s cat', '%s cats', 3); Node i18n 2 (npm install i18n-2) setLocaleFromSubdomain([request])
  • 8. https://github.com/jeresig/i18n-node-2 {! "Hello": "Hello",! "Hello %s, how are you today?": "Hello %s, how are you today?",! "weekend": "weekend",! "Hello %s, how are you today? How was your %s.": "Hello %s, how are you today? How was your %s.",! "Hi": "Hi",! "Howdy": "Howdy",! "%s cat": {! "one": "%s cat",! "other": "%s cats"! },! "There is one monkey in the %%s": {! "one": "There is one monkey in the %%s",! "other": "There are %d monkeys in the %%s"! },! "tree": "tree"! }! Node i18n 2 (npm install i18n-2)
  • 9. Digital Ocean Amazon S3 Amazon Cloudfront Digital Ocean Images Data
 (HTML, XML, JSON) Images JS, CSS Images JS, CSS nginx (w/ cache) node.js express node.js express naught mongodb Elastic
 Search Scraper
  • 10. https://github.com/jeresig/jquery-imgscrubber
  • 11. Collecting Tons of Woodblock Print Data
  • 12. Search Page Page Page HTML Image HTML Image HTML Image Search Page Page Page HTML Image HTML Image HTML Image Queue-based Crawling using PhantomJS Processing Queue
  • 13. Some Website WebKit PhantomJS CasperJS SpookyJS Save Data XML Files Mongo Log libxml (+ xpath) MongoDB Extract Data Process Data Artists Images Correct Artist and Date Add to Site!
  • 14. module.exports = function() {! return {! scrape: [! {! start: "http://ukiyo-e.org/search",! visit: "//a[@class='img']",! next: "//a[contains(@rel,'next')]"! },! {! extract: {! "title": "//p[contains(@class, 'title')]//span",! "dateCreated": "//p[contains(@class, 'date')]//span",! "artists[]": "//p[contains(@class, 'artist')]//a",! "images[]": "//div[contains(@class,'imageholder')]//a/@href"! }! }! ]! };! };!
  • 15. "surname" : "Hashimoto", "surname_kana" : "はしもと", "name" : "Hashimoto Okiie", "ascii" : "Hashimoto Okiie", "plain" : "Hashimoto Okiie", "kana" : "はしもとおきいえ", "_id" : ObjectId("530c0825d9a80976b2000437") } ], "names" : [ { "original" : "Hashimoto Okiie (橋本興家)", "locale" : "ja", "kanji" : "橋本興家", "given" : "Okiie", "given_kana" : "おきいえ", "surname" : "Hashimoto", "surname_kana" : "はしもと", "given_kanji" : "興家", "surname_kanji" : "橋本", "name" : "Hashimoto Okiie", "ascii" : "Hashimoto Okiie", "plain" : "Hashimoto Okiie", "kana" : "はしもとおきいえ", "_id" : ObjectId("530c0825d9a80976b2000439") } ], "extract" : [ "53dfc997cbf9fa7501d78e4820b24a9c" ], "created" : ISODate("2014-02-25T03:04:05Z"), "__v" : 0 }
  • 16. “Stack Scraper” https://github.com/jeresig/stack-scraper https://github.com/jeresig/ukiyoe-scrapers
  • 17. Image Similarity
  • 18. https://github.com/jeresig/node-matchengine
  • 19. Image Similarity Search
  • 20. Idyll: Offline Image Cropping • https://github.com/jeresig/idyll • Crop images offline and on a mobile device. • Saves the selections back to a server. • Data is synced and saved using HTML 5 appcache. • https://github.com/jeresig/node- appcache-glob
  • 21. by David Chester
 at Shutterstock https://github.com/dchester/perl-image-crop-calibration-target
  • 22. http://www.ersatzlabs.com/
  • 23. Aiding Woodblock Print Studies with Image Analysis
  • 24. Correcting Print Data
  • 25. Japanese Names • Utagawa Hiroshige • Ando Hiroshige • Andō Hiroshige • Hiroshige • 歌川広重 • 広重
  • 26. 安土 安堂 安島 安東 安籐 安藤 安道 安達 阿藤 Andō
  • 27. 安藤 andō antō anzō yasuzuka A many-to-many mapping!
  • 28. Sharaku Toshusai 東洲斎写楽
  • 29. Sharaku Toshusai 東洲斎写楽 Is this the family name? Where are the stress marks? How do you “split” this name? Which name parts
 correlate?
  • 30. Tools (all are Node modules!) • https://github.com/lovell/ hepburn • https://github.com/jeresig/ node-enamdict • https://github.com/jeresig/ node-ndlna • https://github.com/jeresig/ node-romaji-name ndlnahepburn enamdict romaji-name
  • 31. Hepburn • https://github.com/lovell/ hepburn • Takes in the English form of a Japanese word. • Returns it written in Hiragana or Katakana (phonetic Japanese alphabets). ndlnahepburn enamdict romaji-name うたがわひろしげUtagawa Hiroshige
  • 32. Enamdict • https://github.com/jeresig/ node-enamdict • Downloads and queries the ENAMDICT database • (A mapping of Japanese proper names to Hiragana and English.) • Used to correct typos and figure out surname/given name. ndlnahepburn enamdict romaji-name
  • 33. NDLNA • https://github.com/jeresig/ node-ndlna • Queries the NDLNA database • Finds the correct Kanji for an English name. • Or the correct English for a Kanji name. ndlnahepburn enamdict romaji-name
  • 34. ndlnahepburn enamdict romaji-name
  • 35. { "original" : "Sharaku Toshusai (東洲斎写楽 )", "locale" : "ja", "kanji" : "東洲斎写楽", "given" : "Sharaku", "given_kana" : "しゃらく", "surname" : "Tōshūsai", "surname_kana" : "とおしゅうさい", "surname_kanji" : "東洲斎", "given_kanji" : "写楽", "name" : "Tōshūsai Sharaku", "ascii" : "Tooshuusai Sharaku", "plain" : "Toshusai Sharaku", "kana" : "とおしゅうさいしゃらく" }
  • 36. Dates • https://github.com/jeresig/node-yearrange var yr = require("yearrange");! " yr.parse("1877")! // {"start": 1877, "end": 1877}! " yr.parse("1847-48")! // {"start": 1847, "end": 1848}! " yr.parse("ca. 1810-20s")! // {"start": 1810, "end": 1829, "circa": true}! " yr.parse("18th–19th century")! // {"start": 1700, "end": 1899}! " yr.parse("Meiji era")! // {"start": 1868, "end": 1912}
  • 37. Artist Rectification
  • 38. Miyagawa Shuntei Printed in 1897 Sold for: $550 Prints sell for $100-$400 individually True Estimate: $2100 - $8400 * * You just have to find someone willing to buy them!
  • 39. • http://ejohn.org/research/ • http://ukiyo-e.org/ • https://github.com/jeresig