• Save
memories of tumblr gear & Tumblrowl
Upcoming SlideShare
Loading in...5
×
 

memories of tumblr gear & Tumblrowl

on

  • 2,514 views

tumblr developer's meetup jp 2011

tumblr developer's meetup jp 2011

Statistics

Views

Total Views
2,514
Views on SlideShare
2,496
Embed Views
18

Actions

Likes
3
Downloads
1
Comments
0

5 Embeds 18

http://a0.twimg.com 5
http://wiki.onakasuita.org 5
http://paper.li 4
https://si0.twimg.com 3
http://www.docshut.com 1

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

memories of tumblr gear & Tumblrowl memories of tumblr gear & Tumblrowl Presentation Transcript

  • memories oftumblr gear & Tumblrowl @honishi tumblr developer’s meetup jp 2011 17 dec 2011
  • honishi?
  • @honishi hiroyuki onishi honishi.tumblr.com since mid 2008• not a truly seasoned programmer• writing code just for fun• consultant for FAST Search Server at Microsoft
  • “honishi” is my secondary identity on the web, my primary ones are:
  • @notomamikofuckyeahnotomamiko.tumblr.com@kugimiyariefuckyeahkugimiyarie.tumblr.com
  • summary✴ living in yukari kingdom✴ a dedicated notomamist✴ a patient with kugimiya disease (type: n)
  • today’s topic✓ tumblr gear for iPhone✓ Tumblrowl for Mac OS X
  • tumblr gear
  • tumblr gear?ancient tumblr client for iPhone
  • core concept• prerequisiteexcept login • no api, • poor performance of iPhone 3g• scraping • text-based scraping • not xml-based: fat? • dom...... slow? • sax complex?• as fast as possible • minimize processing • minimize network traffic
  • main user interface• lots of webviews... UIWebView UIWebView UIWebView UIWebView UIWebView UIWebView UIWebView UIWebView UIWebView UIWebView UIWebView UIWebView UIWebView x11 for browsing x2 for reblogging (1 unhidden, 10 hidden) (always hidden)
  • main user interface (cont’d)• it’s slow to start rendering all webview at one time• so webviews are gradually warmed up (debug view)
  • days of fixing app• initial release ... jun 2009 • released after 4 rejections by Apple• days of fixing app ... after release • every little modification on dashboard affects app’s scraping logic
  • an opinion from the opinion leader• scraping should be executed on server side • when is a need to modify scraping logic there the structure of html changes, • if the logic is implemented long time to application, it usually takes within client release fixed app; submit the build, wait for Apple’s review, being reviewed by Apple... • it’s also better for cross-platform application provisioning
  • maybe it’s true, but...
  • weakness of server side scraping• scalability? • all connections & accesses in single point • need to invest for computing resources there• possibility of ban? • service provider can easily identify massive transactions from one location • once banned, it’s over• security? • no oauth provided at that time • so need to have & use user’s password at server side
  • still yetclient side scraping...
  • restructuring for fault-tolerance• splitting the scraping processes into 2 blocks: • logic for scraping • metadata for above• store them in difference places: • logic inside of the app • metadata outside of the app, s3• metadata is read from the app at the time of startup.
  • logic & metadata logic(process): metadata: 1. read dashboard base url? 2. pre-process target? how? boundaries for: 3. split posts html header? footer? post?4. find next link (then back to 1.) base url? elements for the url? inside app outside app
  • scraping metadata• simple property list• almost all rules are written in simple string or regular expression• located on amazon s3 • http://s3.amazonaws.com/tumblrgear/parsemeta.plist
  • scraping metadata (cont’d)
  • reading dashboard
  • overview: dashboard html header post #1 post #2 :dashboard preprocessed html html post #9 post #10 (next link) html footer
  • #1. read dashboard• login, v1 api• read regular html from the url defined in metadata # key value 1 baseUrl http://www.tumblr.com
  • #2. pre-process• regular the weight of html expression (replace)• reduce unwanted images • disable javascripts • disable img src to *_250.[jpg|gif] etc... • change # key value 1 pageReplace src="http://assets.tumblr.com/images/.*" ;; removed 2 pageReplace <(script .*?</script)> ;; <!--$1--> src=(".*?)_(400|500).(jpg|png|gif)(") ;; 3 pageReplace ORIGINAL_SRC=$1_$2.$3$4 src=$1_250.$3$4 : : ( A ;; B ... replace A with B )
  • #2. pre-process(cont’d)• override cssrequired for iPhone• highre-replace img src to *_500.[jpg|gif] reso, if • # key value (</head>) ;; <style type="text/css"> <!-- body { margin: 0px; padding: 43px 0px 43px 0px; 1 pageReplace (snip) --> </style> <meta name="viewport" content="width=320"> $1 2 highResoReplace ORIGINAL_SRC="(.*)" src=".*?" ;; HIGH_RESO src="$1" 3 : :
  • #3. split post• detect boundaries in the html• then split them into header, footer and posts # key value 1 pageHeaderSplitter <!-- START POSTS --> 2 pageFooterSplitter <!-- END POSTS --> 3 postBeginSplitter <li id="post_ 4 postEndSplitter <!-- END POSTS -->
  • #4. find next link• find next link next link using elements elements• assemble the # key value 1 nextLinkUrl http://www.tumblr.com{1} 2 nextLinkElements <a id="next_page_link" href="(.*)">• then read next page
  • stored postshtml header header post #1 footer header post #2 : post #n post #9 posts array post #10 footerhtml footersplit html stored separately concatenate on demand
  • reblog & like
  • reblog• detect reblog url of the post # key value 1 reblogUrl http://www.tumblr.com{1} 2 reblogElements <a href="(/reblog/.*?)">• get the raw html from the url
  • reblog (cont’d)• preprocess the html (disable img src etc...) # key value 1 reblogReplace <(script .*?</script)> ;; <!--$1--> 2 reblogReplace <link ;; <disabled_link 3 reblogReplace <img ;; <disabled_img• send the html to webview for reblogging
  • reblog (cont’d)• do the javascript thingsinto text area, if provided • put the commentbutton • push the submit # key value 1 reblogAddCommentJS (javascript here ... snip) 2 reblogSubmitJS (javascript here ... snip)• wait for redirect back to dashboard # key value 1 reblogRedirectUrl http://www.tumblr.com/dashboard• done
  • like• detect like url of the post # key value 1 likeUrl http://www.tumblr.com/like/{2}?form_key={3}&id={1} 2 likeElements type="hidden" name="id" value="(.*?)" 3 likeElements action="/like/(.*?)" 4 likeElements name="form_key"s+value="(.*?)"• do the simple postcode 200• wait for response• done
  • sales & trends• average 1,800 downloads/week?
  • sales & trends (cont’d)• US market is now 3 times larger than Japanese one ?
  • enhancement plan• n/a
  • recommended migration path• for iOS users ... Tumbletail• for Android users ... Tumblife
  • conclusion• ibecause:currently do not use this app, myself • softbank’s very poor signal everywhere • reducing numberenough for me to accounts, so it’s of following check the dashboard using pc in the bed
  • Tumblrowl
  • Tumblrowl?Growl-like dashboard application for Mac OS X
  • motivation to build• recently... • i don’t wanna do anything... • just wanna watch niconama... hataratti aka
  • motivation to build (cont’d)• i don’t wanna do anything (reprise)
  • tumblr?tired of pressing j & k, but missing...
  • my requirements• no user input requiredscreen• effective utilization of yorufukurou chrome here!
  • it’s tumblr + Growl
  • architecture• Growl, forked• network • OAuthConsumer for v2 api • ASIHTTPRequest• misc • RegexKitLite • JSON Framework • Sparkle
  • overview suspend?dashboard api post queue display queue display (mutable array) (mutable array) (nswindow)w/ since_id 1 post / dequeue polling every 10 sec polling every 2 sec •open post? •reblog? •like?
  • Growl, forked• extracting the display window from Growl the displaying module• extending out of box window: extended window: x x r o l icon title avatar blog name description image area upper text area lower text area title source
  • miscellaneous• oauth & webview • all cookiesofare shared (default behavior) instances webview in safari & all • so the login sequence to get authorized doesn’t work expectedly • need to override containerinmanually to handle cookie delegate webview • xauth...?
  • miscellaneous (cont’d)• avoiding reblog display limit storm • implement free space seeking logic • by hooking
  • conclusion• i myself currently do not use this app, because i find it distracting... seriously...
  • appendix: icons
  • icons for tumblr gear• designed by charactoy• 3,000- for each• http://www.charactoy.com/
  • icons for Tumblrowl• designed by diwakar ganesh (designcrowd)• $365.65-• http://www.designcrowd.com/
  • thank you. @honishi onishi.hiroyuki@gmail.com special thanks: inu(nihon henshu ongaku kyokai), nonSectRadicals, mamiko noto,shingo yamanaka, jeffrey kuo, midori yokoyama, naoto ohara, masami iwasawa