Your SlideShare is downloading. ×
Data processing @ bit.ly
http://www.jsps.go.jp/english/e-jafos/2010_01.html
http://www.jsps.go.jp/english/e-jafos/2010_01.html
http://www.jsps.go.jp/english/e-jafos/2010_01.html               http://bit.ly/gFNuXa
DATA
12:00 AM 1:00 AM 2:00 AM 3:00 AM 4:00 AM 5:00 AM 6:00 AM 7:00 AM 8:00 AM 9:00 AM10:00 AM11:00 AM12:00 PM 1:00 PM 2:00 PM 3...
The Big Data Problem
sortdb   $sort data.csv > sorted_data.csv   $sortdb -F , -f sorted_data.csv -p 8080   $curl http://127.0.0.1:8080/get?key=...
simplequeue $curl -f “data=...” http://simplequeue/put $data=`curl http://simplequeue/get`http://bit.ly/simplehttp
simplequeuehttp://bit.ly/simplehttp
simplequeuehttp://bit.ly/simplehttp
class BackoffTimer(object):    def __init__(self):       self.interval = 0    def failure(self):        self.interval = mi...
class QueueReader(object):    def __init__(self):        self.backoff_timer = BackoffTimer()   def run(self):       while ...
pubsubhttp://bit.ly/simplehttp
$curl --silent http://pubsubserver/sub{ "a": "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3 like Mac OS X;ja-jp) AppleWebKit/5...
tools we like
github.com/bitly/simplehttp
Data Processing @ bit.ly - Posscon 2011
Data Processing @ bit.ly - Posscon 2011
Data Processing @ bit.ly - Posscon 2011
Data Processing @ bit.ly - Posscon 2011
Data Processing @ bit.ly - Posscon 2011
Data Processing @ bit.ly - Posscon 2011
Upcoming SlideShare
Loading in...5
×

Data Processing @ bit.ly - Posscon 2011

1,098

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,098
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript of "Data Processing @ bit.ly - Posscon 2011"

    1. 1. Data processing @ bit.ly
    2. 2. http://www.jsps.go.jp/english/e-jafos/2010_01.html
    3. 3. http://www.jsps.go.jp/english/e-jafos/2010_01.html
    4. 4. http://www.jsps.go.jp/english/e-jafos/2010_01.html http://bit.ly/gFNuXa
    5. 5. DATA
    6. 6. 12:00 AM 1:00 AM 2:00 AM 3:00 AM 4:00 AM 5:00 AM 6:00 AM 7:00 AM 8:00 AM 9:00 AM10:00 AM11:00 AM12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 5:00 PM 6:00 PM 7:00 PM 8:00 PM iPhone vs iPad Usage NYC - 2011-03-16 9:00 PM10:00 PM11:00 PM
    7. 7. The Big Data Problem
    8. 8. sortdb $sort data.csv > sorted_data.csv $sortdb -F , -f sorted_data.csv -p 8080 $curl http://127.0.0.1:8080/get?key=...http://bit.ly/simplehttp
    9. 9. simplequeue $curl -f “data=...” http://simplequeue/put $data=`curl http://simplequeue/get`http://bit.ly/simplehttp
    10. 10. simplequeuehttp://bit.ly/simplehttp
    11. 11. simplequeuehttp://bit.ly/simplehttp
    12. 12. class BackoffTimer(object): def __init__(self): self.interval = 0 def failure(self): self.interval = min(self.interval * 2, 1) def success(self): self.interval = max(self.interval * .25, 1) - 1
    13. 13. class QueueReader(object): def __init__(self): self.backoff_timer = BackoffTimer() def run(self): while True: try: data = queue.get() if not data: time.sleep(.5) continue self.handle(data) self.backoff_timer.success() except: self.backoff_timer.failure() if self.backoff_timer.interval: time.sleep(self.backoff_timer.interval)
    14. 14. pubsubhttp://bit.ly/simplehttp
    15. 15. $curl --silent http://pubsubserver/sub{ "a": "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3 like Mac OS X;ja-jp) AppleWebKit/533.17.9 (KHTML, like Gecko) Mobile/8F190","c": "JP", "nk": 1, "tz": "Asia/Tokyo", "gr": "40", "g": "hkIdmh","h": "g0ABCf", "k": "4d8547e6-0022b-04438-d8ac8fa8", "l":"portalexcite", "al": "ja-jp", "hh": "bit.ly", "r": "direct", "u":"http://paltyyuria.exblog.jp/15685838/", "t": 1300587345, "hc":1300587133, "cy": "Tokyo", "ll": [ 35.685001, 139.751404 ], "i":"613d872b3663f1f0cd54b48653ec788" }{ "a": "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; WOW64;Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NETCLR 3.5.30729; OfficeLiveConnector.1.5; OfficeLivePatch.1.3; .NET4.0C; .NET CLR 3.0.30729)", "c": "US", "nk": 0, "tz":"America/Chicago", "gr": "TX", "g": "fTPW1w", "h": "fK3C3I", "k":"4d856351-001b9-07054-c6ac8fa8", "l": "espn", "al": "en-us", "hh":"es.pn", "r": "http://espn.go.com/mlb/", "u": "http://espn.go.com/blog/dallas/texas-rangers/post/_/id/4861596/surprise-six-saturday-camp-recap-4", "t": 1300587345, "hc": 1300584373, "cy": "Dallas","ll": [ 32.809799, -96.799301 ], "i":"6686654b9493543ff18d36120d5caa9" }
    16. 16. tools we like
    17. 17. github.com/bitly/simplehttp

    ×