Akisato Kimura ( @_akisato )

ICWSM2012 BRIEF REVIEW
Conference venue

 Trinity College Dublin, Ireland

 Famous place : Old Library   Actual venue : Bio. Med. Bld. (not in campus)
Reception venue

 Guinness Storehouse, west Dublin
                      300-deg view, 7-th floor at west Dublin
What’s ICWSM?

 International AAAI Conference on Weblogs
  and Social Media
   Annual conference, 6th for this year.
   Seems to be a conference on Twitter & other
    social media, few papers as to weblogs.
   A lot of participants from companies and labs
    about SNS, mass media, ads, and marketing.
   A major cluster = sociologists,
    a unique conference hosted by AAAI.
Symbolic panel discussions

 I Want to (Net)work With You, But I Don't
  Know What/Where/Who You Are
   Panelists from Cisco, IBM, LinkedIn & Datahug
 News Generation and Consumption Through
  Social Media
   Panelists from Storyful, Newswhip, Irish Times,
    C-SPAN & Guardian

   Machine learning accounts for a small portion.
Basic statistics




                                           Only single track

                                                Not high quality
                                                as the rate indicates



                 Our presentation (can’t see any other JPN pres.)
  Attendees: over 330 in advanced registration (x3 of papers),
  half of them from USA, only 5 from Japan.
General overview

 Computer science << sociology
   Data collecting, analyses & discussions
    > results > performance > technical novelty
 Most oral presentations with high quality
   Especially in terms of analysis and discussions.
   Don’t mind theoretical soundness and novelty.
 2 giants: Twitter & Facebook
   But, we should not rely only on the giants.
   The direction includes cross platform analysis.
Interesting events & efforts

 Town hall meeting
   Discussing future directions of the conference
    with all the participants, not only PC members.
 Industrial panel
   With powerful debaters from various industries
 Dataset sharing service
   Provides new datasets used by papers.
   All datasets released as openly available
    community resources. http://icwsm.cs.mcgill.ca
Resources

 All the papers presented in the main
  conference can be freely accessible from
   http://www.aaai.org/Library/ICWSM/icwsm12contents.php

 All the workshop papers are also free :
   http://www.aaai.org/Library/Workshops/workshops-library.php



 I gathered most tweets as to ICWSM 12,
  freely accessible from
   http://togetter.com/id/_akisato
Our presentation

 Creating Stories : Social Curation of Twitter
  Messages
   Curated lists = supervised corpora for analyzing
    microblog messages
    http://www.brl.ntt.co.jp/people/akisato/socialweb1.html
面白かった発表 1

 The Livehoods Project: Utilizing Social Media
  to Understand the Dynamics of a City
   Won the Best Paper Award
   Twitterタイムラインから取れる
    位置情報(tweets with geotags, 4sq etc.)から,
    かなり局所的な地域の特性の変化が掴める.




                                 URL: http://livehoods.org
                                  Twitter ID: @livehoods
Livehoods project


                 [解析のポイント]
                 人々の日々の行動パターンから
                 場所の特性を明らかにしよう!

                 [データ収集]
                 11M of 別研究のデータ,
                 7M from Twitter TL.
                 論文で使われているのは,
                 40K check-ins (4K人, 5K箇所)



              [解析方法]
              位置をnodeとするspectral clustering.
              各nodeの素性構築が重要.


                        http://livehoods.org/research
クラスタリングの方法

         [素性のポイント]
         各位置でBo”CheckIns”を計算,
         人数と同数次元のベクトル.


         [素性解析のポイント]
         Bo”CheckIns”の類似性を
         2位置間の類似性と見なす.
         = 同一人物が同じくらい2位置
         にいれば,その2位置は仲間.


         [クラスタリング]
         Spectral clustering.
         物理的距離の遠い2位置は
         無関係と見なすことにする.
で,結果は…

 Webを見た方が早いと思います.
  http://livehoods.org/maps
面白かった発表 2

 Modeling Spread of Disease from Social
  Interactions
   Best paper award candidates
   感染症がどのように拡散
    していくか,を,
    Twitter(+位置情報)だけ
    から予測しよう.
何が,なぜできてなかったのか?

 情報源は病院しかなかった.
  → Global aggregationsしか取れなかった.
    Google Flu Trends: http://www.google.org/flutrends/
    CDC Statistics: http://www.cdc.gov/datastatistics/
    国立感染研情報センタ: http://idsc.nih.go.jp/disease/influenza/


 でも,本当に必要な情報は,
  いつ,どこで,誰が感染しているか?
  だって,感染したくないし…
感染源は誰だ?

 Twitterのfollow関係だけで感染する…
  わけがない! (それは映画の世界…
  同じ時間に同じ場所にいることが大事
 位置情報と時刻の共起を軸に考える
感染したことを知るには?

 [Uni, bi, tri]-gram(+多量の後処理)を素性とした
  半教師付きSVM cascadeで識別.
                    教師なし大量コーパス

  (不)完全教師付
   少量コーパス

                    Self-training
結果

 これもwebを見た方が早いと思います.
  http://health.scenedipity.com/pollution
面白かった発表 その他羅列1

 Crossing Media Streams with Sentiment:
  Domain Adaptation in Blogs, Reviews and
  Twitter
   Sentiment analysisをTwitterだけでやるの
    無理だから,reviewやblogを教師に使う.
 Exploring Social-Historical Ties on Location-
  Based Social Networks
   Foursquareもの.トピックと位置,両方使う.
   階層Pitman-Yor過程によるモデル化
面白かった発表 その他羅列2

 The Emergence of Conventions in Online
  Social Networks
   Won the Best Paper Award
   Twitterにおける「文法」らしきものは,基本
   的にボトムアップにできあがってきたもの.
   それを網羅的に検証.
おしまい

ICWSM12 Brief Review

  • 1.
    Akisato Kimura (@_akisato ) ICWSM2012 BRIEF REVIEW
  • 2.
    Conference venue  TrinityCollege Dublin, Ireland Famous place : Old Library Actual venue : Bio. Med. Bld. (not in campus)
  • 3.
    Reception venue  GuinnessStorehouse, west Dublin 300-deg view, 7-th floor at west Dublin
  • 4.
    What’s ICWSM?  InternationalAAAI Conference on Weblogs and Social Media  Annual conference, 6th for this year.  Seems to be a conference on Twitter & other social media, few papers as to weblogs.  A lot of participants from companies and labs about SNS, mass media, ads, and marketing.  A major cluster = sociologists, a unique conference hosted by AAAI.
  • 5.
    Symbolic panel discussions I Want to (Net)work With You, But I Don't Know What/Where/Who You Are  Panelists from Cisco, IBM, LinkedIn & Datahug  News Generation and Consumption Through Social Media  Panelists from Storyful, Newswhip, Irish Times, C-SPAN & Guardian  Machine learning accounts for a small portion.
  • 6.
    Basic statistics Only single track Not high quality as the rate indicates Our presentation (can’t see any other JPN pres.) Attendees: over 330 in advanced registration (x3 of papers), half of them from USA, only 5 from Japan.
  • 7.
    General overview  Computerscience << sociology  Data collecting, analyses & discussions > results > performance > technical novelty  Most oral presentations with high quality  Especially in terms of analysis and discussions.  Don’t mind theoretical soundness and novelty.  2 giants: Twitter & Facebook  But, we should not rely only on the giants.  The direction includes cross platform analysis.
  • 8.
    Interesting events &efforts  Town hall meeting  Discussing future directions of the conference with all the participants, not only PC members.  Industrial panel  With powerful debaters from various industries  Dataset sharing service  Provides new datasets used by papers.  All datasets released as openly available community resources. http://icwsm.cs.mcgill.ca
  • 9.
    Resources  All thepapers presented in the main conference can be freely accessible from  http://www.aaai.org/Library/ICWSM/icwsm12contents.php  All the workshop papers are also free :  http://www.aaai.org/Library/Workshops/workshops-library.php  I gathered most tweets as to ICWSM 12, freely accessible from  http://togetter.com/id/_akisato
  • 10.
    Our presentation  CreatingStories : Social Curation of Twitter Messages  Curated lists = supervised corpora for analyzing microblog messages http://www.brl.ntt.co.jp/people/akisato/socialweb1.html
  • 11.
    面白かった発表 1  TheLivehoods Project: Utilizing Social Media to Understand the Dynamics of a City  Won the Best Paper Award  Twitterタイムラインから取れる 位置情報(tweets with geotags, 4sq etc.)から, かなり局所的な地域の特性の変化が掴める. URL: http://livehoods.org Twitter ID: @livehoods
  • 12.
    Livehoods project [解析のポイント] 人々の日々の行動パターンから 場所の特性を明らかにしよう! [データ収集] 11M of 別研究のデータ, 7M from Twitter TL. 論文で使われているのは, 40K check-ins (4K人, 5K箇所) [解析方法] 位置をnodeとするspectral clustering. 各nodeの素性構築が重要. http://livehoods.org/research
  • 13.
    クラスタリングの方法 [素性のポイント] 各位置でBo”CheckIns”を計算, 人数と同数次元のベクトル. [素性解析のポイント] Bo”CheckIns”の類似性を 2位置間の類似性と見なす. = 同一人物が同じくらい2位置 にいれば,その2位置は仲間. [クラスタリング] Spectral clustering. 物理的距離の遠い2位置は 無関係と見なすことにする.
  • 14.
  • 15.
    面白かった発表 2  ModelingSpread of Disease from Social Interactions  Best paper award candidates  感染症がどのように拡散 していくか,を, Twitter(+位置情報)だけ から予測しよう.
  • 16.
    何が,なぜできてなかったのか?  情報源は病院しかなかった. → Global aggregationsしか取れなかった.  Google Flu Trends: http://www.google.org/flutrends/  CDC Statistics: http://www.cdc.gov/datastatistics/  国立感染研情報センタ: http://idsc.nih.go.jp/disease/influenza/  でも,本当に必要な情報は, いつ,どこで,誰が感染しているか?  だって,感染したくないし…
  • 17.
    感染源は誰だ?  Twitterのfollow関係だけで感染する… わけがない! (それは映画の世界…  同じ時間に同じ場所にいることが大事  位置情報と時刻の共起を軸に考える
  • 18.
    感染したことを知るには?  [Uni, bi,tri]-gram(+多量の後処理)を素性とした 半教師付きSVM cascadeで識別. 教師なし大量コーパス (不)完全教師付 少量コーパス Self-training
  • 19.
  • 20.
    面白かった発表 その他羅列1  CrossingMedia Streams with Sentiment: Domain Adaptation in Blogs, Reviews and Twitter  Sentiment analysisをTwitterだけでやるの 無理だから,reviewやblogを教師に使う.  Exploring Social-Historical Ties on Location- Based Social Networks  Foursquareもの.トピックと位置,両方使う.  階層Pitman-Yor過程によるモデル化
  • 21.
    面白かった発表 その他羅列2  TheEmergence of Conventions in Online Social Networks  Won the Best Paper Award  Twitterにおける「文法」らしきものは,基本 的にボトムアップにできあがってきたもの. それを網羅的に検証.
  • 22.