Data Centric
Art, Science, and Humanities
김홍기
서울대학교 의생명지식공학연구실
Biomedical Knowledge Engineering Laboratory
Why Data Centric?
∎ Big Data(?)
∎ 개방, 공유, 융합, 협력은 시대정신
∎ 창조경제(?), 집단지성, 집단창의성
∎ 데이터는 쉽게, 그리고 순식간에 이동
∎ Pervasive, real-time data everywhere
∎ 데이터는 손쉽게 가공처리 가능
∎ 데이터의 부가가치는 매우 높을 수 있음
∎ So many computational tools and methodologies
 Analytics & Visualization
Source: Pacific Northwest
How Science Works
∎ “Philosophies of
Funding”, Cell, 2009.
∎ 가설중심의 과학적 연
구 데이터 중심의 대
규모 융합연구
∎ Fragmentation 
Integration
∎ 빅데이터 기반의 새로
운 가설의 발견과 집단
지성 기반의 데이터 분
석과 피드백의 중요성
강조됨
∎ 현실적 문제의 정의와
연구결과물의 공유에
있어 사회구성원의 참
여가 강조됨
Key Challenges of Data Centric Science
Source: Pacific Northwest
Big Data (Large Volumes)?
 Fast Data
Processing
 Big Analytics
 Deep insight
Open Data Space in Biology
데이터의 다양성(Heterogeneity, Diversity)
Data Silos
Source: BioPax
Relating and Linking
Linked Open Data
Layers of Biological Research (Vertical Liking)
System Science
Interrelationships,
Dynamics
Reductionism
Time
Space
Context
Components
System Biology
Structural Biology
Complexity Analysis (Network Biology)
Source: Barabasi(Nature Reviews, 2004)
Assortative vs. Disassortative Networks
Social Network Biological or Technological
Network
Governmental Open Data in Healthcare
Collaborative and Multi-disciplinary Research
neuroscientists
physicians
statisticians
computer
scientists
Scientific Investigation with Transdisciplinarity
Disciplinay
Xxx xxx
Adapted from: www.hent.org/transdisciplinary.htm
Interdisciplinary Transdisciplinary
Multidisciplinary
Association vs. Bisociation
∎ Association is most commonly used in ICT
technologies to discover new information relevant to
the evidence already known to the user.
∎ BISOCIATION occurs when two seemingly unrelated
things are shown to have unanticipated connections.
∎ Context-crossing “associations” that are often
needed in innovative domains
∎ The history of engineering and science is full of
serendipitous discoveries, which are based on
bisociative processes.
Bisociation의 예: Swanson Linik
A CB
Articles about an AB relationship.
Articles about a BC relationship.
AB BC
AB and BC are complementary but disjoint :
They can reveal an implicit relationship between A and C in the absence of
any explicit relation.
suggest a novel hypothesis that connects A with C, an implicit but not explicit
connection.
To call attention to possible implicit links between the various text passages that are
selected.
Source: Swanson. 2003. A literature based Approach to Scientific Discovery. http://hdl.handle.net/10027/41
Magnesium-deficient rat
as a model of epilepsy.
Lab Animal Sci 28:680-5, 1978
The relation of migraine
and epilepsy.
Brain 92: 285-300, 1969
A magnesium
8011
C migraine
2756
An unintended link
Venn diagram: sets of Medline records; A,C are disjoint.
22 45
B epilepsy
An example based on title words in Medline
인문학의 분야
» Korean Studies
» English Literature
» European Studies
» Cultural Studies
» Linguistics
» Other Languages and Literatures
» Philosophy
» History and Philosophy of Science
» History of Ideas
» History
» Environmental Studies
» Multicultural Studies
» Classics and Ancient History
» Archeology
» History of Art, Architecture, Design
» Law
» Theology and Religious Studies
» Communication and Media Studies
» Music and History of Music
» Film Studies
» Drama and Theatre Studies
» Studies of other Performing Arts
» Medical Humanities
» Women’s Studies
Semantic Data for Historical Informatics
독일의 변천과정 Source: Bykau et.al. (J Data Semantics, 2012)
Data Journalism
∎ Data-driven journalism
as process
∎ Raw data needs to be
(1) available, (2)
filtered for patterns, (3)
visualized to help
people understand the
meaning and (4) the
data needs to be
turned into stories
∎ Mostly use open data
with open source tools
∎ Can help a journalist
tell a complex story
through engaging
infographics
Source: Wikipedia
Example (Data Journalism)
Musicology as a ‘data-rich’ discipline
∎ A computer program can take as input a
representation of a score and produces as output an
analysis of that music.
 ‘what is the cause of emotion in music?’
∎ Music Information Retrieval
∎ Music Recognition
∎ Data driven research on music history
∎ Multi-modal research (Music + Image)
Data Art
∎ Data artists paint a picture with data to construct
imaginative representations of the world in their own
way
∎ Creative visualizations can translate terabytes of data
into meaningful business information
∎ Touch will be the next generation user interface for
data, spanning to every screen and every surface
around you
∎ Everybody will be able to create his or her own data
art with data painting tools
26 / 10
Example: Glowing landscape shows river history
(Daniel E. Coe)
Example: The family tree for All in the family
(James Grady)
Bach Cello Suites visualized
Art & Science
∎ 미래의 산업과 과학기술에서의 예술가의 역할은 더욱 중
요해질 것 같다. 예술에 대한 내 나름의 정의는 "chaos와
order" 사이에 긴장감(tension)을 창조해 내는 것이다.
지나친 복잡함과 혼돈의 상태는 정보의 엔트로피
(Claude Shannon의 개념)가 높고, 불확실성이 높으며,
인지적 과부하로 인해 이해를 힘들게 된다. 지나친 질서
와 당연하게 받아들여진 규칙성(regularity)은 지루함을
느끼게 만든다. 과학의 발견은 자연 혹은 사회 현상으로
부터 규칙성을 찾아내는 과정이다. 예술가의 역할은
chaos에서 motif(일종의 미적 패턴)를 창조하고, 일반인
들에게 익숙한 현상에서 질서를 깨는 혼돈을 창조하는
것이 아닐까? 이런 점에서 예술가의 직관은 현상을 바라
보는 초월적(meta 수준의) 관점을 제공해 줌으로써 과학
에 창조적 긴장감을 줄 수 있지 않을까?
- 김홍기
Collective Creativity
32 / 10
∎ No more Einstein or too many Einsteins
Collective Creativity

Data Centric Art, Science, and Humanities

  • 1.
    Data Centric Art, Science,and Humanities 김홍기 서울대학교 의생명지식공학연구실 Biomedical Knowledge Engineering Laboratory
  • 2.
    Why Data Centric? ∎Big Data(?) ∎ 개방, 공유, 융합, 협력은 시대정신 ∎ 창조경제(?), 집단지성, 집단창의성 ∎ 데이터는 쉽게, 그리고 순식간에 이동 ∎ Pervasive, real-time data everywhere ∎ 데이터는 손쉽게 가공처리 가능 ∎ 데이터의 부가가치는 매우 높을 수 있음 ∎ So many computational tools and methodologies  Analytics & Visualization
  • 3.
  • 4.
    How Science Works ∎“Philosophies of Funding”, Cell, 2009. ∎ 가설중심의 과학적 연 구 데이터 중심의 대 규모 융합연구 ∎ Fragmentation  Integration ∎ 빅데이터 기반의 새로 운 가설의 발견과 집단 지성 기반의 데이터 분 석과 피드백의 중요성 강조됨 ∎ 현실적 문제의 정의와 연구결과물의 공유에 있어 사회구성원의 참 여가 강조됨
  • 5.
    Key Challenges ofData Centric Science Source: Pacific Northwest
  • 6.
    Big Data (LargeVolumes)?  Fast Data Processing  Big Analytics  Deep insight
  • 7.
    Open Data Spacein Biology
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
    Layers of BiologicalResearch (Vertical Liking) System Science Interrelationships, Dynamics Reductionism Time Space Context Components System Biology Structural Biology
  • 13.
    Complexity Analysis (NetworkBiology) Source: Barabasi(Nature Reviews, 2004)
  • 14.
    Assortative vs. DisassortativeNetworks Social Network Biological or Technological Network
  • 15.
  • 16.
    Collaborative and Multi-disciplinaryResearch neuroscientists physicians statisticians computer scientists
  • 17.
    Scientific Investigation withTransdisciplinarity Disciplinay Xxx xxx Adapted from: www.hent.org/transdisciplinary.htm Interdisciplinary Transdisciplinary Multidisciplinary
  • 18.
    Association vs. Bisociation ∎Association is most commonly used in ICT technologies to discover new information relevant to the evidence already known to the user. ∎ BISOCIATION occurs when two seemingly unrelated things are shown to have unanticipated connections. ∎ Context-crossing “associations” that are often needed in innovative domains ∎ The history of engineering and science is full of serendipitous discoveries, which are based on bisociative processes.
  • 19.
    Bisociation의 예: SwansonLinik A CB Articles about an AB relationship. Articles about a BC relationship. AB BC AB and BC are complementary but disjoint : They can reveal an implicit relationship between A and C in the absence of any explicit relation. suggest a novel hypothesis that connects A with C, an implicit but not explicit connection. To call attention to possible implicit links between the various text passages that are selected. Source: Swanson. 2003. A literature based Approach to Scientific Discovery. http://hdl.handle.net/10027/41
  • 20.
    Magnesium-deficient rat as amodel of epilepsy. Lab Animal Sci 28:680-5, 1978 The relation of migraine and epilepsy. Brain 92: 285-300, 1969 A magnesium 8011 C migraine 2756 An unintended link Venn diagram: sets of Medline records; A,C are disjoint. 22 45 B epilepsy An example based on title words in Medline
  • 21.
    인문학의 분야 » KoreanStudies » English Literature » European Studies » Cultural Studies » Linguistics » Other Languages and Literatures » Philosophy » History and Philosophy of Science » History of Ideas » History » Environmental Studies » Multicultural Studies » Classics and Ancient History » Archeology » History of Art, Architecture, Design » Law » Theology and Religious Studies » Communication and Media Studies » Music and History of Music » Film Studies » Drama and Theatre Studies » Studies of other Performing Arts » Medical Humanities » Women’s Studies
  • 22.
    Semantic Data forHistorical Informatics 독일의 변천과정 Source: Bykau et.al. (J Data Semantics, 2012)
  • 23.
    Data Journalism ∎ Data-drivenjournalism as process ∎ Raw data needs to be (1) available, (2) filtered for patterns, (3) visualized to help people understand the meaning and (4) the data needs to be turned into stories ∎ Mostly use open data with open source tools ∎ Can help a journalist tell a complex story through engaging infographics Source: Wikipedia
  • 24.
  • 25.
    Musicology as a‘data-rich’ discipline ∎ A computer program can take as input a representation of a score and produces as output an analysis of that music.  ‘what is the cause of emotion in music?’ ∎ Music Information Retrieval ∎ Music Recognition ∎ Data driven research on music history ∎ Multi-modal research (Music + Image)
  • 26.
    Data Art ∎ Dataartists paint a picture with data to construct imaginative representations of the world in their own way ∎ Creative visualizations can translate terabytes of data into meaningful business information ∎ Touch will be the next generation user interface for data, spanning to every screen and every surface around you ∎ Everybody will be able to create his or her own data art with data painting tools 26 / 10
  • 27.
    Example: Glowing landscapeshows river history (Daniel E. Coe)
  • 29.
    Example: The familytree for All in the family (James Grady)
  • 30.
  • 31.
    Art & Science ∎미래의 산업과 과학기술에서의 예술가의 역할은 더욱 중 요해질 것 같다. 예술에 대한 내 나름의 정의는 "chaos와 order" 사이에 긴장감(tension)을 창조해 내는 것이다. 지나친 복잡함과 혼돈의 상태는 정보의 엔트로피 (Claude Shannon의 개념)가 높고, 불확실성이 높으며, 인지적 과부하로 인해 이해를 힘들게 된다. 지나친 질서 와 당연하게 받아들여진 규칙성(regularity)은 지루함을 느끼게 만든다. 과학의 발견은 자연 혹은 사회 현상으로 부터 규칙성을 찾아내는 과정이다. 예술가의 역할은 chaos에서 motif(일종의 미적 패턴)를 창조하고, 일반인 들에게 익숙한 현상에서 질서를 깨는 혼돈을 창조하는 것이 아닐까? 이런 점에서 예술가의 직관은 현상을 바라 보는 초월적(meta 수준의) 관점을 제공해 줌으로써 과학 에 창조적 긴장감을 줄 수 있지 않을까? - 김홍기
  • 32.
    Collective Creativity 32 /10 ∎ No more Einstein or too many Einsteins
  • 33.