Weaving	
  the	
  Visual	
  Web	
  
	
  
	
  
Ramesh	
  Jain	
  
Dept.	
  of	
  Computer	
  Science	
  
University	
  of	
  California,	
  Irvine	
  
jain@ics.uci.edu	
  
•  Memories	
  of	
  experiences.	
  
•  Explicit	
  contextual	
  communicaFon.	
  
– Data	
  is	
  a	
  new	
  uFlity.	
  
– Text	
  is	
  too	
  abstract	
  in	
  many	
  situaFons.	
  
	
  
Duality	
  of	
  Photos	
  
Why	
  do	
  we	
  take	
  photos?	
  
Where	
  do	
  photos	
  come	
  from?	
  	
  
•  Drawings	
  	
  
•  Pain6ngs	
  
•  Cameras	
  
•  Smart	
  Cameras	
  
•  Graphics	
  
Camera	
  History	
  
The	
  Role	
  of	
  Photos	
  con6nues	
  to	
  Evolve	
  
•  Memory	
  
•  Informa6on	
  
– Notes	
  	
  
– Documenta6on	
  
•  Ephemeral	
  
	
  Photos	
  have	
  ‘half	
  life’!	
  
In 20th century, we tolerated photos
in our textual documents.
In 21st century, you create visual
documents that tolerate text.	
  
Major Disruption in Photos: From
Memories to Information Sources.
Photos are the most compelling source
of information.	
  
Experiences
Life =
Events
+
Progress: Knowledge Creation and
Propagation
Visual	
  
Visual	
  knowledge	
  
Oral	
  
Oral	
  knowledge	
  
Textual	
  knowledge	
  
Berners-­‐Lee:	
  	
  Suppose	
  all	
  the	
  informa6on	
  stored	
  
on	
  computers	
  everywhere	
  were	
  linked.	
  Suppose	
  
I	
  could	
  program	
  my	
  computer	
  to	
  create	
  a	
  space	
  
in	
  which	
  anything	
  could	
  be	
  linked	
  to	
  anything.	
  
Web: Human experiences, knowledge,
and understanding captured using
associative links in DOCUMENTS.
Text	
  based	
  
documents	
  are	
  not	
  
NATURAL.	
  
	
  
	
  Language	
  and	
  
Literacy	
  come	
  in	
  
the	
  way.	
  
Imagine if every photo and video
captured were connected to every
other! And to other information!!	
  
What is a camera?
Captures intensity from a point in the world.
Is	
  a	
  Smartphone	
  camera	
  sFll	
  a	
  
camera?	
  
Many	
  sensors	
  capture	
  metadata	
  related	
  to	
  the	
  moment	
  
and	
  capture.	
  
•  Exposure Time
•  Aperture Diameter
•  Flash
•  Metering Mode
•  ISO Ratings
•  Focal Length
•  Time
•  Location
•  Face
	
  
Smartphone	
  camera	
  captures	
  events.	
  
Computa6onal	
  Representa6on	
  of	
  an	
  
event.	
  
Experien6al	
  Data:	
  	
  	
  
•  Photos	
  
•  Video	
  
•  Audio	
  
•  Accelerometer	
  
•  Heart	
  rate	
  
•  …	
  
EMPT: Extractable Mobile Photo Tags
Exif	
  
Content	
  
Analysis	
  
EMPT	
  
	
  
People	
  
Place	
  
Objects	
  
Events	
  
…	
  
Krumbs:	
  	
  Capture	
  and	
  Connect	
  
holisFc	
  experience	
  of	
  a	
  moment.	
  
What:  Objects	
Who:    People	
When:  Events	
Where:  Location	
	
Why:  Intent/Emotions
All photos and
associated context
is automatically
Organized in a
Web, and
Shared (if desired).
Duality	
  of	
  Photos:	
  	
  Relevance	
  to	
  
MM	
  and	
  CompuFng	
  
•  Memories:	
  Search	
  and	
  Retrieval	
  
•  InformaFon	
  communicaFon:	
  Web,	
  Big	
  Data	
  
How	
  do	
  you	
  search	
  photos?	
  
What	
  
Who	
  
When	
  
Where	
  
	
  
Why	
  
AssociaFons	
  related	
  to	
  these.	
  
Searching	
  ‘For	
  a	
  Photo’	
  	
  
and	
  	
  
Searching	
  ‘From	
  a	
  Photo’.	
  
28	
  
29	
  
30	
  
Who	
  
When	
  
Where	
  
What	
  
	
  
Why	
  
AssociaFons	
  related	
  to	
  these.	
  
Marking	
  Moments:	
  	
  Micro	
  Blogs	
  
•  Facebook	
  Status	
  and	
  Tweets	
  started	
  Micro	
  
Blogs.	
  
– Now	
  there	
  are	
  many	
  
– Instagram,	
  Snapchat,	
  …	
  
•  Problem	
  with	
  Tweets:	
  More	
  Noise	
  less	
  Data.	
  
•  Time	
  to	
  add	
  Focused	
  Micro	
  Blogs	
  	
  
– Sensors	
  	
  
– Importance	
  of	
  marking	
  a	
  moment	
  
Waze	
  
Crowdsourced	
  SituaFons	
  
Crowdsourced	
  SituaFons	
  
A	
  Photo	
  is	
  a	
  Click	
  in	
  Real	
  World.	
  
•  Remember	
  Kodak	
  Moment!	
  
•  For	
  each	
  photo:	
  
•  Unique	
  ID,	
  	
  
•  All	
  metadata	
  of	
  the	
  event,	
  
•  Tags,	
  
•  Links,	
  
•  AnnotaFons.	
  
Crowdsourced	
  SituaFons	
  
At	
  435	
  Main	
  
8:37	
  AM	
  
10/20/15	
   37	
  
10/20/15	
   38	
  
Flood level - Shelter
Flood Level
Shelter
Twitter
Classify (Flood level - Shelter)
Krumbs	
  Used	
  as	
  Focused	
  Micro	
  Blog	
  
•  One	
  Click	
  upload	
  to	
  the	
  FMB-­‐App	
  Server.	
  
•  The	
  client	
  sends:	
  
– Sender	
  ID	
  
– Photo	
  
– GPS	
  and	
  Place	
  
– Time	
  and	
  Event	
  
– Emoji	
  based	
  Context	
  and	
  annota6on	
  
– Any	
  addi6onal	
  comments	
  
•  As	
  JSON	
  
FMB-­‐App	
  Server	
  
•  FMB-­‐App	
  Server	
  uses	
  EventShop	
  for	
  appropriate	
  
aggrega6ons.	
  
–  Loca6on	
  
–  Geographic	
  area	
  (ward,	
  town,	
  city,	
  …)	
  
–  Also	
  computes	
  some	
  rates	
  of	
  changes.	
  
–  Determines	
  Trends	
  
–  Classifies	
  areas	
  based	
  on	
  evolving	
  and	
  current	
  
situa6ons.	
  
•  Allows	
  drill-­‐down	
  to	
  show	
  even	
  individual	
  photos.	
  
Tweets	
  VS	
  Krumbs	
  
Tweets	
  
1.  Tweet	
  require	
  thinking	
  and	
  
typing.	
  
2.  Loca6on	
  of	
  a	
  tweet	
  and	
  the	
  
corresponding	
  event	
  may	
  
be	
  different.	
  
3.  Tweets	
  are	
  subjec6ve.	
  
Krumbs	
  
1.  Krumbs	
  have	
  more	
  
informa6on	
  and	
  are	
  
spontaneous.	
  
2.  Krumbs	
  maintains	
  event	
  
loca6on.	
  
3.  Krumbs	
  are	
  objec6ve.	
  
Seman6c	
  Links	
  for	
  a	
  Photo	
  
•  Photo	
  Level	
  
– Automa6c	
  crea6on	
  
– Manual	
  Annota6on	
  
– Manual	
  Crea6on	
  
•  Segment	
  Level	
  
– Automa6c	
  Crea6on	
  
– Manual	
  Crea6on	
  
	
  
ObjecFve	
  Self:	
  From	
  Personal	
  Big	
  Data	
  
Life	
  Events	
  relate	
  disparate	
  streams	
  to	
  life.	
  
Personal	
  photos	
  on	
  smart	
  phones	
  TELL	
  a	
  lot	
  
about	
  you.	
  
Visual	
  Web	
  PlaZorm	
  	
  
Contextual	
  Reasoning	
  and	
  Event	
  ComputaFons	
  
NavigaFon	
  and	
  Search	
  
Krumbs	
  	
  
Next	
  
App	
  
Knowledge	
  Discovery:	
  	
  Event	
  AnalyFcs	
  and	
  VisualizaFon	
  
Agro-­‐	
  
Tech	
  
Clean	
  
India	
  
Personal	
  
Sharing	
  and	
  CommunicaFon	
  
Your	
  Personal	
  Visual	
  Web	
  on	
  
Smartphone	
  
Photo	
  Cloud	
  
Moment	
  Capture	
  
Contextual	
  Reasoning	
  
and	
  
Event	
  ComputaFons	
  
Event	
  AnalyFcs	
  
And	
  VisualizaFon	
  
NavigaFon	
  and	
  Search	
  
• Available	
  on	
  Android	
  and	
  iOS.	
  
• Your	
  data	
  remains	
  on	
  your	
  phone	
  
unless	
  shared.	
  
	
  
Sharing	
  
Developing	
  An	
  App:	
  	
  Agro-­‐Tech	
  
Contextual	
  Reasoning	
  and	
  Event	
  ComputaFons	
  
NavigaFon	
  and	
  Search	
  
Krumbs	
  for	
  Agro-­‐Tech	
  	
  
Agro	
  Knowledge	
  Discovery:	
  	
  Event	
  AnalyFcs	
  and	
  VisualizaFon	
  
Agro-­‐	
  
Tech	
  
Sharing	
  and	
  CommunicaFon	
  
Popular	
  	
  Selfie	
  	
  Food	
  	
  	
  Agriculture	
  	
  Shopping	
  
Technology	
  for	
  Building	
  Visual	
  Web	
  
• Visual	
  Authoring	
  Environment:	
  HTML	
  for	
  Visual	
  
Data.	
  
• Content	
  Analysis	
  
•  Deep	
  Contextual	
  Reasoning:	
  From	
  Smartphones,	
  sensors,	
  IoT,	
  	
  
personal	
  history,	
  social,	
  personal	
  and	
  all	
  other	
  events.	
  
•  Event	
  Clustering	
  and	
  recogniFon.	
  
•  Photo	
  Ranking	
  
•  Combine	
  with	
  Deep	
  Learning	
  based	
  Content	
  Analysis	
  
• Visual	
  NavigaFon	
  
• Cross	
  sharing	
  and	
  integraFon	
  with	
  other	
  PlaZorms.	
  
• Combine	
  Algorithmic	
  and	
  InteracFve	
  tools.	
  
For Visual Web, we need
•  Addressing (URI)
•  Transfer Protocol (HTTP)
•  Authoring and Presentation (HTML)
•  Ranking
•  Contextual Processing
•  Content Analysis
•  Privacy and Security
•  Information Vs Experience
Challenges	
  
Sky	
  is	
  the	
  Limit.	
  
ApplicaFons:	
  
Lifestyle	
  
Health	
  
Commerce	
  
Surveillance	
  and	
  Monitoring	
  
Agriculture	
  
Research	
  and	
  development	
  
ConstrucFon	
  
…	
  
	
  
Thanks	
  for	
  your	
  Fme	
  and	
  a_enFon.	
  
For	
  ques6ons:	
  jain@ics.uci.edu	
  

Visual Web keynote at MMSP 2015

  • 1.
    Weaving  the  Visual  Web       Ramesh  Jain   Dept.  of  Computer  Science   University  of  California,  Irvine   jain@ics.uci.edu  
  • 2.
    •  Memories  of  experiences.   •  Explicit  contextual  communicaFon.   – Data  is  a  new  uFlity.   – Text  is  too  abstract  in  many  situaFons.     Duality  of  Photos   Why  do  we  take  photos?  
  • 3.
    Where  do  photos  come  from?     •  Drawings     •  Pain6ngs   •  Cameras   •  Smart  Cameras   •  Graphics  
  • 4.
  • 5.
    The  Role  of  Photos  con6nues  to  Evolve   •  Memory   •  Informa6on   – Notes     – Documenta6on   •  Ephemeral    Photos  have  ‘half  life’!  
  • 6.
    In 20th century,we tolerated photos in our textual documents. In 21st century, you create visual documents that tolerate text.  
  • 7.
    Major Disruption inPhotos: From Memories to Information Sources. Photos are the most compelling source of information.  
  • 8.
  • 9.
    Progress: Knowledge Creationand Propagation Visual   Visual  knowledge   Oral   Oral  knowledge   Textual  knowledge  
  • 10.
    Berners-­‐Lee:    Suppose  all  the  informa6on  stored   on  computers  everywhere  were  linked.  Suppose   I  could  program  my  computer  to  create  a  space   in  which  anything  could  be  linked  to  anything.  
  • 11.
    Web: Human experiences,knowledge, and understanding captured using associative links in DOCUMENTS.
  • 12.
    Text  based   documents  are  not   NATURAL.      Language  and   Literacy  come  in   the  way.  
  • 13.
    Imagine if everyphoto and video captured were connected to every other! And to other information!!  
  • 14.
    What is acamera? Captures intensity from a point in the world.
  • 15.
    Is  a  Smartphone  camera  sFll  a   camera?   Many  sensors  capture  metadata  related  to  the  moment   and  capture.   •  Exposure Time •  Aperture Diameter •  Flash •  Metering Mode •  ISO Ratings •  Focal Length •  Time •  Location •  Face   Smartphone  camera  captures  events.  
  • 16.
    Computa6onal  Representa6on  of  an   event.   Experien6al  Data:       •  Photos   •  Video   •  Audio   •  Accelerometer   •  Heart  rate   •  …  
  • 17.
    EMPT: Extractable MobilePhoto Tags Exif   Content   Analysis   EMPT     People   Place   Objects   Events   …  
  • 18.
    Krumbs:    Capture  and  Connect   holisFc  experience  of  a  moment.   What:  Objects Who:    People When:  Events Where:  Location Why:  Intent/Emotions
  • 21.
    All photos and associatedcontext is automatically Organized in a Web, and Shared (if desired).
  • 26.
    Duality  of  Photos:    Relevance  to   MM  and  CompuFng   •  Memories:  Search  and  Retrieval   •  InformaFon  communicaFon:  Web,  Big  Data  
  • 27.
    How  do  you  search  photos?   What   Who   When   Where     Why   AssociaFons  related  to  these.  
  • 28.
    Searching  ‘For  a  Photo’     and     Searching  ‘From  a  Photo’.   28  
  • 29.
  • 30.
    30   Who   When   Where   What     Why   AssociaFons  related  to  these.  
  • 31.
    Marking  Moments:    Micro  Blogs   •  Facebook  Status  and  Tweets  started  Micro   Blogs.   – Now  there  are  many   – Instagram,  Snapchat,  …   •  Problem  with  Tweets:  More  Noise  less  Data.   •  Time  to  add  Focused  Micro  Blogs     – Sensors     – Importance  of  marking  a  moment  
  • 32.
  • 33.
  • 34.
  • 35.
    A  Photo  is  a  Click  in  Real  World.   •  Remember  Kodak  Moment!   •  For  each  photo:   •  Unique  ID,     •  All  metadata  of  the  event,   •  Tags,   •  Links,   •  AnnotaFons.  
  • 36.
    Crowdsourced  SituaFons   At  435  Main   8:37  AM  
  • 37.
  • 38.
    10/20/15   38   Flood level - Shelter Flood Level Shelter Twitter Classify (Flood level - Shelter)
  • 39.
    Krumbs  Used  as  Focused  Micro  Blog   •  One  Click  upload  to  the  FMB-­‐App  Server.   •  The  client  sends:   – Sender  ID   – Photo   – GPS  and  Place   – Time  and  Event   – Emoji  based  Context  and  annota6on   – Any  addi6onal  comments   •  As  JSON  
  • 40.
    FMB-­‐App  Server   • FMB-­‐App  Server  uses  EventShop  for  appropriate   aggrega6ons.   –  Loca6on   –  Geographic  area  (ward,  town,  city,  …)   –  Also  computes  some  rates  of  changes.   –  Determines  Trends   –  Classifies  areas  based  on  evolving  and  current   situa6ons.   •  Allows  drill-­‐down  to  show  even  individual  photos.  
  • 41.
    Tweets  VS  Krumbs   Tweets   1.  Tweet  require  thinking  and   typing.   2.  Loca6on  of  a  tweet  and  the   corresponding  event  may   be  different.   3.  Tweets  are  subjec6ve.   Krumbs   1.  Krumbs  have  more   informa6on  and  are   spontaneous.   2.  Krumbs  maintains  event   loca6on.   3.  Krumbs  are  objec6ve.  
  • 42.
    Seman6c  Links  for  a  Photo   •  Photo  Level   – Automa6c  crea6on   – Manual  Annota6on   – Manual  Crea6on   •  Segment  Level   – Automa6c  Crea6on   – Manual  Crea6on    
  • 43.
    ObjecFve  Self:  From  Personal  Big  Data  
  • 44.
    Life  Events  relate  disparate  streams  to  life.   Personal  photos  on  smart  phones  TELL  a  lot   about  you.  
  • 45.
    Visual  Web  PlaZorm     Contextual  Reasoning  and  Event  ComputaFons   NavigaFon  and  Search   Krumbs     Next   App   Knowledge  Discovery:    Event  AnalyFcs  and  VisualizaFon   Agro-­‐   Tech   Clean   India   Personal   Sharing  and  CommunicaFon  
  • 46.
    Your  Personal  Visual  Web  on   Smartphone   Photo  Cloud   Moment  Capture   Contextual  Reasoning   and   Event  ComputaFons   Event  AnalyFcs   And  VisualizaFon   NavigaFon  and  Search   • Available  on  Android  and  iOS.   • Your  data  remains  on  your  phone   unless  shared.     Sharing  
  • 47.
    Developing  An  App:    Agro-­‐Tech   Contextual  Reasoning  and  Event  ComputaFons   NavigaFon  and  Search   Krumbs  for  Agro-­‐Tech     Agro  Knowledge  Discovery:    Event  AnalyFcs  and  VisualizaFon   Agro-­‐   Tech   Sharing  and  CommunicaFon   Popular    Selfie    Food      Agriculture    Shopping  
  • 48.
    Technology  for  Building  Visual  Web   • Visual  Authoring  Environment:  HTML  for  Visual   Data.   • Content  Analysis   •  Deep  Contextual  Reasoning:  From  Smartphones,  sensors,  IoT,     personal  history,  social,  personal  and  all  other  events.   •  Event  Clustering  and  recogniFon.   •  Photo  Ranking   •  Combine  with  Deep  Learning  based  Content  Analysis   • Visual  NavigaFon   • Cross  sharing  and  integraFon  with  other  PlaZorms.   • Combine  Algorithmic  and  InteracFve  tools.  
  • 49.
    For Visual Web,we need •  Addressing (URI) •  Transfer Protocol (HTTP) •  Authoring and Presentation (HTML) •  Ranking •  Contextual Processing •  Content Analysis •  Privacy and Security •  Information Vs Experience Challenges  
  • 50.
    Sky  is  the  Limit.   ApplicaFons:   Lifestyle   Health   Commerce   Surveillance  and  Monitoring   Agriculture   Research  and  development   ConstrucFon   …    
  • 51.
    Thanks  for  your  Fme  and  a_enFon.   For  ques6ons:  jain@ics.uci.edu