SlideShare a Scribd company logo
Where’s all that data? What’s it good for?
                      Gordon Bell
                  Microsoft Research
               Silicon Valley Laboratory




            Fujitsu 5th Technology Forum
     From Sensor Networks to Human Networks:
      Turning Big Data into Actionable Wisdom
                   25 January 2012
Where do you get all those bits?
            Some Stories…
• World’s commercial transactions
• The Cloud
• Personal lives from recording everything (MyLifeBits)
   – Individuals
   – Social sites
   – Libraries e.g. Mormon Library for preserving member archives
• Fourth Paradigm of Science based on data
   – Our world is being instrumented for observing everything
• Monitoring the earth and water for energy, food, and
  pleasure
Courtesy of Barabba (or Steve Haeckel, IBM) or someone else
Commercial       People &                  Science &                         Real time,
Transactions   All Their Bits             4th Paradigm                      Real World
                                                                           Sense & Effect




                          Courtesy of Gordon Bell, Barabba, Steve Haeckel, IBM and probably someone else
Lifelogging

With extreme lifelogging, all of us will have the
ability to recall or have recalled everything
we’ve ever said, saw, and did
… just like today’s Political candidates



        Are people basically narcissistic?
My five lifelogging epiphanies
1. Its capture and digitization (1998)
2. It’s organization and recall (2001)
3. It’s a transaction processor for everything in and
   about your life (2005)
4. It’s your true e-memory(2007)
  Bio-memory is just the meta-data and URL for e-Memory
5. Your e-memory is everywhere and beyond your
   control (2011)
The challenge now
With extreme lifelogging, all of us will have the ability
to recall or have recalled everything we’ve ever said,
saw, and did
… just like today’s Political candidates

The Challenge:
Collecting the bits from the individuals.
• Where are the bits?
• Can they be recalled?
• Who owns them?
• How much does it cost to store forever?
Let’s look at individuals
  & their e-memories
I’m losing my mind
THE ULTIMATE DIARY

      WHAT IF YOU COULD REMEMBER
               EVERYTHING?
“Dalgliesh, he knew, had almost total recall.”
-P.D. James, Death of an expert witness
Everything you’ve ever read
Everything you’ve ever seen
Everything you’ve ever heard
And much more… your very state
• Location, bio, temperature, light level… sensors galore
• Your heart beat, blood pressure, stress level, etc.
As little or as much as you like
Certainly much more than ever before


IF YOU WANT, YOU CAN HAVE
TOTAL RECALL
Recording, Storage, Recall

THREE STREAMS OF
TECHNOLOGY
LEAD TO TOTAL RECALL
Storage: cheap and abundant
                 10000




                 1000
Gigabytes (GB)




                                                       PC (3.5")
                                                       Notebook (2.5")
                  100
                                                       PDA (2")
                                                       Cellphone (Flash)



                   10




                    1
                         2000     2005   2010   2015
Recall: search, analyze & present
Next 10 years will see revolution of life and society


TOTAL RECALL IS INEVITABLE

A TOOL OR THE ULTIMATE DIARY?
We think life-Blogging is nuts!

NOT LIFE-BLOGGING
Personal and private

LIFE-LOGGING NOT LIFE-BLOGGING
Recording/Sensors
Why just paper?
Memex
             As We May Think, Vannevar Bush, 1945

“A memex is a device in which an individual stores all his
  books, records, and communications, and which is
  mechanized so that it may be consulted with exceeding
  speed and flexibility”
• Full-text search, text & audio annotations, and hyperlinks
Recording Everything
Bits per person…
  A One Terabyte, Low Resolution Life
• 2000 VL res life can be stored in a TB (GB/month)
• 2005 MyLifeBits captured about 1 GB per month…
   – Very little audio, video, lower resolution photos
   – Web page, photos, and video takes up the space
• 2010 10-20 Terabytes is more realistic
   – SenseCam 3 GB/month 3 samples/minute
   – Audio          17 GB/month
Special Persons Archives
•   Charles Vest: former President of MIT
•   Einstein
•   National Lib. Of Medicine: Lederberg
•   Salman Rushdie
•   LDS Church (Mormon)
Public 21st century figure legacies
• Charles Vest, president of MIT from 1990 to 2004,
  delivered a hard drive with nearly all of the files of
  his 14 year tenure to the MIT Archivist. It including
  speeches and letters (drafts), presentations,
  planning documents, meeting minutes, e-mails
  and a few photos. The only items Vest had deleted
  were a few files about his personal finances.
• Nothing had been scanned, so no incoming input
  such as letters, web page views, articles, unless
  they were attachments.
www.alberteinstein.info
www.alberteinstein.info
National Library of Medicine

• Top 30: 265 GB; 181K Files; 1.5 MB/file
   –   99.99%              tiff images
   –   less than .01%      plain text
   –   less than .01%      html files
   –   less than .001%     AVI video
• Web derivative files 36 GB; 75K files;         0.5 MB/file
• Items        Pages           Video Who
  18,615          49,951             8    Lederberg
  1,738           37,110             25   RMP
  1,054           10,811             5    Koop
  580             4,374              1    Avery
  469             1,833              -    Crick
  279             1160               1    Pauling
  302             893                -    Varmus
  …
  27.5 K          143 K
Lederberg Finder page
Lederberg papers official
        reports

              Number of document segments
Emory University: Rushdie Archives
    Snooping Through Salman Rushdie's Computer
             http://www.youtube.com/watch?v=pBtFNpgzlsg

• 200 cardboard boxes;
• Four emulated MACs; 18 Gbytes; 40,000 files
SenseCam: Recording everything seen
GB with
SenseCam &
   Voice
 Recorder

Camera now
 available:
Viconreview.com
Capturing
every step
The “killer app”… Health!
Capturing every heartbeat
• 72.6 beats/min; 38.16 Million beats/year
• 3.13 billion beats per life
• Battery life: the expected time to next surgery!
   – St. Jude battery was 4-4.5 years, or ETS
   – Medtronic current, 8 years.
Audiometric
Test 050117
Sensors with IP On Everything
HR, weight, BP, distance,
Health monitoring devices
In-body health sensing




pillcam                      Nanobot in the bloodstream




                       EndoSure Wireless Pressure Sensor
                       in an aneurysm sac
Navigenics
  Report
 2006-03
Health
Monitoring:
“Your
husband
just died,
… here’s
his black
box”
100

                                                      75

                                                 50

                                            25

                                        0
        Work: email, im, social sites
              Work: Who & When
         Work: Legacy documents
                Work: Web pages
              >Work: T&M (VIBE)
               >Work: Meetings
              >Work: Telephone…


           Home: Finance, Legal
  Learning: Books, journals, etc.


                    Health: PHR
     >>Health: Diet & Exercise
  Health: On & inbody metrics


   Life: Music (CDs, cassettes,…
                 Life: Photos
Life: Memorabilia, ephemera
         >Life: Tracked Days
    Life: Video Productions

      Life: SenseCam Days
Bits per person…
  A One Terabyte, Low Resolution Life
• 2000 VL res life can be stored in a TB (GB/month)
• 2005 MyLifeBits captured about 1 GB per month…
   – Very little audio, video, lower resolution photos
   – Web page, photos, and video takes up the space
• 2010 10-20 Terabytes is more realistic
   – SenseCam 3 GB/month 3 samples/minute
   – Audio          17 GB/month
A View of Preserving Digital Lives
• Preserving the analog life of a 20th century person:
  10-100 GB. 2 Mpgs, 100Kimages, 100-1,000 hrs. video
   – Won’t analog people need to be converted to digital,
     … if not they’re really gone and forgotten history?
   – Gresham’s Law: digital lives drive out analog lives
• How will a 21st century, digital person be preserved?
   – Which “lives” of a person e.g. personal, professional?
   – Depth of each life?
   – Size. Who’s in a library’s digital lifeboat?
• Preserving Everybody?
   – Role of public institutions vs. the cloud for “all of us”
Fire in the Library
Technology Review January 2012
How far do we trust our institutions to save lives?
• Re a comment on NPR in late January
  http://www.npr.org/templates/story/story.php?storyId=99372779
  about people saving recordings of early pre-bluegrass American folk
  music:
• "He considered giving his collection to the Library of Congress, …
  Alden says he worried that they'd be hard for musicians … to access,
  and that they'd gather dust lying …, what librarian … would let
  someone into the stacks with a banjo or a fiddle …?"
• … they're burning CDs and shipping them all over, which is the "lots
  of copies keeps stuff safe" philosophy (www.lockss.com). They
  haven't taken the next step and put them online, and anyway don't
  have a virtual place to put them that has a good chance of surviving
  and caring for them in perpetuity.
Scientific Data Deluge
•   CERN detectors
•   Radio telescopes
•   New telescopes and observatories
•   Gene sequencers
•   Global weather sensors
•   Earth science sensors
Science Paradigms
1. Thousand years ago:
    science was empirical
    describing natural phenomena
2. Last few hundred years:
    theoretical branch
    using models, generalizations              .
                                                     2
                                               a    4G c2
3. Last few decades (FORTRAN):                  a   3  2
                                                         a
                                                
    a computational branch
    simulating complex phenomena
4. Today Data-intensive science :
    data exploration (eScience)
   unify theory, experiment, and simulation
   – Data captured by instruments
     Or generated by simulation
   – Processed by software
   – Information/Knowledge stored in computer
   – Scientist analyzes database / files
     using data management and statistics
                         Jim Gray NRC-CSTB 2007-01
   Make sure the scientists have a data
    problem – otherwise they won’t take the
    time to talk with you
   Define 20 questions/plots – this drives
    the technical design, but also helps the
    cross-disciplines communication
   Spread the 20 questions/plots across
    “easy”, “tricky”, “too hard to do now”
   Ask about sharing and security and get
    to shared pragmatic consensus
   Don’t forget to write the papers on both
    sides - they help drive adoption
                                    Courtesy Catharine van Ingen
Synthesizing Imagery, Sensors, Models
             and Field Data

Climate classification                                        FLUXNET
    ~1MB (1file)                                               Curated
                                                                sensor
                                                               dataset
                                                                 30GB
                                                             (960 files)

Vegetative clumping
                           NASA MODIS imagery archives       FLUXNET
   ~5MB (1file)
                                 5 TB (600K files)       curated field
                                                              dataset
                                                          2 KB (1 file)
                         Sizes given are 1 US year
                         20 US year ~ 1 global land
                         surface year
NCEP/NCAR ~100MB
     (4K files)
Global Scale   Global Scale   Archive
Continental US   Reprojection   Reduction      Download
By the numbers….
• 22 months                    • 1.3 M re-projected tiles
• 2 CS interns; 1 architect;   • 25 M reduction files
  1 science intern; 1          • (TBD) VM
  senior scientist; 3            scaleup/scaledown
  hangers-on                     operations
• 522 K cpu hours              • (TBD) Lines of
• 14 TB upload                   (nonMatLab) code
• 10 TB max storage            • $79K external billing
• 5 TB download
• 2.3 B storage operations
The South Esk Hydrological Sensor Web:
Next-Generation Catchment Management
 Water for a Healthy Country

 Andrew Terhorst
 Tasmanian ICT Centre (Hobart WSM real time) award
 winner
 9 September 2011
The sustainability challenge …
• Australia is the driest inhabited continent
• River flows can be extremely fickle/unreliable
• Sustainable management of freshwater resources
                                   FLOOD EARLY WARNING
  requires good situation awareness

                                     WATER                              HYDRO-POWER
                                   REGULATIONS                           GENERATION



                                                      REQUIRES
                              RESERVOIR                 GOOD                 WATER
                             MANAGEMENT               SITUATION              QUALITY
                                                     AWARENESS


                                         WATER                          ENVIRONMENTAL
                                        TRADING                             FLOWS




                                                  IRRIGATION PLANNING
   2011 iAwards - Sustainability and Green IT
South Esk River, Tasmania
                                             • Catchment receives variable
                                               rainfall - river flows are very
                                               erratic
                                             • Water resource managers
                                               require better situation
                                               awareness for managing
                                               water restrictions
                                             • Sustainability goal is to
                                               maximise water harvesting
                                               opportunities without
                                               compromising environmental
                                               flows

2011 iAwards - Sustainability and Green IT
Hydro-meteorological sensor network




2011 iAwards - Sustainability and Green IT
Integrating sensor data from multiple agencies




2011 iAwards - Sustainability and Green IT
Project goal
Develop a prototype
water information
system made up of two
linked sub-systems:

• Continuous flow
  forecast system
   - Based on emerging
     Sensor Web standards
• Provenance
  management system
   - Provides information
     on how flow forecasts
     are produced
  2011 iAwards - Sustainability and Green IT
Current practice

                                             Decision
Numeric                                                 Application Layer
                                             Support
Models
                                              Tools




         Physical Sensors, Observation
                                                        Sensor Layer
                   Archives


2011 iAwards - Sustainability and Green IT
Paradigm shift
                                             Decision
Numeric                                                 Semantic Application Layer
                                             Support
Models                                                   Broker
                                              Tools



                                      Sensor Web                  Services Layer




Physical Sensors, Observation Archives Sensor Layer



2011 iAwards - Sustainability and Green IT
Architectural framework

                                             Network Management
                                               And Provenance



                                                  Scientific
                                                  workflow

                           Sensor data                          Atmospheric
                             feeds                 Clients
                                                                  models


                                                Flow forecast
                                                   models




2011 iAwards - Sustainability and Green IT
2011 iAwards - Sustainability and Green IT
Key system features
                                                                                              Interoperable
              Provenance
                                                                            Highly
              management                                                                             Re-locatable
                                                                            scalable
       First hydrological
       sensor web built in                                                      Redundancy           Rapid
       Australia                                                                                     integration of
  Uses near                                                                  Open                    sensor assets
  real-time data
  feeds from
                             Unique                                       Architecture
                                                                                                 Standards-based
  multiple
  agencies                                                                                           Improved
                                                        Key                                          understandin
                                                                                   Reusable software of natural
                                                      Features                     components        system
Published                                                                                            behaviour
research articles
                                                                                        Value
                              Quality                           Enables
                                                                                     Proposition
                                                                sustainable
Included in the                                                 management of                           Generic
Global Earth                                                    scarce water                            applications
                                      Described as next-                        Serves regulators
Observing System of                                             resources
                                      generation water                          and community
Systems                                                                                       Provides economic
                                      information system in           Serve other purposes benefit to irrigators
implementation pilot
                                      ITU technology briefing         e.g. flood warning,
                                                                      fire-danger risk
       2011 iAwards - Sustainability and Green IT
                                                                      assessment
The end

More Related Content

Similar to 12 gordon bell

O'Reilly Webcast: Organizing the Internet of Things - Actionable Insight Thro...
O'Reilly Webcast: Organizing the Internet of Things - Actionable Insight Thro...O'Reilly Webcast: Organizing the Internet of Things - Actionable Insight Thro...
O'Reilly Webcast: Organizing the Internet of Things - Actionable Insight Thro...
Boris Adryan
 
Phoenix pl2012
Phoenix pl2012Phoenix pl2012
Phoenix pl2012
Stephen Abram
 
Portsmouth public library evening presentation
Portsmouth public library evening presentationPortsmouth public library evening presentation
Portsmouth public library evening presentation
Stephen Abram
 
EIT-Digital_Spohrer_AI_Intro 20240529 v5.pptx
EIT-Digital_Spohrer_AI_Intro 20240529 v5.pptxEIT-Digital_Spohrer_AI_Intro 20240529 v5.pptx
EIT-Digital_Spohrer_AI_Intro 20240529 v5.pptx
International Society of Service Innovation Professionals
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and Humanities
Andrew Prescott
 
How Many Balls Can You Balance?
How Many Balls Can You Balance?How Many Balls Can You Balance?
How Many Balls Can You Balance?
George Needham
 
I am Library: an ode to self-discovery and collective creativity in Second Li...
I am Library: an ode to self-discovery and collective creativity in Second Li...I am Library: an ode to self-discovery and collective creativity in Second Li...
I am Library: an ode to self-discovery and collective creativity in Second Li...
Bernadette Daly Swanson
 
[Webinar] The Internet of Things and the Coming Data Deluge
[Webinar] The Internet of Things and the Coming Data Deluge[Webinar] The Internet of Things and the Coming Data Deluge
[Webinar] The Internet of Things and the Coming Data Deluge
InsightInnovation
 
Web 2.0 and library 2.0: ... it's okay to play!
Web 2.0 and library 2.0: ... it's okay to play!Web 2.0 and library 2.0: ... it's okay to play!
Web 2.0 and library 2.0: ... it's okay to play!
Scottish Library & Information Council (SLIC), CILIP in Scotland (CILIPS)
 
Dealing with Information Overload
Dealing with Information OverloadDealing with Information Overload
Dealing with Information Overload
John Breslin
 
Writing Template With Drawing Box. Online assignment writing service.
Writing Template With Drawing Box. Online assignment writing service.Writing Template With Drawing Box. Online assignment writing service.
Writing Template With Drawing Box. Online assignment writing service.
Jeanne Hall
 
Where are Repository's Going?
Where are Repository's Going?Where are Repository's Going?
Where are Repository's Going?
benosteen
 
MyLifeBits van Microsoft
MyLifeBits van MicrosoftMyLifeBits van Microsoft
MyLifeBits van Microsoft
Edwin Mijnsbergen
 
The Digital Media Landscape (Feb. '10)
The Digital Media Landscape (Feb. '10)The Digital Media Landscape (Feb. '10)
The Digital Media Landscape (Feb. '10)
Scott Kehoe
 
Digital immortality Roadmap
Digital immortality RoadmapDigital immortality Roadmap
Digital immortality Roadmap
avturchin
 
Module 1 - Data Around Us .pptx
Module 1 - Data Around Us .pptxModule 1 - Data Around Us .pptx
Module 1 - Data Around Us .pptx
esta2310819
 
Into the User environment Now! : how users have changed and how libraries can...
Into the User environment Now! : how users have changed and how libraries can...Into the User environment Now! : how users have changed and how libraries can...
Into the User environment Now! : how users have changed and how libraries can...
Guus van den Brekel
 
Knowing what AI Systems Don't know and Why it matters
Knowing what AI  Systems Don't know and Why it mattersKnowing what AI  Systems Don't know and Why it matters
Knowing what AI Systems Don't know and Why it matters
James Hendler
 
Introduction to computer - History - Generation
Introduction to computer - History - GenerationIntroduction to computer - History - Generation
Introduction to computer - History - Generation
TimesRide
 
Law Firm Management Seminar 2009
Law Firm Management Seminar 2009Law Firm Management Seminar 2009
Law Firm Management Seminar 2009
Lorri Mon
 

Similar to 12 gordon bell (20)

O'Reilly Webcast: Organizing the Internet of Things - Actionable Insight Thro...
O'Reilly Webcast: Organizing the Internet of Things - Actionable Insight Thro...O'Reilly Webcast: Organizing the Internet of Things - Actionable Insight Thro...
O'Reilly Webcast: Organizing the Internet of Things - Actionable Insight Thro...
 
Phoenix pl2012
Phoenix pl2012Phoenix pl2012
Phoenix pl2012
 
Portsmouth public library evening presentation
Portsmouth public library evening presentationPortsmouth public library evening presentation
Portsmouth public library evening presentation
 
EIT-Digital_Spohrer_AI_Intro 20240529 v5.pptx
EIT-Digital_Spohrer_AI_Intro 20240529 v5.pptxEIT-Digital_Spohrer_AI_Intro 20240529 v5.pptx
EIT-Digital_Spohrer_AI_Intro 20240529 v5.pptx
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and Humanities
 
How Many Balls Can You Balance?
How Many Balls Can You Balance?How Many Balls Can You Balance?
How Many Balls Can You Balance?
 
I am Library: an ode to self-discovery and collective creativity in Second Li...
I am Library: an ode to self-discovery and collective creativity in Second Li...I am Library: an ode to self-discovery and collective creativity in Second Li...
I am Library: an ode to self-discovery and collective creativity in Second Li...
 
[Webinar] The Internet of Things and the Coming Data Deluge
[Webinar] The Internet of Things and the Coming Data Deluge[Webinar] The Internet of Things and the Coming Data Deluge
[Webinar] The Internet of Things and the Coming Data Deluge
 
Web 2.0 and library 2.0: ... it's okay to play!
Web 2.0 and library 2.0: ... it's okay to play!Web 2.0 and library 2.0: ... it's okay to play!
Web 2.0 and library 2.0: ... it's okay to play!
 
Dealing with Information Overload
Dealing with Information OverloadDealing with Information Overload
Dealing with Information Overload
 
Writing Template With Drawing Box. Online assignment writing service.
Writing Template With Drawing Box. Online assignment writing service.Writing Template With Drawing Box. Online assignment writing service.
Writing Template With Drawing Box. Online assignment writing service.
 
Where are Repository's Going?
Where are Repository's Going?Where are Repository's Going?
Where are Repository's Going?
 
MyLifeBits van Microsoft
MyLifeBits van MicrosoftMyLifeBits van Microsoft
MyLifeBits van Microsoft
 
The Digital Media Landscape (Feb. '10)
The Digital Media Landscape (Feb. '10)The Digital Media Landscape (Feb. '10)
The Digital Media Landscape (Feb. '10)
 
Digital immortality Roadmap
Digital immortality RoadmapDigital immortality Roadmap
Digital immortality Roadmap
 
Module 1 - Data Around Us .pptx
Module 1 - Data Around Us .pptxModule 1 - Data Around Us .pptx
Module 1 - Data Around Us .pptx
 
Into the User environment Now! : how users have changed and how libraries can...
Into the User environment Now! : how users have changed and how libraries can...Into the User environment Now! : how users have changed and how libraries can...
Into the User environment Now! : how users have changed and how libraries can...
 
Knowing what AI Systems Don't know and Why it matters
Knowing what AI  Systems Don't know and Why it mattersKnowing what AI  Systems Don't know and Why it matters
Knowing what AI Systems Don't know and Why it matters
 
Introduction to computer - History - Generation
Introduction to computer - History - GenerationIntroduction to computer - History - Generation
Introduction to computer - History - Generation
 
Law Firm Management Seminar 2009
Law Firm Management Seminar 2009Law Firm Management Seminar 2009
Law Firm Management Seminar 2009
 

Recently uploaded

“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 

Recently uploaded (20)

“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 

12 gordon bell

  • 1. Where’s all that data? What’s it good for? Gordon Bell Microsoft Research Silicon Valley Laboratory Fujitsu 5th Technology Forum From Sensor Networks to Human Networks: Turning Big Data into Actionable Wisdom 25 January 2012
  • 2. Where do you get all those bits? Some Stories… • World’s commercial transactions • The Cloud • Personal lives from recording everything (MyLifeBits) – Individuals – Social sites – Libraries e.g. Mormon Library for preserving member archives • Fourth Paradigm of Science based on data – Our world is being instrumented for observing everything • Monitoring the earth and water for energy, food, and pleasure
  • 3. Courtesy of Barabba (or Steve Haeckel, IBM) or someone else
  • 4. Commercial People & Science & Real time, Transactions All Their Bits 4th Paradigm Real World Sense & Effect Courtesy of Gordon Bell, Barabba, Steve Haeckel, IBM and probably someone else
  • 5. Lifelogging With extreme lifelogging, all of us will have the ability to recall or have recalled everything we’ve ever said, saw, and did … just like today’s Political candidates Are people basically narcissistic?
  • 6. My five lifelogging epiphanies 1. Its capture and digitization (1998) 2. It’s organization and recall (2001) 3. It’s a transaction processor for everything in and about your life (2005) 4. It’s your true e-memory(2007) Bio-memory is just the meta-data and URL for e-Memory 5. Your e-memory is everywhere and beyond your control (2011)
  • 7. The challenge now With extreme lifelogging, all of us will have the ability to recall or have recalled everything we’ve ever said, saw, and did … just like today’s Political candidates The Challenge: Collecting the bits from the individuals. • Where are the bits? • Can they be recalled? • Who owns them? • How much does it cost to store forever?
  • 8. Let’s look at individuals & their e-memories
  • 10. THE ULTIMATE DIARY WHAT IF YOU COULD REMEMBER EVERYTHING? “Dalgliesh, he knew, had almost total recall.” -P.D. James, Death of an expert witness
  • 14. And much more… your very state • Location, bio, temperature, light level… sensors galore • Your heart beat, blood pressure, stress level, etc.
  • 15. As little or as much as you like Certainly much more than ever before IF YOU WANT, YOU CAN HAVE TOTAL RECALL
  • 16. Recording, Storage, Recall THREE STREAMS OF TECHNOLOGY LEAD TO TOTAL RECALL
  • 17. Storage: cheap and abundant 10000 1000 Gigabytes (GB) PC (3.5") Notebook (2.5") 100 PDA (2") Cellphone (Flash) 10 1 2000 2005 2010 2015
  • 18.
  • 20. Next 10 years will see revolution of life and society TOTAL RECALL IS INEVITABLE A TOOL OR THE ULTIMATE DIARY?
  • 21. We think life-Blogging is nuts! NOT LIFE-BLOGGING
  • 25. Memex As We May Think, Vannevar Bush, 1945 “A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility” • Full-text search, text & audio annotations, and hyperlinks
  • 27. Bits per person… A One Terabyte, Low Resolution Life • 2000 VL res life can be stored in a TB (GB/month) • 2005 MyLifeBits captured about 1 GB per month… – Very little audio, video, lower resolution photos – Web page, photos, and video takes up the space • 2010 10-20 Terabytes is more realistic – SenseCam 3 GB/month 3 samples/minute – Audio 17 GB/month
  • 28. Special Persons Archives • Charles Vest: former President of MIT • Einstein • National Lib. Of Medicine: Lederberg • Salman Rushdie • LDS Church (Mormon)
  • 29. Public 21st century figure legacies • Charles Vest, president of MIT from 1990 to 2004, delivered a hard drive with nearly all of the files of his 14 year tenure to the MIT Archivist. It including speeches and letters (drafts), presentations, planning documents, meeting minutes, e-mails and a few photos. The only items Vest had deleted were a few files about his personal finances. • Nothing had been scanned, so no incoming input such as letters, web page views, articles, unless they were attachments.
  • 32. National Library of Medicine • Top 30: 265 GB; 181K Files; 1.5 MB/file – 99.99% tiff images – less than .01% plain text – less than .01% html files – less than .001% AVI video • Web derivative files 36 GB; 75K files; 0.5 MB/file • Items Pages Video Who 18,615 49,951 8 Lederberg 1,738 37,110 25 RMP 1,054 10,811 5 Koop 580 4,374 1 Avery 469 1,833 - Crick 279 1160 1 Pauling 302 893 - Varmus … 27.5 K 143 K
  • 34. Lederberg papers official reports Number of document segments
  • 35. Emory University: Rushdie Archives Snooping Through Salman Rushdie's Computer http://www.youtube.com/watch?v=pBtFNpgzlsg • 200 cardboard boxes; • Four emulated MACs; 18 Gbytes; 40,000 files
  • 37. GB with SenseCam & Voice Recorder Camera now available: Viconreview.com
  • 39.
  • 41. Capturing every heartbeat • 72.6 beats/min; 38.16 Million beats/year • 3.13 billion beats per life • Battery life: the expected time to next surgery! – St. Jude battery was 4-4.5 years, or ETS – Medtronic current, 8 years.
  • 42.
  • 44. Sensors with IP On Everything
  • 45. HR, weight, BP, distance,
  • 47. In-body health sensing pillcam Nanobot in the bloodstream EndoSure Wireless Pressure Sensor in an aneurysm sac
  • 48. Navigenics Report 2006-03
  • 50. 100 75 50 25 0 Work: email, im, social sites Work: Who & When Work: Legacy documents Work: Web pages >Work: T&M (VIBE) >Work: Meetings >Work: Telephone… Home: Finance, Legal Learning: Books, journals, etc. Health: PHR >>Health: Diet & Exercise Health: On & inbody metrics Life: Music (CDs, cassettes,… Life: Photos Life: Memorabilia, ephemera >Life: Tracked Days Life: Video Productions Life: SenseCam Days
  • 51. Bits per person… A One Terabyte, Low Resolution Life • 2000 VL res life can be stored in a TB (GB/month) • 2005 MyLifeBits captured about 1 GB per month… – Very little audio, video, lower resolution photos – Web page, photos, and video takes up the space • 2010 10-20 Terabytes is more realistic – SenseCam 3 GB/month 3 samples/minute – Audio 17 GB/month
  • 52. A View of Preserving Digital Lives • Preserving the analog life of a 20th century person: 10-100 GB. 2 Mpgs, 100Kimages, 100-1,000 hrs. video – Won’t analog people need to be converted to digital, … if not they’re really gone and forgotten history? – Gresham’s Law: digital lives drive out analog lives • How will a 21st century, digital person be preserved? – Which “lives” of a person e.g. personal, professional? – Depth of each life? – Size. Who’s in a library’s digital lifeboat? • Preserving Everybody? – Role of public institutions vs. the cloud for “all of us”
  • 53. Fire in the Library Technology Review January 2012
  • 54. How far do we trust our institutions to save lives? • Re a comment on NPR in late January http://www.npr.org/templates/story/story.php?storyId=99372779 about people saving recordings of early pre-bluegrass American folk music: • "He considered giving his collection to the Library of Congress, … Alden says he worried that they'd be hard for musicians … to access, and that they'd gather dust lying …, what librarian … would let someone into the stacks with a banjo or a fiddle …?" • … they're burning CDs and shipping them all over, which is the "lots of copies keeps stuff safe" philosophy (www.lockss.com). They haven't taken the next step and put them online, and anyway don't have a virtual place to put them that has a good chance of surviving and caring for them in perpetuity.
  • 55. Scientific Data Deluge • CERN detectors • Radio telescopes • New telescopes and observatories • Gene sequencers • Global weather sensors • Earth science sensors
  • 56. Science Paradigms 1. Thousand years ago: science was empirical describing natural phenomena 2. Last few hundred years: theoretical branch using models, generalizations . 2 a 4G c2 3. Last few decades (FORTRAN):  a   3  2   a   a computational branch simulating complex phenomena 4. Today Data-intensive science : data exploration (eScience) unify theory, experiment, and simulation – Data captured by instruments Or generated by simulation – Processed by software – Information/Knowledge stored in computer – Scientist analyzes database / files using data management and statistics Jim Gray NRC-CSTB 2007-01
  • 57. Make sure the scientists have a data problem – otherwise they won’t take the time to talk with you  Define 20 questions/plots – this drives the technical design, but also helps the cross-disciplines communication  Spread the 20 questions/plots across “easy”, “tricky”, “too hard to do now”  Ask about sharing and security and get to shared pragmatic consensus  Don’t forget to write the papers on both sides - they help drive adoption Courtesy Catharine van Ingen
  • 58. Synthesizing Imagery, Sensors, Models and Field Data Climate classification FLUXNET ~1MB (1file) Curated sensor dataset 30GB (960 files) Vegetative clumping NASA MODIS imagery archives FLUXNET ~5MB (1file) 5 TB (600K files) curated field dataset 2 KB (1 file) Sizes given are 1 US year 20 US year ~ 1 global land surface year NCEP/NCAR ~100MB (4K files)
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64. Global Scale Global Scale Archive Continental US Reprojection Reduction Download
  • 65.
  • 66. By the numbers…. • 22 months • 1.3 M re-projected tiles • 2 CS interns; 1 architect; • 25 M reduction files 1 science intern; 1 • (TBD) VM senior scientist; 3 scaleup/scaledown hangers-on operations • 522 K cpu hours • (TBD) Lines of • 14 TB upload (nonMatLab) code • 10 TB max storage • $79K external billing • 5 TB download • 2.3 B storage operations
  • 67. The South Esk Hydrological Sensor Web: Next-Generation Catchment Management Water for a Healthy Country Andrew Terhorst Tasmanian ICT Centre (Hobart WSM real time) award winner 9 September 2011
  • 68. The sustainability challenge … • Australia is the driest inhabited continent • River flows can be extremely fickle/unreliable • Sustainable management of freshwater resources FLOOD EARLY WARNING requires good situation awareness WATER HYDRO-POWER REGULATIONS GENERATION REQUIRES RESERVOIR GOOD WATER MANAGEMENT SITUATION QUALITY AWARENESS WATER ENVIRONMENTAL TRADING FLOWS IRRIGATION PLANNING 2011 iAwards - Sustainability and Green IT
  • 69. South Esk River, Tasmania • Catchment receives variable rainfall - river flows are very erratic • Water resource managers require better situation awareness for managing water restrictions • Sustainability goal is to maximise water harvesting opportunities without compromising environmental flows 2011 iAwards - Sustainability and Green IT
  • 70. Hydro-meteorological sensor network 2011 iAwards - Sustainability and Green IT
  • 71. Integrating sensor data from multiple agencies 2011 iAwards - Sustainability and Green IT
  • 72. Project goal Develop a prototype water information system made up of two linked sub-systems: • Continuous flow forecast system - Based on emerging Sensor Web standards • Provenance management system - Provides information on how flow forecasts are produced 2011 iAwards - Sustainability and Green IT
  • 73. Current practice Decision Numeric Application Layer Support Models Tools Physical Sensors, Observation Sensor Layer Archives 2011 iAwards - Sustainability and Green IT
  • 74. Paradigm shift Decision Numeric Semantic Application Layer Support Models Broker Tools Sensor Web Services Layer Physical Sensors, Observation Archives Sensor Layer 2011 iAwards - Sustainability and Green IT
  • 75. Architectural framework Network Management And Provenance Scientific workflow Sensor data Atmospheric feeds Clients models Flow forecast models 2011 iAwards - Sustainability and Green IT
  • 76. 2011 iAwards - Sustainability and Green IT
  • 77. Key system features Interoperable Provenance Highly management Re-locatable scalable First hydrological sensor web built in Redundancy Rapid Australia integration of Uses near Open sensor assets real-time data feeds from Unique Architecture Standards-based multiple agencies Improved Key understandin Reusable software of natural Features components system Published behaviour research articles Value Quality Enables Proposition sustainable Included in the management of Generic Global Earth scarce water applications Described as next- Serves regulators Observing System of resources generation water and community Systems Provides economic information system in Serve other purposes benefit to irrigators implementation pilot ITU technology briefing e.g. flood warning, fire-danger risk 2011 iAwards - Sustainability and Green IT assessment