Abstract                                                                                                                               Systems Operations Data
                                                                                                                                                 Partner: System-Wide Safety
NASA and its partners produce staggering amounts of data:                                                                                        and Assurance Technologies
petabytes per day for both earth and space science, with some                                                                       Customer: Operations and Maintenance
missions producing over a petabyte/day individually. Examining all
the data by hand is clearly impossible; progress has been made
in automated discovery, but little has been done to enable
the user to search for potential items of interest
directly. The goal of our research is to                           Problem                Solution
address this gap.
                                                              No existing technology      Our approach is to adapt elements from
Strategic Alignment:                                                     adequately       several methods to address the problem.
OCT Roadmap TA11 (Intelligent                                                                                                                      • Search begins with
                                           supports a user who wants to search for
Data Understanding, Data                                                                                     • Relevance estimation                  prototype of anomaly
                                       data in today’s vast data sets:
Lifecycle elements)                                                                                            concepts from information             • System ranks
National Aero R&D Plan            • The strict query interpretation in                                              retrieval instead of strict          candidates by
(organization and mining            databases often over-         SEARCH                                               constraint application.            variation from
of safety data elements)            restricts or under-                                                                                                    prototype
Initial TRL: 3
                                    restricts the results.     METHODS FOR                                             • Multidimensional utility
                                                                                                                         function from utility theory.
                                                                                                                                                        • System
                                                                                                                              automatically
                            • Data mining does not
                              support ad hoc search
                              as it is not user directed.
                                                                    D
                                                           MULTIDIMENSIONAL DATA
                                                                    D
                                                                  ––––––––––––––––––––––––––––––––––––––
                                                                                                                       • Query refinement by data expands on user’s
                                                                                                                         mining explicit and implicit      initial specification.
                                                                                   Shawn Wolfe                           user feedback.
                               • Information retrieval is                        Nikunj Oza, PhD
                                                                                                                    • Multiattribute query
                                 applied to text, not data.                   Yi Zhang (UCSC), PhD
Martian Images                                                                                                        specifications from              Safety Reports
                                     As a result, the user                                                                 database                   Partner: Aviation Safety
Partner: Planetary Data
                                       misses important                                                                        systems.                      Reporting System
          System
Customer: Scientists                       items in the                Novelty/Contribution                                                        Customer: Safety Analysts
                                               data.
                                                                 • Decrease difficulty of finding important data
                                                                 • Utilize strengths from multiple technologies
                                                                 • Combine human and machine intelligence




                                                                                     Contact:
                                                                                 Shawn Wolfe
• Search for images by metadata                                                                           • Search over different field types (numbers, categories, text)
                                                                            Intelligent Systems (TI)
• Scientists enter desired values                                                                           • Fine-tune performance by data mining past queries and
                                                                                (650) 604-4760
• Tradeoffs over matching values are used to rank images                                                  results
                                                                          <Shawn.Wolfe@nasa.gov>
                                                                                                               • Users can also provide explicit feedback to refine results

Search Methods for Multidimensional Data

  • 1.
    Abstract Systems Operations Data Partner: System-Wide Safety NASA and its partners produce staggering amounts of data: and Assurance Technologies petabytes per day for both earth and space science, with some Customer: Operations and Maintenance missions producing over a petabyte/day individually. Examining all the data by hand is clearly impossible; progress has been made in automated discovery, but little has been done to enable the user to search for potential items of interest directly. The goal of our research is to Problem Solution address this gap. No existing technology Our approach is to adapt elements from Strategic Alignment: adequately several methods to address the problem. OCT Roadmap TA11 (Intelligent • Search begins with supports a user who wants to search for Data Understanding, Data • Relevance estimation prototype of anomaly data in today’s vast data sets: Lifecycle elements) concepts from information • System ranks National Aero R&D Plan • The strict query interpretation in retrieval instead of strict candidates by (organization and mining databases often over- SEARCH constraint application. variation from of safety data elements) restricts or under- prototype Initial TRL: 3 restricts the results. METHODS FOR • Multidimensional utility function from utility theory. • System automatically • Data mining does not support ad hoc search as it is not user directed. D MULTIDIMENSIONAL DATA D –––––––––––––––––––––––––––––––––––––– • Query refinement by data expands on user’s mining explicit and implicit initial specification. Shawn Wolfe user feedback. • Information retrieval is Nikunj Oza, PhD • Multiattribute query applied to text, not data. Yi Zhang (UCSC), PhD Martian Images specifications from Safety Reports As a result, the user database Partner: Aviation Safety Partner: Planetary Data misses important systems. Reporting System System Customer: Scientists items in the Novelty/Contribution Customer: Safety Analysts data. • Decrease difficulty of finding important data • Utilize strengths from multiple technologies • Combine human and machine intelligence Contact: Shawn Wolfe • Search for images by metadata • Search over different field types (numbers, categories, text) Intelligent Systems (TI) • Scientists enter desired values • Fine-tune performance by data mining past queries and (650) 604-4760 • Tradeoffs over matching values are used to rank images results <Shawn.Wolfe@nasa.gov> • Users can also provide explicit feedback to refine results