Modernizing your information
architecture with AI
June 13, 2019
2
Matt Aslett
Research Vice President,
451 Research’s Data, AI and
Analytics Channel
Sam Lightstone
CTO for Data & IBM Master Inventor,
IBM Data and AI
Survey Question #1
What best describes your current
occupation?
451RESEARCH.COM
©2019 451 Research. All Rights Reserved.
An increasing proportion of enterprises are using
data to drive strategic decision-making
Enter sidebar content
Source: 451 Research, Voice of the Enterprise: Data and Analytics, 1H19
451RESEARCH.COM
©2019 451 Research. All Rights Reserved.
Enterprises are bullish on AI and optimistic about its
impact across multiple domains
Enter sidebar content
Source: 451 Research, Voice of the Enterprise: AI/ML, 2H18
451RESEARCH.COM
©2019 451 Research. All Rights Reserved.
AI and machine learning are important components
of data platform and analytics initiatives
Enter sidebar content
Source: 451 Research’s VotE Data Platforms and Analytics, 1H19
34%
43%
54%
23%
The most data-driven companies
All respondents
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Percentage of companies agreeing that AI and machine learning are
important components of data platforms and analytics initiative(s)
Mostly agree Completely agree
451RESEARCH.COM
©2019 451 Research. All Rights Reserved.
Data management for AI:
Data management is a fundamental part of the
AI data pipeline
Enter sidebar content
451RESEARCH.COM
©2019 451 Research. All Rights Reserved.
Business impact of AI and data management
 Improve operational efficiencies.
 Improve query performance and accuracy.
 Empower business analysts.
 Accelerate data scientist productivity.
 Evolving the role of the DBA
8
451RESEARCH.COM
©2019 451 Research. All Rights Reserved. 9
Panel
Discussion
Why is artificial intelligence such a hot
topic right now?
Survey Question #2
Do you currently have AI applications
already deployed at your organization?
Survey Question #3
Do you have AI applications that are
under development?
Is “information architecture” important
to implementing AI and why?
Survey Question #4
Is your next AI application targeted for
cloud, private cloud or on-prem?
What is your definition of an “AI
Database”?
How can AI help in data management, what
are some of the potential outcomes?
Survey Question #5
What are the repositories that you would
want to run your data science jobs
(training and scoring) against?
What are ways that AI can be applied to
databases?
Structured data sources
Relational tables
ML queries in
structured query systems
Structured results
Relational tables
Unsupervised machine
learning creates model
1
2
3
4
ML extensions for SQL with Deep
Feed-Forward Neural Nets
19Think 2019 / DOC ID / Month XX, 2019 / © 2019 IBM Corporation
Future Tech
Uses:
• Similarity/dissimilarity queries
• Inductive reasoning queries such
as semantic clustering, analogies,
off-man out
• Semantic group-by operations
• Pattern anomalies (for example,
fraud detection)
• Extend to image, audio, video
Discover hidden semantic
relationships and trends in the
data.
Database
© 2018 International Business Machines Corporation
SELECT
inventory.inv_item_sk
promotion.p_channel_demo
FROM
promotion
JOIN catalog_returns ON catalog_returns.cr_item_sk = promotion.p_item_sk
JOIN reason ON reason.r_reason_sk = catalog_returns.cr_reason_sk
JOIN inventory ON catalog_returns.cr_warehouse_sk = inventory.inv_warehouse_sk
AND inventory.inv_item_sk = catalog_returns.cr_item_sk
AND inventory.inv_item_sk = promotion.p_item_sk
Example SQL query
Without machine learning
Return
Table scan
catalog returns
Table scan
reason
Hash join
Hash join
Table scan
reason
Hash join
Table scan
inventory
With machine learning
IBM Cloud / Db2 AI database / February 2019 / © 2019 IBM Corporation 24
Table scan
inventory
Return
Table scan
promotion
Table scan
catalog returns
Table scan
reason
Hash join
Hash join
Test 1 343 vs 2,927 = 8.5X faster Test 2 281 vs 2,333 = 8.3X faster
Learn more – go to ibm.com/Db2
Q&A
28

Modernizing your information architecture with ai

  • 1.
  • 2.
    2 Matt Aslett Research VicePresident, 451 Research’s Data, AI and Analytics Channel Sam Lightstone CTO for Data & IBM Master Inventor, IBM Data and AI
  • 3.
    Survey Question #1 Whatbest describes your current occupation?
  • 4.
    451RESEARCH.COM ©2019 451 Research.All Rights Reserved. An increasing proportion of enterprises are using data to drive strategic decision-making Enter sidebar content Source: 451 Research, Voice of the Enterprise: Data and Analytics, 1H19
  • 5.
    451RESEARCH.COM ©2019 451 Research.All Rights Reserved. Enterprises are bullish on AI and optimistic about its impact across multiple domains Enter sidebar content Source: 451 Research, Voice of the Enterprise: AI/ML, 2H18
  • 6.
    451RESEARCH.COM ©2019 451 Research.All Rights Reserved. AI and machine learning are important components of data platform and analytics initiatives Enter sidebar content Source: 451 Research’s VotE Data Platforms and Analytics, 1H19 34% 43% 54% 23% The most data-driven companies All respondents 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Percentage of companies agreeing that AI and machine learning are important components of data platforms and analytics initiative(s) Mostly agree Completely agree
  • 7.
    451RESEARCH.COM ©2019 451 Research.All Rights Reserved. Data management for AI: Data management is a fundamental part of the AI data pipeline Enter sidebar content
  • 8.
    451RESEARCH.COM ©2019 451 Research.All Rights Reserved. Business impact of AI and data management  Improve operational efficiencies.  Improve query performance and accuracy.  Empower business analysts.  Accelerate data scientist productivity.  Evolving the role of the DBA 8
  • 9.
    451RESEARCH.COM ©2019 451 Research.All Rights Reserved. 9 Panel Discussion
  • 10.
    Why is artificialintelligence such a hot topic right now?
  • 11.
    Survey Question #2 Doyou currently have AI applications already deployed at your organization?
  • 12.
    Survey Question #3 Doyou have AI applications that are under development?
  • 13.
    Is “information architecture”important to implementing AI and why?
  • 14.
    Survey Question #4 Isyour next AI application targeted for cloud, private cloud or on-prem?
  • 15.
    What is yourdefinition of an “AI Database”?
  • 16.
    How can AIhelp in data management, what are some of the potential outcomes?
  • 17.
    Survey Question #5 Whatare the repositories that you would want to run your data science jobs (training and scoring) against?
  • 18.
    What are waysthat AI can be applied to databases?
  • 19.
    Structured data sources Relationaltables ML queries in structured query systems Structured results Relational tables Unsupervised machine learning creates model 1 2 3 4 ML extensions for SQL with Deep Feed-Forward Neural Nets 19Think 2019 / DOC ID / Month XX, 2019 / © 2019 IBM Corporation Future Tech Uses: • Similarity/dissimilarity queries • Inductive reasoning queries such as semantic clustering, analogies, off-man out • Semantic group-by operations • Pattern anomalies (for example, fraud detection) • Extend to image, audio, video Discover hidden semantic relationships and trends in the data. Database
  • 23.
    © 2018 InternationalBusiness Machines Corporation SELECT inventory.inv_item_sk promotion.p_channel_demo FROM promotion JOIN catalog_returns ON catalog_returns.cr_item_sk = promotion.p_item_sk JOIN reason ON reason.r_reason_sk = catalog_returns.cr_reason_sk JOIN inventory ON catalog_returns.cr_warehouse_sk = inventory.inv_warehouse_sk AND inventory.inv_item_sk = catalog_returns.cr_item_sk AND inventory.inv_item_sk = promotion.p_item_sk Example SQL query
  • 24.
    Without machine learning Return Tablescan catalog returns Table scan reason Hash join Hash join Table scan reason Hash join Table scan inventory With machine learning IBM Cloud / Db2 AI database / February 2019 / © 2019 IBM Corporation 24 Table scan inventory Return Table scan promotion Table scan catalog returns Table scan reason Hash join Hash join
  • 25.
    Test 1 343vs 2,927 = 8.5X faster Test 2 281 vs 2,333 = 8.3X faster
  • 26.
    Learn more –go to ibm.com/Db2
  • 27.
  • 28.

Editor's Notes

  • #9 Improve operational efficiencies. Enterprises often struggle to ensure that database systems are running efficiently. Queries that overload the system, consume excessive resources, or impact other running jobs not only impact performance but also require manual resources to rectify. AI can help by automating the management of queries based on their likely resource consumption, providing a more stable and reliable system that can prioritize queries, reducing manual governance and monitoring of the database. Improve query performance and accuracy. AI-enabled database querying can have a dramatic impact on increasing the overall accuracy of, or confidence in, the query result. By executing queries in a more efficient manner, enterprises can lower the time taken to generate insight and improve business decisions. Empower business analysts. One of the primary challenges when doing analytics has been to ‘democratize’ the technology to enable a broader range of people to be able to make analytics-driven decisions. New query interfaces lower the barriers to insight, while accelerating the development of AI-based applications can enable the output of machine learning models to be placed in the hands of domain experts and business decision-makers. Accelerate data scientist productivity. 451 Research survey results indicate that accessing and preparing data is one of the three most significant barriers to machine learning adoption. An AI-enable database can help overcome this barrier to insight by accelerating data exploration and lowering development times though the integration of developer tools and frameworks. The automation of database administration tasks is set to change the role of the DBA. Through the automation of mundane database administration tasks such as database provisioning and performance tuning, DBAs can focus their time on delivering higher-impact tasks such as architecture planning and data security.
  • #24 Here is an example of a SQL query we tried, just one of about 300 that were run in the demo you’ll soon see. It joins 4 tables, and there are many possible ways the database can compute a correct result.
  • #25 Without the benefit of machine learning the databases uses statistical and resource modeling (CPU, I/O, Network consumption) to evaluate possible strategies, and selects the execution strategy you see on top. It selects an execution strategy that joins two table, joins two other tables and finally joins the result of the two joins. Machine Learning, benefitting from experience, finds a superior execution strategy. It joins two tables, then joins a third table, and finally joins that result with a fourth table. The query executes correctly in both cases but the ML based strategy runs faster.
  • #26  So lets see the technology in action. We studied a workload of over 300 complex queries running against a TPCDS database. Most of the time the database finds a great execution strategy even without the benefit of machine learning. In those cases the performance the queries is similar with and without machine learning. But for a number of queries Machine Learning found profoundly better execution strategies. In this demo we are showing you the queries where Machine Learning found a superior execution strategy. The workload running with Machine Learning are the right, and the queries without the benefit of machine learning are running on the left. Both wrloads are running on the same data, same hardware, same SQL queries. Let’s see how they compare – on your marks, get set, go!
  • #27 Put CTA Learn more go to IBM/Db2
  • #28 Put CTA Learn more go to IBM/Db2