The Briefing Room with David Loshin and Phasic Systems
Slides from the Live Webcast on July 10, 2012
Getting disparate groups of professionals to agree on business terminology can take forever, especially when big dollars or major issues are at stake. Many data governance programs languish indefinitely because of simple hang-ups. But a new approach has recently achieved monumental results for the United States Navy. The detailed process has since been codified and combined with a NoSQL technology that enables even the most complex data models and definitions to be distilled into simple, functional data flows.
Check out this episode of The Briefing Room to hear Analyst David Loshin of Knowledge Integrity explain why effective Data Governance requires cooperation. Loshin will be briefed by Geoffrey Malafsky of Phasic Systems who will tout his company's proprietary protocol for extracting, defining and managing critical information assets and processes. He'll explain how their approach allows everyone to be "correct" in their definitions, without causing data quality or performance issues in associated information systems. And he'll explain how their Corporate NoSQL engine enables real-time harmonization of definitions and dimensions.
Visit us at: http://www.insideanalysis.com
3. ! Reveal the essential characteristics of enterprise
software, good and bad
! Provide a forum for detailed analysis of today s
innovative technologies
! Give vendors a chance to explain their product to
savvy analysts
! Allow audience members to pose serious questions...
and get answers!
Twitter Tag: #briefr
5. ! Disruptive Innovation produces an unexpected new market
and value network, and is usually geared toward a new set
of customers.
! The consumer technology market teems with such game-
changers: mp3 players, iPhone/iPads, portable storage
devices, digital media, etc.
! While disruptive technologies often take a degree of time to
obtain a foothold in the market, they can have a serious
impact on industry incumbents, who can be slow to
innovate.
Twitter Tag: #briefr
6. David Loshin, president of Knowledge Integrity,
Inc, is a recognized thought leader and expert
consultant in the areas of data quality, master
data management and business intelligence.
David is a prolific author regarding business
intelligence best practices and has written
numerous books and papers on data
management, including the just-published
“Practitioner’s Guide to Data Quality
Improvement.” David is a frequent invited
speaker at conferences, web seminars, and
sponsored web sites and channels including
www.b-eye-network.com. His best-selling
book, “Master Data Management,” has been
endorsed by data management industry
leaders, and his valuable MDM insights can be
reviewed at www.mdmbook.com.
David can be reached at:
loshin@knowledge-integrity.com or
(301) 754-6350.
Twitter Tag: #briefr
7. ! Focuses on agility and flexibility for data governance
and standards
! Offers a core technology suite, DataStar, that
delivers data modeling, integration, aggregation and
automation.
! Developed a NoSQL alternative for data consolidation
Twitter Tag: #briefr
8. Dr. Geoffrey Malafsky earned a Ph.D. in
Nanotechnology from Pennsylvania State
University. He was a research scientist at the
Naval Research Laboratory before becoming a
technology consultant in advanced system
capabilities for numerous Government
agencies and corporate clients. He has over
thirty years of experience and is an expert in
multiple fields including Nanotechnology,
Knowledge Discovery and Dissemination, and
Information Engineering. He founded and
operated the technology consulting company
TECHi2 prior to founding Phasic Systems Inc.,
where he is the CEO and CTO.
Twitter Tag: #briefr
9. Bringing Agility and Flexibility to
Data Design and Integration
Phasic Systems Inc
Delivering Agile Data
www.phasicsystemsinc.com
10. 10
Introduction to Phasic Systems Inc
• Bringing Agile capabilities to data lifecycle for business success
• Methods and tools tested and refined over years of in-depth large-
scale efforts
• Solve toughest data problems where traditional methods fail
• Based on extensive consulting lessons learned and real-world
results
• Began in 2005 to commercialize advanced Agile methods
successfully deployed in competitive development contracts
11. 11
Phasic Systems Inc Management
• Geoffrey Malafsky, Ph.D, Founder and CEO
▫ Research scientist
▫ Supported many organizations in their quest to access the right
information at the right time
• Tim Traverso, Sr VP Federal
▫ Technical Director, Navy Deputy CIO
• Marshall Maglothin, Sr VP HealthCare
▫ Sr. Executive multiple large health care systems
• Deborah Malafsky Sr VP Business Development
12. 12
Our Agile Methods
• Why be Agile?
▫ Provide flexibility and adaptability to changing business needs while
maintaining accuracy and commonality
▫ Segmented approach is too slow, rigid, and costly
• How?
▫ Treat data lifecycle as one continuous operation from governance to
modeling to integration to warehouses to Business Intelligence
▫ Emphasize value produced at each step and overall coordination
▫ Seamlessly fit with existing organization, procedures, tools but add Agility,
commonality, flexibility, and reduced cost and time
• We are Agile and comprehensive
▫ Typical 60-90 day engagement
▫ Deliver completed products not just plans or partial results
13. 13
Methods and Tools
• DataStar Discovery: Agile data governance, standards and design
▫ Add business and security context to data
▫ Flexible, common data definitions/ semantics, models
• DataStar Unifier: Agile warehousing and aggregation
▫ Simplified, common semantics using Corporate NoSQL™
▫ Source to target mapping with flexibility, standardization
▫ Aggregate data using all use case and system variations simply and
easily into standard or NoSQL databases
14. 14
PSI Customer Testimonial
“As a COO of a Wall Street firm and a former Vice Admiral in the United
States Navy in charge of a large integrated organization of thousands of people
and numerous IT systems, I have seen firsthand the critical role that high-quality
enterprise data plays in day-to-day operations of an organization. Without
timely access to reliable and trusted data all of our operations were vulnerable
to poor decision making, weak performance, and a failure to compete. With
Phasic Systems Inc.’s agile methodology and technology, we were finally able to
solve our data challenges at a fraction of the time, cost, and organizational
turmoil that all the previous and more expensive, time-consuming approaches
failed to do. Phasic Systems Inc. offers a new and much-needed approach to
this important area of Business Intelligence.”
VADM (ret) J. “Kevin” Moran
15. 15
The Business Case
Today’s Response Timeline (15 to 27 Months)
3 to 6 Months 6 to 9 Months 3 to 6 Months 3 to 6 Months
Business Groups IT Groups BI Groups Users
• Requirements • Develop Systems & Applications • Capability Problems
• BI Data Models
• Conceptual/Logical Models • Physical Data Models • New Capabilities
• Reports
• Data Quality • Databases / Data Warehouse • Missing Data
• Dashboards
• Business Rules • ETL controls
• Standards • MDM
Tomorrow’s Initial Response Timeline with PSI (Subsequent Response Timeline – Days)
2 to 6 Months
• Requirements • Develop Systems & Applications
• Conceptual Data Model • Physical Data Models
• Logical Data Model • Databases / Data Warehouse
• Business Rules • ETL controls
• Standards • MDM
• BI Data Models
• Data Quality
16. 16
Agile: Overcome Hurdles
• Group rivalry
▫ Embrace important business variations; recognize no valid reason
to force everyone to use only one view exclusively.
• Terminology confusion
▫ Use a guided framework of well-known concepts to rapidly identify,
and implement variations as related entities.
• Poor knowledge sharing
▫ Use integrated metadata where important products (business
models, data models, glossaries, code lists, and integration rules)
are visible, coordinated, and referenceable
• Inflexible designs
▫ Use a hybrid approach (Corporate NoSQL™) for Agile
warehousing and integration blending traditional tables and
NoSQL for its immense flexibility and inherent speed
17. Schema Are Not Enough
Governance Integration CEO/CFO/CIO SAP/IBM/ORACLE
Design ? MDM Sales, ?
Accounting
D. Loshin 2008
Which Value? Whose? My “customer” or your “customer”?
How is data used?
Must be agile in order to adapt quickly to new business needs
▫ Continuous change is norm: requirements, consolidation
▫ We must use all the important business variations of key terms (e.g.
account, client, policy) – No such thing as single version for all!
20. 20
Real Estate Listing Example
• Seems simple and well-defined
▫ Each house has a type, id, address, etc..
▫ Industry standards: OSCRE, RETS
• Yet, data systems are very different
▫ Data model tied tightly to business workflow
▫ Extensions and “make-it-work” changes added over time
• Similar to customer relationship mgmt, ERP, and many
other fields
21. 21
Semantic Conflict in
Real Estate Models NKY
HOMESEEKERS
NKY attribute ‘basement’
does not have a corollary in
HOMESEEKERS
22. 22
Data Value Semantic
Errors = Inconsistent, Lot_dimensions: implied semantics for size
Difficult to Merge, data. Actually has all sorts of data
Report, Analyze
Semiannual_taxes: implied semantics for
numeric data. Actually has all sorts of data
28. 28
DataStar Corporate NoSQL™
• Large systems use NoSQL for its flexibility, performance,
and adaptability
▫ But, it is poorly suited for corporate use – lacks connection to
business
• DataStar Corporate NoSQLTM
▫ Blends traditional techniques and NoSQL Speed
▫ Entities come directly from Unified Business Model &
Agility
▫ Object structure with simple tables
▫ Key-value pairs are basic repeating structure of all tables
▫ Business driven terminology
▫ Easily handles semantic variations & updates w/o changes to
logical or physical models
▫ Can be as ‘dimensional’ or ‘normalized’ as desired
30. Results
• Applied to production data:
▫ Fully cleaned & integrated data governance approved
– Requirement: 500,000 records in 2 hrs on Sun E25K
– Actual: 50 minutes on 3 year low-cost server
• Governance documents produced and approved
▫ Legacy data models – first time in ten years
▫ Common data model – directly derived from ontology.
Position-Resume model
• Standing governance board created with short decision-
making monthly meetings
▫ Position-Resume Governance Board
• Process approach and technology applied to new IT
systems
31. Navy HR Data Analysis
• Groups “share” data and control only if they don’t lose
project control or funds
• Governance, business process, data engineers create
separate designs and don’t know how to coordinate
• Try hard to follow industry guidance but stuck
• Actual data is very different than policy, mgmt awareness
▫ Example 1: Multiple Rate/Rating entries. Person xxxxxx has 5
entries: 4 end on the same date, 2 have start dates after they
their end dates , 2 start and end on the same days but are
different
▫ Example 2: 30 different values used for RACE but only 6
allowed values in the Navy Military Personnel Manual derived
from DoD policy
61. ! One of the common themes in the material you provided is the
need for collaboration as part of the lifecycle management for
the creation of a unified business model. To what extent is this
collaboration driven by the software and how much requires
processes designed around the software?
! What is your approach for transferring the knowledge for
identifying semantic conflicts and resolving them within the
organization?
! A lot of the slides suggest that the intent of the use of the
technology is for developing data warehouse or business
intelligence models. Is the use limited to consuming data from
existing systems, or can it be used for reengineering operational
or transaction systems, and if so how, and if not, why?
Twitter Tag: #briefr
62. ! One of the barriers to value for existing metadata and
governance tools is the need for ongoing maintenance of the
content. How can the product be used to facilitate ongoing
management and assurance of consistency of business
terminology?
! Presuming that I am now a data consumer (say a business analyst)
within the organization, how would I use this technology to
clarify the definitions and lineage of business terms presented to
me in a BI report?
Twitter Tag: #briefr
63. ! What is your approach for capturing the semantics of implicit
business concepts? In your real estate example, one of the
columns for lot dimensions had implied semantics for size data,
with an implication of measurement systems, units of measure,
and even “topography” of the lot size. This implies the use of
business concepts that are not explicit (acreage vs. square
footage, transformations across frames of reference,
qualification of lot shape, presentation of dimensionality). How
does the tool capture implicit semantic information?
! Going back to collaboration: What types of interactive
notifications are integrated into your environment to apprise
individuals of changes to business terms, data element concepts,
data elements, value domains, etc.?
Twitter Tag: #briefr