2. What is Data Architecture?
Defines a target state for a computer information system
(similar to a building architect and a plan/model for a house)
• Describes how data is processed, stored, and utilized
• Describes business entities, data process flows, and information usage
• Shows data structures used by business and its computer apps software
w. eventual interactions/integrations between those data systems.
• Sets data standards for all data systems as a vision or a model
• Addresses data in storage and data in motion;
• Provides descriptions of data stores, data groups and data items;
w. mappings of those artifacts to data qualities, applications, locations etc.
o Could be for enterprise or software application
3. Benefits of Data Architecture
1) Higher quality
It’s a blueprint for a building. About 70% of SW efforts fail.
Promotes consistency. Choose a path for success.
2) Reduced cost
Spend 10% earlier to reduce 70% in errors, oversights, fixes.
3) Quicker time to market
Faster SW dev, catch errors early, comprehensive. better disaggregation.
4) Clearer scope
Precise requirements, stakeholder consensus, vocabulary agreement
5) Faster performance
Data systems tuning, discipline scales horizontally.
6) Better documentation (Best practices Ref. architecture)
Concepts agreement, auditability and policies, L/T support and maintenance.
7) Fewer application errors
Less confusion, common and re-usable, faster resolution.
8) Fewer data errors
Data errors are worse than app errors. Better referential integrity, less corruption.
9) Managed risk
Better estimate complexity, intensity, sizing, and project risks.
10) A good start for data mining.
Documentation to warehouse/lake, BI, analytics, AI/ML
4. Defining the Target State
3 traditional architectural processes*:
1) Conceptual - Represents all enterprise entities
Types and descriptions, such as customer, product, store, location, asset
2) Logical – Business PoV, represents the logic of how entities
are related
Fully attributed, independent of DMBS, tech, storage, org
3) Physical – Realization of the data mechanisms
for a specific type of functionality
Fully attributed, with data persistence tech, components, implementations
Atomic
Whole
Picture
* A reference architecture will document such things as HW, SW, processes,
specs, and configs, w. logical components and interrelationships.
5. Influencers and Questions
Enterprise requirements
Users, functionality, performance, warehousing, sizing, access/speed, scalability
Technology drivers
Database designs, organization integrations, standards, new/legacy, virtualization,
existing resources
Economics
Costs, longevity, business cycles, budgeting, market conditions
Business policies
Regulations, security, governmental laws, prof. standards
Data processing needs
Transaction volumes/velocity/variety, accuracy, reproducibility, MIS, data mining,
periodic/ad-hoc reporting, product development
6. A Few Principles
Build decoupled systems
• Data → Store → Process → Store → Analyze → Answers
Use the right tool for the job
• Data structure, latency, throughput, access patterns
Leverage managed services
• Scalable/elastic, available, reliable, secure, no/low admin
Use log-centric design patterns
• Immutable logs, materialized views
Be cost-conscious
• Big data ≠ big cost
7. A Few Tools and Techniques
Business process and object modeling languages such as:
UML and BPMN
Data Modeling tools such as:
ER Studio, Erwin, PowerDesigner or
the Archimate tool.
Helpful to know about Services Architectures such as:
The IT Information Library (ITIL)
and what is an IT Services Catalog
Methodologies such as:
The Integrated Definition IDEF methodology
Disciplined Agile Delivery (DAD)
or the Zachman Framework
Reference Architecture
http://internetofthingsagenda.techtarget.com/definition/reference-architecture
A reference architecture is a document or set of documents to which a project manager or other interested party can refer for best practices. In information technology, a reference architecture can be used to select the best delivery method for particular technologies within an IT service catalog.
The reference may be built in-house or it may be supplied by a third-party service provider or vendor. Typically, a reference architecture will document such things as hardware, software, processes, specifications and configurations, as well as logical components and interrelationships.