SlideShare a Scribd company logo
1 of 8
Download to read offline
  
   
Executive White Paper
Flow
Enterprise Data-Automation Framework and Generic-Hypercube Database
 
 
Data + Actions = Results
 
  
  Executive White Paper 
Harness All Data
 
Executive Briefing  
The most overwhelming issue facing IT leaders is how to integrate mass influxes of data 
from a variety of disconnected sources. Due to the disparate nature of data formats and 
structures, incoming and existing data is siloed, preventing a synchronized view for the entirety of 
an organization's data. This prohibits a complete analysis, limiting it’s potential value.  
Even after the integration of data into one consolidated stream of information, a new 
problem arises: how to efficiently analyze the collection to produce results that matter.  
This typically involves database managers specializing in a query language such as SQL, 
data scientists specializing in a data analytics language such as R, and algorithms experts 
specializing in a massively distributed parallel processing paradigm such as MapReduce. This 
process creates an extreme amount of overhead that stands between raw data and insightful 
results. 
Current solutions are crafted by patching together many disconnected components in an 
attempt to facilitate this complete big data architecture; but the development process is long and 
resource intensive. At 4D­IQ, we have architected an answer to these challenges. 
Flow is the Most Complete and Simple Data Science Architecture 
Based on the concept Generic Data and Generic HyperCubes, Flow is the first true generic 
HyperCube data container... 
Backed by a groundbreaking parallel processing architecture, Flow is one of the most 
efficient computation engines on the market. 
By abstracting traditional coding methods and providing a simple interface to automate 
virtually any procedure across the entirety of your data, Flow delivers a breakthrough in data 
science. 
 
Simply put... 
...the technology behind Flow can automate virtually any data science task... 
...irrespective of scale or complexity 
2 
  
  Executive White Paper 
What Does Flow Do?
Flow is a universal ‘system of systems’ that provides streamlined, automated layers of
communication logic between any number of any variety of disconnected data systems/applications,
at the highest scale possible.
By using Flow to leverage the numerous sources of data available (both internally and
externally), an enterprise can gain significant competitive advantage.
The Motto
 
Data + Actions = Results  
 
Universalize data and simplify actions to create meaningful, automated results.
 
The Objective
 
To automate any data science task irrespective of scale or complexity.
 
The Flow Ecosystem
 
The Flow ecosystem is derived from the core premises that compose the entire process, A-Z,
of data science; the model being:
❏ database creation and cell manipulation
❏ data integration
❏ database management and cleansing
❏ the new ability to create ‘HyperCubes’
❏ data analytics
❏ large scale data processing
❏ a visualization and reporting environment
❏ a community to share and collaborate on Workflows created
The premise of Flow is to abstract away and simplify the complexities associated with data
science, etl, and machine learning (such as syntax-specific custom code), without compromising the
capabilities of programming...
Flow retains the raw power and functionality of any high-level programming language, but
without the need to custom code or script. Developing Workflows is extremely dynamic and
facilitates an unprecedented rate at which complex data science problems within an enterprise can be
solved, automated, and easily maintained.  
3 
  
  Executive White Paper 
The Components
 
 
Generic Data 
 
Flow contains an entirely new type of database architecture which we call a ‘generic database’’
or the ‘jagged HyperCube model’.
Generic data is a key-value data structure. The Flow system can adapt to restructure any and
every* source of data into a single, universal format that is easily manipulated and transferred. Data
persists in-memory within the generic tables. The generic data is bidirectional such that data can be
transferred effortlessly between any combination of data formats.
Data between disparate systems can essentially interact as if they were from the same source.
This includes but is not limited to:
● Local files
○ Excel
○ Access
○ Delimited
○ Positional
○ XML
○ JSON
● RDMS and custom databases
○ MySQL
○ SQL Server
○ Hyperion
○ Informix
○ Postgres
○ SAP
○ Legacy Systems
● CRMs and common applications
○ Redtail
○ Pipedrive
○ Zoho
○ Salesforce
○ Quickbooks
○ Outlook
● Web APIs
○ Twitter
○ LinkedIn
○ Facebook
○ Instagram
○ HTML
○ RSS Feeds
○ Natural Language Text
○ LDAP
○ Google Feeds
○ Magento
○ Ebay
○ Google Analytics
○ Yellow Pages
○ NY Times
○ Tap into the ‘Internet of Things’
○ Any Restful API’s
○ Any SOAP API’s’
 
 
*A custom plugin to any data source unaccounted for can be created in approximately one day 
 
4 
  
  Executive White Paper 
Expression Builder 
 
Flow provides an advanced development environment against the generic data to facilitate
the design and delivery of reusable and automatable Workflows.
The Expression Builder is a simple interface that provides a layer of abstraction over CS
languages or commands that involve:
● Datapoint Management
○ Excel
● Database Management
○ SQL
● Data Analysis
○ R/Python  
The Expression Builder contains eleven complete libraries of functions + special-function
tabs. Workflows are produced by sequencing together operations via the Expression Builder in a
rapid and intuitive manner; to create higher level algorithmic procedures.
Flow presents a systematic way of feeding a function’s output as an input into a new function
to instantaneously create diverse new features and characteristics across the data. Flow provide the
building blocks required to automate virtually any data science task.
 
 
 
Agent Architecture  
 
 
Flow is backed by a parallel processing architecture comparable to Hadoop’s distributed filing
system; allowing for the distributed deployment and processing of massive data sets in a parallel,
asynchronous manner.
Monitoring Agents are deployed to as many disconnected systems at as many locations as
necessary, executing chunks of algorithmic logic in parallel and then pooling the results together at
any desired frequency. The computation process is distributed over as many physical CPUs as
required and is orchestrated by a master Workflow Agent via cloud. Command line entries are
eliminated from this aspect of the system as well.
 
 
 
 
 
 
 
 
5 
  
  Executive White Paper 
HyperCube 
 
 
Combine all disparate data sources together to create a massive, jagged data set. Flow’s
logarithmic-time algorithm then explodes this massive data set into a HyperCube which creates all
permutations for any number of dimensions that exist.
 
The HyperCube is a high dimensional structure which vectorizes and contains every possible
combination and link based on the dimensional axis of the data. The HyperCube is projected and
displayed into a two-dimensional matrix format. Each vector of the Hypercube represents an
independent aggregation of the underlying data point values; ranging from one to ​n ​dimensionality.
The HyperCube data structure is a unique and powerful tool for training advanced AI
learning procedures and optimization techniques. It scales ​linearly ​in a massively parallel fashion to
accommodate (via live communication streams) for all the data necessary as it is increasingly updated
and generated across an enterprise.
 
 
 
 
 
 
 
Visualization 
 
 
Add Workflow steps to extract any subset or dimensional view from the HyperCube via pivot
table. Easily translate those pivot tables into a choice of graphical representation.
Metrics and KPI calculation steps can be added as well. Group all of these reporting features
up in a dashboard and push it to the cloud for a painless, portable, and autonomous reporting
experience.
 
 
 
 
 
 
 
 
 
6 
  
  Executive White Paper 
The Potential
 
Flow has the potential to automate and optimize virtually every data science aspect within an
enterprise. The Flow interface allows creativity and logic to completely replace coding expertise. The
scope and diversity of procedures you can design is virtually limitless.
Workflow procedures can accomplish, but are not limited to:
 
❖  Creating autonomous data communication streams across systems
❖ Implementing standards in data across disconnected systems
❖ Linking and relating data across disconnected systems
❖ Creating automated cleansing logic
❖ Real time monitoring of data systems for anomalies
❖ Triggering conditional based notifications
❖ Migrating and unifying Legacy systems
❖ Evaluating advanced conditions in data
❖ Scrubbing and standardizing address data
❖ Validating emails and customer names
❖ Automating reporting tasks across many systems
❖ Training advanced machine learning and predictive models
❖ Performing any type of statistical analysis
❖ Reconstructing and transforming datasets to create new features
❖ Validating and looking up city, state and zip fields against UPS database
❖ Performing fuzzy match ups and fuzzy joins
❖ Identifying and extract duplicates
❖ Delivering reports and streamline dashboard generation
❖ Performing semantic analysis and semantic matching--
❖ Implementing data dictionaries
❖ Executing FTP data transfers
❖ Finding hidden anomalies and patterns
❖ Performing optimized search across ​n ​systems
7 
  
  Executive White Paper 
Summary
 
Have one master Workflow orchestrating the executions of as many sub-Workflows across all
disconnected systems via autonomous streaming Agents.
Use the system’s join and set-operations to rapidly cleanse and unify the entirety of an
enterprise’s internal data, as well as any desired target data from external sources, into one all
encompassing jagged generic data set.
Warp this jagged generic data set into a high dimensional generic HyperCube which contains
every link and relation across every dimensional axis within your data.
Optimize and train learning algorithms across the HyperCube; extract and then group
together imperative statistical evaluators for an autonomous reporting experience.
The scope and diversity of what can be achieved in this system is bounded by the creativity
of the user.
 
 
 
 
 
 
 
 
 
 
 
 
 
For further inquiry, please contact: 
 
 
Hypercube Artificial Intelligence Division
 
Andrew McLaughlin, CEO and Lead Developer
andrew.mclaughlin@4diq.com
M: 1.484.283.530
Jeremy Villar, CMO and Developer
jeremy.villar@uconn.edu
M: 1.860.309.2788 
8 

More Related Content

What's hot

Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesMongoDB
 
Big data meet_up_08042016
Big data meet_up_08042016Big data meet_up_08042016
Big data meet_up_08042016Mark Smith
 
Database awareness
Database awarenessDatabase awareness
Database awarenesskloia
 
Data Virtualization and ETL
Data Virtualization and ETLData Virtualization and ETL
Data Virtualization and ETLLily Luo
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting datamarkgrover
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoJusto Hidalgo
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Decision Ready Data: Power Your Analytics with Great Data
Decision Ready Data: Power Your Analytics with Great DataDecision Ready Data: Power Your Analytics with Great Data
Decision Ready Data: Power Your Analytics with Great DataDLT Solutions
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Steven Moy
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryNeo4j
 
Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technologypeertechzpublication
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
KnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge baseKnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge baseLaurent Alquier
 
MongoDB Europe 2016 - The Rise of the Data Lake
MongoDB Europe 2016 - The Rise of the Data LakeMongoDB Europe 2016 - The Rise of the Data Lake
MongoDB Europe 2016 - The Rise of the Data LakeMongoDB
 
Evaluation criteria for nosql databases
Evaluation criteria for nosql databasesEvaluation criteria for nosql databases
Evaluation criteria for nosql databasesEbenezer Daniel
 

What's hot (20)

Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
 
Analytical tools
Analytical toolsAnalytical tools
Analytical tools
 
Big data meet_up_08042016
Big data meet_up_08042016Big data meet_up_08042016
Big data meet_up_08042016
 
Database awareness
Database awarenessDatabase awareness
Database awareness
 
Data Virtualization and ETL
Data Virtualization and ETLData Virtualization and ETL
Data Virtualization and ETL
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
 
How to build a successful Data Lake
How to build a successful Data LakeHow to build a successful Data Lake
How to build a successful Data Lake
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
The NoSQL Movement
The NoSQL MovementThe NoSQL Movement
The NoSQL Movement
 
Decision Ready Data: Power Your Analytics with Great Data
Decision Ready Data: Power Your Analytics with Great DataDecision Ready Data: Power Your Analytics with Great Data
Decision Ready Data: Power Your Analytics with Great Data
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data Discovery
 
Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technology
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
KnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge baseKnowIT, semantic informatics knowledge base
KnowIT, semantic informatics knowledge base
 
MongoDB Europe 2016 - The Rise of the Data Lake
MongoDB Europe 2016 - The Rise of the Data LakeMongoDB Europe 2016 - The Rise of the Data Lake
MongoDB Europe 2016 - The Rise of the Data Lake
 
Evaluation criteria for nosql databases
Evaluation criteria for nosql databasesEvaluation criteria for nosql databases
Evaluation criteria for nosql databases
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 

Similar to ExecutiveWhitePaper

From Relational Database Management to Big Data: Solutions for Data Migration...
From Relational Database Management to Big Data: Solutions for Data Migration...From Relational Database Management to Big Data: Solutions for Data Migration...
From Relational Database Management to Big Data: Solutions for Data Migration...Cognizant
 
Business Intelligence for users - Sharperlight
Business Intelligence for users - SharperlightBusiness Intelligence for users - Sharperlight
Business Intelligence for users - SharperlightMichell8240
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesVasu S
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationDenodo
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
Steering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsSteering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsConnexica
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Perficient, Inc.
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy snehal parikh
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionAmazon Web Services
 
ACdP Fiware.pdf
ACdP Fiware.pdfACdP Fiware.pdf
ACdP Fiware.pdfMASSAL3
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overviewvhrocca
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Amy W. Tang
 
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...Mihir Gandhi
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Amy W. Tang
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 

Similar to ExecutiveWhitePaper (20)

From Relational Database Management to Big Data: Solutions for Data Migration...
From Relational Database Management to Big Data: Solutions for Data Migration...From Relational Database Management to Big Data: Solutions for Data Migration...
From Relational Database Management to Big Data: Solutions for Data Migration...
 
Event Driven Architecture
Event Driven ArchitectureEvent Driven Architecture
Event Driven Architecture
 
Business Intelligence for users - Sharperlight
Business Intelligence for users - SharperlightBusiness Intelligence for users - Sharperlight
Business Intelligence for users - Sharperlight
 
Flow-ABriefExplanation
Flow-ABriefExplanationFlow-ABriefExplanation
Flow-ABriefExplanation
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data Lakes
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
Steering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsSteering Away from Bolted-On Analytics
Steering Away from Bolted-On Analytics
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy
 
A Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in ActionA Data Culture with Embedded Analytics in Action
A Data Culture with Embedded Analytics in Action
 
ACdP Fiware.pdf
ACdP Fiware.pdfACdP Fiware.pdf
ACdP Fiware.pdf
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 

ExecutiveWhitePaper

  • 1.        Executive White Paper Flow Enterprise Data-Automation Framework and Generic-Hypercube Database     Data + Actions = Results  
  • 2.      Executive White Paper  Harness All Data   Executive Briefing   The most overwhelming issue facing IT leaders is how to integrate mass influxes of data  from a variety of disconnected sources. Due to the disparate nature of data formats and  structures, incoming and existing data is siloed, preventing a synchronized view for the entirety of  an organization's data. This prohibits a complete analysis, limiting it’s potential value.   Even after the integration of data into one consolidated stream of information, a new  problem arises: how to efficiently analyze the collection to produce results that matter.   This typically involves database managers specializing in a query language such as SQL,  data scientists specializing in a data analytics language such as R, and algorithms experts  specializing in a massively distributed parallel processing paradigm such as MapReduce. This  process creates an extreme amount of overhead that stands between raw data and insightful  results.  Current solutions are crafted by patching together many disconnected components in an  attempt to facilitate this complete big data architecture; but the development process is long and  resource intensive. At 4D­IQ, we have architected an answer to these challenges.  Flow is the Most Complete and Simple Data Science Architecture  Based on the concept Generic Data and Generic HyperCubes, Flow is the first true generic  HyperCube data container...  Backed by a groundbreaking parallel processing architecture, Flow is one of the most  efficient computation engines on the market.  By abstracting traditional coding methods and providing a simple interface to automate  virtually any procedure across the entirety of your data, Flow delivers a breakthrough in data  science.    Simply put...  ...the technology behind Flow can automate virtually any data science task...  ...irrespective of scale or complexity  2 
  • 3.      Executive White Paper  What Does Flow Do? Flow is a universal ‘system of systems’ that provides streamlined, automated layers of communication logic between any number of any variety of disconnected data systems/applications, at the highest scale possible. By using Flow to leverage the numerous sources of data available (both internally and externally), an enterprise can gain significant competitive advantage. The Motto   Data + Actions = Results     Universalize data and simplify actions to create meaningful, automated results.   The Objective   To automate any data science task irrespective of scale or complexity.   The Flow Ecosystem   The Flow ecosystem is derived from the core premises that compose the entire process, A-Z, of data science; the model being: ❏ database creation and cell manipulation ❏ data integration ❏ database management and cleansing ❏ the new ability to create ‘HyperCubes’ ❏ data analytics ❏ large scale data processing ❏ a visualization and reporting environment ❏ a community to share and collaborate on Workflows created The premise of Flow is to abstract away and simplify the complexities associated with data science, etl, and machine learning (such as syntax-specific custom code), without compromising the capabilities of programming... Flow retains the raw power and functionality of any high-level programming language, but without the need to custom code or script. Developing Workflows is extremely dynamic and facilitates an unprecedented rate at which complex data science problems within an enterprise can be solved, automated, and easily maintained.   3 
  • 4.      Executive White Paper  The Components     Generic Data    Flow contains an entirely new type of database architecture which we call a ‘generic database’’ or the ‘jagged HyperCube model’. Generic data is a key-value data structure. The Flow system can adapt to restructure any and every* source of data into a single, universal format that is easily manipulated and transferred. Data persists in-memory within the generic tables. The generic data is bidirectional such that data can be transferred effortlessly between any combination of data formats. Data between disparate systems can essentially interact as if they were from the same source. This includes but is not limited to: ● Local files ○ Excel ○ Access ○ Delimited ○ Positional ○ XML ○ JSON ● RDMS and custom databases ○ MySQL ○ SQL Server ○ Hyperion ○ Informix ○ Postgres ○ SAP ○ Legacy Systems ● CRMs and common applications ○ Redtail ○ Pipedrive ○ Zoho ○ Salesforce ○ Quickbooks ○ Outlook ● Web APIs ○ Twitter ○ LinkedIn ○ Facebook ○ Instagram ○ HTML ○ RSS Feeds ○ Natural Language Text ○ LDAP ○ Google Feeds ○ Magento ○ Ebay ○ Google Analytics ○ Yellow Pages ○ NY Times ○ Tap into the ‘Internet of Things’ ○ Any Restful API’s ○ Any SOAP API’s’     *A custom plugin to any data source unaccounted for can be created in approximately one day    4 
  • 5.      Executive White Paper  Expression Builder    Flow provides an advanced development environment against the generic data to facilitate the design and delivery of reusable and automatable Workflows. The Expression Builder is a simple interface that provides a layer of abstraction over CS languages or commands that involve: ● Datapoint Management ○ Excel ● Database Management ○ SQL ● Data Analysis ○ R/Python   The Expression Builder contains eleven complete libraries of functions + special-function tabs. Workflows are produced by sequencing together operations via the Expression Builder in a rapid and intuitive manner; to create higher level algorithmic procedures. Flow presents a systematic way of feeding a function’s output as an input into a new function to instantaneously create diverse new features and characteristics across the data. Flow provide the building blocks required to automate virtually any data science task.       Agent Architecture       Flow is backed by a parallel processing architecture comparable to Hadoop’s distributed filing system; allowing for the distributed deployment and processing of massive data sets in a parallel, asynchronous manner. Monitoring Agents are deployed to as many disconnected systems at as many locations as necessary, executing chunks of algorithmic logic in parallel and then pooling the results together at any desired frequency. The computation process is distributed over as many physical CPUs as required and is orchestrated by a master Workflow Agent via cloud. Command line entries are eliminated from this aspect of the system as well.                 5 
  • 6.      Executive White Paper  HyperCube      Combine all disparate data sources together to create a massive, jagged data set. Flow’s logarithmic-time algorithm then explodes this massive data set into a HyperCube which creates all permutations for any number of dimensions that exist.   The HyperCube is a high dimensional structure which vectorizes and contains every possible combination and link based on the dimensional axis of the data. The HyperCube is projected and displayed into a two-dimensional matrix format. Each vector of the Hypercube represents an independent aggregation of the underlying data point values; ranging from one to ​n ​dimensionality. The HyperCube data structure is a unique and powerful tool for training advanced AI learning procedures and optimization techniques. It scales ​linearly ​in a massively parallel fashion to accommodate (via live communication streams) for all the data necessary as it is increasingly updated and generated across an enterprise.               Visualization      Add Workflow steps to extract any subset or dimensional view from the HyperCube via pivot table. Easily translate those pivot tables into a choice of graphical representation. Metrics and KPI calculation steps can be added as well. Group all of these reporting features up in a dashboard and push it to the cloud for a painless, portable, and autonomous reporting experience.                   6 
  • 7.      Executive White Paper  The Potential   Flow has the potential to automate and optimize virtually every data science aspect within an enterprise. The Flow interface allows creativity and logic to completely replace coding expertise. The scope and diversity of procedures you can design is virtually limitless. Workflow procedures can accomplish, but are not limited to:   ❖  Creating autonomous data communication streams across systems ❖ Implementing standards in data across disconnected systems ❖ Linking and relating data across disconnected systems ❖ Creating automated cleansing logic ❖ Real time monitoring of data systems for anomalies ❖ Triggering conditional based notifications ❖ Migrating and unifying Legacy systems ❖ Evaluating advanced conditions in data ❖ Scrubbing and standardizing address data ❖ Validating emails and customer names ❖ Automating reporting tasks across many systems ❖ Training advanced machine learning and predictive models ❖ Performing any type of statistical analysis ❖ Reconstructing and transforming datasets to create new features ❖ Validating and looking up city, state and zip fields against UPS database ❖ Performing fuzzy match ups and fuzzy joins ❖ Identifying and extract duplicates ❖ Delivering reports and streamline dashboard generation ❖ Performing semantic analysis and semantic matching-- ❖ Implementing data dictionaries ❖ Executing FTP data transfers ❖ Finding hidden anomalies and patterns ❖ Performing optimized search across ​n ​systems 7 
  • 8.      Executive White Paper  Summary   Have one master Workflow orchestrating the executions of as many sub-Workflows across all disconnected systems via autonomous streaming Agents. Use the system’s join and set-operations to rapidly cleanse and unify the entirety of an enterprise’s internal data, as well as any desired target data from external sources, into one all encompassing jagged generic data set. Warp this jagged generic data set into a high dimensional generic HyperCube which contains every link and relation across every dimensional axis within your data. Optimize and train learning algorithms across the HyperCube; extract and then group together imperative statistical evaluators for an autonomous reporting experience. The scope and diversity of what can be achieved in this system is bounded by the creativity of the user.                           For further inquiry, please contact:      Hypercube Artificial Intelligence Division   Andrew McLaughlin, CEO and Lead Developer andrew.mclaughlin@4diq.com M: 1.484.283.530 Jeremy Villar, CMO and Developer jeremy.villar@uconn.edu M: 1.860.309.2788  8