Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
  
   
Executive White Paper
Flow
Enterprise Data-Automation Framework and Generic-Hypercube Database
 
 
Data + Actions =...
  
  Executive White Paper 
Harness All Data
 
Executive Briefing  
The most overwhelming issue facing IT leaders is how t...
  
  Executive White Paper 
What Does Flow Do?
Flow is a universal ‘system of systems’ that provides streamlined, automate...
  
  Executive White Paper 
The Components
 
 
Generic Data 
 
Flow contains an entirely new type of database architecture...
  
  Executive White Paper 
Expression Builder 
 
Flow provides an advanced development environment against the generic da...
  
  Executive White Paper 
HyperCube 
 
 
Combine all disparate data sources together to create a massive, jagged data se...
  
  Executive White Paper 
The Potential
 
Flow has the potential to automate and optimize virtually every data science a...
  
  Executive White Paper 
Summary
 
Have one master Workflow orchestrating the executions of as many sub-Workflows acros...
Upcoming SlideShare
Loading in …5
×

ExecutiveWhitePaper

198 views

Published on

  • Be the first to comment

  • Be the first to like this

ExecutiveWhitePaper

  1. 1.        Executive White Paper Flow Enterprise Data-Automation Framework and Generic-Hypercube Database     Data + Actions = Results  
  2. 2.      Executive White Paper  Harness All Data   Executive Briefing   The most overwhelming issue facing IT leaders is how to integrate mass influxes of data  from a variety of disconnected sources. Due to the disparate nature of data formats and  structures, incoming and existing data is siloed, preventing a synchronized view for the entirety of  an organization's data. This prohibits a complete analysis, limiting it’s potential value.   Even after the integration of data into one consolidated stream of information, a new  problem arises: how to efficiently analyze the collection to produce results that matter.   This typically involves database managers specializing in a query language such as SQL,  data scientists specializing in a data analytics language such as R, and algorithms experts  specializing in a massively distributed parallel processing paradigm such as MapReduce. This  process creates an extreme amount of overhead that stands between raw data and insightful  results.  Current solutions are crafted by patching together many disconnected components in an  attempt to facilitate this complete big data architecture; but the development process is long and  resource intensive. At 4D­IQ, we have architected an answer to these challenges.  Flow is the Most Complete and Simple Data Science Architecture  Based on the concept Generic Data and Generic HyperCubes, Flow is the first true generic  HyperCube data container...  Backed by a groundbreaking parallel processing architecture, Flow is one of the most  efficient computation engines on the market.  By abstracting traditional coding methods and providing a simple interface to automate  virtually any procedure across the entirety of your data, Flow delivers a breakthrough in data  science.    Simply put...  ...the technology behind Flow can automate virtually any data science task...  ...irrespective of scale or complexity  2 
  3. 3.      Executive White Paper  What Does Flow Do? Flow is a universal ‘system of systems’ that provides streamlined, automated layers of communication logic between any number of any variety of disconnected data systems/applications, at the highest scale possible. By using Flow to leverage the numerous sources of data available (both internally and externally), an enterprise can gain significant competitive advantage. The Motto   Data + Actions = Results     Universalize data and simplify actions to create meaningful, automated results.   The Objective   To automate any data science task irrespective of scale or complexity.   The Flow Ecosystem   The Flow ecosystem is derived from the core premises that compose the entire process, A-Z, of data science; the model being: ❏ database creation and cell manipulation ❏ data integration ❏ database management and cleansing ❏ the new ability to create ‘HyperCubes’ ❏ data analytics ❏ large scale data processing ❏ a visualization and reporting environment ❏ a community to share and collaborate on Workflows created The premise of Flow is to abstract away and simplify the complexities associated with data science, etl, and machine learning (such as syntax-specific custom code), without compromising the capabilities of programming... Flow retains the raw power and functionality of any high-level programming language, but without the need to custom code or script. Developing Workflows is extremely dynamic and facilitates an unprecedented rate at which complex data science problems within an enterprise can be solved, automated, and easily maintained.   3 
  4. 4.      Executive White Paper  The Components     Generic Data    Flow contains an entirely new type of database architecture which we call a ‘generic database’’ or the ‘jagged HyperCube model’. Generic data is a key-value data structure. The Flow system can adapt to restructure any and every* source of data into a single, universal format that is easily manipulated and transferred. Data persists in-memory within the generic tables. The generic data is bidirectional such that data can be transferred effortlessly between any combination of data formats. Data between disparate systems can essentially interact as if they were from the same source. This includes but is not limited to: ● Local files ○ Excel ○ Access ○ Delimited ○ Positional ○ XML ○ JSON ● RDMS and custom databases ○ MySQL ○ SQL Server ○ Hyperion ○ Informix ○ Postgres ○ SAP ○ Legacy Systems ● CRMs and common applications ○ Redtail ○ Pipedrive ○ Zoho ○ Salesforce ○ Quickbooks ○ Outlook ● Web APIs ○ Twitter ○ LinkedIn ○ Facebook ○ Instagram ○ HTML ○ RSS Feeds ○ Natural Language Text ○ LDAP ○ Google Feeds ○ Magento ○ Ebay ○ Google Analytics ○ Yellow Pages ○ NY Times ○ Tap into the ‘Internet of Things’ ○ Any Restful API’s ○ Any SOAP API’s’     *A custom plugin to any data source unaccounted for can be created in approximately one day    4 
  5. 5.      Executive White Paper  Expression Builder    Flow provides an advanced development environment against the generic data to facilitate the design and delivery of reusable and automatable Workflows. The Expression Builder is a simple interface that provides a layer of abstraction over CS languages or commands that involve: ● Datapoint Management ○ Excel ● Database Management ○ SQL ● Data Analysis ○ R/Python   The Expression Builder contains eleven complete libraries of functions + special-function tabs. Workflows are produced by sequencing together operations via the Expression Builder in a rapid and intuitive manner; to create higher level algorithmic procedures. Flow presents a systematic way of feeding a function’s output as an input into a new function to instantaneously create diverse new features and characteristics across the data. Flow provide the building blocks required to automate virtually any data science task.       Agent Architecture       Flow is backed by a parallel processing architecture comparable to Hadoop’s distributed filing system; allowing for the distributed deployment and processing of massive data sets in a parallel, asynchronous manner. Monitoring Agents are deployed to as many disconnected systems at as many locations as necessary, executing chunks of algorithmic logic in parallel and then pooling the results together at any desired frequency. The computation process is distributed over as many physical CPUs as required and is orchestrated by a master Workflow Agent via cloud. Command line entries are eliminated from this aspect of the system as well.                 5 
  6. 6.      Executive White Paper  HyperCube      Combine all disparate data sources together to create a massive, jagged data set. Flow’s logarithmic-time algorithm then explodes this massive data set into a HyperCube which creates all permutations for any number of dimensions that exist.   The HyperCube is a high dimensional structure which vectorizes and contains every possible combination and link based on the dimensional axis of the data. The HyperCube is projected and displayed into a two-dimensional matrix format. Each vector of the Hypercube represents an independent aggregation of the underlying data point values; ranging from one to ​n ​dimensionality. The HyperCube data structure is a unique and powerful tool for training advanced AI learning procedures and optimization techniques. It scales ​linearly ​in a massively parallel fashion to accommodate (via live communication streams) for all the data necessary as it is increasingly updated and generated across an enterprise.               Visualization      Add Workflow steps to extract any subset or dimensional view from the HyperCube via pivot table. Easily translate those pivot tables into a choice of graphical representation. Metrics and KPI calculation steps can be added as well. Group all of these reporting features up in a dashboard and push it to the cloud for a painless, portable, and autonomous reporting experience.                   6 
  7. 7.      Executive White Paper  The Potential   Flow has the potential to automate and optimize virtually every data science aspect within an enterprise. The Flow interface allows creativity and logic to completely replace coding expertise. The scope and diversity of procedures you can design is virtually limitless. Workflow procedures can accomplish, but are not limited to:   ❖  Creating autonomous data communication streams across systems ❖ Implementing standards in data across disconnected systems ❖ Linking and relating data across disconnected systems ❖ Creating automated cleansing logic ❖ Real time monitoring of data systems for anomalies ❖ Triggering conditional based notifications ❖ Migrating and unifying Legacy systems ❖ Evaluating advanced conditions in data ❖ Scrubbing and standardizing address data ❖ Validating emails and customer names ❖ Automating reporting tasks across many systems ❖ Training advanced machine learning and predictive models ❖ Performing any type of statistical analysis ❖ Reconstructing and transforming datasets to create new features ❖ Validating and looking up city, state and zip fields against UPS database ❖ Performing fuzzy match ups and fuzzy joins ❖ Identifying and extract duplicates ❖ Delivering reports and streamline dashboard generation ❖ Performing semantic analysis and semantic matching-- ❖ Implementing data dictionaries ❖ Executing FTP data transfers ❖ Finding hidden anomalies and patterns ❖ Performing optimized search across ​n ​systems 7 
  8. 8.      Executive White Paper  Summary   Have one master Workflow orchestrating the executions of as many sub-Workflows across all disconnected systems via autonomous streaming Agents. Use the system’s join and set-operations to rapidly cleanse and unify the entirety of an enterprise’s internal data, as well as any desired target data from external sources, into one all encompassing jagged generic data set. Warp this jagged generic data set into a high dimensional generic HyperCube which contains every link and relation across every dimensional axis within your data. Optimize and train learning algorithms across the HyperCube; extract and then group together imperative statistical evaluators for an autonomous reporting experience. The scope and diversity of what can be achieved in this system is bounded by the creativity of the user.                           For further inquiry, please contact:      Hypercube Artificial Intelligence Division   Andrew McLaughlin, CEO and Lead Developer andrew.mclaughlin@4diq.com M: 1.484.283.530 Jeremy Villar, CMO and Developer jeremy.villar@uconn.edu M: 1.860.309.2788  8 

×