Competitive Advantage. Elegantly Engineered.
A Different
Data Science Methodology
We use data, analytics, and design to help clients
perform at their best.
Machine Intelligence catalyzes innovation, engineers machine learning
applications, and builds enduring capabilities.
We’re creative, rigorous, and efficient. We bring the sophistication of a
large strategy firm with the speed and value of a focused boutique.
We apply proven techniques, designs, and world-class expertise to:
• Improve how companies engage customers
• Optimize machine performance
• Enhance process results
Models reproduce how questions are answered
in training data.
Business, not IT, should design training data.
Most project time is used understanding how
data is generated and building training data sets.
Machine Learning is Simple
Real
world
Training
Data Results
Generally a subset of
scenarios in the real
world.
Data trains models that
reproduce decisions in
the training data with
80-95% accuracy.
The full set of all
consumers, machines, or
business results that a
model will forecast.
A Different Data Science Methodology
Many data science projects jump into
algorithms and technology.
We reverse the usual approach by first
rigorously defining the business question
and understanding data.
The methodology:
• Aligns the whole business
• Sets practical expectations
• Leads change
• Builds sustaining capabilities
Data
Technology
Business question
Business
goals
Time
and
focus
Data
Technology
Steps
Foundation
• Align change across the business
• Understand data
• Define the business question
Results
• Sustain capabilities
• Communicate value
• Build application
Model
• Iterate production model
• Pilot models
• Build training data
1.
2.
3.
Project Phasing
• Most time is spent understanding data and building training data.
• An early pilot is key to refining to training data and building support for change.
• Developing the full application starts early with a UX for the pilot model.
1. Set Foundation
A. Define the business question
B. Align change
C. Understand data
• Learn and set expectations on the data science process and cloud hosting.
• Define precise business questions.
• Model how answering the business question delivers results.
• Link business and regulatory needs to training data design and algorithm selection, e.g. does a
model require easy explainability?
• Build a coalition of sponsors and communicate the vision.
• Define roles for compliance, customer service, finance, marketing, product, and sales.
• Understand the data generating process: genchi genbutsu.
• Visualize the “shape of the data”: distributions, sensitivity, clusters, anomalies, and
sparseness. Identify quality issues.
• Capture rules and map data flows from source systems.
2. Build Models
A. Build training data
B. Pilot models
C. Iterate production models
• Form business and IT team: roles, super-labelers, biases.
• Design the data set’s scenarios and set quality criteria.
• Visualize attributes and confirm with business sponsors.
• Define rules to pre-process data and select open source algorithms.
• Visualize and communicate results. Show an early win. Ideally, prototype the UX.
• Plan enhancements to training data, algos, and applications.
• Refine data (feature shaping and dimensionality reduction).
• Customize rules and algorithms.
• Connect into the broader application starting with the data model.
3. Deliver Results
A. Build application
B. Communicate value
C. Sustain capabilities
• Visualize UX, define data model and APIs.
• Set non-functional requirements such as scalability, latency, and security.
• Define test plan.
• Communicate how the solution makes jobs better and brings value to customers
• Build understanding and support with key influencers
• Use multiple channels (meetings, email, calls) repeatedly to ensure reaching people
• Optimize costs and scalability. Plan for decreased costs.
• Confirm team skills and capacity to evolve the models.
• Set plan for and automate re-training models. Set expectations that models may expand the
range of scenarios covered and/or may improve precision.
Contact
Machine Intelligence Partners LLC serves clients
globally. Our people are centered in Boston,
Bozeman, Grand Rapids, London, New York, San
Francisco, and Washington, D.C.
Client relationship leaders:
New York
Jeremy Lehman
917.225.2011
jeremy.lehman@machineintel.com
Washington, D.C.
Philippe Berckmans
804.405.6009
philippe.berckmans@machineintel.com
Machine Intelligence is an Amazon Technology Partner
and member of the Microsoft Partner Network.
We are a veteran-owned small business.

Machine intelligence data science methodology 060420

  • 1.
    Competitive Advantage. ElegantlyEngineered. A Different Data Science Methodology
  • 2.
    We use data,analytics, and design to help clients perform at their best. Machine Intelligence catalyzes innovation, engineers machine learning applications, and builds enduring capabilities. We’re creative, rigorous, and efficient. We bring the sophistication of a large strategy firm with the speed and value of a focused boutique. We apply proven techniques, designs, and world-class expertise to: • Improve how companies engage customers • Optimize machine performance • Enhance process results
  • 3.
    Models reproduce howquestions are answered in training data. Business, not IT, should design training data. Most project time is used understanding how data is generated and building training data sets. Machine Learning is Simple Real world Training Data Results Generally a subset of scenarios in the real world. Data trains models that reproduce decisions in the training data with 80-95% accuracy. The full set of all consumers, machines, or business results that a model will forecast.
  • 4.
    A Different DataScience Methodology Many data science projects jump into algorithms and technology. We reverse the usual approach by first rigorously defining the business question and understanding data. The methodology: • Aligns the whole business • Sets practical expectations • Leads change • Builds sustaining capabilities Data Technology Business question Business goals Time and focus Data Technology
  • 5.
    Steps Foundation • Align changeacross the business • Understand data • Define the business question Results • Sustain capabilities • Communicate value • Build application Model • Iterate production model • Pilot models • Build training data 1. 2. 3.
  • 6.
    Project Phasing • Mosttime is spent understanding data and building training data. • An early pilot is key to refining to training data and building support for change. • Developing the full application starts early with a UX for the pilot model.
  • 7.
    1. Set Foundation A.Define the business question B. Align change C. Understand data • Learn and set expectations on the data science process and cloud hosting. • Define precise business questions. • Model how answering the business question delivers results. • Link business and regulatory needs to training data design and algorithm selection, e.g. does a model require easy explainability? • Build a coalition of sponsors and communicate the vision. • Define roles for compliance, customer service, finance, marketing, product, and sales. • Understand the data generating process: genchi genbutsu. • Visualize the “shape of the data”: distributions, sensitivity, clusters, anomalies, and sparseness. Identify quality issues. • Capture rules and map data flows from source systems.
  • 8.
    2. Build Models A.Build training data B. Pilot models C. Iterate production models • Form business and IT team: roles, super-labelers, biases. • Design the data set’s scenarios and set quality criteria. • Visualize attributes and confirm with business sponsors. • Define rules to pre-process data and select open source algorithms. • Visualize and communicate results. Show an early win. Ideally, prototype the UX. • Plan enhancements to training data, algos, and applications. • Refine data (feature shaping and dimensionality reduction). • Customize rules and algorithms. • Connect into the broader application starting with the data model.
  • 9.
    3. Deliver Results A.Build application B. Communicate value C. Sustain capabilities • Visualize UX, define data model and APIs. • Set non-functional requirements such as scalability, latency, and security. • Define test plan. • Communicate how the solution makes jobs better and brings value to customers • Build understanding and support with key influencers • Use multiple channels (meetings, email, calls) repeatedly to ensure reaching people • Optimize costs and scalability. Plan for decreased costs. • Confirm team skills and capacity to evolve the models. • Set plan for and automate re-training models. Set expectations that models may expand the range of scenarios covered and/or may improve precision.
  • 10.
    Contact Machine Intelligence PartnersLLC serves clients globally. Our people are centered in Boston, Bozeman, Grand Rapids, London, New York, San Francisco, and Washington, D.C. Client relationship leaders: New York Jeremy Lehman 917.225.2011 jeremy.lehman@machineintel.com Washington, D.C. Philippe Berckmans 804.405.6009 philippe.berckmans@machineintel.com Machine Intelligence is an Amazon Technology Partner and member of the Microsoft Partner Network. We are a veteran-owned small business.