Predictive Modelling
Preserving Data Heuristics
Present for Solutioning
Data Science Life Cycle
1- Scale Wide Data
Explicit Feature
Selection
-Dimensionality
Reduction
Methods
2- Modelled Data
Implicit Feature
Selection
- Modelling
/Segmentation
Algorithms
3 – Un Checked Bounds
Unbounded at
Leaf Nodes
Un Checked
attributes -
greedy attribute
split
4- Un Processed Dimensions
Features Left in
Steps 1 ~3
5 - Unprocessed
Samples
Left by
Modelling
Algorithms to
Reduce Variance
Data Science Life Cycle
1- Scale Wide Data
Explicit Feature
Selection
-Dimensionality
Reduction
Methods
2- Modelled Data
Implicit Feature
Selection
- Modelling
/Segmentation
Algorithms
3 – Un Checked Bounds
Unbounded at
Leaf Nodes
Un Checked
attributes -
greedy attribute
split
4- Un Processed Dimensions
Features Left in
Steps 1 ~3
5 - Unprocessed
Samples
Left by
Modelling
Algorithms to
Reduce Variance
Extract Per Class Feature Heuristics
Automate Bound Checks
Automate - Confidence Score Generation
, Bound Check Violations
Configurable Bound Checks
Index
References – PHD Thesis / Book
• Feature Selection Problem • Feature Selection for High
Dimensional Data

Heuristics Data Science Life Cycle

  • 1.
    Predictive Modelling Preserving DataHeuristics Present for Solutioning
  • 2.
    Data Science LifeCycle 1- Scale Wide Data Explicit Feature Selection -Dimensionality Reduction Methods 2- Modelled Data Implicit Feature Selection - Modelling /Segmentation Algorithms 3 – Un Checked Bounds Unbounded at Leaf Nodes Un Checked attributes - greedy attribute split 4- Un Processed Dimensions Features Left in Steps 1 ~3 5 - Unprocessed Samples Left by Modelling Algorithms to Reduce Variance
  • 3.
    Data Science LifeCycle 1- Scale Wide Data Explicit Feature Selection -Dimensionality Reduction Methods 2- Modelled Data Implicit Feature Selection - Modelling /Segmentation Algorithms 3 – Un Checked Bounds Unbounded at Leaf Nodes Un Checked attributes - greedy attribute split 4- Un Processed Dimensions Features Left in Steps 1 ~3 5 - Unprocessed Samples Left by Modelling Algorithms to Reduce Variance Extract Per Class Feature Heuristics Automate Bound Checks Automate - Confidence Score Generation , Bound Check Violations Configurable Bound Checks Index
  • 4.
    References – PHDThesis / Book • Feature Selection Problem • Feature Selection for High Dimensional Data