Big Data trap
francis@qmining.com
@fraka6
Data/Big Data Knowledge Action
People care about Knowledge/actions not data
Agenda
● Big data dilemma
● When are we doing Big Data?
● Maturity/Evolution steps
● The big data trap
● Optimal design = real time data-mining
● Increase your chances of success
The Big Data Dilemma
Big Data =
Data + IO bounded (disk)
CPU
<100%Data
IO bounded
QA
BI
Maturity
Barriers of entry Levels
Just another barrier of entry
Trap = no KPI
● No KPI -> batch processing -> big data
● KPI -> real time -> no big data complexity
Optimal design = real-time data-mining
● Events -> everything is an event
● + Rule -> create signal from events
● + KPIs -> selection of signals (top level)
● + Incident = signal static/dynamic thresholds
● + Root causes analysis
○ Bayesian inference (ratio signal)
○ Signal correlation (std signal)
○ Rule filtering (domain specific)
Increase chances of success
● Data driven culture
● Data quality culture (Avoid logs)
● Reach Analytics/BI level
● KISS
Recap
● Big Data = Small Data + IO bound
● Big data->Data->Analytics->Mining->Predictive
○ Data Quality = BIGGEST PROBLEM
○ Big Data = another barrier of entry
● Big data trap = no KPI
● KISS = real time data mining
hum...
Questions?
francis@qmining.com

Big data trap

  • 1.
  • 2.
    Data/Big Data KnowledgeAction People care about Knowledge/actions not data
  • 3.
    Agenda ● Big datadilemma ● When are we doing Big Data? ● Maturity/Evolution steps ● The big data trap ● Optimal design = real time data-mining ● Increase your chances of success
  • 4.
    The Big DataDilemma
  • 5.
    Big Data = Data+ IO bounded (disk) CPU <100%Data IO bounded
  • 6.
    QA BI Maturity Barriers of entryLevels Just another barrier of entry
  • 7.
    Trap = noKPI ● No KPI -> batch processing -> big data ● KPI -> real time -> no big data complexity
  • 8.
    Optimal design =real-time data-mining ● Events -> everything is an event ● + Rule -> create signal from events ● + KPIs -> selection of signals (top level) ● + Incident = signal static/dynamic thresholds ● + Root causes analysis ○ Bayesian inference (ratio signal) ○ Signal correlation (std signal) ○ Rule filtering (domain specific)
  • 9.
    Increase chances ofsuccess ● Data driven culture ● Data quality culture (Avoid logs) ● Reach Analytics/BI level ● KISS
  • 10.
    Recap ● Big Data= Small Data + IO bound ● Big data->Data->Analytics->Mining->Predictive ○ Data Quality = BIGGEST PROBLEM ○ Big Data = another barrier of entry ● Big data trap = no KPI ● KISS = real time data mining
  • 11.