Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Converting Buzz into Activity
in the Big Data Ecosystem
Sandra Hanchard & Tirath Ramdas
Big Data Malaysia

Big Data World ...
10 events since May 2012
Plus 9 during

•

A networking group for people passionate about data.

•

We talk about applicat...
Rationale
Opportunities and inhibitors to Big Data
activity in Malaysia?
 Who’s interested vs. involved?
 What is the cu...
About the survey..
Methodology






Content

Collection over October 2013
Distribution via Big Data
Malaysia email & ...
Lucky draw sponsors

5

November 2013
Technical assistance

6

November 2013
Half respondents in ICT, even spread
across other sectors
Information and communications technology
Marketing services
Pro...
48% management, 40% practitioner
CEO, or equivalent

C-level:
24%

CIO/CTO, or equivalent

24%

Senior/middle manager

C-l...
No. respondents

Top management concentrated in SMEs,
other roles spread Enterprise > Boutique
20

C-level

15
10
5
0

No....
Some complacency in how data
leveraged, bullish anticipated spend
My organization is effectively
deriving tangible benefit...
..but actual and planned headcount
remain low
Big Data Headcount (HC): Current vs. Next 12 months
Low HC,
High growth
HC N...
ICT high outsourcing intent
suggests technical fragmentation
How willing would you be to
outsource high-skill tasks in
you...
Whatever Big Data offers, we’re all
focused on the ‘customer’
End-uses ranked by priority of relevance
All

Non-ICT

Custo...
Managers have higher aspirations
for End-uses vs. practitioners
End-uses ranked by priority of relevance
All*
Customer beh...
Desired skills: distributed data
analysis, nod to fundamentals
Capabilities ranked by priority of need

Industry swing by ...
Specific skills: ICT demands
Hadoop, Non-ICT wants algorithms
1.6

Key priority areas:

Normalized Priority Score

1.4
1.2...
Desired capabilities: uncovering
and visualizing patterns in real-time
Capabilities ranked by priority of relevance

Indus...
Strong willingness to outsource
bodes well for service providers
Willingness to Outsource swing
vs. All Managers

Capabili...
High-commitment managers had
higher prioritization of capabilities
Forward-capacity swing vs. All
Managers

Capabilities r...
Organizations are sourcing data
both internally and externally
My organization uses sources for Big Data initiatives prima...
Open data from Government
needed to support ecosystem
Having access to some government
data does/will create valuable Big
...
Government data needed for
benchmarking & consolidation
“In order for us to understand the needs of Malaysians, statistica...
General wariness of “red-tape”,
PDPA identified as biggest concern
Do you believe your local legal/regulatory
environment ...
Less than a third willing to upload
internal data to a Cloud service
I am willing to upload my internal
data to third-part...
Some conclusions..







25

Strong ‘Grass-roots’ and Mid-tier support for
Big Data in Malaysia. Unknown at local
Ent...
..further research
How are Big Data budgets being split
between infrastructure / personnel / data?
 Who qualifies as a ‘D...
5 Questions to ask yourself today









27

Where are your aspirational blind-spots?
Internal vs. External.
Are yo...
Get in touch
Tirath Ramdas
tirath@bigdatamalaysia.org
Founder
Sandra Hanchard
sandra@bigdatamalaysia.org
Researcher
You ca...
Upcoming SlideShare
Loading in …5
×

Big Data in Malaysia - Emerging Sector Profile

1,440 views

Published on

Big Data is a buzz term with global traction. But while interest and awareness is high, is that buzz being converted effectively into significant economic activity in Malaysia? What are the inhibitors to driving Big Data solutions? And where are the opportunities we should nurture? In this presentation, Big Data Malaysia shares insights from a new survey based on a range of stakeholders in this emerging industry.

Published in: Business, Technology

Big Data in Malaysia - Emerging Sector Profile

  1. 1. Converting Buzz into Activity in the Big Data Ecosystem Sandra Hanchard & Tirath Ramdas Big Data Malaysia Big Data World Show JW Marriott, Kuala Lumpur, November 2013
  2. 2. 10 events since May 2012 Plus 9 during • A networking group for people passionate about data. • We talk about applications in social media analytics, financial data, consumer insight, telecommunications, etc. • Participants from end-users, vendors, academia – engineers, analysts, managers, professors, entrepreneurs. • Wide technical breadth (Hadoop, R, Greenplum, Postgres, Cassandra, MongoDB, 0MQ, Prudsys, Storm, Acunu Analytics, Google BigQuery, Dremel, Oracle, Datasift, Tableau, GPUs, NetApp, Hive, Hbase, AWS, MySQL, Teradata…). www.BigDataMalaysia.org
  3. 3. Rationale Opportunities and inhibitors to Big Data activity in Malaysia?  Who’s interested vs. involved?  What is the current and future capacity for big data skills?  Where are the critical gaps in skills?  What are the soft inhibitors, including data access, regulation and perception?  3 November 2013
  4. 4. About the survey.. Methodology     Content Collection over October 2013 Distribution via Big Data Malaysia email & social media channels and partners 108 respondents over 90 organizations Not intended to be representative. Illustrative of Big Data Malaysia network       About you Enabling Your Organization End-Uses Skills Capabilities Data Sources This deck contains preliminary findings: we are generating hypotheses for further analysis. 4 November 2013
  5. 5. Lucky draw sponsors 5 November 2013
  6. 6. Technical assistance 6 November 2013
  7. 7. Half respondents in ICT, even spread across other sectors Information and communications technology Marketing services Professional, scientific, and technical services Educational services 48% Media and Classifieds Finance and Insurance Government 7% n = 108 7 8% Manufacturing Not applicable e.g. I am a student Other Preliminary findings - November 2013
  8. 8. 48% management, 40% practitioner CEO, or equivalent C-level: 24% CIO/CTO, or equivalent 24% Senior/middle manager C-level Other Software engineer Practitioner: 40% Analyst Emerging number of Data scientists (6%) Who are they? Data scientist Technical consultant (e.g. pre-sales) Academic or Scientist Full time student Other 0 5 10 15 20 No. respondents Practitioners include; Software engineers, Analysts, Data scientists, Technical consultants and Academic/Scientists. 8 Preliminary findings - November 2013 25 30
  9. 9. No. respondents Top management concentrated in SMEs, other roles spread Enterprise > Boutique 20 C-level 15 10 5 0 No. respondents Employees exceeding Employees from 200 Employees from 75 to Employees from 5 to Employees of less than 1000 to not exceeding 1000 not exceeding 200 less than 75 5 12 10 8 6 4 2 0 Senior/middle manager No. respondents Employees exceeding Employees from 200 Employees from 75 to Employees from 5 to Employees of less than 1000 to not exceeding 1000 not exceeding 200 less than 75 5 9 Self-employed Self-employed 15 Practitioner 10 5 0 Employees exceeding Employees from 200 Employees from 75 to Employees from 5 to Employees of less than 1000 to not exceeding 1000 not exceeding 200 less than 75 5 Preliminary findings - November 2013 Self-employed
  10. 10. Some complacency in how data leveraged, bullish anticipated spend My organization is effectively deriving tangible benefits from our organizational data assets How do you expect your spend on Big Data to change in 2014 compared to 2013? 100% 21% Strongly agree Increasing by more than 25% % Respondents n=108 Increasing by between 10% and 25% 39% Agree 50% Increasing by between 5% and 10% Increasing by less than 5% 29% 0% Neutral 4% 7% Disagree Strongly disagree No change 0 Don’t know/prefer not to say = 16 Managers (n=75) defined as respondents who selected ‘yes’ to having managerial responsibility in their role. 10 Preliminary findings - November 2013 5 10 15 20 No. ‘Managers’ 25
  11. 11. ..but actual and planned headcount remain low Big Data Headcount (HC): Current vs. Next 12 months Low HC, High growth HC Next 12 months > 50 1 21-50 1 1 4-10 1-3 Don’t know/Prefer not to say 7 11 4 2 6 1 1 1 8 1 3 No. ‘Managers’ 5 2 4 1-3 4-10 11-20 High HC, Low growth 2 None yet Analysis based on Managers; excluding those who selected ‘Don’t know/Prefer not to say’ for Current headcount. n=68. 11 High HC, High growth 1 3 11-20 None Low HC, Low growth 3 Current HC Preliminary findings - November 2013 21-50 > 50
  12. 12. ICT high outsourcing intent suggests technical fragmentation How willing would you be to outsource high-skill tasks in your Big Data initiatives to external consultants? How do you expect your spend on Big Data to change in 2014 compared to 2013? Non-ICT 100% Increasing by more than 25% 24% % Respondents 42% Increasing by between 10% and 25% Quite/Extremely willing 24% Moderately willing Increasing by less than 10% 50% 27% Slightly willing 24% 0% Not at all willing 12% 28% 18% 0% Non-ICT ICT n=29 12 ICT n=33  10% 20% 30% 40% 50% Boutique opportunities amongst ICT, but Non-ICT also priority targets given matching expected spend Preliminary findings - November 2013
  13. 13. Whatever Big Data offers, we’re all focused on the ‘customer’ End-uses ranked by priority of relevance All Non-ICT Customer behavioural profiling Customer service and/or experience Competitive intelligence Customer retention Social trends monitoring Customer acquisition Customer cross-sell and/or up-sell Forecasting supply and demand Brand monitoring Product and service innovation Operational cost management Risk management Supply-chain monitoring Infrastructure and assets monitoring Compliance and regulatory issues  ICT 103 101 105 Very relevant 104 108 117 104 115 115 135 132 158 111 Moderately relevant Slightly relevant Not all relevant n=56 ICT greater production focus as well as Forecasting n=52 Where are your blind spots? Are you too internally or externally focused with aspirations. Big data providers: Specialise or provide holistic solutions? Swing is an indexed number based on relevance score. 13 Non-ICT greater external focus e.g. social trends & brand monitoring 110 109 n=108  Industry swing by All Preliminary findings - November 2013
  14. 14. Managers have higher aspirations for End-uses vs. practitioners End-uses ranked by priority of relevance All* Customer behavioural profiling Customer service and/or experience Customer retention Customer cross-sell and/or up-sell Customer acquisition Competitive intelligence Social trends monitoring Forecasting supply and demand Product and service innovation Brand monitoring Operational cost management Risk management Supply-chain monitoring Infrastructure and assets monitoring Compliance and regulatory issues All* excludes Students and Others   Very relevant Moderately relevant Slightly relevant Not at all relevant n=95 Role function swing by All Practitioner Manager 109 102 114 104 111 101 107 109 110 107 105 124 113 108 91 108 n=44 Practitioners have stronger internal org. focus n=51 Managers need to align perception of Big Data’s ‘value’ throughout organization with business objectives. What internal end-uses are being overlooked by Managers? Swing is an indexed number based on relevance score. 14 Managers’ key priorities:  Profiling  Retention  Acquisition Preliminary findings - November 2013
  15. 15. Desired skills: distributed data analysis, nod to fundamentals Capabilities ranked by priority of need Industry swing by All All Specialised data analysis, modeling, simulation (op.research, machine learning) Distributed systems (e.g. Hadoop) deployment and/or administration Fundamental computer science and/or software engineering Industry-specific/domain knowledge Applied math and/or statistics Web/mobile development and/or visualization Research experience from any quantitative discipline Business (strategy, marketing, product development, etc.) Hardware/sensor design    101 109 123 High need Little need No need Those with Intermediate/Advanced skills prioritize distributed systems Those with Basic tech skills prioritize domain knowledge Soft skills undervalued (Strategy, marketing etc.) No love for Internet of Things (hardware/sensor design skills). Swing is an indexed number based on need score. 15 ICT 106 119 109 113 116 n=108  Non-ICT Critical need Preliminary findings - November 2013 111 n=56 n=52 Skill-level swing by All Intermediate/ advanced 105 117 122 Entry 102 111 103 104 101 n=47 n=61
  16. 16. Specific skills: ICT demands Hadoop, Non-ICT wants algorithms 1.6 Key priority areas: Normalized Priority Score 1.4 1.2 1 0.8 1. 2. 3. 4. Big and Distributed Data (Hadoop, MapReduce) Algorithms (computational complexity, CS theory) Machine Learning (decision trees, neural nets, SVM, clustering) Back-End Programming (e.g. Java/C++/Python/Rails/Objective C) 0.6 0.4 0.2 0 ICT (n=35) 16 Combined (n=61) Non-ICT (n=26) Preliminary findings - November 2013
  17. 17. Desired capabilities: uncovering and visualizing patterns in real-time Capabilities ranked by priority of relevance Industry swing vs. All All Real-time insights from real-time data streams Uncovering patterns (e.g. segments, correlations) from multi-structured data sets Visualizing/presenting insights Data discovery and exploration across many data sources Statistical analysis on big working data sets (>100GB) Automated decision making Machine-generated data (e.g. log files, periodic diagnostics) Content and sentiment from online media (e.g. social media) Efficiently and safely storing large data sets on infrastructure controlled by my org. Image, video, and audio data Physical sensor networks (e.g. "Internet of Things") Non-ICT  120 137 173 109 Slightly relevant Not at all relevant n=56 n=52 Clear desire to derive ‘meaning’ from Big Data (i.e. insights) Those in a non-ICT role more likely to prioritize content; social and mediarich data (i.e. very unstructured data) Swing is an indexed number based on relevance score. 17 110 125 107 119 119 117 136 Moderately relevant n=108  ICT Very relevant Preliminary findings - November 2013
  18. 18. Strong willingness to outsource bodes well for service providers Willingness to Outsource swing vs. All Managers Capabilities ranked by priority of relevance All Managers Visualizing/presenting insights Uncovering patterns (e.g. segments, correlations) from multi-structured data sets Real-time insights from real-time data streams Data discovery and exploration across many data sources Statistical analysis on big working data sets (>100GB) Automated decision making Efficiently and safely storing large data sets on infrastructure controlled by my org. Machine-generated data (e.g. log files, periodic diagnostics) Content and sentiment from online media (e.g. social media) Image, video, and audio data Physical sensor networks (e.g. "Internet of Things") Unwilling  Moderately relevant Slightly relevant Not at all relevant n=38 Top priority by Managers to communicate ‘meaning’ from Big Data (visualization & insights) Desire for ‘discovery’: leverage through better information management Swing is an indexed number based on relevance score. 18 111 120 113 103 124 120 122 146 130 167 110 Very relevant n=75  Willing Preliminary findings - November 2013 n=37
  19. 19. High-commitment managers had higher prioritization of capabilities Forward-capacity swing vs. All Managers Capabilities ranked by priority of relevance All Managers High-FC Visualizing/presenting insights Uncovering patterns (e.g. segments, correlations) from multi-structured data sets Real-time insights from real-time data streams Data discovery and exploration across many data sources Statistical analysis on big working data sets (>100GB) Automated decision making Efficiently and safely storing large data sets on infrastructure controlled by my org. Machine-generated data (e.g. log files, periodic diagnostics) Content and sentiment from online media (e.g. social media) Image, video, and audio data Physical sensor networks (e.g. "Internet of Things") Cautious-FC Moderately relevant 102 102 109 109 114 124 129 107 113 Slightly relevant Not at all relevant 117 Very relevant n=46 120 n=31 n=15 Only exception was media-rich data Forward capacity: Measures resource commitment *Headcount *Expected headcount *Expected spend *Willingness to outsource ..which will increase likelihood of delivering desired Big Data outcomes Swing is an indexed number based on relevance score. 19 Preliminary findings - November 2013
  20. 20. Organizations are sourcing data both internally and externally My organization uses sources for Big Data initiatives primarily from the following: None yet ICT (n=45) 7% 27% 20% 11% 36% Internal data Open-access third-party data (incl. government) Non-ICT (n=47) 17% 28% 6% 6% 43% Proprietary third-party data Combination   20 ICTs more open sources; Non-ICTs should prioritize content opportunities e.g. data journalism Respondents focused on profiling customers more dependent on thirdparty data e.g. social media Respondents who selected "Very high" for relevance of Customer behavioural profiling Combination Internal data Open-access third-party data Proprietary third-party data None yet Sorted by highest %. Bars illustrate sw ing against remaining sample Preliminary findings - November 2013 104 59 146 313 87 n=45
  21. 21. Open data from Government needed to support ecosystem Having access to some government data does/will create valuable Big Data opportunities for me 100% % of respondents n=108 21% Strongly agree 19% Agree 49% Neutral 4% 6% What kinds of government data will assist you?  Demographic; socioeconomic; behaviour  Population (online & offline), migratory  Crime (by ethnic group); Border Security  Public &community services (utilities, health, education)  Location (by utility); GIS  Financial; credit  Weather Disagree Strongly disagree 50% 0% 21 Preliminary findings - November 2013
  22. 22. Government data needed for benchmarking & consolidation “In order for us to understand the needs of Malaysians, statistical data from the population census is important to identify correlations with internal behavioural data” – Head of Decision Science, MNC bank “Government has many different sets of survey data, collected from various sources. For instance, Ministry of Health data can be sourced from private and general hospitals, clinics or consultancies. Big data offers a mechanism to speed up consolidation of all this information, without any processing delays to configure each and every source.” - Jin Chuan Tai Director, ChrysaSys Consulting Sdn Bhd 22 Preliminary findings – November 2013
  23. 23. General wariness of “red-tape”, PDPA identified as biggest concern Do you believe your local legal/regulatory environment a hindrance to your planned Big Data initiatives? 100% Not at all a hindrance 2% % of respondents n=108 13% 47% Not a hindrance What hindrances in particular?      Personal Data Protection Act 2010 (PDPA) Red-tape Bureaucracy and organizational structure Data compliance and data risk Loss of data  15% “Don’t know”: greater education around PDPA needed? Neutral 50% 17% Moderate hindrance 6% Severe hindrance 15% Don’t know “Regulation does prevent some of our products or product features being deployed in some markets.” – Head of Development, global marketing analytics firm 0% 23 Preliminary findings - November 2013
  24. 24. Less than a third willing to upload internal data to a Cloud service I am willing to upload my internal data to third-party infrastructure (e.g. a public cloud) 100% % of respondents n=108 Strongly agree 20% 50% 6% Vendors need to identify concerns – privacy, migration cost, perceptions specific to Malaysian professionals Agree 37% Opinion is divided starkly amongst High-forward capacity respondents. Neutral High-forward Capacity 23% 14% 0% 24  Disagree Strongly disagree Strongly agree Agree Neutral Disagree Strongly disagree Bars represent swing against Bars illustrate sw ing against Cautious-forw ard Capacity Preliminary findings - November 2013 175 113 41 210 153 n=31
  25. 25. Some conclusions..     25 Strong ‘Grass-roots’ and Mid-tier support for Big Data in Malaysia. Unknown at local Enterprise, C-level. Aspirations high but Human Resource commitment a concern. Immediate skilling priorities include R and Hadoop. Opportunities for boutique firms in Malaysia to meet specialist technical needs with global punch. Preliminary findings - November 2013
  26. 26. ..further research How are Big Data budgets being split between infrastructure / personnel / data?  Who qualifies as a ‘Data scientist’ – what skills do they have, and what value do they add?  How can Big Data activity contribute to Malaysia’s push to become a high-income nation by 2020?  26 November 2013
  27. 27. 5 Questions to ask yourself today      27 Where are your aspirational blind-spots? Internal vs. External. Are your aspirations unrealistic? Are you committing resources aggressively enough? Are you prioritizing the right blend of skills? Are you driving/participating in cultural change for Big Data advocacy? November 2013
  28. 28. Get in touch Tirath Ramdas tirath@bigdatamalaysia.org Founder Sandra Hanchard sandra@bigdatamalaysia.org Researcher You can find us on..  Feedback / questions / comments  What do you need to know? www.bigdatamalaysia.org  How can Big Data Malaysia serve your organization? November 2013

×