Unstructured data is growing 62% per year faster than structured data. According to Gartner, data volumes are set to grow 800% in aggregate over the next 5 years, and 80% of it will be unstructured data.
This on-demand webinar will highlight and discuss:
How applying big data analytics to unstructured data can help you gain richer, deeper and more accurate insights to gain competitive advantages
The sources of unstructured data which include email, social media platforms, CRM systems, call center platforms (including notes and speech-to-text transcripts), and web scrapes
How monitoring the communications of your customers and prospects enables you to make time-sensitive decisions and jump on new business opportunities
2. View Recording !!
You can view the recording of this webinar
at:
http://info.datameer.com/Online-Slideshare-
Analyzing-Unstructured-Data-in-Hadoop-
On-Demand.html
4. Agenda!
• Market & Data trends
• Tuning into new channels
• The good news
• The rise of wrangling
• Analytics requirements
• Bringing order to chaos
• Use Cases
6. Market & Data Trends!
• Data volumes will grow 800% in 5 years
• Unstructured data is growing 62% faster
• 80% of all data will be unstructured in 2019
• “Big Unstructured Data” requires new tech.
• 85% of the Fortune 500 will be unable to exploit
Big Data for competitive advantage through 2015
Source: Gartner
7. Market & Data Trends!
• ‘Multi-structured’ is the word of the day
• Mainstream IT tools broadening the base
• Competitive advantage lies outside your firewall!
S U
9. Tuning Into New Channels!
• Public & social data is available by the firehose
• The new discipline: connecting, filtering, switching
• Find the right keywords, dictionaries, segments
• Learn from, but don’t emulate search engines
• Beware of point solutions
10. The Good News!
• All data has structure
• Storage is cheap (Hadoop ~= $300 / TB)
• Processing is cheap (“free”)
• Unstructured data compresses well
• Data APIs abound
• Public data blossoming (data.gov, etc.)
11. The Rise of Wrangling!
• A ‘record’ is no longer a record
• Event streams need different angles of attack
• Explode, project, align, window, search
• New companies/technologies specializing in it
Source: Gartner
12. Analytics Requirements (1)!
• A scalable Big Data foundation (Hadoop)
• Schema-on-read
• Data profiling & cleansing
• Fast, visual iteration over samples
Source: Gartner
13. Analytics Requirements (2)!
• Text mining, without programing
• Helper functions for semi/un-structured formats
• Data connectors, new visualizations
• Patience, and a an culture of data discovery
16. Bringing Order to Chaos!
• ‘Big Data Visualization’ is an oxymoron
• Rich, detailed summaries are the goal
• ‘It’s the analytics, stupid’
17. Industry Use Cases!
• Retail: Competitive pricing through web scraping
• MFG: Product sentiment through Twitter
• FSI: Brand preferences from Facebook “likes”
• Gov: Nefarious behavior through email seizure!