Everyone’s talking about big data but the bulk of the conversation seems to focus on a new level of business intelligence and an ever-increasing volume of data organised into OLTP, OLAP and NoSQLsiloes. In this talk, Mark Wilson puts forward a view that the real value is not from the big data itself but how we can employ linked data concepts to integrate structured, unstructured and semistructured data sets – and then use this unified data source to derive new value.
The problem with big data: and a solutionThe problem: “New reference architectures will include both big data and enterprise data warehouses” [IDC, 19 January 2012] Two worlds: structured and unstructured data (plus external data sources, documents stored in structured databases, etc.) Siloes create issues with management, integration, etc.The solution: Linked data – a single reference point for all data in the enterprise#CloudCamp 1 UNCLASSIFIED
Some history Fixed structure Difficult to change schema Simple reporting capabilities Complex to create new reports#CloudCamp 2 UNCLASSIFIED
Some history Completed transactions transferred to separate database for analysis “Data warehouse” Better reporting, data mining, etc. Still highly structured Data is historical May be aggregated#CloudCamp 3 UNCLASSIFIED
The smart guysReal-time update of completed transactions Transactions moved to data warehouse upon completion Smaller transactional databaseAllows for alerts to be generated when specific conditions met and action taken#CloudCamp 4 UNCLASSIFIED
A third “data silo” Masses of unstructured/semi- structured data being processed in NoSQL databases May, or may not be transferred to/from structured databases Time-consuming and inefficient Three types of data, each with their own limitations and own management considerations#CloudCamp 5 UNCLASSIFIED
Linked DataTie records together – even from separate data setsWe can express as triples with a specific grammar:Build up a graph to show machine-readable data in human form#CloudCamp 7 UNCLASSIFIED
Then add lots more data…Source: http://lod-cloud.net/ Each node is itself another graph (zoom in)#CloudCamp 8 UNCLASSIFIED
Aren’t we missing a trick?Use linked data as a the optimal reference source Broker of all data sourcesSingle view on structured and unstructured data Bring in external sources tooMapping, interconnecting, indexing and feeding In real timeQuery linked data to derive new value from old Infer relationships Gain new insights#CloudCamp 9 UNCLASSIFIED
About the authorMark Wilson, Strategy Manager, FujitsuMark is an analyst working within Fujitsu’s UK andIreland Office of the CTO, providing thoughtleadership both internally and to customers,shaping business and technology strategy. He has17 years experience of working in the IT industry,12 of which have been with Fujitsu. Mark has abackground in leading large IT infrastructureprojects with customers in the UK, mainlandEurope and Australia. He has a degree inComputer Studies from the University ofGlamorgan. Mark is also active in social media andwon the Individual IT Professional (Male) award inthe 2010 Computer Weekly IT Blog Awards. Markmay be found on Twitter @markwilsonit.If you would like to comment on the topics in thispresentation, Mark would welcome your feedback,by email to firstname.lastname@example.org.