Reference: The Information: A History, a Theory, a Flood by James Gleick
http://blogs.forrester.com/brian_hopkins/11-08-29-big_data_brewer_and_a_couple_of_webinars
Data               Dissemination,                                  Collaboration                                          ...
•   SSIS                                                •   SQL Server•   Datamarket                        • SQL Server  ...
// Map Reduce function inJavaScriptvar map = function (key,value, context) {var words =value.split(/[^a-zA-Z]/);for (var i...
Operations                    • AD integration                    • Kerberos  Information       • Cluster Management      ...
Name                                           NodeOn Premise Enterprise                                 Azure Blob       ...
Situation & Requirements  Yahoo attracts 700 million monthly visitors worldwide  Help advertisers better understand thei...
Simplicity & Manageability   Combine internal and   Broaden accessibility of Big   Develop once, deploy on- of Windows for...
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation
Upcoming SlideShare
Loading in...5
×

BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation

1,134

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,134
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
63
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

BigDataCloud meetup Feb 16th - Microsoft's Saptak Sen's presentation

  1. 1. Reference: The Information: A History, a Theory, a Flood by James Gleick
  2. 2. http://blogs.forrester.com/brian_hopkins/11-08-29-big_data_brewer_and_a_couple_of_webinars
  3. 3. Data Dissemination, Collaboration Analysis and Data Acquisition Sharing, and Mining and Modeling Preservation VisualizationConsolidation Storage Analysis Consumption • Incremental enhancements benefit all stakeholders • Federated data cleansing benefits all • Shared analytical efforts builds consensus
  4. 4. • SSIS • SQL Server• Datamarket • SQL Server • SSAS, SSRS • Power View• Hadoop • BizTalk • PDW • FAST • PowerPivot• MDS / DQS • StreamInsight • Azure Storage • HPC • Silverlight Discovery / Routing / Storage / Analysis Visualization Acquisition Alerting OLTP Azure – Windows Server – System Center
  5. 5. // Map Reduce function inJavaScriptvar map = function (key,value, context) {var words =value.split(/[^a-zA-Z]/);for (var i = 0; i <words.length; i++) { if(words[i] !== ""){context.write(words[i].toLowerCase(), 1);}}};var reduce = function(key, values, context) {var sum = 0;while (values.hasNext()) {sum +=parseInt(values.next()); }context.write(key, sum);};
  6. 6. Operations • AD integration • Kerberos Information • Cluster Management Worker Developer and Monitoring • Visual Studio Tools• Excel Integration Integration• HiveODBC: PowerPivot • JavaScript• Power View Map/Reduce • HiveQL Hadoop Analytics • Mahout • Pegasus… Data Scientist
  7. 7. Name NodeOn Premise Enterprise Azure Blob Data Data StorageContent Node Node• Transactional DBs• On Prem logs Azure Blob• Internal sensors Storage Data Data Node Node SQL AzureCloud Enterprise Content• Generated in Azure Application end point• Generated/stored S3 elsewhere3rd Party Content• Azure Datamarket• Public content• Delivered online
  8. 8. Situation & Requirements  Yahoo attracts 700 million monthly visitors worldwide  Help advertisers better understand their targeted ad campaigns  Each consumer is a member of an average of 10 segments – explodes the data by 10xTechnical Solution Centralize data aggregation on grid using Hadoop Update 24TB SSAS cube continuously Use SASS aggregations extensively – cut down on Hadoop aggregations dramaticallyOutcome  Doubled customer revenue for eCPM campaigns and increases advertiser spending by 15 percent.  Provides more relevant advertising data.  Loads data faster and produces better results.
  9. 9. Simplicity & Manageability Combine internal and Broaden accessibility of Big Develop once, deploy on- of Windows for Hadoop external data Data Analytics to all users premises or in the cloud
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×