This document discusses integrating Supermicro, Greenplum, and SAS to enable big data analytics platforms and infrastructure. It provides an agenda that includes discussing big data analytics platforms and infrastructure as well as a 1,000 node Hadoop cluster using EMC and Supermicro.
Big Data World Forum (BDWF http://www.bigdatawf.com/) is specially designed for data-driven decision makers, managers, and data practitioners, who are shaping the future of the big data.
Emerging Big Data & Analytics Trends with Hadoop InnoTech
The document discusses big data opportunities with Hadoop solutions from EMC. It describes how big data is transforming business through use cases in healthcare, financial services, and utilities. EMC addresses challenges of the Hadoop platform through its Isilon scale-out NAS storage and Greenplum's unified analytics platform. The solutions provide enterprise-grade data protection, management, and scalability for Hadoop implementations.
Progress with confidence into next generation ITPaul Muller
The keynote from my recent Amazing Summer 2012 tour where I spoke about the need for us to flip out thinking from traditional change control to a more forward looking approach by moving change and security up to the design phase.
The document discusses breakthroughs in information technology that can make cities smarter. It describes how sensors, networks, and data analytics can provide insights that improve outcomes across various city systems, including transportation, energy, water, and public safety. The core idea is that digital and physical systems are converging, allowing cities to leverage data to develop insight and wisdom. Examples are provided of cities using these technologies to monitor infrastructure in real-time, predict problems, and better coordinate resources.
The document discusses how exponential data growth is straining centralized cloud infrastructure and driving up costs due to lack of economies of scale. It argues that a more distributed and decentralized approach is needed to better manage and leverage the vast amount of unused capacity at the edge. This includes distributing data across devices, networks, and data centers instead of concentrating it within massive centralized data centers. A hybrid model is proposed that keeps some functions like policy centralized while pushing processing and storage out closer to where the data is created and used.
Datalicious is a data analytics agency founded in late 2007 that has grown to provide 360-degree data services. It has a team of analysts and developers with expertise in web analytics, data modeling, and executing smart data-driven marketing campaigns. The document outlines Datalicious' history and services, which include collecting data from various platforms, generating insights through analytics and modeling, and applying those insights through targeted campaigns and marketing automation. Examples of the types of clients and industries served are also provided.
This document discusses big data and Talend's goal of democratizing big data through its open source integration platform. It begins by defining big data and explaining the challenges it poses related to volume, velocity, variety and other factors. It then outlines Talend's goal of providing intuitive graphical tools to design and run big data jobs within Hadoop, abstracting away the underlying code generation. The document stresses that data quality is especially important for big data and how Talend supports implementing data quality checks either as part of loading data into Hadoop or as a separate job after loading. Finally it provides an overview of Talend's roadmap to add support for additional Hadoop technologies over time such as HCatalog, Oozie and more
Big Data World Forum (BDWF http://www.bigdatawf.com/) is specially designed for data-driven decision makers, managers, and data practitioners, who are shaping the future of the big data.
Emerging Big Data & Analytics Trends with Hadoop InnoTech
The document discusses big data opportunities with Hadoop solutions from EMC. It describes how big data is transforming business through use cases in healthcare, financial services, and utilities. EMC addresses challenges of the Hadoop platform through its Isilon scale-out NAS storage and Greenplum's unified analytics platform. The solutions provide enterprise-grade data protection, management, and scalability for Hadoop implementations.
Progress with confidence into next generation ITPaul Muller
The keynote from my recent Amazing Summer 2012 tour where I spoke about the need for us to flip out thinking from traditional change control to a more forward looking approach by moving change and security up to the design phase.
The document discusses breakthroughs in information technology that can make cities smarter. It describes how sensors, networks, and data analytics can provide insights that improve outcomes across various city systems, including transportation, energy, water, and public safety. The core idea is that digital and physical systems are converging, allowing cities to leverage data to develop insight and wisdom. Examples are provided of cities using these technologies to monitor infrastructure in real-time, predict problems, and better coordinate resources.
The document discusses how exponential data growth is straining centralized cloud infrastructure and driving up costs due to lack of economies of scale. It argues that a more distributed and decentralized approach is needed to better manage and leverage the vast amount of unused capacity at the edge. This includes distributing data across devices, networks, and data centers instead of concentrating it within massive centralized data centers. A hybrid model is proposed that keeps some functions like policy centralized while pushing processing and storage out closer to where the data is created and used.
Datalicious is a data analytics agency founded in late 2007 that has grown to provide 360-degree data services. It has a team of analysts and developers with expertise in web analytics, data modeling, and executing smart data-driven marketing campaigns. The document outlines Datalicious' history and services, which include collecting data from various platforms, generating insights through analytics and modeling, and applying those insights through targeted campaigns and marketing automation. Examples of the types of clients and industries served are also provided.
This document discusses big data and Talend's goal of democratizing big data through its open source integration platform. It begins by defining big data and explaining the challenges it poses related to volume, velocity, variety and other factors. It then outlines Talend's goal of providing intuitive graphical tools to design and run big data jobs within Hadoop, abstracting away the underlying code generation. The document stresses that data quality is especially important for big data and how Talend supports implementing data quality checks either as part of loading data into Hadoop or as a separate job after loading. Finally it provides an overview of Talend's roadmap to add support for additional Hadoop technologies over time such as HCatalog, Oozie and more
Core Solutions President Ravi Ganesan along with co-presenter Adam Bauer, Senior Manager at Deloitte, recently spoke about the importance of establishing effective data collection and evaluation processes to better support today’s healthcare organizations. This SlideShare consists of that presentation, which included:
The use of the Digital Transformation Framework
The role of IoT in deciphering automated data
The clinical tools used to demonstrate data value
The implications of visual perception in sharing data
Discover what comes next for IBM Watson and the industries particularly suited for Watson solutions, such as healthcare, banking, and the financial sector. All of which deal with massive amounts of unstructured data coming from various sources. Find out how the advanced analytics used in Watson are being put to work in businesses around the world.
A Capability Maturity Framework for Sustainable ICTEdward Curry
The document proposes a Capability Maturity Framework for Sustainable ICT developed by the Innovation Value Institute. It aims to help organizations assess and improve their maturity in sustainable ICT practices. The framework evaluates capabilities across nine building blocks including strategy, processes, people and culture, and governance. Assessments provide insight into an organization's strengths and challenges to develop sustainable ICT. Increasing maturity involves systematically improving each of the nine building blocks over multiple levels from ad hoc to optimized practices.
Developing an Sustainable IT Capability: Lessons From Intel's JourneyEdward Curry
Intel Corporation set itself a goal to reduce its global-warming greenhouse gas footprint by 20% by 2012 from 2007 levels. Through the use of sustainable IT, the Intel IT organization is recognized as a significant contributor to the company’s sustainability strategy by transforming its IT operations and overall Intel operations. This article describes how Intel has achieved IT sustainability benefits thus far by developing four key capabilities. These capabilities have been incorporated into the Sustainable ICT Capability Maturity Framework (SICT-CMF), a model developed by an industry consortium in which the authors were key participants. The article ends with lessons learned from Intel’s experiences that can be applied by business and IT executives in other enterprises.
Esg Wp Isilon Scale Out Nas Comes Of Age Sep 08sydcarr
This document discusses the rise of scale-out NAS storage to address growing unstructured file data needs. Scale-out NAS uses clustering and a global namespace to scale capacity and performance linearly by adding commodity servers and disks. Isilon IQ is highlighted as an example scale-out NAS system that uses a shared memory architecture for high performance and efficiency. The growth of unstructured file data like videos and documents is outpacing structured data, driving more organizations to adopt scale-out NAS that can easily expand to handle petabytes of file storage requirements.
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEdward Curry
Energy Intelligence platforms can help organizations manage power consumption more efficiently by providing a functional view of the entire organization so that the energy consumption of business activities can be understood, changed, and reinvented to better support sustainable practices. Significant technical challenges exist in terms of information management, cross-domain data integration, leveraging real-time data, and assisting users to interpret the information to optimize energy usage. This paper presents an architectural approach to overcome these challenges using a Dataspace, Linked Data, and Complex Event Processing. The paper describes the fundamentals of the approach and demonstrates it within an Enterprise Energy Observatory.
E. Curry, S. Hasan, and S. O’Riáin, “Enterprise Energy Management using a Linked Dataspace for Energy Intelligence,” in The Second IFIP Conference on Sustainable Internet and ICT for Sustainability (SustainIT 2012), 2012.
Datalicious is a data marketing agency that was founded in late 2007. It has since grown to become a 360 degree data agency with specialist teams in data collection, analytics, and campaign execution. Datalicious utilizes best of breed data platforms and tools to generate insights from customer data and then implement targeted marketing campaigns to optimize performance. The company works across many industries and offers a wide range of data services to help clients capture insights and take action.
The New York Times is the largest metropolitan and the third largest newspaper in the United States. The Times website, nytimes.com, is ranked as the most
popular newspaper website in the United States and is an important source of advertisement revenue for the company. The NYT has a rich history for curation of its articles and its 100 year old curated repository has ultimately defined its participation as one of the first players in the emergingWeb of Data.
Data curation is a process that can ensure the quality of data and its fitness for use. Traditional approaches to curation are struggling with increased data volumes, and near real-time demands for curated data. In response, curation teams have turned to community crowd-sourcing and semi-automatedmetadata tools for assistance.
E. Curry, A. Freitas, and S. O’Riáin, “The Role of Community-Driven Data Curation for Enterprises,” in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25-47.
Hadoop, oracle and the industrial revolution of data Guy Harrison
The document discusses the rise of big data and how companies are leveraging it. It covers key aspects of big data including the volume, velocity, and variety of data. Hadoop and MapReduce are presented as important technologies for processing and analyzing large datasets in a distributed manner. Case studies of Google and how it uses big data are provided. Related technologies like HDFS, Hive, Pig, and HBase are also overviewed along with how they fit into the Hadoop ecosystem. The value of extracting insights from big data through machine learning, collective intelligence, and predictive analytics is highlighted.
Infosys International, Inc. is a technology company established in 1986 that provides IT services and products to government and commercial clients. It has received numerous awards and recognition for its work. The company offers services including consulting, application development, and staffing augmentation, as well as products such as digital pen technology, biometric security solutions, and content management systems.
BCS APSG The landscape of enterprise applicationsGeoff Sharman
It's a cliché that modern enterprise applications are simply web applications. But is that the whole truth? And if it isn't, what are all the pieces of an enterprise application and how do they fit together? How can we continue to use older technologies within these applications and how might we exploit new technologies in the future? What new challenges do enterprises face in the 21st Century and how might they affect the design of applications and programming systems?
There has been a lack of substantive data about the state of open source in the business intelligence and data warehousing market. In this presentation noted industry analyst Mark Madsen will present the results of recent market research on adoption profiles and characteristics for open source BI/DW.
This research surveyed adopters of open source to understand their reasons for adoption and the benefits they experienced. It also captured user demographics to identify who is adopting open source for BI/DW, where they are deploying it, and how it’s being used. Two highly experienced open source BI practitioners, Bruce Belvin (President, Monolith Software Solutions) and Jay Webster (President and COO at Consorte Media) will describe their BI implementations, their criteria and selection methodology, and share best practices.
The document outlines an ADMA short course on data, measurement, and ROI presented by Datalicious. It provides an overview of the course, which covers basic and advanced analytics concepts over two days. Participants will learn how to define a metrics framework, attribute media, analyze campaigns, and extract insights from data to optimize marketing performance. The course teaches analytics and data-driven strategies to improve marketing outcomes.
THE 3V’S OF BIG DATA: VARIETY, VELOCITY, and VOLUMEGigaom
The document discusses the 3 V's of big data: volume, velocity, and variety. It provides examples of how big data is characterized by these attributes and defines what makes data "big data". Specific challenges of big data like its mixed structured and unstructured nature are also examined.
This document discusses e-government and digital cities in Korea. It provides an overview of key e-government initiatives and digital city projects in Seoul. The three main points are:
1) E-government programs in Korea focus on improving public services, administrative efficiency, and transparency. Major programs include G4C for citizens and various information systems for government agencies.
2) Seoul is developing as a digital city through projects providing online government services, a geographical information system, and connecting government offices through digital networks.
3) Future directions of e-government include expanding mobile and TV-based services, and developing human resources to support innovation in government.
PCs for People october 2012 broadband taskforce presentationAnn Treacy
PCs for People present to the Minnestoa Broadband Task Force (Oct 2012) on their computer donation, refurbishing programs that puts computers in the hands of low income households.
The document discusses several technological trends that will reshape the future in the coming years:
1. The rise of "knowledge individuals" who are always connected via mobile devices and cloud services, blurring work and personal lives.
2. Dramatic reductions in the cost of storage and bandwidth will enable virtually unlimited sharing of information online.
3. Mobile applications and specialized devices will replace the PC as the primary means of internet access, ushering in a "post-PC" era.
4. Social networks and user-generated content will continue growing in importance both personally and professionally through platforms like Facebook.
5. Location-based services and embedded sensors will create new types of applications using real-time
The document discusses tapping into the $4 billion market opportunity for laptop data protection solutions for small to medium sized businesses (SMBs). It outlines the high costs of data loss for SMBs and how Sony's Laptop Data Protection solution can provide an attractive return on investment by reducing costs from data loss and IT administration of data recovery. The solution is scalable from 6 to 1,500 users and offers benefits like easy installation and management, automated backups, and security features. It represents a significant opportunity in the SMB space given the growth of laptop usage and inadequate existing solutions.
This document discusses the rise of the information center and Hitachi Data Systems' vision and strategy. It summarizes that HDS sees a major growth in digital data requiring new IT approaches, and that its blueprint is to create a common virtualized platform for all data, applications, and information. HDS claims its strategy can deliver significant cost savings and value to customers through virtualization, automation, cloud-readiness, and sustainability.
Kim Escherich - How Big Data Transforms Our WorldBigDataViz
This document discusses how big data is transforming our world. It notes that the volume and velocity of data is exploding, with more connected devices, sensors, and digital interactions creating petabytes and zettabytes of data. It also discusses how this data can provide insights if analyzed for patterns and trends using advanced analytics. Examples are given of how big data insights can help businesses innovate new products, optimize operations in real-time, better understand customer behavior, and more effectively measure risk and fraud.
Core Solutions President Ravi Ganesan along with co-presenter Adam Bauer, Senior Manager at Deloitte, recently spoke about the importance of establishing effective data collection and evaluation processes to better support today’s healthcare organizations. This SlideShare consists of that presentation, which included:
The use of the Digital Transformation Framework
The role of IoT in deciphering automated data
The clinical tools used to demonstrate data value
The implications of visual perception in sharing data
Discover what comes next for IBM Watson and the industries particularly suited for Watson solutions, such as healthcare, banking, and the financial sector. All of which deal with massive amounts of unstructured data coming from various sources. Find out how the advanced analytics used in Watson are being put to work in businesses around the world.
A Capability Maturity Framework for Sustainable ICTEdward Curry
The document proposes a Capability Maturity Framework for Sustainable ICT developed by the Innovation Value Institute. It aims to help organizations assess and improve their maturity in sustainable ICT practices. The framework evaluates capabilities across nine building blocks including strategy, processes, people and culture, and governance. Assessments provide insight into an organization's strengths and challenges to develop sustainable ICT. Increasing maturity involves systematically improving each of the nine building blocks over multiple levels from ad hoc to optimized practices.
Developing an Sustainable IT Capability: Lessons From Intel's JourneyEdward Curry
Intel Corporation set itself a goal to reduce its global-warming greenhouse gas footprint by 20% by 2012 from 2007 levels. Through the use of sustainable IT, the Intel IT organization is recognized as a significant contributor to the company’s sustainability strategy by transforming its IT operations and overall Intel operations. This article describes how Intel has achieved IT sustainability benefits thus far by developing four key capabilities. These capabilities have been incorporated into the Sustainable ICT Capability Maturity Framework (SICT-CMF), a model developed by an industry consortium in which the authors were key participants. The article ends with lessons learned from Intel’s experiences that can be applied by business and IT executives in other enterprises.
Esg Wp Isilon Scale Out Nas Comes Of Age Sep 08sydcarr
This document discusses the rise of scale-out NAS storage to address growing unstructured file data needs. Scale-out NAS uses clustering and a global namespace to scale capacity and performance linearly by adding commodity servers and disks. Isilon IQ is highlighted as an example scale-out NAS system that uses a shared memory architecture for high performance and efficiency. The growth of unstructured file data like videos and documents is outpacing structured data, driving more organizations to adopt scale-out NAS that can easily expand to handle petabytes of file storage requirements.
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEdward Curry
Energy Intelligence platforms can help organizations manage power consumption more efficiently by providing a functional view of the entire organization so that the energy consumption of business activities can be understood, changed, and reinvented to better support sustainable practices. Significant technical challenges exist in terms of information management, cross-domain data integration, leveraging real-time data, and assisting users to interpret the information to optimize energy usage. This paper presents an architectural approach to overcome these challenges using a Dataspace, Linked Data, and Complex Event Processing. The paper describes the fundamentals of the approach and demonstrates it within an Enterprise Energy Observatory.
E. Curry, S. Hasan, and S. O’Riáin, “Enterprise Energy Management using a Linked Dataspace for Energy Intelligence,” in The Second IFIP Conference on Sustainable Internet and ICT for Sustainability (SustainIT 2012), 2012.
Datalicious is a data marketing agency that was founded in late 2007. It has since grown to become a 360 degree data agency with specialist teams in data collection, analytics, and campaign execution. Datalicious utilizes best of breed data platforms and tools to generate insights from customer data and then implement targeted marketing campaigns to optimize performance. The company works across many industries and offers a wide range of data services to help clients capture insights and take action.
The New York Times is the largest metropolitan and the third largest newspaper in the United States. The Times website, nytimes.com, is ranked as the most
popular newspaper website in the United States and is an important source of advertisement revenue for the company. The NYT has a rich history for curation of its articles and its 100 year old curated repository has ultimately defined its participation as one of the first players in the emergingWeb of Data.
Data curation is a process that can ensure the quality of data and its fitness for use. Traditional approaches to curation are struggling with increased data volumes, and near real-time demands for curated data. In response, curation teams have turned to community crowd-sourcing and semi-automatedmetadata tools for assistance.
E. Curry, A. Freitas, and S. O’Riáin, “The Role of Community-Driven Data Curation for Enterprises,” in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25-47.
Hadoop, oracle and the industrial revolution of data Guy Harrison
The document discusses the rise of big data and how companies are leveraging it. It covers key aspects of big data including the volume, velocity, and variety of data. Hadoop and MapReduce are presented as important technologies for processing and analyzing large datasets in a distributed manner. Case studies of Google and how it uses big data are provided. Related technologies like HDFS, Hive, Pig, and HBase are also overviewed along with how they fit into the Hadoop ecosystem. The value of extracting insights from big data through machine learning, collective intelligence, and predictive analytics is highlighted.
Infosys International, Inc. is a technology company established in 1986 that provides IT services and products to government and commercial clients. It has received numerous awards and recognition for its work. The company offers services including consulting, application development, and staffing augmentation, as well as products such as digital pen technology, biometric security solutions, and content management systems.
BCS APSG The landscape of enterprise applicationsGeoff Sharman
It's a cliché that modern enterprise applications are simply web applications. But is that the whole truth? And if it isn't, what are all the pieces of an enterprise application and how do they fit together? How can we continue to use older technologies within these applications and how might we exploit new technologies in the future? What new challenges do enterprises face in the 21st Century and how might they affect the design of applications and programming systems?
There has been a lack of substantive data about the state of open source in the business intelligence and data warehousing market. In this presentation noted industry analyst Mark Madsen will present the results of recent market research on adoption profiles and characteristics for open source BI/DW.
This research surveyed adopters of open source to understand their reasons for adoption and the benefits they experienced. It also captured user demographics to identify who is adopting open source for BI/DW, where they are deploying it, and how it’s being used. Two highly experienced open source BI practitioners, Bruce Belvin (President, Monolith Software Solutions) and Jay Webster (President and COO at Consorte Media) will describe their BI implementations, their criteria and selection methodology, and share best practices.
The document outlines an ADMA short course on data, measurement, and ROI presented by Datalicious. It provides an overview of the course, which covers basic and advanced analytics concepts over two days. Participants will learn how to define a metrics framework, attribute media, analyze campaigns, and extract insights from data to optimize marketing performance. The course teaches analytics and data-driven strategies to improve marketing outcomes.
THE 3V’S OF BIG DATA: VARIETY, VELOCITY, and VOLUMEGigaom
The document discusses the 3 V's of big data: volume, velocity, and variety. It provides examples of how big data is characterized by these attributes and defines what makes data "big data". Specific challenges of big data like its mixed structured and unstructured nature are also examined.
This document discusses e-government and digital cities in Korea. It provides an overview of key e-government initiatives and digital city projects in Seoul. The three main points are:
1) E-government programs in Korea focus on improving public services, administrative efficiency, and transparency. Major programs include G4C for citizens and various information systems for government agencies.
2) Seoul is developing as a digital city through projects providing online government services, a geographical information system, and connecting government offices through digital networks.
3) Future directions of e-government include expanding mobile and TV-based services, and developing human resources to support innovation in government.
PCs for People october 2012 broadband taskforce presentationAnn Treacy
PCs for People present to the Minnestoa Broadband Task Force (Oct 2012) on their computer donation, refurbishing programs that puts computers in the hands of low income households.
The document discusses several technological trends that will reshape the future in the coming years:
1. The rise of "knowledge individuals" who are always connected via mobile devices and cloud services, blurring work and personal lives.
2. Dramatic reductions in the cost of storage and bandwidth will enable virtually unlimited sharing of information online.
3. Mobile applications and specialized devices will replace the PC as the primary means of internet access, ushering in a "post-PC" era.
4. Social networks and user-generated content will continue growing in importance both personally and professionally through platforms like Facebook.
5. Location-based services and embedded sensors will create new types of applications using real-time
The document discusses tapping into the $4 billion market opportunity for laptop data protection solutions for small to medium sized businesses (SMBs). It outlines the high costs of data loss for SMBs and how Sony's Laptop Data Protection solution can provide an attractive return on investment by reducing costs from data loss and IT administration of data recovery. The solution is scalable from 6 to 1,500 users and offers benefits like easy installation and management, automated backups, and security features. It represents a significant opportunity in the SMB space given the growth of laptop usage and inadequate existing solutions.
This document discusses the rise of the information center and Hitachi Data Systems' vision and strategy. It summarizes that HDS sees a major growth in digital data requiring new IT approaches, and that its blueprint is to create a common virtualized platform for all data, applications, and information. HDS claims its strategy can deliver significant cost savings and value to customers through virtualization, automation, cloud-readiness, and sustainability.
Kim Escherich - How Big Data Transforms Our WorldBigDataViz
This document discusses how big data is transforming our world. It notes that the volume and velocity of data is exploding, with more connected devices, sensors, and digital interactions creating petabytes and zettabytes of data. It also discusses how this data can provide insights if analyzed for patterns and trends using advanced analytics. Examples are given of how big data insights can help businesses innovate new products, optimize operations in real-time, better understand customer behavior, and more effectively measure risk and fraud.
NIIT and Denodo: Business Continuity Planning in the times of the Covid-19 Pa...Denodo
Watch: https://bit.ly/349QjYr
Currently, the most common Analytical Solutions are implemented on large scalable ecosystems which involve massive Data Lakes and Data Warehouses. These solutions take time to build and incur substantial TCO. In today’s environment we need rapid technologies, and NIIT has developed a compelling solution powered by Denodo’s Data Virtualization and Data Catalog.
The document provides an overview of a conference on building a globally competitive position for digital media in Canada. It discusses managing content in the cloud and gearing up for growth. The conference was held on March 30, 2010 in London and was presented by Tom Jenkins, the Executive Chairman and Chief Strategy Officer of Open Text Corporation.
Smarter Planet: How Big Data changes our worldKim Escherich
This document discusses how big data is transforming our world through the increasing instrumentation, interconnection, and intelligence of people, processes, and things. It notes that by 2015, there will be over 50 billion connected devices and over 80% of all available data will be uncertain. The document highlights opportunities that big data creates, such as analyzing information in motion, extreme volumes of information, and managing and planning with data. It also discusses challenges like verifying the veracity of data. Overall, the summary highlights how big data is creating new opportunities through the increasing connections between people, processes, and things.
Information is the principle driver of competitive advantage. How it is collected, analysed and communicated determines our success. No single resource is more critical to organisational survival.
The amount of data in the world is exponentially increasing, to a point where companies capture significant amounts of information about their customers, suppliers, and operations. Millions of networked sensors are being embedded in everything from mobile phones to cars. Social networks and location data from mobile devices will continue to fuel this exponential data growth. These huge data pools are commonly being referred to as "big data".
This talk examines how analytics and big data are exploiting information to drive competitive advantage.
This document summarizes a presentation on leveraging big data in new product development. It discusses what big data is, the growth of data volumes and sources, and how companies can create value from big data. Specifically, it outlines frameworks for using big data to improve existing solutions or create new ones to meet unrecognized needs. It also identifies critical success factors and common pitfalls for new product development with big data. Finally, it provides contact information for the presentation and an overview of an upcoming workshop on the product manager's role in new product development processes.
The document discusses top storage trends that will reshape datacenters in 2012 according to IDC predictions. It finds that data is exploding due to more connected devices and digital content creation. Survey results show organizations prioritizing IT security and cost reduction. IDC predicts that in 2012, storage virtualization will go mainstream, SSDs will be integrated into ROI strategies, unified storage will be standard, and cloud storage services will provide more sophisticated features to help organizations manage big data.
The End of the Wild-West of Data – Relevance and Regulation: the Cornerstones...auexpo Conference
1) Big data is getting exponentially larger each year due to growth in social media, e-commerce, mobile devices, and the internet of things. However, the quality and relevance of data is more important than just the quantity.
2) When data is enriched by linking it to other contextual data sources and then visualized effectively, it can provide meaningful insights. However, regulations and privacy concerns must also be adequately addressed.
3) While big data can improve efficiency and business strategies, its primary benefit should be better customer service, experiences, and identification of new products/services. End user trust and perception of privacy is paramount.
Inspiring Analytics: Tips and Examples for Achieving Better Business, Not Jus...SAP Analytics
http://spr.ly/SBOUC_VP - The last few years have seen massive changes in analytics technology, but organizations often struggle to take full advantage of these changes because they are focused on existing ways of working rather than future possibilities. This presentation aims to educate, entertain, and inspire, with a wide range of examples of how people have used brand-new technology (big data, social analytics, mobile analytics, etc.) not only to remove existing analytics bottlenecks, but also rethink business processes and flip industry business models.
Presenter: Timo Elliott, SAP
The Zen and Art of IT Management (VM World Keynote 2012)CA Technologies
This document discusses strategies for IT management to drive business innovation. It suggests allocating resources between maintaining current systems versus delivering new business services, with 63% of spending going towards the latter. Maintaining systems is seen as a "chore" while investing in new services enables innovation. It also discusses using tools like CA Service Assurance to improve efficiency, streamline processes, and increase capacity for innovation. Case studies show how these tools helped companies like Jack Henry & Associates and Wikimedia Foundation improve service quality and the user experience.
CentriLogic's Downtown Toronto Data Center Grand OpeningCentriLogic
Slideshow from CentriLogic's Downtown Toronto data center grand opening, Thursday May 17, 2012. Presented by:
Robert Offley - CentriLogic
Davin Juusola - Info-Tech Research Group
Kristian Kataila - WIND Mobile
About ActuateOne for Utility Analytics
Water and Energy Utilities are under tremendous pressure to demonstrate progress in asset optimization, grid optimization and performance gains across traditional business drivers such as customers, revenue protection, utility regulatory compliance and financials. ActuateOne for Utility Analytics provides a comprehensive portfolio of software and utility analytics industry expertise to ensure today’s utility leaders and customers always have access to the right information, insight and collaborative capabilities for accurate and informed decisions. Delivered through a single platform, ActuateOne for Utility Analytics ignites any utility or grid Analytics initiative with integrated asset optimization dashboards, grid optimization dashboards, utility compliance reports as well as Transformer Management Scorecards, Substation & Equipment Management Scorecards and Utility KPI Dashboards which help today’s Utility enhance performance and maximize grid performance.
This document discusses the emergence of a new style of IT driven by trends like cloud computing, big data, mobility, and the internet of things. It outlines how HP is leading in this new style of IT by offering converged infrastructure, a converged cloud platform, and software-defined data center solutions. The document also discusses how HP's comprehensive portfolio of hardware, software, and services can help customers transform their IT environments and take advantage of new opportunities in this changing technology landscape.
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Cloudera, Inc.
You will also learn how to understand key challenges when deploying a Hadoop cluster in production, manage the entire Hadoop lifecycle using a single management console, deliver integrated management of the entire cluster to maximize IT and business agility.
1) Big data refers to the immense volume, variety and velocity of data that is now available.
2) As our ability to analyze big data increases, it will lead to changes such as the rise of data scientist roles and more accessible information.
3) These changes will impact media, mentality and accelerate the pace of change in society. Governance of big data use is needed to balance business, social and individual interests.
Big data analytics can provide businesses with new insights from large volumes of structured and unstructured data. It allows analyzing customer sentiment, detecting medical conditions, predicting weather patterns, assessing risk, and identifying threats. To leverage big data, businesses need to capture data from various sources, analyze it in real-time, and turn it into insights to predict customer, competitive, and market behavior. Deploying big data analytics competencies consistently across an enterprise correlates with higher financial performance and competitive advantage long-term.
This document discusses moving NEON optimizations to 64-bit ARM architectures. Some key points:
- NEON is an ARM instruction set extension that allows single-instruction multiple data (SIMD) processing. It has more registers and capabilities in AArch64, including double precision floating point.
- Migrating NEON code to AArch64 usually only requires minor changes to assembly code due to compatibility in C/intrinsics code and clearer register mappings. Existing NEON documentation still applies.
- Open source libraries and compilers support NEON optimizations, providing performance boosts such as 3-4x faster video codecs. The Android NDK fully supports 64-bit development.
- Examples show optimized
The document discusses the advantages of 64-bit ARMv8-A architecture for Android. It describes how Android Lollipop provides support for both 32-bit and 64-bit applications. Native and ART applications can see performance gains by taking advantage of the ARMv8-A architecture's modern instruction set and use of more registers. The document encourages developers to explore 64-bit development and provides additional resources.
The document discusses ARM's Intelligent Power Allocation (IPA) technology, which aims to maximize performance within thermal limits. It describes three types of power consumption scenarios and the limitations of the current Linux thermal framework. IPA uses a closed-loop control system to dynamically allocate power between components like the CPU and GPU based on temperature, power estimates, and performance requests. Test results show IPA achieving up to 31% higher FPS in games compared to static thermal policies, with more consistent temperature control.
This document discusses how Serengeti can be used to automate the deployment and management of Hadoop clusters on VMware vSphere. Some key points:
- Serengeti is a virtual appliance that can be deployed on vSphere and automates the provisioning of Hadoop clusters within 10 minutes from templates.
- It allows separating storage and compute by deploying Hadoop data nodes on shared storage and compute nodes as VMs for better elasticity and utilization.
- Serengeti supports elastic scaling of Hadoop clusters, multi-tenancy by isolating tenant workloads, and live configuration changes with rolling upgrades and no downtime.
This document discusses recommended architectures and best practices for deploying Hadoop on VMware vSphere. It recommends deploying Hadoop nodes across multiple virtualization hosts with 10Gb networking for high performance. The standard deployment places data nodes on shared storage and task trackers on local disks. It also discusses planning the cluster size, hardware requirements including CPU, memory, storage and networking considerations. Configuration recommendations include using NTP, proper virtual disk settings, enabling NUMA and avoiding overcommitting resources.
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
Virtualizing big data platforms like Hadoop provides organizations with agility, elasticity, and operational simplicity. It allows clusters to be quickly provisioned on demand, workloads to be independently scaled, and mixed workloads to be consolidated on shared infrastructure. This reduces costs while improving resource utilization for emerging big data use cases across many industries.
Pivotal HD is a Hadoop distribution that includes additional components to configure, deploy, monitor and manage Hadoop clusters. It provides tools like the Command Center for visual cluster monitoring and job management, Hadoop Virtualization Extensions to improve resource utilization, and HAWQ for high performance SQL queries and analytics across Hadoop data.
The document discusses EMC's transformation to an IT-as-a-Service model. It summarizes how EMC has virtualized 90% of its server workloads, consolidated data centers, and transformed its IT infrastructure to deliver services through a cloud foundation. This allows EMC to enhance agility, optimize costs, and deliver business value through offerings like infrastructure-as-a-service, platform-as-a-service, and software-as-a-service.
This document discusses how IT is transforming through trends like cloud computing and big data. It summarizes that EMC can help customers navigate these changes by providing solutions like hybrid cloud infrastructure and big data analytics to help businesses transform their applications and IT infrastructure. The document also emphasizes that EMC is committed to innovation through R&D investment and acquisitions to ensure it continues to lead customers on their journey to the cloud and with big data.
The document discusses disaster recovery for mission critical applications. It notes challenges in ensuring application availability with data growth and budget pressures, while meeting regulatory requirements. It discusses using replication, snapshots, and continuous data protection to reduce recovery point objectives (RPO) from hours to minutes or less. EMC provides integrated solutions using technologies like Data Domain, Avamar, RecoverPoint, and VPlex to automate backup, replication, and recovery for applications.
The document discusses desktop virtualization and cloud computing. It compares the PC era to the current cloud era and how workstyles have shifted from PCs to mobile devices that can access cloud services from any location using various devices. It discusses how users can access their desktops, applications, files, and services from any cloud through mobile workstyles. It also mentions some benefits of desktop virtualization like security, collaboration, application migration, integration and managing services from various devices and clouds.
The document discusses virtualizing mission critical applications. It notes that the primary drivers for virtualizing applications are cost savings and service improvement. It provides statistics showing an increasing percentage of workload instances running on VMware for applications like Microsoft Exchange, SharePoint, SQL, Oracle, and SAP. It then discusses EMC IT's journey towards a private cloud, moving from an infrastructure focus to an applications focus to an IT-as-a-service model. The document also discusses challenges around data protection and backup/recovery for virtualized applications and provides solutions using technologies like Avamar, Data Domain, and VFCache. It provides an example case study of EMC IT successfully virtualizing their Oracle 11i CRM system.
The document discusses EMC and Oracle's long-standing partnership in developing solutions to optimize Oracle applications. It outlines three common deployment models for Oracle (aggregation, verticalized, virtualization) and describes the benefits of virtualizing Oracle software, such as 3x higher performance with lower total cost of ownership. It also introduces EMC solutions like Vblock infrastructure platforms, FAST automated storage tiering, and VFCache server flash caching that help address challenges of Oracle I/O performance and optimize storage for virtualized Oracle environments.
This document describes virtualization solutions using Microsoft Hyper-V and System Center with EMC storage components. It provides configuration details for solutions supporting 50 and 100 virtual machines, including servers, hypervisors, networking, storage and backup components. It also discusses features for virtualizing Microsoft applications and the benefits of using System Center for management.
This document discusses the transformation of IT backup and recovery due to trends in data growth and regulations. It presents EMC's backup solutions including Data Domain for disk-based backup with deduplication, Avamar for fast VMware backups, and NetWorker for centralized backup management. These solutions provide faster backups, recovery and scalability compared to traditional tape-based systems. Case studies show customers achieving up to 98% data reduction, replacing tapes completely and saving over $200k annually with EMC's backup products.
The document discusses EMC's strategy called "FLASH 1st" for data storage over the next decade. It argues that traditional hard disk drives will not be able to keep up with rapidly growing data and increasing IO demands. FLASH/solid state technology on the other hand is improving much faster than HDDs and will provide dramatically better performance and cost efficiency. EMC's FLASH 1st strategy leverages automated tiering software to place active "hot" data on high-performance FLASH storage and less active "cold" data on lower-cost capacity HDDs to maximize benefits.
3. !!!
“Big Data Is Less !!!
About Size, And
More About
Freedom”
―Techcrunch
!!!
THE ERA OF
!!!
BIG DATA
“Findings: „Big Data‟
!!! Is More Extreme
Than Volume”
“Big Data! It‟s Real,
IS HERE… ― Gartner It‟s Real-time, and
It‟s Already
“Total data: „bigger‟ Changing Your
than big data” World”
!!! ― 451 Group
!!!
!!! ―IDB
4. Data Sources Are Expanding
THE DIGITAL UNIVERSE WILL
GROW 44X
IN THE NEXT 10 YEARS
Source : 2011 IDC Digital Universe Study
5. BIG Data is Just a Bunch of Data to Store…? OR
90
80
70
60
50
Big 40
Data 30
Sources 20
10
0
2009 2010 2011 2012 2013 2014
File Based: 60.7% CAGR Block Based: 21.8% CAGR
By 2012, 80% of all storage capacity sold will be for file-based data
Source: IDC
7. Make BIG Data
Accessible
Identify the data source
Store the data
Connect applications and users
Utilize the data in different views
8. EMC UAP Solutions – Analytics Platform
This is what my
analytics
environment looks
like…
9. Building The Big Data Analytics
“Stack”
Analytic Toolsets
(Business Analytics, BI, Statistics, etc.)
Greenplum Chorus
Enterprise Collaboration Platform for Data
Greenplum Data Computing Appliances
Purpose-built for Big Data Analytics
Greenplum Database Greenplum HD
Enterprise & Community Editions Hadoop Enterprise & Community Editions
World’s Most Scalable MPP Database Platform Enterprise Analytics Platform for Unstructured Data
10. Greenplum Becomes the Foundation
of EMC’s Data Computing Division
E M C A C Q U I R E S G R E E N P L U M O N J U LY 2 0 1 0
“For three years, Gartner has identified Greenplum as
the most advanced vendor in the visionary
quadrant of its data warehouse DBMS Magic Quadrant….”
– Gartner
11.
12. SAS at a Glance
Company Highlight:
• Founded 1976: 11,000+ employees in 400+
offices
• 2010 worldwide revenue $2.43 B
• IDC: SAS is leader in Analytics with a 34.5%
market share : Analytics and Reporting
• 4.5 million users worldwide
• 50,000+sites in 114 countries
• From Tools to Vertical Solutions
Services
Retail
11%
Other 4% Financial Services
2% 42%
Manufacturing
6%
Healthcare
Communications
& Life Sciences
8%
8%
Government Education
14% Energy & Utilities 3%
2%
13. Overview
SMC Inc., HQ SMC BV,
San Jose, CA The Netherlands
SMC TW,
Taiwan
Founded in 1993, HQ– San Jose, CA, 2007 NASDAQ: SMCI
Revenues: FY09 $500M, FY10 $721M , FY11 ~$1B
Global Footprint: >100 Countries
Production: US, EU and Asia Production facilities
Engineering: 70% of workforce in engineering (30% growth through recession)
Market Share: #1 Server Channel (SMCI enables ~10% of global server market)
Brand Equity: Growing public profile since 2007 IPO
Corporate Focus: Energy Efficiency, Earth-friendly, Green Technology Innovation
14. Product Family
Resource Optimized (WIO/UIO) Twin Architecture GPU SuperComputing
Data Center Optimized Embedded
Application Optimized: Multi I/O SuperBlade
Workstation
Mainstream Business Solutions Storage Server
15. In-House Design and Server Building Block Solutions®
Technology Partners Server Building Block Solutions® Customer Requirements
Application Optimized
OEM
Specs
Tri-Lab
Optimized
Data Center
In-House Design
Server Building Block Solutions®
> 350 Operating
>550 >1300 > 140 Power Open
Cooling Systems /
Motherboards Chassis Supplies CPU/ Memory
Modules Applications
(1) As of Q2, 2009
16. Big Data Analytics on Hadoop
Internet companies are not built on SQL but are building Analytics on Hadoop/NoSQL
Existing Hadoop Users (Internet)
This is what I think BI &
ETL Tools Web Apps
my analytics Reporting
environment looks
like…
Management & Coordination
Pig Hive HBase
Hadoop System MapReduce Layer
Hadoop Storage
Web Portal,
Social Networks
17. Hadoop Components (hadoop.apache.org)
HDFS • Hadoop Distributed File System
MapReduce • Framework for writing scalable data applications
Pig • Procedural language that abstracts lower level MapReduce
Zookeeper • Highly reliable distributed coordination
Hive • Data warehouse infrastructure built on top of Hadoop
HBase • Database for random, real time read/write access
Oozie • workflow/coordination to manage jobs
Mahout • Scalable machine learning libraries
18. What can Hadoop do for you?
Financial Services Web & e-Tailing
Better knowing customers Web usage, click stream behavior
Risk analysis and management. Market & customer segmentation
Fraud detection and security Ad customer targeting
analytics. On-line fraud detection
Telecommunications Government
Customer churn prevention. Fraud detection
Price optimization and marketing Compliance and regulatory analytics
Network analysis and optimization
Customer experience management Retail
Market and consumer segmentation
Healthcare Merchandizing and cross-selling
Patient care quality Promotion and campaign analysis
Drug development
Data Source: Cloudera
19. Hadoop Use Cases
Linkedin – “People You May Know” and other facts
Yahoo! – Hadoop to support AdSystems and web search
Visa – Credit card fraud detection and analysis
T-Mobile – Churn analysis, user experience
Amazon, Baidu, AOL, eBay, Facebook, Twitter, …
Data Source: Cloudera
20. Hadoop Cluster HW selection
What’s the HW configuration for Hadoop clusters?...
It depends, workloads matter.
CPU Intensive I/O Intensive
Machine learning Data importing and exporting
Natural language processing Indexing
Complex data mining Searching
Feature extraction Grouping
Decoding/decompressing
Data Storage
Capacity
General Configuration
# of data mirroring
2 Quad Core CPUs
16-96GB Memory
TCO 2 x GE
Rack space 1TB-2TB Disk x n
Power consumption 1U/2U Rack mount
Different workloads
21. Proven at Scale with Worldwide Support
Production-scale testing of Apache Trunk & hosted environment for customer POC‟s
Industry’s largest Hadoop
support team
Industry‟s most accomplished
Hadoop talents (from Yahoo!,
LinkedIn, Talend, etc.)
Tested at scale on the
Greenplum Analytics
Workbench
1,000-node, 24-petabyte cluster
Multi-million dollar investment
by EMC and partners
Reduced risk for EMC
Bringing Rapid Innovation customers
to Hadoop
Certification of partner products
22. Supermicro Server Functions in the Cluster
Supermicro
Data Nodes
2U Storage Server
Supermicro Infrastructure
Nodes
• 1,000+ Physical Supermicro Server Nodes
(10k virtual nodes)
• 12,000 Processor Cores
• 24 Petabytes of Storage Capacity (6Gbps SATA)
• 48 Terabytes RAM
2U Twin2 Server • 56 Gbps Infiniband Connectivity
26. Supermicro Advantages
Why Supermicro…
Building Blocks for different High Efficiency, High Quality
Workloads & Requirement
-Green IT
-Meet any Hadoop workloads by models -High Efficiency Power
-I/O, CPU, Disks, Density -High Quality for highest system availability and
- Customize by specific workload requirement best utilization
Proven solutions TCO
-EMC Greenplum proven solutions Solutions to Cost-Effective Hadoop Clusters
-100% Apache Hadoop Compatible Best choice of Hadoop Hardware platforms
-Benchmark and testing programs with partners
27. Turnkey Hadoop:
Supermicro Complete Rack Solutions
One Stop Shop for Hardware, End to End Total
Solutions
Speedup Deployment With Ready to Run Rack
Systems
Single Source, Consistent Build Quality and
Delivery Time
Multi-Vendor Compatibility Test, Zero
Compatibility Issue
Premium Service With Competitive Pricing
Shipped Directly From US, NL, TW