This document provides an overview of various big data concepts and perspectives from leading technology companies. It discusses the growing volumes, varieties, and velocities of data being generated. It also summarizes different companies' views on big data and how they are applying analytics, including perspectives from IBM, Intel, Microsoft, Oracle, and EMC. Additionally, it examines the emerging role of data scientists and provides examples of data science courses and topics.
Time Between Times Slides: the joy of educating in a time of rapid technologi...Jonathan Nalder
7 distinguished educators, from George Siemens and Stephen Heppell to current classroom teachers respond to this statement:
This is the time between times for educators working with technology. Before mobile, ubiquitous and everyware computing become the invisible norm, but after a time when educators could sit back and wait for the digital revolution to pass on by. As slow as some in education have been to respond to rapid technological change, this is however the most exciting and dynamic time to be an educator of the educators because ...
Outlook 2010 new features review. Email-to-SMS solutions. Have I drowned my laptop? Insurers take a sideways look at Facebook. De-crapify your PC. Q&A: How to write our online handbook? Clicks of the Trade - Banish browser mixed-content security warnings
The presentation describes about Big Data and Big Data Analytics that is emerging as an important business terminology. There is a lot of noise about Big Data in the market, where most people believe that's about huge volume of data. The presentation addresses it in a simple way and explains Big Data's implication from several angle and not only from volume.
Charting the Course: Using Data in the Museum to Explore, Innovate, and Reach...Robert J. Stein
This talk was presented at the We Are Museums Conference in May-June 2015 in Berlin, Germany.
It seems that today’s museums are awash in data. With so many sources of data available to us, museums can easily feel that they’re drowning in numbers, but starved for real insight. This talk will present practical ways that museums can begin to collect and analyze data to help illuminate their own practice and impact with visitors. Using a unique visitor loyalty program at the Dallas Museum of Art as a case study, this talk will raise questions about what “big data” in the cultural sector really looks like and what insights it might provide to museums.
Learn more about the DMA Friends program
https://www.dma.org/visit/dma-friends
Read the article of Robert Stein about the DMA Friends programme http://rjstein.com/portfolio/dma-friends/
Time Between Times Slides: the joy of educating in a time of rapid technologi...Jonathan Nalder
7 distinguished educators, from George Siemens and Stephen Heppell to current classroom teachers respond to this statement:
This is the time between times for educators working with technology. Before mobile, ubiquitous and everyware computing become the invisible norm, but after a time when educators could sit back and wait for the digital revolution to pass on by. As slow as some in education have been to respond to rapid technological change, this is however the most exciting and dynamic time to be an educator of the educators because ...
Outlook 2010 new features review. Email-to-SMS solutions. Have I drowned my laptop? Insurers take a sideways look at Facebook. De-crapify your PC. Q&A: How to write our online handbook? Clicks of the Trade - Banish browser mixed-content security warnings
The presentation describes about Big Data and Big Data Analytics that is emerging as an important business terminology. There is a lot of noise about Big Data in the market, where most people believe that's about huge volume of data. The presentation addresses it in a simple way and explains Big Data's implication from several angle and not only from volume.
Charting the Course: Using Data in the Museum to Explore, Innovate, and Reach...Robert J. Stein
This talk was presented at the We Are Museums Conference in May-June 2015 in Berlin, Germany.
It seems that today’s museums are awash in data. With so many sources of data available to us, museums can easily feel that they’re drowning in numbers, but starved for real insight. This talk will present practical ways that museums can begin to collect and analyze data to help illuminate their own practice and impact with visitors. Using a unique visitor loyalty program at the Dallas Museum of Art as a case study, this talk will raise questions about what “big data” in the cultural sector really looks like and what insights it might provide to museums.
Learn more about the DMA Friends program
https://www.dma.org/visit/dma-friends
Read the article of Robert Stein about the DMA Friends programme http://rjstein.com/portfolio/dma-friends/
VIETNAM ICT COMM CONFERENCE 2016 | ICT COMM VIETNAM - IT, Mobile, Hightech exhibition
Xu hướng ứng dụng và triển khai Big Data cho doanh nghiệp Việt Nam và Thế Giới.
Giải pháp & kiến trúc Hydrid - vừa tự làm + vừa outsourcing là chiến lược hiệu quả nhất trong năm 2016 cho phần lớn doanh nghiệp SME toàn cầu.
Cloud-Based IT Outsourcing:
The cloud benefits of scale, cost, and storage will alter big data initiatives by transforming IT departments.
The new paradigm for this organizational function will involve a hybridized architecture in which all but the most vital and longstanding systems are outsourced to complement existing infrastructure.
http://ants.vn
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics
Presented by David Smith, Chief Community Officer, Revolution Analytics at Garner Business Intelligence and Analytics Summit, April 2014.
In this presentation, I'll introduce the open source R language — the modern standard for Data Science — and the enhanced performance, scalability and ease-of-use capabilities of Revolution R Enterprise. Customer case studies will illustrate Revolution R Enterprise as a component of the real-time analytics deployment process, via integration with Hadoop, database warehousing systems and Cloud platforms, to implement data-driven end-user applications.
The IoT Food Chain – Picking the Right Dining Partner is Important with Dean ...gogo6
Download our special report, IoT Tech for the Manager: http://bit.ly/report1-slideshare
The IoT Food Chain – Picking the Right Dining Partner is Important as presented at the IoT Inc Business' fourteenth Meetup. See: http://www.iot-inc.com/internet-of-things-value-chain-meetup/
In our fourteenth Meetup we have Dean Freeman, Research VP at Gartner presenting “The IoT Food Chain – Picking the Right Dining Partner is Important”.
Presentation Abstract
The Internet of Things means many different things to different people. What is key about the IoT is there is a distinct food chain that runs from the silicon devices to the services and then back. The level of success you will have in the IoT is heavily dependent upon where you fit in the food chain, and if you have the capability to move up the chain or across the chain into different verticals. In this presentation we will explore the food chain, what is important and what steps need to be taken to succeed in the world of the IoT.
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
講者:Informatica 資深產品顧問 | 尹寒柏
議題簡介:Big Data 時代,比的不是數據數量,而是了解數據的深度。現在,因為 Big Data 技術的成熟,讓非資訊背景的 CXO 們,可以讓過去像是專有名詞的 CI (Customer Intelligence) 變成動詞,從 BI 進入 CI,更連結消費者經濟的脈動,洞悉顧客的意圖。不過,有個 Big Data 時代要 注意的思維,那就是競爭到最後,不單只是看數據量的增長,還要比誰能更了解數據的深度。而 Informatica 正是這個最佳解決的答案。我們透過 Informatica 解決在企業及時提供可信賴數據的巨大壓力;同時隨著日益增高的數據量和複雜程度,Informatica 也有能力提供更快速彙集數據技術,從而讓數據變的有意義並可供企業用來促進效率提升、完善品質、保證確定性和發揮優勢的功能。Inforamtica 提供了更為快速有效地實現此目標的方案,是精誠集團在 Big Data 時代的最佳工具。
Many believe Big Data is a brand new phenomenon. It isn't, it is part of an evolution that reaches far back history. Here are some of the key milestones in this development.
Gartner: Top 10 Strategic Technology Trends 2016Den Reymer
Digital Transformation and Innovation on http://denreymer.com
- Which trends will drive the greatest disruption to the IT landscape over the next three years
- Critical technologies that must be explored to support the move to digital business
- How these trends and technologies are evolving and actions to take today
http://www.gartner.com//it/content/3154000/3154017/december_8_top_strategic_technology_trends_dcearley.pdf
VIETNAM ICT COMM CONFERENCE 2016 | ICT COMM VIETNAM - IT, Mobile, Hightech exhibition
Xu hướng ứng dụng và triển khai Big Data cho doanh nghiệp Việt Nam và Thế Giới.
Giải pháp & kiến trúc Hydrid - vừa tự làm + vừa outsourcing là chiến lược hiệu quả nhất trong năm 2016 cho phần lớn doanh nghiệp SME toàn cầu.
Cloud-Based IT Outsourcing:
The cloud benefits of scale, cost, and storage will alter big data initiatives by transforming IT departments.
The new paradigm for this organizational function will involve a hybridized architecture in which all but the most vital and longstanding systems are outsourced to complement existing infrastructure.
http://ants.vn
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Revolution Analytics
Presented by David Smith, Chief Community Officer, Revolution Analytics at Garner Business Intelligence and Analytics Summit, April 2014.
In this presentation, I'll introduce the open source R language — the modern standard for Data Science — and the enhanced performance, scalability and ease-of-use capabilities of Revolution R Enterprise. Customer case studies will illustrate Revolution R Enterprise as a component of the real-time analytics deployment process, via integration with Hadoop, database warehousing systems and Cloud platforms, to implement data-driven end-user applications.
The IoT Food Chain – Picking the Right Dining Partner is Important with Dean ...gogo6
Download our special report, IoT Tech for the Manager: http://bit.ly/report1-slideshare
The IoT Food Chain – Picking the Right Dining Partner is Important as presented at the IoT Inc Business' fourteenth Meetup. See: http://www.iot-inc.com/internet-of-things-value-chain-meetup/
In our fourteenth Meetup we have Dean Freeman, Research VP at Gartner presenting “The IoT Food Chain – Picking the Right Dining Partner is Important”.
Presentation Abstract
The Internet of Things means many different things to different people. What is key about the IoT is there is a distinct food chain that runs from the silicon devices to the services and then back. The level of success you will have in the IoT is heavily dependent upon where you fit in the food chain, and if you have the capability to move up the chain or across the chain into different verticals. In this presentation we will explore the food chain, what is important and what steps need to be taken to succeed in the world of the IoT.
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
講者:Informatica 資深產品顧問 | 尹寒柏
議題簡介:Big Data 時代,比的不是數據數量,而是了解數據的深度。現在,因為 Big Data 技術的成熟,讓非資訊背景的 CXO 們,可以讓過去像是專有名詞的 CI (Customer Intelligence) 變成動詞,從 BI 進入 CI,更連結消費者經濟的脈動,洞悉顧客的意圖。不過,有個 Big Data 時代要 注意的思維,那就是競爭到最後,不單只是看數據量的增長,還要比誰能更了解數據的深度。而 Informatica 正是這個最佳解決的答案。我們透過 Informatica 解決在企業及時提供可信賴數據的巨大壓力;同時隨著日益增高的數據量和複雜程度,Informatica 也有能力提供更快速彙集數據技術,從而讓數據變的有意義並可供企業用來促進效率提升、完善品質、保證確定性和發揮優勢的功能。Inforamtica 提供了更為快速有效地實現此目標的方案,是精誠集團在 Big Data 時代的最佳工具。
Many believe Big Data is a brand new phenomenon. It isn't, it is part of an evolution that reaches far back history. Here are some of the key milestones in this development.
Gartner: Top 10 Strategic Technology Trends 2016Den Reymer
Digital Transformation and Innovation on http://denreymer.com
- Which trends will drive the greatest disruption to the IT landscape over the next three years
- Critical technologies that must be explored to support the move to digital business
- How these trends and technologies are evolving and actions to take today
http://www.gartner.com//it/content/3154000/3154017/december_8_top_strategic_technology_trends_dcearley.pdf
Learn how cloud companies are keepin' it saasy with small- and medium-sized businesses. We're going to discuss:
• The unique challenges of cloud marketing
• What successful companies are doing to reach SMB tech buyers
• What SMB IT professionals are saying about the Cloud
Why? Because Spiceworks has worked with leading Cloud vendors (from startups to the Fortune 500) to help them develop, launch, and promote their cloud products to millions of tech buyers....and now we would like to share what we have learned with you!
This deck was prepared for a lecture for week 1 of Founder Labs Mobile Edition.
The audience was a mix of developers, UI/UX designers and hardware engineers. The goal was to provide a baseline ecosystem overview and talk about technology drivers and business models in mobile.
Most of the slides in the deck are derived from work with my clients at Accenture.
A Mobile Centric View of Silicon Valley - January 2011Lars Kamp
A presentation held at Opinno in San Francisco to a delegration from PromoMadrid. Goal was to provide a quick overview of major trends in mobile in 30 min.
Proliferation of Mobile Devices = Opportunity for Apps & DevelopersVDC Research Group
These slides are from a webcast originally recorded on 8/4/10. During this webcast, Eric Klein, Senior Analyst, Mobile & Wireless Practice at VDC Research, presents data from VDC’s annual Mobile Developer survey, as well as the findings from vendor and end-user interviews. Specifically, he provides insights into: The size of the mobile applications market, and how quickly it is expected to grow; The primary challenges enterprises face when adopting mobile applications; How traditional enterprise software vendors are transitioning their products to mobile platforms; The ramifications of the recent acquisitions in the mobility space; How the shifting/evolving mobile landscape will create opportunities; How companies plan to acquire, deploy and support mobile applications; and The key technical trends companies should be aware of when adopting mobile technologies.
QRcodes & Augmented Reality, by Martha GabrielMartha Gabriel
Presentation given by Martha Gabriel at HighEdWeb 2010 (#heweb10) - http://2010.highedweb.org
The presentation covers the main concepts & aspects of augmented reality and the mobile tagging (2D barcodes - like QRcode and Datamatrix) like tools for AR.
Talk to the author on Twitter - @marthagabriel
IBM and the Metaverse focuses on IBM's interest in virtual worlds, from the early beginning. By the end of my presentation you'll know how IBM got into the metaverse, what are we doing there and how you can benefit from our experience
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
1. Big Data Concepts &
Practice
Vladimir Suvorov
vladimir.suvorov@emc.com
EMC &
DataScienceSquad.com
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 1
2. About myself
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 2
4. …by the end of 2011, this was about 30
In 2005 there were 1.3 billion RFID
billion and growing even faster
tags in circulation…
4 Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 4
5. An increasingly sensor-enabled and instrumented
business environment generates HUGE volumes of
data with MACHINE SPEED characteristics…
1 BILLION lines of code
EACH engine generating 10 TB every 30 minutes!
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 5
6. 350B
Transactions/Year
Meter Reads
every 15 min.
120M – meter reads/month 3.65B – meter reads/day
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 6
7. In August of 2010, Adam
Savage, of “Myth Busters,”
took a photo of his vehicle
using his smartphone. He
then posted the photo to his
Twitter account including the
phrase “Off to work.”
Since the photo was taken by
his smartphone, the image
contained metadata revealing
the exact geographical
location the photo was taken
By simply taking and posting a
photo, Savage revealed the
exact location of his home,
the vehicle he drives, and the
time he leaves for work
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 7
8. The Social Layer in an Instrumented Interconnected World
4.6
30 billion billion
RFID tags today
camera
12+ TBs (1.3B in 2005)
phones
of tweet data world
every day wide
100s of
millions
of GPS
data every
of
enabled
? TBs
devices
day
sold
annually
25+ TBs of 2+
log data billion
every day people
on the
76 million smart Web by
meters in 2009… end
200M by 2014 2011
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 8
9. Twitter Tweets per Second Record Breakers of 2011
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 9
10. Extract Intent, Life Events, Micro Segmentation
Attributes
Pauline
Name, Birthday, Family
Tom Sit
Not Relevant - Noise
Tina Mu
Monetizable Intent
Jo Jobs
Not Relevant - Noise
Location Wishful Thinking
Relocation
Monetizable Intent
SPAMbots
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 10
11. Big Data Includes Any of the following Characteristics
Extracting insight from an immense volume, variety and velocity of data, in
context, beyond what was previously possible
Variety: Manage the complexity of
data in many different
structures, ranging from
relational, to logs,
to raw text
Velocity: Streaming data and large
volume data movement
Volume: Scale from Terabytes to
Petabytes (1K TBs) to
Zetabytes (1B TBs)
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 11
12. Bigger and Bigger Volumes of Data
• Retailers collect click-stream data from Web site interactions and loyalty card data
– This traditional POS information is used by retailer for shopping basket analysis,
inventory replenishment, +++
– But data is being provided to suppliers for customer buying analysis
• Healthcare has traditionally been dominated by paper-based systems, but this information is
getting digitized
• Science is increasingly dominated by big science initiatives
– Large-scale experiments generate over 15 PB of data a year and can’t be stored within
the data center; sent to laboratories
• Financial services are seeing large and large volumes through smaller trading sizes,
increased market volatility, and technological improvements in automated and algorithmic
trading
• Improved instrument and sensory technology
– Large Synoptic Survey Telescope’s GPixel camera generates 6PB+ of image data per
year or consider Oil and Gas industry
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 12
13. The Big Data Conundrum
• The percentage of available data an enterprise can analyze is decreasing
proportionately to the available to it
Quite simply, this means as enterprises, we are getting
“more naive” about our business over time
We don’t know what we could already know….
Data AVAILABLE to
an organization
Data an organization
can PROCESS
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 13
14. Why Not All of Big Data Before: Didn’t have the Tools?
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 14
15. Applications for Big Data Analytics
Smarter Healthcare Multi-channel Finance Log Analysis
sales
Homeland Security Traffic Control Telecom Search Quality
Manufacturing Trading Fraud and Retail: Churn,
Analytics Risk NBO
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 15
16. Most Requested Uses of Big Data
• Log Analytics & Storage
• Smart Grid / Smarter Utilities
• RFID Tracking & Analytics
• Fraud / Risk Management & Modeling
• 360° View of the Customer
• Warehouse Extension
• Email / Call Center Transcript Analysis
• Call Detail Record Analysis
16
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 16
17. What companies &
analytics think of Big
Data
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 17
18. Gartner & McKinsley
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 18
19. Hype Cycle of Big Data
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 19
20. Priority matrix
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 20
21. Key vision
• Predictive modeling is gaining momentum with property
and casualty (P&C) companies who are using them to
support claims analysis, CRM, risk management, pricing
and actuarial workflows, quoting, and underwriting.
• Social content is the fastest growing category of new
content in the enterprise and will eventually attain 20%
market penetration.
• Gartner reports that 45% as sales management teams
identify sales analytics as a priority to help them
understand sales performance, market conditions and
opportunities.
• Over 80% of Web Analytics solutions are delivered via
Software-as-a-Service (SaaS).
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 21
22. Big Data deliverables by McKinsley
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 22
23. Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 23
24. Intel
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 24
25. Intel Big Data Cluster Example
Application Big Data Algorithms Compute
Style
Scientific study Ground model Earthquake HPC
(e.g. earthquake simulation, thermal
study) conduction, …
Internet library Historic web Data mining MapReduce
search snapshots
Virtual world Virtual world Data mining TBD
analysis database
Language Text corpuses, Speech recognition, MapReduce &
translation audio archives,… machine translation, HPC
text-to-speech, …
Video search Video data Object/gesture MapReduce
identification, face
recognition, …
There has been more video uploaded to YouTube in the last 2 months than if ABC,
NBC, and CBS had been airing content 24/7/365 continuously since 1948. - Gartner
25
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 25
26. Example Motivating Application:
Online Processing of Archival Video
• Research project: Develop a context recognition system that is 90% accurate over
90% of your day
• Leverage a combination of low- and high-rate sensing for perception
• Federate many sensors for improved perception
• Big Data: Terabytes of archived video from many egocentric cameras
• Example query 1: “Where did I leave my briefcase?”
• Sequential search through all video streams [Parallel Camera]
• Example query 2: “Now that I’ve found my briefcase, track it”
• Cross-cutting search among related video streams [Parallel Time]
Big Data Cluster
26
26
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 26
27. Oracle
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 27
28. Big Data Use Cases
Today’s Challenge New Data What’s Possible
Healthcare Remote patient Preventive care,
Expensive office visits monitoring reduced hospitalization
Manufacturing Automated diagnosis,
Product sensors
In-person support support
Location-Based
Services Geo-advertising, traffic,
Real time location data
Based on home zip local search
code
Public Sector Tailored services,
Citizen surveys
Standardized services cost reductions
Retail
Sentiment analysis
One size fits all Social media
segmentation
marketing
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 28
29. What’s in Big Data for Public Sector
•Operational efficiency and productivity
•Fraud detection and prevention
•Close tax gaps
•Value for money for citizens
•Prevent crime waves
•Customize actions based on population
segments
•Public utilities to reduce consumption
•Produce safety from farm to fork
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 29
31. New opportunities
Measures and ranks online user
Increases ad revenue by processing 3.5 influence by processing 3 billion signals Improving investigation time by analyzing
billion events per day per day large volume & variety of data
Massive Volumes Cloud Connectivity Real-Time Insight
Processes 464 billion rows per quarter, Connects across 15 social networks via Cut investigation time from 2 years to
with average query time under 10 secs. the cloud for data and API access 15 days
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 31
32. Microsoft’s Approach to Big Data
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 32
33. A Holistic Big Data Solution from Microsoft
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 33
34. Data
Scientist
Job
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 34
35. Sexy Job of Data Scientist
Tom Davenport, who is teaching an executive
program in Big Data and analytics at Harvard
University, said some data scientists are
earning annual salaries as high as $300,000,
which is “pretty good for somebody that
doesn't have anyone else working for them.”
Davenport also said such workers are
motivated by the problems and opportunities
data provides.
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 35
36. What EMC Think of Data Scientists
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 36
37. Job evolution
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 37
38. What Forbes think of Data Scientists
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 38
39. Data
Science
Courses
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 39
40. Course Modules and Navigation Icons
Data Science and Big Data Analytics
1. Introduction to Big Data Analytics
2. Data Analytics Lifecycle + Lab
3. Review of Basic Data Analytics Methods Using R +
Labs
4. Advanced Analytics - Theory & Methods + Labs
5. Advanced Analytics - Technology & Tools + Labs
6. The Endgame, or Putting it All Together + Final Lab
40
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 40
41. Topics : DataofScience and Big Advanced Analytics
Introducti Review Basic Advanced Data The Endgame,
on to Big Data Analytic Analytics – Analytics - or Putting it All
Course Methods Using R Theory and Technology
Data Together
Analytics Methods and Tools +
+ Final Lab on Big
Data Data Analytics
Analytics
Lifecycle
Big Data Using R to Look at K-means Analytics for Operationalizing
Overview Data - Clustering Unstructured an Analytics
Introduction to R Data Project
State of Association (MapReduce
the Analyzing and Rules and Hadoop) Creating the
Practice in Exploring the Data Final
Analytics Linear The Hadoop Deliverables
Statistics for Regression Ecosystem
The Data Model Building Data
Scientist and Evaluation Logistic In-database Visualization
Regression Analytics – Techniques
Big Data SQL Essentials
Analytics Naive + Final Lab –
in Bayesian Advanced SQL Application of
Industry Classifier and MADlib for the Data
Verticals In-database Analytics
Decision Trees Analytics Lifecycle to a
Data Big Data
Analytics Time Series Analytics
Lifecycle Analysis Challenge
Text Analysis
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 41
41
42. Hadoop
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 42
43. Top companies need Hadoop
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 43
44. What is Hadoop and Where did it start?
• Created by Doug Cutting, formerly of Yahoo!
Now Cloudera
– HDFS (storage) & MapReduce (compute)
– Inspired by Google’s MapReduce and Google
File System (GFS) papers
• Much of the initial work on Hadoop was done
by Yahoo
• It is now a top-level Apache project backed by
large open source development community
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 44
45. What is Hadoop?
Two Core Components
HDFS MapReduce
Storage in the Compute via the
Hadoop Distributed MapReduce distributed
File System processing platform
• Storage & Compute in 1 Framework
• Open Source Project of the Apache Software Foundation
• Written in Java
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 45
46. Hadoop cluster architecture
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 46
47. MapReduce example
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 47
48. Hadoop versions
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 48
49. Hadoop Wave Report
“EMC Greenplum is the first mover in Hadoop
appliances. EMC Greenplum the first EDW vendor to
provide a full-featured enterprise-grade Hadoop
appliance and roll out an appliance family that integrates
its Hadoop, EDW, and data integration in a single rack. It
provides its own open source Hadoop distribution
software, integrates EMC’s strong storage product
portfolio in its appliances, and has an extensive
professional services force of EMC technical consultants
and data scientists with Hadoop expertise.”
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 49
50. Hadoop Players Today
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 50
51. Get Started With Hadoop Today
Data Scientists & Hadoop Architecture teams deliver customer success
Hadoop Architecture Services
– POC planning and deployment
– Installation and best practices
– Educate the team
Greenplum Analytics Labs
– Leverage the expertise of Greenplum’s
Data Scientists
– Packaged solutions that produce business
value and actionable results
– Accelerate Hadoop capabilities on your
data with your analysts
Establish a strategic vision
– Roadmap for Hadoop and unified analytics
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 51
52. The Greenplum Unified Analytics Platform
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 52
53. NoSQL
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 53
54. Definition
from nosql-databases.org
• Next Generation Databases mostly addressing
some of the points: being non-relational,
distributed, open-source and horizontal
scalable. The original intention has been modern
web-scale databases. The movement began
early 2009 and is growing rapidly. Often more
characteristics apply as: schema-free, easy
replication support, simple API, eventually
consistent /BASE (not ACID), a huge data
amount, and more. So the misleading term "nosql"
(the community now translates it mostly with "not
only sql") should be seen as an alias to
something like the definition above.
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 54
55. NoSQL
http://nosql-database.org/
• Non relational
• Scalability
– Vertically
• Add more data
– Horizontally
• Add more storage
• Collection of structures
– Hashtables, maps, dictionaries
• No pre-defined schema
• No join operations
• CAP not ACID
– Consistency, Availability and Partitioning (but not all three at
once!)
– Atomicity, Consistency, Isolation and Durability
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 55
56. Advantages of NoSQL
• Cheap, easy to implement
• Data are replicated and can be partitioned
• Easy to distribute
• Don't require a schema
• Can scale up and down
• Quickly process large amounts of data
• Relax the data consistency requirement (CAP)
• Can handle web-scale data, whereas Relational
DBs cannot
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 56
57. Disadvantages of NoSQL
• New and sometimes buggy
• Data is generally duplicated, potential for
inconsistency
• No standardized schema
• No standard format for queries
• No standard language
• Difficult to impose complicated structures
• Depend on the application layer to enforce data
integrity
• No guarantee of support
• Too many options, which one, or ones to pick
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 57
58. NoSQL Options
Key-Value Stores
• This technology you know and love and use all the
time
– Hashmap for example
• Put(key,value)
• value = Get(key)
• Examples
– Redis (my favorite!!) – in memory store
– Memcached
– and 100s more
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 58
59. Column Stores
• Not to be confused with the relational-db version
of this
– Sybase-IQ etc.
• Multi-dimensional map
• Not all entries are relevant each time
– Column families
• Examples
– Cassandra
– Hbase
– Amazon SimpleDB
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 59
60. Document Stores
• Key-document stores
– However the document can be seen as a value so
you can consider this is a super-set of key-value
• Big difference is that in document stores one can
query also on the document, i.e. the document
portion is structured (not just a blob of data)
• Examples
– MongoDB
– CouchDB
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 60
61. Graph Stores
• Use a graph structure
– Labeled, directed, attributed multi-graph
• Label for each edge
• Directed edges
• Multiple attributes per node
• Multiple edges between nodes
– Relational DBs can model graphs, but an edge
requires a join which is expensive
• Example Neo4j
– http://www.infoq.com/articles/graph-nosql-neo4j
Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 61
62. Non-commercial education only. Corresponding information belongs to its respectful owner. These includes EMC, IBM, Microsoft, Oracle, Gartner etc 62