There is a flood of information online from tweets,feeds, status updates, photos, government, private, and other
sources. Just how big is “big data”? This presentation will share examples of big and open data in the cloud:where it
comes from, how it’s stored, and what you can do with it. Learn to incorporate real world data online for your
students to analyze using Excel; create data visualizations and infographics, and understand the impact of Data
as a Service as a model for cloud computing.
This presentation is about creating more effective and holistic digital strategies that return real results for libraries and library initiatives.
To schedule a training, workshop, or to have me speak at your conference, visit my website at pcsweeney.com
This presentation is about creating more effective and holistic digital strategies that return real results for libraries and library initiatives.
To schedule a training, workshop, or to have me speak at your conference, visit my website at pcsweeney.com
Yes, we face a data deluge and big data seems to be largely about how to deal with it. But 99% of what has been written about big data is focused on selling hardware and services. The truth is that until the concept of big data can be objectively defined, any measurements, claims of success, quantifications, etc. must be viewed skeptically and with suspicion. While both the need for and approaches to these new requirements are faced by virtually every organization, jumping into the fray ill-prepared has (to date) reproduced the same dismal IT project results.
The very real, very rapid, very great increases in data of all forms (charts showing data types and volume increases)
Challenges faced by virtually all data management programs
Means by which big data techniques can compliment existing data management practices
Necessary but insufficient pre-requisites to exploiting big data techniques
Prototyping nature of practicing big data techniques
You can sign up for future Data-Ed webinars here: http://www.datablueprint.com/resource-center/webinar-schedule/
Bigdata.
Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Challenges include capture, storage, analysis, data curation, search, sharing, transfer, visualization, querying, updating and information privacy. The term "big data" often refers simply to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem."[2] Analysis of data sets can find new correlations to "spot business trends, prevent diseases, combat crime and so on."[3] Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet search, fintech, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics,[4] connectomics, complex physics simulations, biology and environmental research.[5]
Data sets grow rapidly - in part because they are increasingly gathered by cheap and numerous information-sensing Internet of things devices such as mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks.[6][7] The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s;[8] as of 2012, every day 2.5 exabytes (2.5×1018) of data are generated.[9] One question for large enterprises is determining who should own big-data initiatives that affect the entire organization.[10]
Relational database management systems and desktop statistics- and visualization-packages often have difficulty handling big data. The work may require "massively parallel software running on tens, hundreds, or even thousands of servers".[11] What counts as "big data" varies depending on the capabilities of the users and their tools, and expanding capabilities make big data a moving target. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."
It has been said that Mobiles +Cloud + Social + Big Data = Better Run The World. IBM has invested over $20 billion since 2005 to grow its analytics business, many companies will invest more than $120 billion by 2015 on analytics, hardware, software and services critical in almost every industry like ; Healthcare, media, sports, finance, government, etc.
It has been estimated that there is a shortage of 140,000 – 190,000 people with deep analytical skills to fill the demand of jobs in the U.S. by 2018.
Decoding the human genome originally took 10 years to process; now it can be achieved in one week with the power of Analytic and BI (Business Intelligence). This lecture’s Key Messages is that Analytics provide a competitive edge to individuals , companies and institutions and that Analytics and BI are often critical to the success of any organization.
Methodology used is to teach analytic techniques through real world examples and real data with this goal to convince audience of the Analytics Edge and power of BI, and inspire them to use analytics and BI in their career and their life.
Discussed about the things that happen in 60 seconds, big data, and 4V's of big data. The presentation includes analytics, its evolution and applications.
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceJedha Bootcamp
Depuis les 5 dernières années, nous avons créé plus de données que depuis les débuts de l'humanité. Nous produisons aujourd'hui tellement de données qu'il devient difficile de les gérer. C'est ce qu'on appelle le Big Data. Durant ce workshop nous parlerons des enjeux du Big Data et de ses applications concrètes dans notre société.
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
This presentation about Big Data will help you understand how Big Data evolved over the years, what is Big Data, applications of Big Data, a case study on Big Data, 3 important challenges of Big Data and how Hadoop solved those challenges. The case study talks about Google File System (GFS), where you’ll learn how Google solved its problem of storing increasing user data in early 2000. We’ll also look at the history of Hadoop, its ecosystem and a brief introduction to HDFS which is a distributed file system designed to store large volumes of data and MapReduce which allows parallel processing of data. In the end, we’ll run through some basic HDFS commands and see how to perform wordcount using MapReduce. Now, let us get started and understand Big Data in detail.
Below topics are explained in this Big Data presentation for beginners:
1. Evolution of Big Data
2. Why Big Data?
3. What is Big Data?
4. Challenges of Big Data
5. Hadoop as a solution
6. MapReduce algorithm
7. Demo on HDFS and MapReduce
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of the Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
A Review of Big data for Social Policy Decision Making Ridi Fe
This presentation is presented on ICSC 2018 at University Club - Universitas Gadjah Mada. it discusses how big data and cloud computing enhance social policy decision making. this research is supported by Microsoft Innovation Center UGM
Discovering History Through Digital Newspaper CollectionCengage Learning
Hear from Seth Cayley, Director of Research Publishing at Gale, a part of Cengage Learning, as he discusses the historic media coverage of familiar and little known events, cultural phenomena, and everyday life found in 19th and early 20th century newspapers. Learn how historical newspapers can support faculty research, drive inquiry and critical thinking among students, and stimulate classroom debate.
Are Your Students Ready for Lab?
11/5/2015
Presenters: Bill Heslop and Tony Baldwin, Directors and Co-founders, Learning Science Ltd.
LabSkills is an online program that prepares students for their lab sessions through assignments inOWLv2, the leading online learning system for Chemistry. LabSkills makes it easy for you to requirestudents to complete laboratory preparation prior to attending lab with demonstrations, interactivesimulations, and quizzes. The newest version of LabSkills PreLabs is an enhanced course with 10 new techniques, plus new mobile-compatible simulations. LabSkills content is easy to assign and is automatically graded. LabSkills is currently used by schools and universities in more than 30 countries worldwide.In this webinar, you will learn how to get your students:-Engaged with practical work-Prepared when they get to the lab-Confident in performing the experiments-Using the time in the lab effectively
More Related Content
Similar to Course Tech 2013, Mark Frydenberg, Drinking from the Fire Hose: Tools for Interpreting and Teaching with Big Data
Yes, we face a data deluge and big data seems to be largely about how to deal with it. But 99% of what has been written about big data is focused on selling hardware and services. The truth is that until the concept of big data can be objectively defined, any measurements, claims of success, quantifications, etc. must be viewed skeptically and with suspicion. While both the need for and approaches to these new requirements are faced by virtually every organization, jumping into the fray ill-prepared has (to date) reproduced the same dismal IT project results.
The very real, very rapid, very great increases in data of all forms (charts showing data types and volume increases)
Challenges faced by virtually all data management programs
Means by which big data techniques can compliment existing data management practices
Necessary but insufficient pre-requisites to exploiting big data techniques
Prototyping nature of practicing big data techniques
You can sign up for future Data-Ed webinars here: http://www.datablueprint.com/resource-center/webinar-schedule/
Bigdata.
Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Challenges include capture, storage, analysis, data curation, search, sharing, transfer, visualization, querying, updating and information privacy. The term "big data" often refers simply to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem."[2] Analysis of data sets can find new correlations to "spot business trends, prevent diseases, combat crime and so on."[3] Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet search, fintech, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics,[4] connectomics, complex physics simulations, biology and environmental research.[5]
Data sets grow rapidly - in part because they are increasingly gathered by cheap and numerous information-sensing Internet of things devices such as mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks.[6][7] The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s;[8] as of 2012, every day 2.5 exabytes (2.5×1018) of data are generated.[9] One question for large enterprises is determining who should own big-data initiatives that affect the entire organization.[10]
Relational database management systems and desktop statistics- and visualization-packages often have difficulty handling big data. The work may require "massively parallel software running on tens, hundreds, or even thousands of servers".[11] What counts as "big data" varies depending on the capabilities of the users and their tools, and expanding capabilities make big data a moving target. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."
It has been said that Mobiles +Cloud + Social + Big Data = Better Run The World. IBM has invested over $20 billion since 2005 to grow its analytics business, many companies will invest more than $120 billion by 2015 on analytics, hardware, software and services critical in almost every industry like ; Healthcare, media, sports, finance, government, etc.
It has been estimated that there is a shortage of 140,000 – 190,000 people with deep analytical skills to fill the demand of jobs in the U.S. by 2018.
Decoding the human genome originally took 10 years to process; now it can be achieved in one week with the power of Analytic and BI (Business Intelligence). This lecture’s Key Messages is that Analytics provide a competitive edge to individuals , companies and institutions and that Analytics and BI are often critical to the success of any organization.
Methodology used is to teach analytic techniques through real world examples and real data with this goal to convince audience of the Analytics Edge and power of BI, and inspire them to use analytics and BI in their career and their life.
Discussed about the things that happen in 60 seconds, big data, and 4V's of big data. The presentation includes analytics, its evolution and applications.
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceJedha Bootcamp
Depuis les 5 dernières années, nous avons créé plus de données que depuis les débuts de l'humanité. Nous produisons aujourd'hui tellement de données qu'il devient difficile de les gérer. C'est ce qu'on appelle le Big Data. Durant ce workshop nous parlerons des enjeux du Big Data et de ses applications concrètes dans notre société.
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
This presentation about Big Data will help you understand how Big Data evolved over the years, what is Big Data, applications of Big Data, a case study on Big Data, 3 important challenges of Big Data and how Hadoop solved those challenges. The case study talks about Google File System (GFS), where you’ll learn how Google solved its problem of storing increasing user data in early 2000. We’ll also look at the history of Hadoop, its ecosystem and a brief introduction to HDFS which is a distributed file system designed to store large volumes of data and MapReduce which allows parallel processing of data. In the end, we’ll run through some basic HDFS commands and see how to perform wordcount using MapReduce. Now, let us get started and understand Big Data in detail.
Below topics are explained in this Big Data presentation for beginners:
1. Evolution of Big Data
2. Why Big Data?
3. What is Big Data?
4. Challenges of Big Data
5. Hadoop as a solution
6. MapReduce algorithm
7. Demo on HDFS and MapReduce
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of the Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
A Review of Big data for Social Policy Decision Making Ridi Fe
This presentation is presented on ICSC 2018 at University Club - Universitas Gadjah Mada. it discusses how big data and cloud computing enhance social policy decision making. this research is supported by Microsoft Innovation Center UGM
Discovering History Through Digital Newspaper CollectionCengage Learning
Hear from Seth Cayley, Director of Research Publishing at Gale, a part of Cengage Learning, as he discusses the historic media coverage of familiar and little known events, cultural phenomena, and everyday life found in 19th and early 20th century newspapers. Learn how historical newspapers can support faculty research, drive inquiry and critical thinking among students, and stimulate classroom debate.
Are Your Students Ready for Lab?
11/5/2015
Presenters: Bill Heslop and Tony Baldwin, Directors and Co-founders, Learning Science Ltd.
LabSkills is an online program that prepares students for their lab sessions through assignments inOWLv2, the leading online learning system for Chemistry. LabSkills makes it easy for you to requirestudents to complete laboratory preparation prior to attending lab with demonstrations, interactivesimulations, and quizzes. The newest version of LabSkills PreLabs is an enhanced course with 10 new techniques, plus new mobile-compatible simulations. LabSkills content is easy to assign and is automatically graded. LabSkills is currently used by schools and universities in more than 30 countries worldwide.In this webinar, you will learn how to get your students:-Engaged with practical work-Prepared when they get to the lab-Confident in performing the experiments-Using the time in the lab effectively
5 Course Design Tips to Increase Engagement and OutcomesCengage Learning
Facilitated by: Professor Greg Gellene, Texas Tech University, Lubbock, Texas
10/21/2015
How do you get the most out of your students? Do you wish for them to participate more? Complete their homework? Improve their outcomes? Listen as Greg Gellene reveals his 5 tips for designing a course to better engage college students. Greg will share his experience building a digitally-infused course that increased class attendance and drove homework completion rates to over 80%. Attend this second webinar in our Journey to Digital Professional Development Series to hear from Greg, ask advice for implementing such methods in your own course, and discover why Greg’s students say technology helped to keep them well-engaged in his course.
The Journey to Digital: Incorporating Technology to Strengthen Critical MindsCengage Learning
Dr. Dale Prentiss, Special Lecturer, Oakland University, Rochester, Michigan
Have you gone digital? 74% of surveyed college students feel that they would fare better if their instructors would use more technology. Whether you are a technology novice or a digital pro, we welcome you to a webinar inspired by a recent case study at Oakland University. Dr. Dale Prentiss will share his journey to digital, his mission to help students strengthen their critical thinking skills, and how personalizing his course resulted in better student engagement. Join Dale as he discusses the highs and lows of moving from a non-digital to a fully-digital experience and offers tips on how to make the transition in your own course in this first webinar of The Journey to Digital Professional Development Series.
Google Drive Plus TexQuest Equals a Match Made in Research HeavenCengage Learning
Learn more about how Prosper (TX) High School is using their Gale In Context resources through the Google integration with tools such as Drive, Docs, and Apps, to help their students and teachers more easily access and share content within the classroom, library and from home.
Improving Time Management: Tips that Will Help College Students Start the Yea...Cengage Learning
Successful time management can have a major positive impact on grades and classroom performance. In addition, students who improve their time management report less stress, better focus and improved quality of life. Keep reading to review Cengage Learning’s top time-management tips!
How successful is MindTap? Just ask the Students! We asked and you answered, students are more likely to recommend to fellow students and professors alike!
Getting Started with Enhanced WebAssign 8/11/15 Presented by: Mike Lafreniere...Cengage Learning
Get up and running with Enhanced WebAssign (EWA) quickly! In this hour long peer-to-peer training session you will learn how to log in, create your own course, build and schedule assignments, and more. In addition, you’ll also get advice on what to require of students during the first couple of weeks of class.
Taming the Digital Tiger: Implementing a Successful Digital or 1:1 InitiativeCengage Learning
Hear from respected educational technologist, Lenny Schad, as he shares his experiences in leading a large Texas school district through a program of inclusion – creating an environment where it no longer matters which brands of hardware are being used or who owns the devices. Lenny is also an author with a recent ISTE published title Bring Your Own Learning.
Decimal and Fraction Jeopardy - A Game for Developmental MathCengage Learning
Each year colleges identify a significant number of students needing developmental math classes. Classes include capable students who may have fallen behind as well as students who have never acquired the skills to be successful in math. Game based learning can enhance motivation and help students succeed. Creating a game does not require advance technical skills. This user-friendly Powerpoint game is modeled on the popular Jeopardy game show and provides students with the opportunity to develop basic math skills. With game based learning, your lesson plan will become a focused, interactive opportunity for learning.
Game it up! Introducing Game Based Learning for Developmental MathCengage Learning
Addressing the needs of developmental math students is difficult but important challenge facing instructors. Game based learning adds excitement to your lesson and helps students focus. In this presentation, Dr Kathleen Offenholly reviews best practices and simple steps for adding game based learning to your class. The games are not flashy and do not require advanced technical skills. They are simple to implement and have proven to be effective.
Our esteemed guest, and author of the ASCD published title "Overcoming Textbook Fatigue", ReLeah Lent, shares ways in which over-reliance on textbooks as a sole-source of curriculum instruction can unintentionally create a barrier between our students and 21st Century effectiveness. Ms. Lent discuss actionable strategies for navigating this barrier while engaging our students more effectively.
Adult Student Success: How Does Awareness Correlate to Program Completion?Cengage Learning
Adult Student Success: How Does Awareness Correlate to Program Completion?
Presented by: Dr. Barbara Calabro and Dr. Melanie Yerk
Date Recorded: 12/9/2014
This installment of Cengage Learning’s College Success Faculty Engagement Webinar Series will help instructors and administrators to better understand the multi-faceted approaches to adult student success and retention by exploring the factors that specifically impact how adult students learn (including motivation, personality development, Maslow’s Hierarchy of Needs as they relate to adult students, self-esteem, and financial literacy) and by discussing the foundational competencies necessary for success both in college and in the workplace.
You're responsible for teaching, and your students are resonsible for learnin...Cengage Learning
Presenter: Dr. Debora Katz, United States Naval Academy
We've all heard the expression, "You can lead a horse to water, but you cannot make him drink." Many of us think this expression applies to our physics students. We lead them to physics, but we make them drink it in. Put in more concrete terms we are responsible for teaching, but our students are responsible for learning. So how can we get them to learn? In this webinar, Dr. Debora Katz, author of the new calculus-based physics text, Physics for Scientists and Engineers: Foundations and Connections, will discuss how flipping her classroom has shifted the focus from her teaching to her students' learning.
What is the Impact of the New Standard on the Intermediate Accounting Course?Cengage Learning
Presented by: Jefferson P. Jones Auburn University and Donald P. Pagach North Carolina State University
This session will address why the new standard was issued, its impact on the intermediate accounting course, and guidance on how to teach the new standard in the intermediate accounting course. Authors Jeff Jones and Don Pagach will also discuss how the new standard will be addressed in the second edition of Wahlen/Jones/Pagach Intermediate Accounting 2e.
The ABCs Approach to Goal Setting and ImplementationCengage Learning
Presented by: Dr. Christine Harrington - Director for the Center for the Enrichment of Learning and Teaching, Middlesex County College
Despite its' widespread use, you may be surprised to discover the research supporting the SMART goal setting framework is lacking. In fact, the SMART model is missing the most important factor in goal setting. Come discover a research-based framework (and the most important goal setting factor!) that will assist your students with setting and implementing effective goals that will lead to high levels of success.
Competency-based Education: Out with the new, in with the old? Cengage Learning
Presented by: Sally M. Johnstone, PhD - Vice President for Academic Advancement, Western Governors University; Dr. Larry Banks - Provost, Daymar Colleges Group, Competency Based Education Consultant, Wonderlic Assessments; and Anne Gupton, L.P.C., N.C.C. - Counselor and Associate Professor, Mott Community College
Date Recorded: 10/3/2014
The idea of competency-based education has steadily gained traction in the media, but its appropriateness in the educational arena remains questioned. How does this drive critical thinking? Should we measure learning based on the application of existing knowledge, or the ability to acquire and apply new knowledge?
Student-to-Student Learning, Powered by FlashNotes Cengage Learning
Presented by: Lester Lefton, President Emeritus of Kent State and Lou Lataif, Dean Emeritus of the School of Business at Boston University
Join Lester Lefton, President Emeritus of Kent State and Lou Lataif, Dean Emeritus of the School of Business at Boston University as they share the power of peer to peer education. We’ll also be joined by Michael Matousek as he shares the story of his company, Flashnotes.com, and its mission to compliment and reinforce the in-class experience and assigned textbook through the Flashnotes.com marketplace. By leveraging original student-created content, students have another opportunity to get help in real-time, preventing them from falling behind throughout the semester, to improve academic outcomes, student retention and graduation rate. In addition, hear the thoughts and experiences of fellow educators on this topic, and learn how you can help your students to take advantage of this technology.
Presented by: Francine Fabricant, MA, EdM - Lecturer at Hofstra University Continuing Education
It is possible for today's students to look at an unpredictable world and feel confident about their career potential. Students are facing a rapidly-changing, technologically-advanced, global economy, where job security is a thing of the past. To help students feel more secure and optimistic, they need a new set of tools.
Using strategies from the latest academic research and best-selling authors, we'll explore the new skills for career success, including open-mindedness, proactive behavior, creative thinking, sponsorship, personal branding, and lifelong learning. We'll also discuss how structured tools can help your students, such as a career portfolio and a flexible plan of action.
16. 3 V's
• Volume - amount of data is larger than
those conventional relational database
infrastructures can handle
• Velocity - the rate at which data is
generated, processed and analyzed in
(real) time
• Variety – data formats are unstructured
and inconsistent
19. Walmart
• Walmart collects more than 2.5
petabytes of data every hour from its
customer transactions.
• A petabyte is one quadrillion bytes, or the
equivalent of about 20 million filing
cabinets’ worth of text.
http://hbr.org/2012/10/big-data-the-management-revolution/ar
20. Velocity: Drinking from the Firehose
• Scrutinize 5 million trade events created
each day to identify potential fraud
• Analyze 500 million daily call detail
records in real-time to predict customer
churn faster
22. McKinsey&Company Report (2011)
• Data is part of every
industry and business
function.
• Data creates value.
• Big data becomes a basis
of competition and growth.
• Some sectors will achieve
greater gains.
• Shortage of people with
analytical skills.
• Need policies related to
privacy, security,
ownership.
27. Big Data Technologies
• HADOOP: scalable
storage, parallel
computation
• NoSQL: distributed
querying
28. What this Means
• Change your web page and Google finds it
in minutes.
• Ten years ago, you would have to submit a
request to Yahoo! to reindex your site.
• All you need is a lot of servers.
• Google has a million of them.
• No problem.
48. Mark Frydenberg
mfrydenberg@bentley.edu
cis.bentley.edu/mfrydenberg
CourseMate
Enhanced
Edition
Invite me to your school!
Editor's Notes
6 Degrees of Kevin Bacon, Name is Dumb Luck6 Degrees of Separation – within networks of people or things, there is a theoretical maximum of 6 points between any two nodesThat’s the Bacon IndexBob is 1, Ann is 2, Joe is 3. Index can only get so big because of interconnections.If Kim is connected to Bob, Kim is 2, not 4.
Twitter can’t be structured. Twitter is a bunch of words that humans are the best at parsingAnd so again we’re back to the 3 V’s, Volume, Velocity, and Variety. Not only is twitter’s data disorganized, it handles over 3000 new tweets per secondTwitter is using this data to recommend things to you, and it does it all lightning fast through an engine called Storm
If Amazon can see that lots of people buy forks and knives together, or that people buy curtains and curtain rods together how do they not recommend everyone who has bought a wrench set or a copy of black beauty buy them together if someone else has?This is where things get complicated
Twitter isn’t the only place where unstructured, realtime data is being processed. Facial recognition is a massive big data problemYour iPhone does facial recognition. Facebook does facial recognition. Aperture learns about faces from hundreds of data points and can help you find who is in what photos. Amazing.How do we do this so quickly?
Should it be opt-in only? http://www.code.org/sites/all/themes/codedotorg/logo.png
- Hereis a blood pressure monitor fromiHealththat stores yourblood pressure data in the cloud.
Here’s an appthat monitors yourheart rate fromyourphone’s camera, amazingstuffSo all thiswellness data isnowbeingcollectedubiquitously. How canitbeusedsecurely and effectively to make all of us healthier? This is the big data problem in health care