Big Data is a term used to describe huge volumes of data that are so large and complex that traditional data processing applications are inadequate. Examples of big data sources include social media, jet engine sensor data, and stock exchange trade data. Big data has certain characteristics including volume, velocity, variety, and veracity. It can be structured, unstructured, or semi-structured. Big data analytics involves collecting and analyzing large data sets to find patterns and other useful information. Applications of big data include healthcare, education, e-commerce, media and entertainment, finance, and more.
Big data is data that is too large and complex for traditional data processing applications to analyze in a timely manner. It is characterized by high volume, velocity, and variety. Big data analytics involves collecting and analyzing large data sets to uncover hidden patterns, unknown correlations, market trends, and customer preferences that can help organizations make better business decisions. Some key applications of big data today include using it in healthcare to improve treatment outcomes, in education to personalize learning, in e-commerce to enhance customer experience, and in various other industries like media, finance, travel, telecom, and automobiles.
Big data is a large and complex collection of data that is difficult to process using traditional data management tools. It is characterized by its volume, variety, velocity, and variability. Examples of big data sources include social media data, sensor data from jet engines, stock exchange data, and more. Big data can be structured, unstructured, or semi-structured. Analyzing big data provides advantages like improved customer service, risk identification, and operational efficiency. However, big data also poses challenges around rapid data growth, storage, data security, and unreliable data.
Big Data refers to large, complex datasets that traditional data processing applications are unable to handle efficiently. Spark is a fast, general engine for large-scale data processing that supports multiple languages and data sources. Spark uses resilient distributed datasets (RDDs) that operate on data stored in cluster memory for faster performance compared to the disk-based MapReduce model. DataFrames provide a distributed collection of data organized into named columns similar to a relational database, enabling SQL-like queries and optimizations.
This document provides an overview of a course on fundamentals of big data. It outlines 5 course outcomes related to identifying big data characteristics, implementing Hadoop and MapReduce, analyzing structured and semi-structured data using Hive and Pig, using MongoDB for CRUD operations, and exploring visualization techniques. It also describes 2 course units, the first of which covers characteristics of big data like volume, velocity, and variety, as well as challenges. Traditional business intelligence is contrasted with big data approaches.
The document discusses big data, including the different units used to measure data size like bytes, kilobytes, megabytes, etc. It notes that big data is difficult to store and process using traditional tools due to its large size and complexity. Big data is growing rapidly in volume, velocity and variety. Some challenges in analyzing big data include its unstructured nature, size that exceeds capabilities of conventional tools, and need for real-time insights. Security, access control, data classification and performance impacts must be considered when protecting big data.
Big Data Mining - Classification, Techniques and IssuesKaran Deep Singh
The document discusses big data mining and provides an overview of related concepts and techniques. It describes how big data is characterized by large volume, variety, and velocity of data that is difficult to manage with traditional methods. Common techniques for big data mining discussed include NoSQL databases, MapReduce, and Hadoop. Some challenges of big data mining are also mentioned, such as dealing with high volumes of unstructured data and limitations of traditional databases in handling diverse and continuously growing data sources.
IRJET- Big Data Management and Growth EnhancementIRJET Journal
1. The document discusses big data management and growth, including definitions of big data, properties of big data like volume, variety, and velocity, and applications of big data in various domains.
2. It describes how big data is used in education to improve student outcomes, in healthcare to enable prevention and more personalized care, and in industries like banking and fraud detection to enhance customer segmentation and risk assessment.
3. Big data analytics refers to analyzing large and complex datasets to extract useful insights and make better decisions. The document provides examples of machine learning and predictive analytics techniques used for big data analysis.
Big data refers to extremely large data sets that are too large to be processed using traditional data processing applications. It is characterized by high volume, variety, and velocity. Examples of big data sources include social media, jet engines, stock exchanges, and more. Big data can be structured, unstructured, or semi-structured. Key characteristics include volume, variety, velocity, and variability. Analyzing big data can provide benefits like improved customer service, better operational efficiency, and more informed decision making for organizations in various industries.
Big data is data that is too large and complex for traditional data processing applications to analyze in a timely manner. It is characterized by high volume, velocity, and variety. Big data analytics involves collecting and analyzing large data sets to uncover hidden patterns, unknown correlations, market trends, and customer preferences that can help organizations make better business decisions. Some key applications of big data today include using it in healthcare to improve treatment outcomes, in education to personalize learning, in e-commerce to enhance customer experience, and in various other industries like media, finance, travel, telecom, and automobiles.
Big data is a large and complex collection of data that is difficult to process using traditional data management tools. It is characterized by its volume, variety, velocity, and variability. Examples of big data sources include social media data, sensor data from jet engines, stock exchange data, and more. Big data can be structured, unstructured, or semi-structured. Analyzing big data provides advantages like improved customer service, risk identification, and operational efficiency. However, big data also poses challenges around rapid data growth, storage, data security, and unreliable data.
Big Data refers to large, complex datasets that traditional data processing applications are unable to handle efficiently. Spark is a fast, general engine for large-scale data processing that supports multiple languages and data sources. Spark uses resilient distributed datasets (RDDs) that operate on data stored in cluster memory for faster performance compared to the disk-based MapReduce model. DataFrames provide a distributed collection of data organized into named columns similar to a relational database, enabling SQL-like queries and optimizations.
This document provides an overview of a course on fundamentals of big data. It outlines 5 course outcomes related to identifying big data characteristics, implementing Hadoop and MapReduce, analyzing structured and semi-structured data using Hive and Pig, using MongoDB for CRUD operations, and exploring visualization techniques. It also describes 2 course units, the first of which covers characteristics of big data like volume, velocity, and variety, as well as challenges. Traditional business intelligence is contrasted with big data approaches.
The document discusses big data, including the different units used to measure data size like bytes, kilobytes, megabytes, etc. It notes that big data is difficult to store and process using traditional tools due to its large size and complexity. Big data is growing rapidly in volume, velocity and variety. Some challenges in analyzing big data include its unstructured nature, size that exceeds capabilities of conventional tools, and need for real-time insights. Security, access control, data classification and performance impacts must be considered when protecting big data.
Big Data Mining - Classification, Techniques and IssuesKaran Deep Singh
The document discusses big data mining and provides an overview of related concepts and techniques. It describes how big data is characterized by large volume, variety, and velocity of data that is difficult to manage with traditional methods. Common techniques for big data mining discussed include NoSQL databases, MapReduce, and Hadoop. Some challenges of big data mining are also mentioned, such as dealing with high volumes of unstructured data and limitations of traditional databases in handling diverse and continuously growing data sources.
IRJET- Big Data Management and Growth EnhancementIRJET Journal
1. The document discusses big data management and growth, including definitions of big data, properties of big data like volume, variety, and velocity, and applications of big data in various domains.
2. It describes how big data is used in education to improve student outcomes, in healthcare to enable prevention and more personalized care, and in industries like banking and fraud detection to enhance customer segmentation and risk assessment.
3. Big data analytics refers to analyzing large and complex datasets to extract useful insights and make better decisions. The document provides examples of machine learning and predictive analytics techniques used for big data analysis.
Big data refers to extremely large data sets that are too large to be processed using traditional data processing applications. It is characterized by high volume, variety, and velocity. Examples of big data sources include social media, jet engines, stock exchanges, and more. Big data can be structured, unstructured, or semi-structured. Key characteristics include volume, variety, velocity, and variability. Analyzing big data can provide benefits like improved customer service, better operational efficiency, and more informed decision making for organizations in various industries.
An Comprehensive Study of Big Data Environment and its Challenges.ijceronline
Big Data is a data analysis methodology enabled by recent advances in technologies and Architecture. Big data is a massive volume of both structured and unstructured data, which is so large that it's difficult to process with traditional database and software techniques. This paper provides insight to Big data and discusses its nature, definition that include such features as Volume, Velocity, and Variety .This paper also provides insight to source of big data generation, tools available for processing large volume of variety of data, applications of big data and challenges involved in handling big data
Introduction to big data – convergences.saranya270513
Big data is high-volume, high-velocity, and high-variety data that is too large for traditional databases to handle. The volume of data is growing exponentially due to more data sources like social media, sensors, and customer transactions. Data now streams in continuously in real-time rather than in batches. Data also comes in more varieties of structured and unstructured formats. Companies use big data to gain deeper insights into customers and optimize business processes like supply chains through predictive analytics.
This document provides information about big data analytics. It defines what data and big data are, explaining that big data refers to extremely large data sets that are difficult to process using traditional data management tools. It discusses the volume, variety, velocity, and veracity characteristics of big data. Examples of big data sources and sizes are provided, such as the terabytes of data generated each day by the New York Stock Exchange and Facebook. The document also covers structured, unstructured, and semi-structured data types; advantages of big data processing; and types of digital advertising.
This document discusses big data, including what it is, its characteristics, advantages, and challenges. Big data refers to extremely large data sets that cannot be processed with traditional data processing tools. It is characterized by its volume, variety, velocity, variability, and veracity. Big data has advantages in fields like predicting diseases and improving transportation safety. However, challenges include storing large amounts of data from various sources and processing it quickly. The document outlines tools used for big data like Hadoop and MongoDB and concludes that big data plays a vital role in today's world.
Big data is a collection of large and complex data sets that are difficult to process using traditional data processing applications. It is characterized by high volume, velocity, and variety of data. Big data is stored in data lakes and processed using technologies like Hadoop, Spark, and cloud platforms. While big data enables new insights and opportunities, it also presents challenges around data management, integration, and developing skills to work with diverse data types and systems.
Bda assignment can also be used for BDA notes and concept understanding.Aditya205306
Big data refers to large and complex datasets that are difficult to analyze using traditional methods. It is characterized by high volume, velocity, and variety of data from numerous sources. Big data analytics uses tools like Hadoop and Spark to extract meaningful insights from large, unstructured datasets in real-time. This allows companies to gain valuable business insights, reduce costs, enhance customer experience, innovate products, and make faster decisions.
Overview of mit sloan case study on ge data and analytics initiative titled g...Gregg Barrett
GE collects sensor data from industrial equipment to analyze equipment performance and predict failures. It created a "data lake" to integrate raw flight data from 3.4 million flights with other data sources. This allows data scientists to identify issues reducing equipment uptime for customers. However, GE faces challenges in finding qualified analytics talent and establishing effective data governance as it scales its data and analytics efforts.
1. Data refers to raw facts and statistics that are stored or transmitted, while information is data that has been organized and structured to be useful.
2. Big data is a large collection of data that cannot be processed by traditional data management tools due to its huge size and complexity.
3. Big data has three key characteristics - volume, referring to the large amount of data; velocity, referring to the speed at which new data is generated and collected; and variety, referring to the different types and sources of data.
This document provides an overview of big data, including its definition, size and growth, characteristics, analytics uses and challenges. It discusses operational vs analytical big data systems and technologies like NoSQL databases, Hadoop and MapReduce. Considerations for selecting big data technologies include whether they support online vs offline use cases, licensing models, community support, developer appeal, and enabling agility.
This document discusses data mining techniques for big data. It defines big data as large, complex collections of data from various sources that contain both structured and unstructured data. Big data is growing rapidly due to data from sources like social media, sensors, and digital content. Data mining can extract useful insights from big data by discovering patterns and relationships. The document outlines common data mining techniques like classification, prediction, clustering and association rule mining that can be applied to big data. It also discusses challenges of big data like its huge volume, variety of data types, and rapid growth that require new data management approaches.
Intro to big data and applications - day 1Parviz Vakili
This document provides an overview and introduction to big data and its applications. It defines key concepts related to big data, including the five V's of big data (volume, velocity, variety, veracity, and value). It also discusses where big data comes from, different data types (structured, semi-structured, unstructured), and common applications of big data across different industries. Finally, it introduces concepts of data governance, data strategy, and how big data can support digital transformation.
The document discusses big data challenges faced by organizations. It identifies several key challenges: heterogeneity and incompleteness of data, issues of scale as data volumes increase, timeliness in processing large datasets, privacy concerns, and the need for human collaboration in analyzing data. The document describes surveying various organizations in Pakistan, including educational institutions, telecommunications companies, hospitals, and electrical utilities, to understand the big data problems they face. Common challenges included data errors, missing or incomplete data, lack of data management tools, and issues integrating different data sources. The survey found that while some organizations used big data tools, many educational institutions in particular did not, limiting their ability to effectively manage and analyze their large and growing datasets.
This document discusses big data analytics and how it is transforming business intelligence. It defines big data analytics as combining large datasets ("big data") with advanced analytic techniques. It describes big data using the three V's: volume, variety, and velocity. Volume refers to the large size of datasets. Variety means data comes from many different sources and formats. Velocity means data streams in continuously and in real-time. The document provides examples of how companies are using big data analytics to discover new insights and track changing customer behavior.
This document provides an overview of big data, including definitions, characteristics, and technologies. It defines big data as large datasets that cannot be processed by traditional databases due to size and complexity. It describes the key aspects of big data as volume, variety, velocity, and veracity. The document also discusses how big data differs from traditional transaction systems, the promise and challenges of big data, and Hadoop as a framework for distributed processing of big data.
Communications of the Association for Information SystemsV.docxmonicafrancis71118
Communications of the Association for Information Systems
Volume 34 Article 65
5-2014
Tutorial: Big Data Analytics: Concepts,
Technologies, and Applications
Hugh J. Watson
University of Georgia, [email protected]
Follow this and additional works at: http://aisel.aisnet.org/cais
This material is brought to you by the Journals at AIS Electronic Library (AISeL). It has been accepted for inclusion in Communications of the
Association for Information Systems by an authorized administrator of AIS Electronic Library (AISeL). For more information, please contact
[email protected]
Recommended Citation
Watson, Hugh J. (2014) "Tutorial: Big Data Analytics: Concepts, Technologies, and Applications," Communications of the Association
for Information Systems: Vol. 34, Article 65.
Available at: http://aisel.aisnet.org/cais/vol34/iss1/65
http://aisel.aisnet.org/cais?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
http://aisel.aisnet.org/cais/vol34?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
http://aisel.aisnet.org/cais/vol34/iss1/65?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
http://aisel.aisnet.org/cais?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
http://aisel.aisnet.org/cais/vol34/iss1/65?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
mailto:[email protected]>
Volume 34 Article 65
Tutorial: Big Data Analytics: Concepts, Technologies, and Applications
Hugh J. Watson
Department of MIS, University of Georgia
[email protected]
We have entered the big data era. Organizations are capturing, storing, and analyzing data that has high volume,
velocity, and variety and comes from a variety of new sources, including social media, machines, log files, video,
text, image, RFID, and GPS. These sources have strained the capabilities of traditional relational database
management systems and spawned a host of new technologies, approaches, and platforms. The potential value of
big data analytics is great and is clearly established by a growing number of studies. The keys to success with big
data analytics include a clear business need, strong committed sponsorship, alignment between the business and
IT strategies, a fact-based decision-making culture, a strong data infrastructure, the right analytical tools, and people
skilled in the use of analytics. Because of the paradigm shift in the kinds of data being analyzed and how this data is
used, big data can be considered to be a new, fourth generation of decision support data management. Though the
business value from big data is great, especially for online companies like Google and Facebook, how it is being
used is raising significant privacy concerns.
Keywords: big data, analytics, benefits, architecture, platforms, privacy
Volume 34, .
Big data is a huge collection of structured, semi-structured, and unstructured data that grows exponentially over time. It is too large and complex for traditional data management tools to process efficiently. Big data comes from a variety of sources and is characterized by its volume, variety, velocity, and variability. Tools like Hadoop, Spark, and Cassandra are used to analyze big data to provide businesses insights for improved decision making, customer service, and operational efficiency.
Big Data Analytics: Recent Achievements and New ChallengesEditor IJCATR
The era of Big data is being generated by everything around us at all times. Every digital process and social media
exchange produces it. Systems, sensors and mobile devices transmit it. Big data is arriving from multiple sources at an alarming
velocity, volume and variety. To extract meaningful value from big data, you need optimal processing power, analytics
capabilities and skills. Big data has become an important issue for a large number of research areas such as data mining,
machine learning, computational intelligence, information fusion, the semantic Web, and social networks. The combination of
big data technologies and traditional machine learning algorithms has generated new and interesting challenges in other areas
as social media and social networks. These new challenges are focused mainly on problems such as data processing, data
storage, data representation, and how data can be used for pattern mining, analysing user behaviours, and visualizing and
tracking data, among others. In this paper, discussion about the new concept big data and data analytic their concept, tools
and methodologies that is designed to allow for efficient data mining and information sharing fusion from social media and of
the new applications and frameworks that are currently appearing under the “umbrella” of the social networks, social media
and big data paradigms.
Unit 1 Introduction to Data Analytics .pptxvipulkondekar
The document provides an introduction to the concepts of data analytics including:
- It outlines the course outcomes for ET424.1 Data Analytics including discussing challenges in big data analytics and applying techniques for data analysis.
- It discusses what can be done with data including extracting knowledge from large datasets using techniques like analytics, data mining, machine learning, and more.
- It introduces concepts related to big data like the three V's of volume, variety and velocity as well as data science and common big data architectures like MapReduce and Hadoop.
An Encyclopedic Overview Of Big Data AnalyticsAudrey Britton
This document provides an overview of big data analytics. It discusses the characteristics of big data, known as the 5 V's: volume, velocity, variety, veracity, and value. It describes how Hadoop has become the standard for storing and processing large datasets across clusters of servers. The challenges of big data are also summarized, such as dealing with the speed, scale, and inconsistencies of data from a variety of structured and unstructured sources.
Big data is high-volume, high-velocity, and high-variety data that is difficult to process using traditional data management tools. It is characterized by 3Vs: volume of data is growing exponentially, velocity as data streams in real-time, and variety as data comes from many different sources and formats. The document discusses big data analytics techniques to gain insights from large and complex datasets and provides examples of big data sources and applications.
This document discusses network security and provides an overview of common security threats and countermeasures. It defines security, explains why security is needed to protect information and resources, and identifies entities that are vulnerable to attacks. It then describes several common security attacks such as firewalls, intrusion detection systems, denial of service attacks, TCP attacks, packet sniffing, and social engineering. For each threat, it outlines associated countermeasures to mitigate risks and improve security.
Logistics and Managing Transportion.pptxcalf_ville86
Transportation is the backbone of logistics and accounts for 40-50% of total logistics costs. It facilitates the movement of goods and connects production facilities. The key modes of transportation are roadways, railways, waterways, airways, and pipelines. Choosing the right mode or combination depends on factors like type of goods, distance, costs etc. Effective transportation requires applying principles like economy of scale and distance to reduce costs. Containerization, network design, and route planning techniques further optimize the transportation system.
An Comprehensive Study of Big Data Environment and its Challenges.ijceronline
Big Data is a data analysis methodology enabled by recent advances in technologies and Architecture. Big data is a massive volume of both structured and unstructured data, which is so large that it's difficult to process with traditional database and software techniques. This paper provides insight to Big data and discusses its nature, definition that include such features as Volume, Velocity, and Variety .This paper also provides insight to source of big data generation, tools available for processing large volume of variety of data, applications of big data and challenges involved in handling big data
Introduction to big data – convergences.saranya270513
Big data is high-volume, high-velocity, and high-variety data that is too large for traditional databases to handle. The volume of data is growing exponentially due to more data sources like social media, sensors, and customer transactions. Data now streams in continuously in real-time rather than in batches. Data also comes in more varieties of structured and unstructured formats. Companies use big data to gain deeper insights into customers and optimize business processes like supply chains through predictive analytics.
This document provides information about big data analytics. It defines what data and big data are, explaining that big data refers to extremely large data sets that are difficult to process using traditional data management tools. It discusses the volume, variety, velocity, and veracity characteristics of big data. Examples of big data sources and sizes are provided, such as the terabytes of data generated each day by the New York Stock Exchange and Facebook. The document also covers structured, unstructured, and semi-structured data types; advantages of big data processing; and types of digital advertising.
This document discusses big data, including what it is, its characteristics, advantages, and challenges. Big data refers to extremely large data sets that cannot be processed with traditional data processing tools. It is characterized by its volume, variety, velocity, variability, and veracity. Big data has advantages in fields like predicting diseases and improving transportation safety. However, challenges include storing large amounts of data from various sources and processing it quickly. The document outlines tools used for big data like Hadoop and MongoDB and concludes that big data plays a vital role in today's world.
Big data is a collection of large and complex data sets that are difficult to process using traditional data processing applications. It is characterized by high volume, velocity, and variety of data. Big data is stored in data lakes and processed using technologies like Hadoop, Spark, and cloud platforms. While big data enables new insights and opportunities, it also presents challenges around data management, integration, and developing skills to work with diverse data types and systems.
Bda assignment can also be used for BDA notes and concept understanding.Aditya205306
Big data refers to large and complex datasets that are difficult to analyze using traditional methods. It is characterized by high volume, velocity, and variety of data from numerous sources. Big data analytics uses tools like Hadoop and Spark to extract meaningful insights from large, unstructured datasets in real-time. This allows companies to gain valuable business insights, reduce costs, enhance customer experience, innovate products, and make faster decisions.
Overview of mit sloan case study on ge data and analytics initiative titled g...Gregg Barrett
GE collects sensor data from industrial equipment to analyze equipment performance and predict failures. It created a "data lake" to integrate raw flight data from 3.4 million flights with other data sources. This allows data scientists to identify issues reducing equipment uptime for customers. However, GE faces challenges in finding qualified analytics talent and establishing effective data governance as it scales its data and analytics efforts.
1. Data refers to raw facts and statistics that are stored or transmitted, while information is data that has been organized and structured to be useful.
2. Big data is a large collection of data that cannot be processed by traditional data management tools due to its huge size and complexity.
3. Big data has three key characteristics - volume, referring to the large amount of data; velocity, referring to the speed at which new data is generated and collected; and variety, referring to the different types and sources of data.
This document provides an overview of big data, including its definition, size and growth, characteristics, analytics uses and challenges. It discusses operational vs analytical big data systems and technologies like NoSQL databases, Hadoop and MapReduce. Considerations for selecting big data technologies include whether they support online vs offline use cases, licensing models, community support, developer appeal, and enabling agility.
This document discusses data mining techniques for big data. It defines big data as large, complex collections of data from various sources that contain both structured and unstructured data. Big data is growing rapidly due to data from sources like social media, sensors, and digital content. Data mining can extract useful insights from big data by discovering patterns and relationships. The document outlines common data mining techniques like classification, prediction, clustering and association rule mining that can be applied to big data. It also discusses challenges of big data like its huge volume, variety of data types, and rapid growth that require new data management approaches.
Intro to big data and applications - day 1Parviz Vakili
This document provides an overview and introduction to big data and its applications. It defines key concepts related to big data, including the five V's of big data (volume, velocity, variety, veracity, and value). It also discusses where big data comes from, different data types (structured, semi-structured, unstructured), and common applications of big data across different industries. Finally, it introduces concepts of data governance, data strategy, and how big data can support digital transformation.
The document discusses big data challenges faced by organizations. It identifies several key challenges: heterogeneity and incompleteness of data, issues of scale as data volumes increase, timeliness in processing large datasets, privacy concerns, and the need for human collaboration in analyzing data. The document describes surveying various organizations in Pakistan, including educational institutions, telecommunications companies, hospitals, and electrical utilities, to understand the big data problems they face. Common challenges included data errors, missing or incomplete data, lack of data management tools, and issues integrating different data sources. The survey found that while some organizations used big data tools, many educational institutions in particular did not, limiting their ability to effectively manage and analyze their large and growing datasets.
This document discusses big data analytics and how it is transforming business intelligence. It defines big data analytics as combining large datasets ("big data") with advanced analytic techniques. It describes big data using the three V's: volume, variety, and velocity. Volume refers to the large size of datasets. Variety means data comes from many different sources and formats. Velocity means data streams in continuously and in real-time. The document provides examples of how companies are using big data analytics to discover new insights and track changing customer behavior.
This document provides an overview of big data, including definitions, characteristics, and technologies. It defines big data as large datasets that cannot be processed by traditional databases due to size and complexity. It describes the key aspects of big data as volume, variety, velocity, and veracity. The document also discusses how big data differs from traditional transaction systems, the promise and challenges of big data, and Hadoop as a framework for distributed processing of big data.
Communications of the Association for Information SystemsV.docxmonicafrancis71118
Communications of the Association for Information Systems
Volume 34 Article 65
5-2014
Tutorial: Big Data Analytics: Concepts,
Technologies, and Applications
Hugh J. Watson
University of Georgia, [email protected]
Follow this and additional works at: http://aisel.aisnet.org/cais
This material is brought to you by the Journals at AIS Electronic Library (AISeL). It has been accepted for inclusion in Communications of the
Association for Information Systems by an authorized administrator of AIS Electronic Library (AISeL). For more information, please contact
[email protected]
Recommended Citation
Watson, Hugh J. (2014) "Tutorial: Big Data Analytics: Concepts, Technologies, and Applications," Communications of the Association
for Information Systems: Vol. 34, Article 65.
Available at: http://aisel.aisnet.org/cais/vol34/iss1/65
http://aisel.aisnet.org/cais?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
http://aisel.aisnet.org/cais/vol34?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
http://aisel.aisnet.org/cais/vol34/iss1/65?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
http://aisel.aisnet.org/cais?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
http://aisel.aisnet.org/cais/vol34/iss1/65?utm_source=aisel.aisnet.org%2Fcais%2Fvol34%2Fiss1%2F65&utm_medium=PDF&utm_campaign=PDFCoverPages
mailto:[email protected]>
Volume 34 Article 65
Tutorial: Big Data Analytics: Concepts, Technologies, and Applications
Hugh J. Watson
Department of MIS, University of Georgia
[email protected]
We have entered the big data era. Organizations are capturing, storing, and analyzing data that has high volume,
velocity, and variety and comes from a variety of new sources, including social media, machines, log files, video,
text, image, RFID, and GPS. These sources have strained the capabilities of traditional relational database
management systems and spawned a host of new technologies, approaches, and platforms. The potential value of
big data analytics is great and is clearly established by a growing number of studies. The keys to success with big
data analytics include a clear business need, strong committed sponsorship, alignment between the business and
IT strategies, a fact-based decision-making culture, a strong data infrastructure, the right analytical tools, and people
skilled in the use of analytics. Because of the paradigm shift in the kinds of data being analyzed and how this data is
used, big data can be considered to be a new, fourth generation of decision support data management. Though the
business value from big data is great, especially for online companies like Google and Facebook, how it is being
used is raising significant privacy concerns.
Keywords: big data, analytics, benefits, architecture, platforms, privacy
Volume 34, .
Big data is a huge collection of structured, semi-structured, and unstructured data that grows exponentially over time. It is too large and complex for traditional data management tools to process efficiently. Big data comes from a variety of sources and is characterized by its volume, variety, velocity, and variability. Tools like Hadoop, Spark, and Cassandra are used to analyze big data to provide businesses insights for improved decision making, customer service, and operational efficiency.
Big Data Analytics: Recent Achievements and New ChallengesEditor IJCATR
The era of Big data is being generated by everything around us at all times. Every digital process and social media
exchange produces it. Systems, sensors and mobile devices transmit it. Big data is arriving from multiple sources at an alarming
velocity, volume and variety. To extract meaningful value from big data, you need optimal processing power, analytics
capabilities and skills. Big data has become an important issue for a large number of research areas such as data mining,
machine learning, computational intelligence, information fusion, the semantic Web, and social networks. The combination of
big data technologies and traditional machine learning algorithms has generated new and interesting challenges in other areas
as social media and social networks. These new challenges are focused mainly on problems such as data processing, data
storage, data representation, and how data can be used for pattern mining, analysing user behaviours, and visualizing and
tracking data, among others. In this paper, discussion about the new concept big data and data analytic their concept, tools
and methodologies that is designed to allow for efficient data mining and information sharing fusion from social media and of
the new applications and frameworks that are currently appearing under the “umbrella” of the social networks, social media
and big data paradigms.
Unit 1 Introduction to Data Analytics .pptxvipulkondekar
The document provides an introduction to the concepts of data analytics including:
- It outlines the course outcomes for ET424.1 Data Analytics including discussing challenges in big data analytics and applying techniques for data analysis.
- It discusses what can be done with data including extracting knowledge from large datasets using techniques like analytics, data mining, machine learning, and more.
- It introduces concepts related to big data like the three V's of volume, variety and velocity as well as data science and common big data architectures like MapReduce and Hadoop.
An Encyclopedic Overview Of Big Data AnalyticsAudrey Britton
This document provides an overview of big data analytics. It discusses the characteristics of big data, known as the 5 V's: volume, velocity, variety, veracity, and value. It describes how Hadoop has become the standard for storing and processing large datasets across clusters of servers. The challenges of big data are also summarized, such as dealing with the speed, scale, and inconsistencies of data from a variety of structured and unstructured sources.
Big data is high-volume, high-velocity, and high-variety data that is difficult to process using traditional data management tools. It is characterized by 3Vs: volume of data is growing exponentially, velocity as data streams in real-time, and variety as data comes from many different sources and formats. The document discusses big data analytics techniques to gain insights from large and complex datasets and provides examples of big data sources and applications.
This document discusses network security and provides an overview of common security threats and countermeasures. It defines security, explains why security is needed to protect information and resources, and identifies entities that are vulnerable to attacks. It then describes several common security attacks such as firewalls, intrusion detection systems, denial of service attacks, TCP attacks, packet sniffing, and social engineering. For each threat, it outlines associated countermeasures to mitigate risks and improve security.
Logistics and Managing Transportion.pptxcalf_ville86
Transportation is the backbone of logistics and accounts for 40-50% of total logistics costs. It facilitates the movement of goods and connects production facilities. The key modes of transportation are roadways, railways, waterways, airways, and pipelines. Choosing the right mode or combination depends on factors like type of goods, distance, costs etc. Effective transportation requires applying principles like economy of scale and distance to reduce costs. Containerization, network design, and route planning techniques further optimize the transportation system.
Lesson 3 - Enterprise System Architecture.pptxcalf_ville86
This document discusses enterprise systems architecture and ERP systems. It examines the modules of ERP systems like finance, HR and sales. It describes common ERP architectures including three-tier architectures that separate data, application and presentation layers. The document also discusses service-oriented architectures, cloud computing and the implications of architecture decisions for management.
This document summarizes the key aspects of the Kimball Lifecycle approach to data warehousing. It describes the main phases including planning, requirements definition, dimensional modeling, ETL design, application development, deployment, maintenance, and growth. It explains the parallel tracks of technology, data, and business intelligence applications. Dimensional modeling concepts like star schemas and snowflake schemas are also defined.
This document provides an overview of application software and discusses several common types, including word processing, spreadsheet, database, and presentation graphics software. It describes key concepts for each type of application software, such as how to create and format documents in word processing and spreadsheet programs, organize data in databases using tables and queries, and design electronic slide shows using presentation graphics software. The document also covers general topics like software ownership rights, installed versus cloud-based software, and common commands found in many application programs.
Lesson 2 - The Internet, the Web, and Electronic Commerce.pptxcalf_ville86
This document provides an overview of the Internet, the World Wide Web, and electronic commerce. It discusses the origins and evolution of the Internet and Web. It describes how to access the Web using Internet service providers and browsers. It also covers various Internet applications and technologies like email, social media, search tools, e-commerce models, cloud computing, and the Internet of Things. The document aims to explain how individuals and businesses can effectively use Internet resources.
The document discusses database management systems (DBMS). It defines DBMS as software that collects, organizes, and provides access to data. The key components of a DBMS are hardware, software, data, procedures, and database access language. Normalization is also discussed as the process of organizing data into tables to avoid data redundancy and ambiguity. The goals of normalization include dividing tables, eliminating duplicated data, and defining relationships between tables.
Lesson 1 - Introduction to Enterprise Systems for Management.pdfcalf_ville86
The document provides an introduction to enterprise resource planning (ERP) systems. It discusses how ERP systems evolved from early inventory management and materials requirement planning systems used in the 1960s-1980s. ERP systems integrate core business functions such as accounting, finance, marketing, and human resources into a single system. The document outlines the components, architecture, benefits and limitations of ERP systems. It explains how ERP systems improve information sharing, standardize processes, and increase an organization's agility compared to earlier disconnected legacy systems.
Lessoon 1 - Information Technology, The Internet and You.pptxcalf_ville86
This document provides an overview of information technology concepts including:
- The parts of an information system are people, procedures, software, hardware, data, and the Internet.
- There are two main types of software: system software which manages computer resources, and application software which users directly interact with like word processors and browsers.
- Computers range from supercomputers to mainframes to personal computers (PCs) like desktops, laptops, tablets, smartphones, and wearables.
- Personal computer hardware includes the system unit containing the processor and memory, input devices like keyboards, output devices like monitors, storage devices like hard disks, and communication devices like modems.
- Data is stored electronically in
This document provides an overview of data warehousing and dimensional modeling concepts. It defines key terms like data warehouse and data mart. It explores reasons for data warehousing like the need for an integrated company-wide view of information. It describes common data warehouse architectures and components of the star schema model. It also discusses topics like slowly changing dimensions, data visualization, and data mining.
Definition of requirements for each project phases.pdfcalf_ville86
The document discusses the five key phases of project management: initiation, planning, execution, monitoring and control, and closure. It provides details on typical activities and objectives for each phase, including developing a project initiation document, creating a project plan and schedule, implementing the planned project activities, monitoring progress, and closing out the project upon completion.
The document discusses systems development and the systems development life cycle (SDLC). It describes the seven phases of the SDLC as planning, analysis, design, development, testing, implementation, and maintenance. It also discusses different systems development methodologies like waterfall methodology, rapid application development, extreme programming, and agile methodology. Finally, it covers outsourcing and the different forms it can take, including onshore, nearshore and offshore outsourcing.
1. Transaction Processing and Concurrency Control.pptxcalf_ville86
This document discusses transaction processing and concurrency control in database systems. It covers topics such as transactions and their properties including atomicity, consistency, isolation, and durability. It also discusses concurrency control and how locking methods work to ensure serializability by coordinating concurrent transactions through the use of locks at different levels of granularity like the database, table, page, and row levels. The scheduler establishes the execution order of operations for concurrent transactions.
1. Components of Information Systems.pdfcalf_ville86
The document discusses concepts related to information systems. It defines a system as having inputs, processing, and outputs. An information system specifically has five key resources - people, hardware, software, data, and networks. It describes each of these components in more detail and provides examples. The document also discusses different types of computer systems like microcomputers, midrange systems, and mainframe systems. It covers various input technologies and storage options in computers along with trade-offs between different storage media.
Introduction to Information Management.pptxcalf_ville86
The document discusses key concepts related to data, information, databases, and information management. It defines important terminologies and describes the typical components of a database system and database management system. It then outlines the six main activities that comprise the continuous cycle of information management: identification of information needs, acquisition and creation of information, analysis and interpretation of information, organization and storage of information, information access and dissemination, and information use. Additionally, it provides facts about the massive growth of data and outlines common methods for data collection.
Business intelligence (BI) refers to capabilities that enable organizations to make better decisions by collecting, presenting, and delivering data in easy-to-understand formats. BI solutions allow companies to answer questions about their products, competitors, customers, markets, and trends. An effective BI solution should be easy for all levels of employees to access, integrate data from various sources, provide data visualization and self-service analytics capabilities, and employ machine learning for automated and augmented analysis.
A data mart is a smaller subset of data from a data warehouse that is tailored to a specific business unit or function. It provides faster access to relevant data than searching an entire data warehouse. There are three main types of data marts - dependent, which get data from a data warehouse; independent, which access data directly from sources; and hybrid, which integrate multiple data sources. Data marts use either a star or snowflake schema to logically structure the data in dimension and fact tables for analysis. Implementing a data mart involves designing it, constructing the logical and physical structures, transferring data using ETL tools, configuring access, and ongoing management.
Business logic refers to the custom rules and algorithms that govern the flow of information between a database and user interface in a computer program. It contains the business rules that define how a business operates in true or false binaries. Business logic determines workflows and sequences of steps that specify how information and data are properly passed and decisions are made. It exists at a higher level than basic code maintaining computer infrastructure and performs critical behind-the-scenes data processing invisible to users. Business logic enables business rules to be implemented and determines how data is calculated, changed, and transmitted, while business rules provide the framework.
Business analytics uses data to help organizations make better decisions and craft business strategies. As companies generate vast amounts of data, there is a need for professionals with data analysis skills. Leading companies are using analytics not just to improve operations but launch new business models. While some industries and digital natives have captured opportunities, much potential value from analytics remains untapped, especially in manufacturing, healthcare, and the public sector. For companies to succeed in an increasingly data-driven world, analytics must be incorporated strategically and supported by the right talent, processes, and infrastructure.
it describes the bony anatomy including the femoral head , acetabulum, labrum . also discusses the capsule , ligaments . muscle that act on the hip joint and the range of motion are outlined. factors affecting hip joint stability and weight transmission through the joint are summarized.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
sybca-bigdata-ppt.pptx
1.
2. Introduction to Big Data
What is Data?
The quantities, characters, or symbols on which operations are performed by a computer,
which may be stored and transmitted in the form of electrical signals and recorded on
magnetic, optical, or mechanical recording media.
What is Big Data?
Big Data is also data but with a huge size. Big Data is a term used to describe a
collection of data that is huge in volume and yet growing exponentially with time. In
short such data is so large and complex that none of the traditional data management
tools are able to store it or process it efficiently.
“Extremely large data sets that may be analyzed computationally to reveal patterns ,
trends and association, especially relating to human behavior and interaction are
known as Big Data.”
3. Examples Of Big Data
Following are some the examples of Big Data-
The New York Stock Exchange generates about one terabyte of new trade data per day.
4. Social Media
The statistic shows that 500+terabytes of new data get ingested into the databases of social
media site Facebook, every day. This data is mainly generated in terms of photo and video
uploads, message exchanges, putting comments etc.
Asingle Jet engine can generate 10+terabytes of data in 30 minutes of flight time. With many
thousand flights per day, generation of data reaches up to many Petabytes.
TWITTER
6. Characteristics Of Big Data
• The following are known as “Big Data Characteristics”.
1. Volume
2. Velocity
3. Variety
4. Veracity
1. Volume:
Volume means “How much Data is generated”. Now-a-days,
Organizations or Human Beings or Systems are generating or getting
very vast amount of Data say TB(Tera Bytes) to PB(Peta Bytes) to Exa
Byte(EB) and more.
7. 2. Velocity:
Velocity means “How fast produce Data”. Now-a-days, Organizations or
Human Beings or Systems are generating huge amounts of Data at very
fast rate.
3. Variety:
Variety means “Different forms of Data”. Now-a-days, Organizations or
Human Beings or Systems are generating very huge amount of data at very fast
rate in different formats. We will discuss in details about different formats of
Data soon.
8. 4. Veracity
Veracity means “The Quality or Correctness orAccuracy of Captured Data”.
Out of 4Vs, it is most important V for any Big Data Solutions. Because without
Correct Information or Data, there is no use of storing large amount of data at
fast rate and different formats. That data should give correct business value.
9. Types of Digital Data
1. Structured
2. Unstructured
3. Semi-structured
Structured
Any data that can be stored, accessed and processed in the form of fixed format is
termed as a 'structured' data.
Over the period of time, talent in computer science has achieved greater success in
developing techniques for working with such kind of data (where the format is well
known in advance) and also deriving value out of it.
However, nowadays, we are foreseeing issues when a size of such data grows to a huge
extent, typical sizes are being in the range of multiple zettabytes.
Do you know? 1021 bytes equal to 1 zettabyte or one billion terabytes forms a zettabyte.
Looking at these figures one can easily understand why the name Big Data is
given and imagine the challenges involved in its storage and processing.
10. Do you know? Data stored in a relational database management system is one
example of a 'structured' data.
• Examples Of Structured Data
An 'Employee' table in a database is an example of Structured Data
Employee_ID Employee_Name Gender Department Salary_In_lacs
2365 Rajesh Kulkarni Male Finance 650000
3398 Pratibha Joshi Female Admin 650000
7465 Shushil Roy Male Admin 500000
7500 Shubhojit Das Male Finance 500000
7699 Priya Sane Female Finance 550000
11. Unstructured
Any data with unknown form or the structure is classified as unstructured data.
In addition to the size being huge, un-structured data poses multiple challenges in terms
of its processing for deriving value out of it.
A typical example of unstructured data is a heterogeneous data source containing a
combination of simple text files, images, videos etc.
Now day organizations have wealth of data available with them but unfortunately, they
don't know how to derive value out of it since this data is in its raw form or unstructured
format.
• Examples Of Un-structured Data
The output returned by 'Google Search'
12. Semi-structured
Semi-structured data can contain both the forms of data.
We can see semi-structured data as a structured in form but it is actually not defined
with e.g. a table definition in relational DBMS.
Example of semi-structured data is a data represented in an XML file.
Examples Of Semi-structured Data
Personal data stored in an XML file-
<rec><name>Prashant Rao</name><sex>Male</sex><age>35</age></rec>
<rec><name>Seema R.</name><sex>Female</sex><age>41</age></rec>
<rec><name>Satish Mane</name><sex>Male</sex><age>29</age></rec>
<rec><name>Subrato Roy</name><sex>Male</sex><age>26</age></rec>
<rec><name>Jeremiah J.</name><sex>Male</sex><age>35</age></rec>
13. Big Data Analytics
Big DataAnalytics:
Big Data analytics is the process of collecting, organizing and analyzing
large sets of data (called Big Data) to discover patterns and other useful
information.
Big Data analytics can help organizations to better understand the
information contained within the data and will also help identify the data
that is most important to the business and future business decisions.
Analysts working with Big Data typically want the knowledge that comes
from analyzing the data.
14. High-PerformanceAnalytics Required:
To analyze such a large volume of data, Big Data analytics is typically
performed using specialized software tools and applications for predictive
analytics, data mining, text mining, forecasting and data optimization.
Collectively these processes are separate but highly integrated functions of
high-performance analytics.
Using Big Data tools and software enables an organization to process extremely
large volumes of data that a business has collected to determine which data is
relevant and can be analyzed to drive better business decisions in the future.
15. The Challenges:
For most organizations, Big Data analysis is a challenge. Consider the sheer
volume of data and the different formats of the
data(both structured and unstructured data) that is collected across the entire
organization and the many different ways different types of data can be
combined, contrasted and analyzed to find patterns and other useful business
information.
The first challenge is in breaking down data silos to access all data an
organization stores in different places and often in different systems.
A second challenge is in creating platforms that can pull in unstructured data as
easily as structured data.
This massive volume of data is typically so large that it's difficult to process
using traditional database and software methods.
16. How Big DataAnalytics is Used Today:
As the technology that helps an organization to break down data silos and analyze
data improves, business can be transformed in all sorts of ways.
Today's advances in analyzing big data allow researchers to decode human DNA in
minutes, predict where terrorists plan to attack, determine which gene is mostly likely
to be responsible for certain diseases and, of course, which ads you are most likely to
respond to on Facebook.
Another example comes from one of the biggest mobile carriers in the world.
France's Orange launched its Data for Development project by releasing subscriber
data for customers in the Ivory Coast.
The 2.5 billion records, which were made anonymous, included details on calls and
text messages exchanged between 5 million users.
Researchers accessed the data and sent Orange proposals for how the data could serve
as the foundation for development projects to improve public health and safety.
Proposed projects included one that showed how to improve public safety by tracking
cell phone data to map where people went after emergencies; another showed how to
use cellular data for disease containment. (source)
17. The Benefits of Big DataAnalytics:
Enterprises are increasingly looking to find actionable insights into their
data. Many big data projects originate from the need to answer specific
business questions. With the right big data analytics platforms in place, an
enterprise can boost sales, increase efficiency, and improve operations,
customer service and risk management.
Webopedia parent company, QuinStreet, surveyed 540 enterprise decision-
makers involved in big data purchases to learn which business areas
companies plan to use Big Data analytics to improve operations. About half
of all respondents said they were applying big data analytics to improve
customer retention, help with product development and gain a competitive
advantage.
Notably, the business area getting the most attention relates to increasing
efficiency and optimizing operations. Specifically, 62 percent of respondents
said that they use big data analytics to improve speed and reduce complexity.
19. Here is the list of top Big Data applications in today’s world:
Big Data in Healthcare
Big Data in Education
Big Data in E-commerce
Big Data in Media and Entertainment
Big Data in Finance
Big Data in Travel Industry
Big Data in Telecom
Big Data inAutomobile
20. Let’s discuss the applications of Big Data in detail.
1. Big Data in Retail
The retail industry is the one that faces the most fierce competition of all. Retailers
constantly hunt for ways that will give them a competitive edge over others.
Customers are the real king sounds legit for the retail industry in particular.
For retailers to thrive in this competitive world, they need to understand their
customers in a better way. If they are aware of their customers’ needs and how to
fulfill those needs in the best possible way, then they know everything.
Check how Big Data act as a weapon for retailers to connect with their customers
– Big Data in Retail.
Through advanced analysis of their customer’s data, retailers are now able to
understand them from every angle possible. They gather this data from various
sources such as social media, loyalty programs, etc.
21. Even a minute detail about any customer has now become significant for them. They are
now closer to their customers than they have ever been. This empowers them to provide
customers with more personalized services and predict their demands in advance.
This helps them in building a loyal customer base. Some of the biggest names in the retail
world like Walmart, Sears and Holdings, Costco, Walgreens, and many more now have Big
Data as an integral part of their organizations.
A study by the National Retail Federation estimated that sales in November and December
are responsible for as much as 30% of retail annual sales.
22. 2. Big Data in Healthcare
Big Data and healthcare are an ideal match. It complements the healthcare industry better
than anything ever will. The amount of data the healthcare industry has to deal with is
unimaginable.
Gone are the days when healthcare practitioners were incapable of harnessing this data.
From finding a cure to cancer to detecting Ebola and much more, Big Data has got it all
under its belt and researchers have seen some life-saving outcomes through it.
Big Data and analytics have given them the license to build more personalized
medications. Data analysts are harnessing this data to develop more and more effective
treatments. Identifying unusual patterns of certain medicines to discover ways for
developing more economical solutions is a common practice these days.
23. Explore how Big Data helps to speed up the treatment process – Big Data in
Healthcare.
Smart wearables have gradually gained popularity and are the latest trend among
people of all age groups. This generates massive amounts of real-time data in the
form of alerts which helps in saving the lives of the people.
24. 3. Big Data in Education
When you ask people about the use of the data that an educational institute gathers, the
majority of the people will have the same answer that the institute or the student might
need it for future references.
Even you had the same perception about this data, didn’t you? But the fact is, this data
holds enormous importance. Big Data is the key to shaping the future of the people and
has the power to transform the education system for better.
Some of the top universities are using Big Data as a tool to renovate their academic
curriculum. Additionally, universities can even track the dropout rates of the students
and are taking the required measures to reduce this rate as much as possible.
25. 4. Big Data in E-commerce
One of the greatest revolutions this generation has seen is that of E-commerce. It is now part
and parcel of our routine life. Whenever we need to buy something, the first thought that
provokes our mind is E-commerce.And not your surprise, Big Data has been the face of it.
Some of the biggest E-commerce companies of the world like Amazon, Flipkart, Alibaba, and
many more are now bound to Big Data and analytics is itself an evidence of the level of
popularity Big Data has gained in recent times.
Big Data is now as important as anyone else in these organizations. Amazon, the biggest E-
commerce firm in the world and one of the pioneers of Big Data and analytics, has Big Data as
the backbone of its system. Flipkart, the biggest E-commerce firm in India, has one of the most
robust data platforms in the country.
See how Flipkart used Big Data to have one of the most robust data platforms.
Big Data’s recommendation engine is one of the most amazing applications the Big Data world
has ever witnessed. It furnishes the companies with a 360-degree view of its customers.
Companies then suggest customers accordingly. Customers now experience more personalized
services than they have ever had. Big Data has completely redefined people’s online shopping
experiences.
26. 5. Big Data in Media and Entertainment
Media and Entertainment industry is all about art and employing Big Data in it is a
sheer piece of art. Art and science are often considered to be the two completely
contrasting domains but when employed together, they do make a deadly duo and Big
Data’s endeavors in the media industry are a perfect example of it.
Viewers these days need content according to their choices only. Content that is
relatively new to what they saw the previous time. Earlier the companies
broadcasted theAds randomly without any kind of analysis.
But after the advent of Big Data analytics in the industry, companies now are
aware of the kind of Ads that attracts a customer and the most appropriate time to
broadcast it for seeking maximum attention.
Customers are now the real heroes of the Media and entertainment industry -
courtesy to Big Data andAnalytics.
27. 6. Big Data in Finance
The functioning of any financial organization depends heavily on its data and to safeguard that
data is one of the toughest challenges any financial firm faces. Data has been the second most
important commodity for them after money.
Even before Big Data gained popularity, the finance industry was already conquering the
technical field. In addition to it, financial firms were among the earliest adopters of Big Data
andAnalytics.
Digital banking and payments are two of the most trending buzzwords around and Big data
has been at the heart of it. Big Data is bossing the key areas of financial firms such as fraud
detection, risk analysis, algorithmic trading, and customer contentment.
This has brought much-needed fluency in their systems. They are now empowered to focus
more on providing better services to their customers rather than focussing on security issues.
Big Data has now enhanced the financial system with answers to its hardest of the challenges.
28. 7. Big Data in Travel Industry
While Big Data is spreading like wildfire and various industries have been cooking its food
with it, the travel industry was a bit late to realize its worth. Better late than never though.
Having a stress-free traveling experience is still like a daydream for many.
And now Big Data’s arrival is like a ray of hope, that will mark the departure of all the
hindrances in our smooth traveling experience.
See how Big Data is revolutionizing the travel & tourism sector.
Through Big Data and analytics, travel companies are now able to offer more
customized traveling experience. They are now able to understand their customer’s
requirements in a much-enhanced way.
From providing them with the best offers to be able to make suggestions in real-time,
Big Data is certainly a perfect guide for any traveler. Big Data is gradually taking the
window seat in the travel industry.
29. 8. Big Data in Telecom
The telecom industry is the soul of every digital revolution that takes place around the world.
With the ever-increasing popularity of smartphones, it has flooded the telecom industry with
massive amounts of data.
And this data is like a goldmine, telecom companies just need to know how to dig it properly.
Through Big Data and analytics, companies are able to provide the customers with smooth
connectivity, thus eradicating all the network barriers that the customers have to deal with.
Companies now with the help of Big Data and analytics can track the areas with the lowest as
well as the highest network traffics and thus doing the needful to ensure hassle-free network
connectivity.
Big Data alike other industries have helped the telecom industry to understand its customers
pretty well.
Telecom industries now provide customers with offers as customized as possible.
Big Data has been behind the data revolution we are currently experiencing.
30. 9. Big Data in Automobile
“A business like an automobile, has to be driven, in order to get results.” B.C. Forbes
And Big Data has now taken complete control of the automobile industry and is driving it
smoothly. Big Data is driving the automobile industry towards some unbelievable and never
before results.
The automobile industry is on a roll and Big Data is its wheels or I must say Big Data has given
wings to it. Big Data has helped the automobile industry achieve things that were beyond our
imaginations
From analyzing the trends to understanding the supply chain management, from taking care
of its customers to turning our wildest dream of connected cars a reality, Big Data is well
and truly driving the automobile industry crazy.