This document discusses characteristics of big data and the big data stack. It describes the evolution of data from the 1970s to today's large volumes of structured, unstructured and multimedia data. Big data is defined as data that is too large and complex for traditional data processing systems to handle. The document then outlines the challenges of big data and characteristics such as volume, velocity and variety. It also discusses the typical data warehouse environment and Hadoop environment. The five layers of the big data stack are then described including the redundant physical infrastructure, security infrastructure, operational databases, organizing data services and tools, and analytical data warehouses.
This presentation briefly discusses about the following topics:
Data Analytics Lifecycle
Importance of Data Analytics Lifecycle
Phase 1: Discovery
Phase 2: Data Preparation
Phase 3: Model Planning
Phase 4: Model Building
Phase 5: Communication Results
Phase 6: Operationalize
Data Analytics Lifecycle Example
This presentation briefly discusses about the following topics:
Data Analytics Lifecycle
Importance of Data Analytics Lifecycle
Phase 1: Discovery
Phase 2: Data Preparation
Phase 3: Model Planning
Phase 4: Model Building
Phase 5: Communication Results
Phase 6: Operationalize
Data Analytics Lifecycle Example
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
Building Your Own Facebook Real Time Analytics System with Cassandra and GigaSpaces.
Facebook's real time analytics system is a good reference for those looking to build their real time analytics system for big data.
The first part covers the lessons from Facebook's experience and the reason they chose HBase over Cassandra.
In the second part of the session, we learn how we can build our own Real Time Analytics system, achieve better performance, gain real business insights, and business analytics on our big data, and make the deployment and scaling significantly simpler using the new version of Cassandra and GigaSpaces Cloudify.
INTRODUCTIONTO OPERATING SYSTEM
What is an Operating System?
Mainframe Systems
Desktop Systems
Multiprocessor Systems
Distributed Systems
Clustered System
Real -Time Systems
Handheld Systems
Computing Environments
Hadoop was born out of the need to process Big Data.Today data is being generated liked never before and it is becoming difficult to store and process this enormous volume and large variety of data, In order to cope this Big Data technology comes in.Today Hadoop software stack is go-to framework for large scale,data intensive storage and compute solution for Big Data Analytics Applications.The beauty of Hadoop is that it is designed to process large volume of data in clustered commodity computers work in parallel.Distributing the data that is too large across the nodes in clusters solves the problem of having too large data sets to be processed onto the single machine.
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
Building Your Own Facebook Real Time Analytics System with Cassandra and GigaSpaces.
Facebook's real time analytics system is a good reference for those looking to build their real time analytics system for big data.
The first part covers the lessons from Facebook's experience and the reason they chose HBase over Cassandra.
In the second part of the session, we learn how we can build our own Real Time Analytics system, achieve better performance, gain real business insights, and business analytics on our big data, and make the deployment and scaling significantly simpler using the new version of Cassandra and GigaSpaces Cloudify.
INTRODUCTIONTO OPERATING SYSTEM
What is an Operating System?
Mainframe Systems
Desktop Systems
Multiprocessor Systems
Distributed Systems
Clustered System
Real -Time Systems
Handheld Systems
Computing Environments
Hadoop was born out of the need to process Big Data.Today data is being generated liked never before and it is becoming difficult to store and process this enormous volume and large variety of data, In order to cope this Big Data technology comes in.Today Hadoop software stack is go-to framework for large scale,data intensive storage and compute solution for Big Data Analytics Applications.The beauty of Hadoop is that it is designed to process large volume of data in clustered commodity computers work in parallel.Distributing the data that is too large across the nodes in clusters solves the problem of having too large data sets to be processed onto the single machine.
Bigdata.
Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Challenges include capture, storage, analysis, data curation, search, sharing, transfer, visualization, querying, updating and information privacy. The term "big data" often refers simply to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem."[2] Analysis of data sets can find new correlations to "spot business trends, prevent diseases, combat crime and so on."[3] Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet search, fintech, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics,[4] connectomics, complex physics simulations, biology and environmental research.[5]
Data sets grow rapidly - in part because they are increasingly gathered by cheap and numerous information-sensing Internet of things devices such as mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks.[6][7] The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s;[8] as of 2012, every day 2.5 exabytes (2.5×1018) of data are generated.[9] One question for large enterprises is determining who should own big-data initiatives that affect the entire organization.[10]
Relational database management systems and desktop statistics- and visualization-packages often have difficulty handling big data. The work may require "massively parallel software running on tens, hundreds, or even thousands of servers".[11] What counts as "big data" varies depending on the capabilities of the users and their tools, and expanding capabilities make big data a moving target. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."
This Presentation is completely on Big Data Analytics and Explaining in detail with its 3 Key Characteristics including Why and Where this can be used and how it's evaluated and what kind of tools that we use to store data and how it's impacted on IT Industry with some Applications and Risk Factors
An Comprehensive Study of Big Data Environment and its Challenges.ijceronline
Big Data is a data analysis methodology enabled by recent advances in technologies and Architecture. Big data is a massive volume of both structured and unstructured data, which is so large that it's difficult to process with traditional database and software techniques. This paper provides insight to Big data and discusses its nature, definition that include such features as Volume, Velocity, and Variety .This paper also provides insight to source of big data generation, tools available for processing large volume of variety of data, applications of big data and challenges involved in handling big data
INTRODUCTION TO BIG DATA AND HADOOP
9
Introduction to Big Data, Types of Digital Data, Challenges of conventional systems - Web data, Evolution of analytic processes and tools, Analysis Vs reporting - Big Data Analytics, Introduction to Hadoop - Distributed Computing
Challenges - History of Hadoop, Hadoop Eco System - Use case of Hadoop – Hadoop Distributors – HDFS – Processing Data with Hadoop – Map Reduce.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
This is a presentation by Dada Robert in a Your Skill Boost masterclass organised by the Excellence Foundation for South Sudan (EFSS) on Saturday, the 25th and Sunday, the 26th of May 2024.
He discussed the concept of quality improvement, emphasizing its applicability to various aspects of life, including personal, project, and program improvements. He defined quality as doing the right thing at the right time in the right way to achieve the best possible results and discussed the concept of the "gap" between what we know and what we do, and how this gap represents the areas we need to improve. He explained the scientific approach to quality improvement, which involves systematic performance analysis, testing and learning, and implementing change ideas. He also highlighted the importance of client focus and a team approach to quality improvement.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
How to Split Bills in the Odoo 17 POS ModuleCeline George
Bills have a main role in point of sale procedure. It will help to track sales, handling payments and giving receipts to customers. Bill splitting also has an important role in POS. For example, If some friends come together for dinner and if they want to divide the bill then it is possible by POS bill splitting. This slide will show how to split bills in odoo 17 POS.
The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension.
For more information, visit-www.vavaclasses.com
1. UNIT : II
Chracteristics of Data
Composition: deals with the structure of data i.e. sources of
data, types of data, nature of data.
Condition: deals with state of data i.e.
Context: deals with generation of data, sensitivity of data.
2. Evolution of Big Data
In 1970s : The data was essentially primitive and
structured.
In 1980s and 1990s : Relational databases evolved,
so the era was of Data-intensive applications.
In 2000 and beyond : WWW and IoT have led to
structured, unstructured and multimedia data.
3. Big Data
Define Big Data?
It's anything beyond imagination.
Today's BIG may be tomorrow's NORMAL.
Terabytes, Petabytes or Zettabytes of data.
About 3V's.
4. In 2001 industry analyst Doug Laney defines “Big Data” as the three
V’s (3Vs): Volume, Velocity and Variety.
In 2012 Gartner update this definition as, “Big Data” is high-volume,
high-velocity & high-variety information assets that demand cost-
effective, innovative form of information processing for enhanced
insight and decision making.
Big data is an evolving term that describes any voluminous amount
of structured, semi-structured and unstructured data that has the
potential to be mined for information.
Big Data
5. Challenges with Big Data
Challenges with Big Data
Capture
Storage
Curation
Search
Analysis
Transfer
Visualization
Privacy
6. Characteristics of Big Data
Big data is broken by three characteristics.
Extremely largeVolume of data
Extremely highVelocity of data
Extremely wideVariety of data
7.
8. Other characteristics of data which
are not definitional for Big Data
Veracity and Validity : deals with abnormality, accuracy and
correctness
Volatility : deals with data validity
Variability : deals with data floe which is highly inconsistent
9. Why Big Data?
More Data
More Acurate Analysis
More Confidence in
decision making
Impact in terms of enhancing
operational efficiency,
reducing cost & time,
innovating New products, new services,
Optimized offerings etc.
10. We are only Consumers or
information producers?
Consider one scenario :
11. 1. Text msg. To attend the party.
2. use of credit/debit card at the petrol pump.
3. Point-of-sale sys. At Archie's shop.
4. Photographs & posts on social networking
sites.
5. Likes & comments to your post.
12. BI Versus Big Data
Bisiness Intelligence(BI)
1. All enterprise's data is
housed in a central server
2. Tipical database server
scales data Vertically
3. BI data analyzed in an offline
mode
4. BI is about Structured Data
5. Move Data to code
Big Data
1. Data resides in a
distributed file system
2. Distributed file system
scales data Horizontally
3. Big Data analyzed in both
real time as well as
offline mode.
4. Big Data is about veriety
data
5. Move Code to data
13. Typical Data Warehouse Environment
ERP
(Enterprise Resource
Planning)
CRM
(Customer Relationship
Management)
Third party apps
Legacy System
Data
Warehouse
Reporting/
Dashbording
OLAP
Ad hoc querying
Modeling
14. Typical Hadoop Environment
Web Logs
Images and Videos
Docs and PDFs
Social Media
HDFS
Operational System
Data Warehouse
Data Mart
ODS
(Operational Data Store)
Data MartHadoop
MapReduce
15. Functional Requirements of Big Data
Big Data
Big Data
Big Data
(1)
Collection
(2)
Integration
(3)
Analysis
(4)
Actions
Decisions
16. Big Data Stack
Big Data technical Stack explain layered
architecture.
It is how to think about Big Data.
It is dealing with
– Storage
– Analytics
– Reporting
– Applications
Let's watch this Vedio....
18. Big Data Stack
Layer 0 (Redundant Physical Infrastructure) :
Deals with hardware, network & so on.
Performance: How responsive do you need the sys. To be?
performance of your machine, very fast infrastructures tends
to be very expensive.
Availability: Do you need a 100% uptime guarantee of
servise? Highly available infrastuctures are very expensive.
Scalability: How Big does your infrastructure need to be?
How much Disk space is needed?
Flexibility: How quickly can you add more resourses to the
infrastructure?
Cost: What can you afford?
19. Big Data Stack
Layer 1 (Security Infrastructure) :
Security and privacy requirements for big data are similar to the
requirements for conventional data environments.
Data Access: Data should be available to authorized person.
Application Access: Most API's offer protection from
unauthorized usage or access.
Data Encryption: It is most challenging aspect in Big Data
environment.
Threat Detection: The inclusion of mobile devices and social
networks exponentially increases both the amount of data and
opportunities for security threats.
20. Big Data Stack
Layer 2 (Operational Databases):
For Big Data environment it is needed to be have
fast & scalable database engine.
Use of RDBMS for Big Data is not practical
solution.
Choose Proper Database.
Your Database must support ACID.
21. Big Data Stack
Layer 3 (Organizing Data Services and Tools):
Organizing Data Services and Tools capture, validate and assemble
various big data elements in to contextually relevent collections.
Becouse Big data is massive.
Tools need to provide integration, translation, normalization and scale.
Technologies in this layer are as follows:
A Distributed File System
Serialization Service
Coordination Services
Extract, Transfer and Load (ETL) Tools
Workflow Services
22. Big Data Stack
Layer 4 (Analytical data Warehouses):
Data Warehouse and Data Mart contain normalized data
gathered from a variety of sources and assembled to facilitate
analysis of the business.
It is for creation of reports and visualization of disparate data
items.
23. Big Data Analytics:
It requires proper Analytical tools
This Architecture list three classes of tools.
Reporting and dashboards: this tools provide
“User-friendly” representation of information.
Visualization:
Analytics and Advanced Analytics: