SlideShare a Scribd company logo
1 of 16
Discussion Board 1 – 2
Within the Discussion Board area, write 400-600 words that
respond to the following questions with your thoughts, ideas,
and comments. This will be the foundation for future
discussions by your classmates. Be substantive and clear, and
use examples to reinforce your ideas.
The architecture of Web 1.0 consists of following three
components (Jacobs & Walsh, 2004):
· Web resources identification: Uniform Resource Identifier
(URI)
· Interaction protocol: HyperText Transfer Protocol (HTTP)
· Data formats: HyperText Markup Language (HTML)
Over the last 25 years, the Web has experienced several
evolutions, which have been called Web 1.0, Web 2.0, Web 3.0,
Web 4.0, and Web 5.0. Each of the evolutions has brought in
more types of data sources, along with more advanced
functional capability to the Internet infrastructure to make the
Web the central place to see the convergence of many existing
and new technologies. These new capabilities, in turn, support
many new innovative business processes and practice through
the Web. Therefore, it is important to know the basic concepts
and applications of the Web, starting from its first generation.
Knowing the root of the Web technology will help you to
understand the reasons and consequences of the current and
future changes to the Web technology, as well as the challenges
of accessing the ever-growing Web data.
Complete the reading assignment, and search the Library and
Internet to find and study at least 2 more references that discuss
the concepts and applications of the Web. Based on the results
of your research, discuss the following questions:
· What role has each of the 3 components of the architecture of
Web 1.0 (URI, HTTP, and HTML) played in making the Web
one of the main sources of ever-growing big data?
· What will be the trend in terms of "performance bottleneck" to
access large-scale Web data as the Web technology evolves?
· Justify your point of view, and provide examples as necessary.
Unit 2 - 1
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
As the core component of Web 4.0, the Internet of Things (IoT)
has become a reality after many years of development. Distinct
from all previous generations of the Web where all the data are
generated by people, the Web 4.0 data are generated by both
human and embedded computing devices (Atzori, 2010). The
number of sources for the Web data have greatly increased
because multibillions of uniquely identifiable embedded
computing devices are connected through the Internet
infrastructure and various types of wireless networks. Because
most of IoT devices only have limited computing resources,
they play the role of raw data collector and initial data
preprocessor. These devices have to send the lower-level data to
various data processing centers where the computers with higher
order computing resources will perform heavier duty tasks. The
IoT-based Web 4.0 has not only increased the data growth rate,
but it also shifted the performance bottlenecks of accessing Web
data to many new places in the Internet infrastructure. It is very
important to fully understand where these new performance
bottlenecks are and the root causes of their existence so that you
can be more effectively handle your computing resources in
accessing various types of Web data for your large-scale Web
data-based applications.
Complete the reading assignment, and search the Library and
Internet to find and study at least 2 references that discuss the
concepts and applications of the IoT. Based on the results of
your research, discuss the following questions:
· Where will the new performance bottlenecks be when
accessing large-scale Web data generated by IoT?
· What is the new challenge for developing an indexing scheme
used to assist accessing large-scale Web data generated by IoT?
· Justify your point of view and provide examples as necessary.
Unit 3 – 1
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
MapReduce was originally developed for cost-efficient use of
large clusters of commodity computers to achieve scalable and
reliable data processing. It consistently applies two simple but
powerful functions—Map and Reduce—in parallel. Along
with Hadoop, which is an open-source implementation of
MapReduce, MapReduce has become one of the most popular
and practical technical solutions to deal with big data analytic
tasks. However, like any technical solution, the initial
MapReduce and Hadoop also have quite a few weaknesses when
applied to handle certain types of data processing applications.
Therefore, there is a need to thoroughly study the basic
concepts of MapReduce and its Hadoop implementation to fully
understand their pros and cons so that when applying them in
big data analytic tasks, you will be able to make the right
decisions and achieve the desired results.
Complete the reading assignment, and search the Library and
Internet to find and study more references that discuss the
concepts and applications of MapReduce and Hadoop as needed.
Based on the results of your research, discuss the following
questions:
· What are the basic concepts of MapReduce?
· What are the top 3 features of Hadoop?
· What are the pros and cons of MapReduce?
Justify your point of view and provide examples, as necessary.
===============================================
=====================================
Unit 3 – 2
Within the Discussion Board area, write 400-600 words that
respond to the following questions with your thoughts, ideas,
and comments. This will be the foundation for future
discussions by your classmates. Be substantive and clear, and
use examples to reinforce your ideas.
Many data analytic tasks in commonly used Web applications,
such as page ranking and social network analysis, are processed
iteratively until the computation meets the given condition.
However, the original MapReduce framework does not support
iterative computation directly. The iterative tasks have to be
manually developed through a separate software and use
multiple MapReduce jobs to emulate the iteration process. The
unchanged data from previous iteration will be reloaded and
reprocessed in the next iteration. This approach has increased
the performance penalty on computing resources because it does
not take advantage of most of the data in the iterations, which is
unchanged, and subsequently has no need to reload and
reprocess them during the consequent iterations. Another
problem with the manual approach is that it depends on
detecting the termination condition at each iteration. This
requires an extra MapReduce job, which causes extra
scheduling, I/O, and will increase network traffic. Obviously, a
better solution is required to address these performance
penalties.
Complete the reading assignment, and search the Library and
Internet to find and study more references that discuss how to
address the weakness of MapReduce and Hadoop on iterative
computation. Based on the results of your research, discuss the
following questions:
· What are the weaknesses of the initial MapReduce framework
in iterative computation?
· What are the root causes of the weakness?
· What are the key technical steps to solve the weakness?
Justify your point of view and provide examples, as necessary.
===============================================
=====================================
Unit 4 – 1
Within the Discussion Board area, write 400-600 words that
respond to the following questions with your thoughts, ideas,
and comments. This will be the foundation for future
discussions by your classmates. Be substantive and clear, and
use examples to reinforce your ideas.
Most of data analytic tasks in commonly used large-scale Web
data processing applications, such as Web crawls and Web page
indexing, are not iterative but incremental. The application
usually runs one time as needed. However, there is a common
characteristic of data shown in the incremental computation on
most large-scale Web data. Most of the data do not change
between two different runs. This obviously is an opportunity to
improve the data processing performance for MapReduce and its
Hadoop implementation because they did not consider this data
characteristic in their design and development. For example, if
99% of a large-scale data set is unchanged and if there is a
method to allow the MapReduce-based Web application to reuse
that data directly in the next run without reprocessing, the data
processing performance on this data set will increase greatly.
Therefore, it is very important to acquire the knowledge and
skills on how to achieve this process with existing MapReduce
and Hadoop.
Complete the reading assignment and search the Library and
Internet to find and study more references that discuss how to
implement incremental computation with existing MapReduce
and Hadoop. Based on the results of your research, discuss the
following questions:
· What are the principles of using the initial MapReduce
framework and Hadoop to improve performance of incremental
computation?
· How can these principles be designed and implemented
without a need of any big change on the initial MapReduce
framework and Hadoop?
Justify your point of view and provide examples, as necessary.
===============================================
===============================
Unit 5 – 1
Within the Discussion Board area, write 400-600 words that
respond to the following questions with your thoughts, ideas,
and comments. This will be the foundation for future
discussions by your classmates. Be substantive and clear, and
use examples to reinforce your ideas.
Distributed programming is a computation method in which
software will run on separate cores in multiple networked
computers. It is a true parallel computa tion model because it
can provide fully supported computing resources for
multitasking. Based on different criteria, distributed
programming models can be classified differently. If the
term distributed system is defined as “a system consisting of
networked computers and communicating through either
messaging passing or shared distributed memory to coordinate
the software functions to solve a problem or provide a service,”
then you can divide that distributed programming into the
following two models:
· Shared memory distributed programming
· Message-passing distributed programming
According to the definition of a distributed system, a cloud
computing environment is considered a distributed system.
Therefore, both the shared memory distributed programming
and the message-passing distributed programming can be
applied in the cloud. However, if you use cloud computing to
conduct large-scale data processing, the existing capabilities of
both the shared memory distributed programming or the
message-passing distributed programming are not sufficient
(Sakr & Gaber, 2014).
Complete the reading assignment, and search the Library and
Internet to find and study additional references that discuss the
concepts and applications of the distributed programming
models. Based on the results of your research, discuss the
following question:
· Why are both the shared memory distributed programming and
the message-passing distributed programming insufficient when
processing the large-scale data in cloud computing
environment?
Please justify your point of view and provide examples, as
necessary.
===============================================
==================================
Unit 5 – 2
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
The CAP theorem was originally proposed by Dr. E. Brewer at a
symposium on distributed computing, and he stated that “in any
highly distributed data system, there are three commonly
desirable properties: consistency, availability, and partition
tolerance. However, it is impossible for a system to provide all
three properties at the same time” (2000). This theorem was
later proven by S. Gilbert and N. Lynch. The CAP theorem has
had great impact on the design of distributed systems and
services, including distributed database management systems
(DDBMS). Web-based applications have posted new
requirements that traditional database systems such as SQL-
based relational database systems (RDBs) cannot fully satisfy.
This has triggered a new type of data storage systems
called NoSQL systems to occur and gradually become a
dominant alternative solution for data store and management.
One of popular practices among NoSQL data storage systems is
based on the CAP theorem to make the trade-off among the
three properties. Because the high performance cost of
maintaining strong consistency based on the atomicity,
consistency, isolation, and durability (ACID) semantics held by
RDBs, NoSQL systems often apply the weak consistency model
in exchange for the great reduction of performance overhead
involved in enforcing strong consistency (Sakr & Gaber, 2014).
Complete the reading assignment, and search the Library and
Internet to find and study additional references that discuss the
concepts of strong and weak consistency. Based on the results
of your research, discuss the following questions:
· Why must a NoSQL data storage system based on the cloud
computing environment make trade-offs between consistency
and availability?
· Where do the savings on the consistency handling overhead
come from in a NoSQL data storage system executing the weak
consistency?
Please justify your point of view and provide examples, as
necessary.
===============================================
==================================
Unit 6 -1
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
Any application that is data- and computing-intensive will be a
good candidate for a services based on cloud computing.
Visualizing large-scale data sets is one such application.
Visualizing large-scale data sets involves two aspects—large-
scale data processing and a visualization interface. To use the
cloud computing environment for large-scale data processing,
you need to consider the network performance issues such as
“the unevenness of bandwidth of computer pairs” that will be
discussed in another assignment (Sakr & Gaber, 2014). Now
you are focusing on the visualization interface aspect. Authors
have proposed a prototype framework for the design of
visualization service to the big data coming from the cloud
computing environment (Tanashi et al., 2010). The authors
discuss the end-user functionality supported by the framework
and their technical decisions on how to implement the
framework.
Complete the reading assignment, and search the Library and
Internet to find and study additional references that discuss how
to visualize large-scale data sets in a cloud computing
environment. Based on the results of your research, discuss the
following questions:
· What is the end-user functionality of the framework reported
in Tanahashi et al.?
· What are the technical design decisions that have an impact on
the performance of the framework?
Justify your point of view and provide examples, as necessary.
===============================================
==============================
Unit 7 – 1
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
Big data analytics can help solve some very hard problems. One
example is to detect network traffic anomalies caused by
diverse machine-generated traffic attacks (known as hit
inflation attacks, which “refer to the fraudulent activities of
generating charges for online advertisers without a real interest
in the product advertised”) by detecting the anomalous
deviation from the expected Internet Protocol (IP) size
distribution, where the term of IP size is defined as “the number
of users sharing the same source IP” (Sakr & Gaber, 2014). The
ability to detect hit inflation attacks is critical to the well -being
of online advertisement because it will ensure the healthy
operations of many daily used popular public Web-based
services, such as search engines, e-mail, maps, and other Web-
based applications. However, the network traffic data itself is
also a type of large-scale data set. To process such a data set
efficiently to discover the corresponding IP size distributions
for all publishers’ Web sites for detecting network traffic
anomalies is a very challenging task.
Complete the reading assignment, and search the Library and
Internet to find and study more references that discuss detecting
network traffic anomalies based on the IP size distribution.
Based on the results of your research, discuss the following
concepts:
· Identify 1 method of detecting network traffic anomalies based
on the IP size distribution.
· What are the design principles of the method?
· How does each method address the performance issue of
processing such large scale network traffic data?
Justify your point of view and provide examples, as necessary.
===============================================
============================
Unit 7 – 2
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
Different from the conventional distributed systems such as
those supercomputer-based client-server systems or small-scale
cluster systems, the network performance of the cloud
computing system has a unique characteristic. The network
bandwidth among different pairs of computers in the cloud can
vary significantly, and it is called “the bandwidth unevenness
among different machine pairs” (Sakr & Gaber, 2014). When a
very large-scale data set, such as those in a social network, Web
graph, information networks (which are known as large-scale
graph data set), needs to be partitioned into many machines in a
cloud computing system before it can be processed, the network
performance problem caused by the bandwidth unevenness
among different machine pairs needs to be seriously considered
and addressed. It will impact the entire data processing
performance because the partitioning of the very large-scale
data set will generate a very large amount of network traffic and
will impact a very large number of machines. Network
performance is a critical parameter for the design in any cloud
computing-based large-scale graph data set partitioning and
processing method.
Complete the reading assignment, and search the Library and
Internet to find and study more references that discuss the cloud
computing-based large-scale graph dataset partitioning and
processing. Based on the results of your research, discuss the
following topics:
· Identify 2 large-scale graph data set partitioning methods used
in a cloud computing system.
· What are the design principles of each method?
· How does each method address the network performance issue
caused by the bandwidth unevenness among different machine
pairs?
Justify your point of view and provide examples, as necessary.
===============================================
=======================
Unit 8 – 1
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
One of the main purposes of processing big data is to extract
knowledge (or so called big knowledge) out from the big data
set. “Knowledge is the meaningfulness about the data” (Sakr &
Gaber, 2014). Knowledge representation is usually associated
with the problem-solving task’s specific requirements. For
example, if a problem-solving task involves time order, then a
list may be a suitable data structure for the knowledge
representation; if a problem-solving task involves no time order,
then a set may be a suitable data structure for the knowledge
representation. However, there is a universal standard for
knowledge representation proposed by the World Web
Consortium (W3C) called Resource Description
Framework [RDF] (2014). It is the standard model for machine-
readable data representation, which now has been commonly
used to hold the knowledge representation in an application of
processing big data set. Resource Description Framework is
very helpful when you need to integrate the results of several
big data set processing applications. It can facilitate knowledge
integration even when the underlying data schemas differ in the
original data storage systems.
Complete the reading assignment, and search the Library and
Internet to find and study more references that discuss how to
extract knowledge from a large scale data set by applying
machine learning. Based on the results of your research, discuss
the following:
· Identify 1 research work that involves how to extract out
knowledge from a large-scale data set with specific real-world
semantics (e.g., an informatics system for biomedical research)
by applying machine learning.
· How is the machine learning applied in this research work?
· How is the extracted knowledge represented?
· How does the research work address the performance issue of
processing such large-scale data sets?
Justify your point of view and provide examples, as necessary.
===============================================
==============================
Unit 9 – 1
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
One way to research the security issues associated with big data
is to look into every stage of the life cycle of big data. The
entire data life cycle consists of the following 8 stages (Khan et
al., 2014):
· Stage 1: Raw data
· Stage 2: Collection
· Stage 3: Filtering and classification
· Stage 4: Data analysis
· Stage 5: Storing
· Stage 6: Sharing and publishing
· Stage 7: Security
· Stage 8: Retrieval, reuse, and discover
There are two items to point out, as follows:
· Stage 7 is an abstract stage.
· Only three stages (5, 6, and 8) are involved with security.
In Stage 5, the security issues associated with this stage are
mainly caused by two aspects—the size of data and the place to
store the data. Because the size of the data is too big, many
companies have to store their data in the cloud. However,
because the data are so big, it is really hard to verify if cloud
vendors indeed stored all the data. Because the cloud runs
under black box mode, the customers really have no way to
know where the data are stored, how they are stored, and
whether the integrity of the data is preserved. Because of the
cost of local storage and network bandwidth, customers cannot
even afford to use any simple approach, such as downloading
the entire data set, to verify if the data have been stored
properly in the cloud.
Complete the reading assignment, and search the Library and
Internet to find and study more references that discuss the
security issues associated with big data and how to solve them.
Based on the results of your research, discuss the following
tasks:
· Identify 2 security issues associated with big data.
· What are the root causes of these 2 security issues?
· How can each of these 2 security issues be solved?
Justify your point of view and provide examples, as necessary.
===============================================
=================================
Unit 10 -1
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
The cloud computing system shares the common characteristics
of the general security attacks to all types of distributed
computing systems, such as the following (Prakash & Darbari,
2012):
· Eavesdropping (gaining secret information)
· Masquerading (making assumptions on the identity of users)
· Message tampering (changing the content of the message)
· Replaying the message
· Denial of services
However, because of several special system features of the
cloud computing systems, such as virtual machines (VM), trust
asymmetry, semitransparent system architecture, and so forth,
the cloud computing system has a few special security issues.
These are summarized into the following 10 technical aspects
(Sakr & Gaber, 2014):
· Exploitation of co-tenancy
· Secure architecture for the cloud
· Accountability for outsourced data
· Confidentiality of data and computation
· Privacy
· Verifying outsourced computation
· Verifying capability
· Cloud forensics
· Misuse detection
· Resource accounting and economic attacks
Additionally, even some non-technical areas (e.g., regulatory
compliance legal jurisdiction) and many security researches'
assumptions pose security challenges to which no meaningful
solutions have yet been made (Prakash & Darbari, 2012). All of
these have kept security in cloud computing systems a current
and hot research subject.
Complete the reading assignment, and search the Library and
Internet to find and study more references that discuss the
security issues of cloud computing systems and how to solve
them. Based on the results of your research, discuss the
following tasks:
· Identify 2 security issues associated with cloud computing
system.
· What are the root causes of these 2 security issues?
· How can these 2 security issues be solved?
Justify your point of view and provide examples, as necessary.
===============================================
==============================
Unit 10 -2
Primary Task Response: Within the Discussion Board area,
write 400-600 words that respond to the following questions
with your thoughts, ideas, and comments. This will be the
foundation for future discussions by your classmates. Be
substantive and clear, and use examples to reinforce your ideas.
Throughout this class, you have touched many current hot
research subjects in big data analytics. You are familiar with
some of problems that are still waiting for a better solution in
big data analytics. Assume that you will write a research paper
on a big data analytic related subject. Discuss the following:
· Present your paper’s title, the motivation in maximum 3
sentences, the problem statement in 1 sentence, and the
hypothesis statement in 1 sentence.
· The hypothesis statement will include a proposed solution to
address the root cause of the problem presented in your problem
statement.
· Present 2 research questions, and discuss your thought process
on how you came up with your research questions based on the
motivations, the problem statement, and the hypothesis
statement.
· Use 1 sentence to specify the new contribution made to the
body of knowledge by your proposed solution.
===============================================
=================================

More Related Content

Similar to Discussion Board 1 – 2 Within the Discussion Board area, write 4

How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryAli Dasdan
 
Kellogg XML Holland Speech
Kellogg XML Holland SpeechKellogg XML Holland Speech
Kellogg XML Holland SpeechDave Kellogg
 
WebE_chapter_16.ppt
WebE_chapter_16.pptWebE_chapter_16.ppt
WebE_chapter_16.pptUsamaPatel9
 
The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018Amit Ashwini
 
Document Based Data Modeling Technique
Document Based Data Modeling TechniqueDocument Based Data Modeling Technique
Document Based Data Modeling TechniqueCarmen Sanborn
 
data science chapter-4,5,6
data science chapter-4,5,6data science chapter-4,5,6
data science chapter-4,5,6varshakumar21
 
Exploration of Call Transcripts with MapReduce and Zipf’s Law
Exploration of Call Transcripts with MapReduce and Zipf’s LawExploration of Call Transcripts with MapReduce and Zipf’s Law
Exploration of Call Transcripts with MapReduce and Zipf’s LawTom Donoghue
 
Business Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search EngineBusiness Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search Engineankur881120
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewAbhishek Roy
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 
Introduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersIntroduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersEmanuele Della Valle
 
IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptxritikgarg48
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxmadlynplamondon
 
11 Strategic Considerations for SharePoint Migrations #SPSVB
11 Strategic Considerations for SharePoint Migrations #SPSVB11 Strategic Considerations for SharePoint Migrations #SPSVB
11 Strategic Considerations for SharePoint Migrations #SPSVBChristian Buckley
 

Similar to Discussion Board 1 – 2 Within the Discussion Board area, write 4 (20)

How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st century
 
Kellogg XML Holland Speech
Kellogg XML Holland SpeechKellogg XML Holland Speech
Kellogg XML Holland Speech
 
WebE_chapter_16.ppt
WebE_chapter_16.pptWebE_chapter_16.ppt
WebE_chapter_16.ppt
 
The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018The Guide to becoming a full stack developer in 2018
The Guide to becoming a full stack developer in 2018
 
Report on web development
Report on web developmentReport on web development
Report on web development
 
Document Based Data Modeling Technique
Document Based Data Modeling TechniqueDocument Based Data Modeling Technique
Document Based Data Modeling Technique
 
data science chapter-4,5,6
data science chapter-4,5,6data science chapter-4,5,6
data science chapter-4,5,6
 
Exploration of Call Transcripts with MapReduce and Zipf’s Law
Exploration of Call Transcripts with MapReduce and Zipf’s LawExploration of Call Transcripts with MapReduce and Zipf’s Law
Exploration of Call Transcripts with MapReduce and Zipf’s Law
 
Data science unit2
Data science unit2Data science unit2
Data science unit2
 
Introduction abstract
Introduction abstractIntroduction abstract
Introduction abstract
 
Business Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search EngineBusiness Intelligence Solution Using Search Engine
Business Intelligence Solution Using Search Engine
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
Database project
Database projectDatabase project
Database project
 
Introduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersIntroduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS Practitioners
 
BrainSpa Paper
BrainSpa PaperBrainSpa Paper
BrainSpa Paper
 
IJSRED-V2I3P43
IJSRED-V2I3P43IJSRED-V2I3P43
IJSRED-V2I3P43
 
Intern Project Showcase.pptx
Intern Project Showcase.pptxIntern Project Showcase.pptx
Intern Project Showcase.pptx
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docx
 
11 Strategic Considerations for SharePoint Migrations #SPSVB
11 Strategic Considerations for SharePoint Migrations #SPSVB11 Strategic Considerations for SharePoint Migrations #SPSVB
11 Strategic Considerations for SharePoint Migrations #SPSVB
 

More from LyndonPelletier761

300 words Building healthier cities and communities involves local.docx
300 words Building healthier cities and communities involves local.docx300 words Building healthier cities and communities involves local.docx
300 words Building healthier cities and communities involves local.docxLyndonPelletier761
 
300 words APA format, Select a current example of a policy issue t.docx
300 words APA format, Select a current example of a policy issue t.docx300 words APA format, Select a current example of a policy issue t.docx
300 words APA format, Select a current example of a policy issue t.docxLyndonPelletier761
 
300-400 wordsClick here to access American Rhetorics Top 100 .docx
300-400 wordsClick here to access American Rhetorics Top 100 .docx300-400 wordsClick here to access American Rhetorics Top 100 .docx
300-400 wordsClick here to access American Rhetorics Top 100 .docxLyndonPelletier761
 
3. Describe one of the five major themes of Progressive Reform outli.docx
3. Describe one of the five major themes of Progressive Reform outli.docx3. Describe one of the five major themes of Progressive Reform outli.docx
3. Describe one of the five major themes of Progressive Reform outli.docxLyndonPelletier761
 
3. How do culture and business of Ireland compare with US culture an.docx
3. How do culture and business of Ireland compare with US culture an.docx3. How do culture and business of Ireland compare with US culture an.docx
3. How do culture and business of Ireland compare with US culture an.docxLyndonPelletier761
 
3-page paper which you use the article from the below websites.docx
3-page paper which you use the article from the below websites.docx3-page paper which you use the article from the below websites.docx
3-page paper which you use the article from the below websites.docxLyndonPelletier761
 
3-page APA format reaction paper to the first four stages of develop.docx
3-page APA format reaction paper to the first four stages of develop.docx3-page APA format reaction paper to the first four stages of develop.docx
3-page APA format reaction paper to the first four stages of develop.docxLyndonPelletier761
 
350 words. Standard essay format- no sources needed1. Explain wh.docx
350 words. Standard essay format- no sources needed1. Explain wh.docx350 words. Standard essay format- no sources needed1. Explain wh.docx
350 words. Standard essay format- no sources needed1. Explain wh.docxLyndonPelletier761
 
300 - 500 words in APA format (in text citations) and refernce page..docx
300 - 500 words in APA format (in text citations) and refernce page..docx300 - 500 words in APA format (in text citations) and refernce page..docx
300 - 500 words in APA format (in text citations) and refernce page..docxLyndonPelletier761
 
300 words long addressing the following issues.its a discussion h.docx
300 words long addressing the following issues.its a discussion h.docx300 words long addressing the following issues.its a discussion h.docx
300 words long addressing the following issues.its a discussion h.docxLyndonPelletier761
 
3. Creativity and AdvertisingFind two advertisements in a magazi.docx
3. Creativity and AdvertisingFind two advertisements in a magazi.docx3. Creativity and AdvertisingFind two advertisements in a magazi.docx
3. Creativity and AdvertisingFind two advertisements in a magazi.docxLyndonPelletier761
 
3-page APA format reaction paper to the standards of thinking and th.docx
3-page APA format reaction paper to the standards of thinking and th.docx3-page APA format reaction paper to the standards of thinking and th.docx
3-page APA format reaction paper to the standards of thinking and th.docxLyndonPelletier761
 
3-5 pagesThe patrol division of a police department is the l.docx
3-5 pagesThe patrol division of a police department is the l.docx3-5 pagesThe patrol division of a police department is the l.docx
3-5 pagesThe patrol division of a police department is the l.docxLyndonPelletier761
 
3-5 pagesOfficer Landonio is now in the drug task force. H.docx
3-5 pagesOfficer Landonio is now in the drug task force. H.docx3-5 pagesOfficer Landonio is now in the drug task force. H.docx
3-5 pagesOfficer Landonio is now in the drug task force. H.docxLyndonPelletier761
 
3-4 paragraphsAssignment DetailsContemporary criminal just.docx
3-4 paragraphsAssignment DetailsContemporary criminal just.docx3-4 paragraphsAssignment DetailsContemporary criminal just.docx
3-4 paragraphsAssignment DetailsContemporary criminal just.docxLyndonPelletier761
 
3-4 paragraphsYou have received a complaint that someone in the .docx
3-4 paragraphsYou have received a complaint that someone in the .docx3-4 paragraphsYou have received a complaint that someone in the .docx
3-4 paragraphsYou have received a complaint that someone in the .docxLyndonPelletier761
 
3-4 pagesAPA STYLEThe U.S. has long been seen by many around t.docx
3-4 pagesAPA STYLEThe U.S. has long been seen by many around t.docx3-4 pagesAPA STYLEThe U.S. has long been seen by many around t.docx
3-4 pagesAPA STYLEThe U.S. has long been seen by many around t.docxLyndonPelletier761
 
3-5 pagesCommunity-oriented policing (COP) does involve th.docx
3-5 pagesCommunity-oriented policing (COP) does involve th.docx3-5 pagesCommunity-oriented policing (COP) does involve th.docx
3-5 pagesCommunity-oriented policing (COP) does involve th.docxLyndonPelletier761
 
3 to 4 line answers only each.No Plagarism.$25Need by 109201.docx
3 to 4 line answers only each.No Plagarism.$25Need by 109201.docx3 to 4 line answers only each.No Plagarism.$25Need by 109201.docx
3 to 4 line answers only each.No Plagarism.$25Need by 109201.docxLyndonPelletier761
 
3 page paper, double spaced, apa formatThis paper is technically.docx
3 page paper, double spaced, apa formatThis paper is technically.docx3 page paper, double spaced, apa formatThis paper is technically.docx
3 page paper, double spaced, apa formatThis paper is technically.docxLyndonPelletier761
 

More from LyndonPelletier761 (20)

300 words Building healthier cities and communities involves local.docx
300 words Building healthier cities and communities involves local.docx300 words Building healthier cities and communities involves local.docx
300 words Building healthier cities and communities involves local.docx
 
300 words APA format, Select a current example of a policy issue t.docx
300 words APA format, Select a current example of a policy issue t.docx300 words APA format, Select a current example of a policy issue t.docx
300 words APA format, Select a current example of a policy issue t.docx
 
300-400 wordsClick here to access American Rhetorics Top 100 .docx
300-400 wordsClick here to access American Rhetorics Top 100 .docx300-400 wordsClick here to access American Rhetorics Top 100 .docx
300-400 wordsClick here to access American Rhetorics Top 100 .docx
 
3. Describe one of the five major themes of Progressive Reform outli.docx
3. Describe one of the five major themes of Progressive Reform outli.docx3. Describe one of the five major themes of Progressive Reform outli.docx
3. Describe one of the five major themes of Progressive Reform outli.docx
 
3. How do culture and business of Ireland compare with US culture an.docx
3. How do culture and business of Ireland compare with US culture an.docx3. How do culture and business of Ireland compare with US culture an.docx
3. How do culture and business of Ireland compare with US culture an.docx
 
3-page paper which you use the article from the below websites.docx
3-page paper which you use the article from the below websites.docx3-page paper which you use the article from the below websites.docx
3-page paper which you use the article from the below websites.docx
 
3-page APA format reaction paper to the first four stages of develop.docx
3-page APA format reaction paper to the first four stages of develop.docx3-page APA format reaction paper to the first four stages of develop.docx
3-page APA format reaction paper to the first four stages of develop.docx
 
350 words. Standard essay format- no sources needed1. Explain wh.docx
350 words. Standard essay format- no sources needed1. Explain wh.docx350 words. Standard essay format- no sources needed1. Explain wh.docx
350 words. Standard essay format- no sources needed1. Explain wh.docx
 
300 - 500 words in APA format (in text citations) and refernce page..docx
300 - 500 words in APA format (in text citations) and refernce page..docx300 - 500 words in APA format (in text citations) and refernce page..docx
300 - 500 words in APA format (in text citations) and refernce page..docx
 
300 words long addressing the following issues.its a discussion h.docx
300 words long addressing the following issues.its a discussion h.docx300 words long addressing the following issues.its a discussion h.docx
300 words long addressing the following issues.its a discussion h.docx
 
3. Creativity and AdvertisingFind two advertisements in a magazi.docx
3. Creativity and AdvertisingFind two advertisements in a magazi.docx3. Creativity and AdvertisingFind two advertisements in a magazi.docx
3. Creativity and AdvertisingFind two advertisements in a magazi.docx
 
3-page APA format reaction paper to the standards of thinking and th.docx
3-page APA format reaction paper to the standards of thinking and th.docx3-page APA format reaction paper to the standards of thinking and th.docx
3-page APA format reaction paper to the standards of thinking and th.docx
 
3-5 pagesThe patrol division of a police department is the l.docx
3-5 pagesThe patrol division of a police department is the l.docx3-5 pagesThe patrol division of a police department is the l.docx
3-5 pagesThe patrol division of a police department is the l.docx
 
3-5 pagesOfficer Landonio is now in the drug task force. H.docx
3-5 pagesOfficer Landonio is now in the drug task force. H.docx3-5 pagesOfficer Landonio is now in the drug task force. H.docx
3-5 pagesOfficer Landonio is now in the drug task force. H.docx
 
3-4 paragraphsAssignment DetailsContemporary criminal just.docx
3-4 paragraphsAssignment DetailsContemporary criminal just.docx3-4 paragraphsAssignment DetailsContemporary criminal just.docx
3-4 paragraphsAssignment DetailsContemporary criminal just.docx
 
3-4 paragraphsYou have received a complaint that someone in the .docx
3-4 paragraphsYou have received a complaint that someone in the .docx3-4 paragraphsYou have received a complaint that someone in the .docx
3-4 paragraphsYou have received a complaint that someone in the .docx
 
3-4 pagesAPA STYLEThe U.S. has long been seen by many around t.docx
3-4 pagesAPA STYLEThe U.S. has long been seen by many around t.docx3-4 pagesAPA STYLEThe U.S. has long been seen by many around t.docx
3-4 pagesAPA STYLEThe U.S. has long been seen by many around t.docx
 
3-5 pagesCommunity-oriented policing (COP) does involve th.docx
3-5 pagesCommunity-oriented policing (COP) does involve th.docx3-5 pagesCommunity-oriented policing (COP) does involve th.docx
3-5 pagesCommunity-oriented policing (COP) does involve th.docx
 
3 to 4 line answers only each.No Plagarism.$25Need by 109201.docx
3 to 4 line answers only each.No Plagarism.$25Need by 109201.docx3 to 4 line answers only each.No Plagarism.$25Need by 109201.docx
3 to 4 line answers only each.No Plagarism.$25Need by 109201.docx
 
3 page paper, double spaced, apa formatThis paper is technically.docx
3 page paper, double spaced, apa formatThis paper is technically.docx3 page paper, double spaced, apa formatThis paper is technically.docx
3 page paper, double spaced, apa formatThis paper is technically.docx
 

Recently uploaded

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 

Recently uploaded (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 

Discussion Board 1 – 2 Within the Discussion Board area, write 4

  • 1. Discussion Board 1 – 2 Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. The architecture of Web 1.0 consists of following three components (Jacobs & Walsh, 2004): · Web resources identification: Uniform Resource Identifier (URI) · Interaction protocol: HyperText Transfer Protocol (HTTP) · Data formats: HyperText Markup Language (HTML) Over the last 25 years, the Web has experienced several evolutions, which have been called Web 1.0, Web 2.0, Web 3.0, Web 4.0, and Web 5.0. Each of the evolutions has brought in more types of data sources, along with more advanced functional capability to the Internet infrastructure to make the Web the central place to see the convergence of many existing and new technologies. These new capabilities, in turn, support many new innovative business processes and practice through the Web. Therefore, it is important to know the basic concepts and applications of the Web, starting from its first generation. Knowing the root of the Web technology will help you to understand the reasons and consequences of the current and future changes to the Web technology, as well as the challenges of accessing the ever-growing Web data. Complete the reading assignment, and search the Library and Internet to find and study at least 2 more references that discuss the concepts and applications of the Web. Based on the results of your research, discuss the following questions: · What role has each of the 3 components of the architecture of Web 1.0 (URI, HTTP, and HTML) played in making the Web one of the main sources of ever-growing big data? · What will be the trend in terms of "performance bottleneck" to
  • 2. access large-scale Web data as the Web technology evolves? · Justify your point of view, and provide examples as necessary. Unit 2 - 1 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. As the core component of Web 4.0, the Internet of Things (IoT) has become a reality after many years of development. Distinct from all previous generations of the Web where all the data are generated by people, the Web 4.0 data are generated by both human and embedded computing devices (Atzori, 2010). The number of sources for the Web data have greatly increased because multibillions of uniquely identifiable embedded computing devices are connected through the Internet infrastructure and various types of wireless networks. Because most of IoT devices only have limited computing resources, they play the role of raw data collector and initial data preprocessor. These devices have to send the lower-level data to various data processing centers where the computers with higher order computing resources will perform heavier duty tasks. The IoT-based Web 4.0 has not only increased the data growth rate, but it also shifted the performance bottlenecks of accessing Web data to many new places in the Internet infrastructure. It is very important to fully understand where these new performance bottlenecks are and the root causes of their existence so that you can be more effectively handle your computing resources in accessing various types of Web data for your large-scale Web data-based applications. Complete the reading assignment, and search the Library and Internet to find and study at least 2 references that discuss the concepts and applications of the IoT. Based on the results of your research, discuss the following questions:
  • 3. · Where will the new performance bottlenecks be when accessing large-scale Web data generated by IoT? · What is the new challenge for developing an indexing scheme used to assist accessing large-scale Web data generated by IoT? · Justify your point of view and provide examples as necessary. Unit 3 – 1 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. MapReduce was originally developed for cost-efficient use of large clusters of commodity computers to achieve scalable and reliable data processing. It consistently applies two simple but powerful functions—Map and Reduce—in parallel. Along with Hadoop, which is an open-source implementation of MapReduce, MapReduce has become one of the most popular and practical technical solutions to deal with big data analytic tasks. However, like any technical solution, the initial MapReduce and Hadoop also have quite a few weaknesses when applied to handle certain types of data processing applications. Therefore, there is a need to thoroughly study the basic concepts of MapReduce and its Hadoop implementation to fully understand their pros and cons so that when applying them in big data analytic tasks, you will be able to make the right decisions and achieve the desired results. Complete the reading assignment, and search the Library and Internet to find and study more references that discuss the concepts and applications of MapReduce and Hadoop as needed. Based on the results of your research, discuss the following questions: · What are the basic concepts of MapReduce? · What are the top 3 features of Hadoop? · What are the pros and cons of MapReduce?
  • 4. Justify your point of view and provide examples, as necessary. =============================================== ===================================== Unit 3 – 2 Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. Many data analytic tasks in commonly used Web applications, such as page ranking and social network analysis, are processed iteratively until the computation meets the given condition. However, the original MapReduce framework does not support iterative computation directly. The iterative tasks have to be manually developed through a separate software and use multiple MapReduce jobs to emulate the iteration process. The unchanged data from previous iteration will be reloaded and reprocessed in the next iteration. This approach has increased the performance penalty on computing resources because it does not take advantage of most of the data in the iterations, which is unchanged, and subsequently has no need to reload and reprocess them during the consequent iterations. Another problem with the manual approach is that it depends on detecting the termination condition at each iteration. This requires an extra MapReduce job, which causes extra scheduling, I/O, and will increase network traffic. Obviously, a better solution is required to address these performance penalties. Complete the reading assignment, and search the Library and Internet to find and study more references that discuss how to address the weakness of MapReduce and Hadoop on iterative computation. Based on the results of your research, discuss the following questions: · What are the weaknesses of the initial MapReduce framework in iterative computation? · What are the root causes of the weakness?
  • 5. · What are the key technical steps to solve the weakness? Justify your point of view and provide examples, as necessary. =============================================== ===================================== Unit 4 – 1 Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. Most of data analytic tasks in commonly used large-scale Web data processing applications, such as Web crawls and Web page indexing, are not iterative but incremental. The application usually runs one time as needed. However, there is a common characteristic of data shown in the incremental computation on most large-scale Web data. Most of the data do not change between two different runs. This obviously is an opportunity to improve the data processing performance for MapReduce and its Hadoop implementation because they did not consider this data characteristic in their design and development. For example, if 99% of a large-scale data set is unchanged and if there is a method to allow the MapReduce-based Web application to reuse that data directly in the next run without reprocessing, the data processing performance on this data set will increase greatly. Therefore, it is very important to acquire the knowledge and skills on how to achieve this process with existing MapReduce and Hadoop. Complete the reading assignment and search the Library and Internet to find and study more references that discuss how to implement incremental computation with existing MapReduce and Hadoop. Based on the results of your research, discuss the following questions: · What are the principles of using the initial MapReduce framework and Hadoop to improve performance of incremental computation? · How can these principles be designed and implemented
  • 6. without a need of any big change on the initial MapReduce framework and Hadoop? Justify your point of view and provide examples, as necessary. =============================================== =============================== Unit 5 – 1 Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. Distributed programming is a computation method in which software will run on separate cores in multiple networked computers. It is a true parallel computa tion model because it can provide fully supported computing resources for multitasking. Based on different criteria, distributed programming models can be classified differently. If the term distributed system is defined as “a system consisting of networked computers and communicating through either messaging passing or shared distributed memory to coordinate the software functions to solve a problem or provide a service,” then you can divide that distributed programming into the following two models: · Shared memory distributed programming · Message-passing distributed programming According to the definition of a distributed system, a cloud computing environment is considered a distributed system. Therefore, both the shared memory distributed programming and the message-passing distributed programming can be applied in the cloud. However, if you use cloud computing to conduct large-scale data processing, the existing capabilities of both the shared memory distributed programming or the message-passing distributed programming are not sufficient (Sakr & Gaber, 2014). Complete the reading assignment, and search the Library and Internet to find and study additional references that discuss the
  • 7. concepts and applications of the distributed programming models. Based on the results of your research, discuss the following question: · Why are both the shared memory distributed programming and the message-passing distributed programming insufficient when processing the large-scale data in cloud computing environment? Please justify your point of view and provide examples, as necessary. =============================================== ================================== Unit 5 – 2 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. The CAP theorem was originally proposed by Dr. E. Brewer at a symposium on distributed computing, and he stated that “in any highly distributed data system, there are three commonly desirable properties: consistency, availability, and partition tolerance. However, it is impossible for a system to provide all three properties at the same time” (2000). This theorem was later proven by S. Gilbert and N. Lynch. The CAP theorem has had great impact on the design of distributed systems and services, including distributed database management systems (DDBMS). Web-based applications have posted new requirements that traditional database systems such as SQL- based relational database systems (RDBs) cannot fully satisfy. This has triggered a new type of data storage systems called NoSQL systems to occur and gradually become a dominant alternative solution for data store and management. One of popular practices among NoSQL data storage systems is based on the CAP theorem to make the trade-off among the three properties. Because the high performance cost of maintaining strong consistency based on the atomicity,
  • 8. consistency, isolation, and durability (ACID) semantics held by RDBs, NoSQL systems often apply the weak consistency model in exchange for the great reduction of performance overhead involved in enforcing strong consistency (Sakr & Gaber, 2014). Complete the reading assignment, and search the Library and Internet to find and study additional references that discuss the concepts of strong and weak consistency. Based on the results of your research, discuss the following questions: · Why must a NoSQL data storage system based on the cloud computing environment make trade-offs between consistency and availability? · Where do the savings on the consistency handling overhead come from in a NoSQL data storage system executing the weak consistency? Please justify your point of view and provide examples, as necessary. =============================================== ================================== Unit 6 -1 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. Any application that is data- and computing-intensive will be a good candidate for a services based on cloud computing. Visualizing large-scale data sets is one such application. Visualizing large-scale data sets involves two aspects—large- scale data processing and a visualization interface. To use the cloud computing environment for large-scale data processing, you need to consider the network performance issues such as “the unevenness of bandwidth of computer pairs” that will be discussed in another assignment (Sakr & Gaber, 2014). Now you are focusing on the visualization interface aspect. Authors
  • 9. have proposed a prototype framework for the design of visualization service to the big data coming from the cloud computing environment (Tanashi et al., 2010). The authors discuss the end-user functionality supported by the framework and their technical decisions on how to implement the framework. Complete the reading assignment, and search the Library and Internet to find and study additional references that discuss how to visualize large-scale data sets in a cloud computing environment. Based on the results of your research, discuss the following questions: · What is the end-user functionality of the framework reported in Tanahashi et al.? · What are the technical design decisions that have an impact on the performance of the framework? Justify your point of view and provide examples, as necessary. =============================================== ============================== Unit 7 – 1 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. Big data analytics can help solve some very hard problems. One example is to detect network traffic anomalies caused by diverse machine-generated traffic attacks (known as hit inflation attacks, which “refer to the fraudulent activities of generating charges for online advertisers without a real interest in the product advertised”) by detecting the anomalous deviation from the expected Internet Protocol (IP) size distribution, where the term of IP size is defined as “the number of users sharing the same source IP” (Sakr & Gaber, 2014). The ability to detect hit inflation attacks is critical to the well -being of online advertisement because it will ensure the healthy operations of many daily used popular public Web-based
  • 10. services, such as search engines, e-mail, maps, and other Web- based applications. However, the network traffic data itself is also a type of large-scale data set. To process such a data set efficiently to discover the corresponding IP size distributions for all publishers’ Web sites for detecting network traffic anomalies is a very challenging task. Complete the reading assignment, and search the Library and Internet to find and study more references that discuss detecting network traffic anomalies based on the IP size distribution. Based on the results of your research, discuss the following concepts: · Identify 1 method of detecting network traffic anomalies based on the IP size distribution. · What are the design principles of the method? · How does each method address the performance issue of processing such large scale network traffic data? Justify your point of view and provide examples, as necessary. =============================================== ============================ Unit 7 – 2 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. Different from the conventional distributed systems such as those supercomputer-based client-server systems or small-scale cluster systems, the network performance of the cloud computing system has a unique characteristic. The network bandwidth among different pairs of computers in the cloud can vary significantly, and it is called “the bandwidth unevenness among different machine pairs” (Sakr & Gaber, 2014). When a very large-scale data set, such as those in a social network, Web graph, information networks (which are known as large-scale graph data set), needs to be partitioned into many machines in a cloud computing system before it can be processed, the network
  • 11. performance problem caused by the bandwidth unevenness among different machine pairs needs to be seriously considered and addressed. It will impact the entire data processing performance because the partitioning of the very large-scale data set will generate a very large amount of network traffic and will impact a very large number of machines. Network performance is a critical parameter for the design in any cloud computing-based large-scale graph data set partitioning and processing method. Complete the reading assignment, and search the Library and Internet to find and study more references that discuss the cloud computing-based large-scale graph dataset partitioning and processing. Based on the results of your research, discuss the following topics: · Identify 2 large-scale graph data set partitioning methods used in a cloud computing system. · What are the design principles of each method? · How does each method address the network performance issue caused by the bandwidth unevenness among different machine pairs? Justify your point of view and provide examples, as necessary. =============================================== ======================= Unit 8 – 1 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. One of the main purposes of processing big data is to extract knowledge (or so called big knowledge) out from the big data set. “Knowledge is the meaningfulness about the data” (Sakr & Gaber, 2014). Knowledge representation is usually associated with the problem-solving task’s specific requirements. For example, if a problem-solving task involves time order, then a
  • 12. list may be a suitable data structure for the knowledge representation; if a problem-solving task involves no time order, then a set may be a suitable data structure for the knowledge representation. However, there is a universal standard for knowledge representation proposed by the World Web Consortium (W3C) called Resource Description Framework [RDF] (2014). It is the standard model for machine- readable data representation, which now has been commonly used to hold the knowledge representation in an application of processing big data set. Resource Description Framework is very helpful when you need to integrate the results of several big data set processing applications. It can facilitate knowledge integration even when the underlying data schemas differ in the original data storage systems. Complete the reading assignment, and search the Library and Internet to find and study more references that discuss how to extract knowledge from a large scale data set by applying machine learning. Based on the results of your research, discuss the following: · Identify 1 research work that involves how to extract out knowledge from a large-scale data set with specific real-world semantics (e.g., an informatics system for biomedical research) by applying machine learning. · How is the machine learning applied in this research work? · How is the extracted knowledge represented? · How does the research work address the performance issue of processing such large-scale data sets? Justify your point of view and provide examples, as necessary. =============================================== ============================== Unit 9 – 1 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas.
  • 13. One way to research the security issues associated with big data is to look into every stage of the life cycle of big data. The entire data life cycle consists of the following 8 stages (Khan et al., 2014): · Stage 1: Raw data · Stage 2: Collection · Stage 3: Filtering and classification · Stage 4: Data analysis · Stage 5: Storing · Stage 6: Sharing and publishing · Stage 7: Security · Stage 8: Retrieval, reuse, and discover There are two items to point out, as follows: · Stage 7 is an abstract stage. · Only three stages (5, 6, and 8) are involved with security. In Stage 5, the security issues associated with this stage are mainly caused by two aspects—the size of data and the place to store the data. Because the size of the data is too big, many companies have to store their data in the cloud. However, because the data are so big, it is really hard to verify if cloud vendors indeed stored all the data. Because the cloud runs under black box mode, the customers really have no way to know where the data are stored, how they are stored, and whether the integrity of the data is preserved. Because of the cost of local storage and network bandwidth, customers cannot even afford to use any simple approach, such as downloading the entire data set, to verify if the data have been stored properly in the cloud. Complete the reading assignment, and search the Library and Internet to find and study more references that discuss the security issues associated with big data and how to solve them. Based on the results of your research, discuss the following tasks: · Identify 2 security issues associated with big data. · What are the root causes of these 2 security issues? · How can each of these 2 security issues be solved?
  • 14. Justify your point of view and provide examples, as necessary. =============================================== ================================= Unit 10 -1 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. The cloud computing system shares the common characteristics of the general security attacks to all types of distributed computing systems, such as the following (Prakash & Darbari, 2012): · Eavesdropping (gaining secret information) · Masquerading (making assumptions on the identity of users) · Message tampering (changing the content of the message) · Replaying the message · Denial of services However, because of several special system features of the cloud computing systems, such as virtual machines (VM), trust asymmetry, semitransparent system architecture, and so forth, the cloud computing system has a few special security issues. These are summarized into the following 10 technical aspects (Sakr & Gaber, 2014): · Exploitation of co-tenancy · Secure architecture for the cloud · Accountability for outsourced data · Confidentiality of data and computation · Privacy · Verifying outsourced computation · Verifying capability · Cloud forensics · Misuse detection · Resource accounting and economic attacks Additionally, even some non-technical areas (e.g., regulatory compliance legal jurisdiction) and many security researches'
  • 15. assumptions pose security challenges to which no meaningful solutions have yet been made (Prakash & Darbari, 2012). All of these have kept security in cloud computing systems a current and hot research subject. Complete the reading assignment, and search the Library and Internet to find and study more references that discuss the security issues of cloud computing systems and how to solve them. Based on the results of your research, discuss the following tasks: · Identify 2 security issues associated with cloud computing system. · What are the root causes of these 2 security issues? · How can these 2 security issues be solved? Justify your point of view and provide examples, as necessary. =============================================== ============================== Unit 10 -2 Primary Task Response: Within the Discussion Board area, write 400-600 words that respond to the following questions with your thoughts, ideas, and comments. This will be the foundation for future discussions by your classmates. Be substantive and clear, and use examples to reinforce your ideas. Throughout this class, you have touched many current hot research subjects in big data analytics. You are familiar with some of problems that are still waiting for a better solution in big data analytics. Assume that you will write a research paper on a big data analytic related subject. Discuss the following: · Present your paper’s title, the motivation in maximum 3 sentences, the problem statement in 1 sentence, and the hypothesis statement in 1 sentence. · The hypothesis statement will include a proposed solution to address the root cause of the problem presented in your problem statement. · Present 2 research questions, and discuss your thought process on how you came up with your research questions based on the
  • 16. motivations, the problem statement, and the hypothesis statement. · Use 1 sentence to specify the new contribution made to the body of knowledge by your proposed solution. =============================================== =================================