1.2 Motivating Challenges
As mentioned earlier, traditional data analysis techniques have often encountered practical difficulties in meeting the challenges posed by big data applications. The following are some of the specific challenges that motivated the development of data mining.
Scalability
Because of advances in data generation and collection, data sets with sizes of terabytes, petabytes, or even exabytes are becoming common. If data mining algorithms are to handle these massive data sets, they must be scalable. Many data mining algorithms employ special search strategies to handle exponential search problems. Scalability may also require the implementation of novel data structures to access individual records in an efficient manner. For instance, out-of-core algorithms may be necessary when processing data sets that cannot fit into main memory. Scalability can also be improved by using sampling or developing parallel and distributed algorithms. A general overview of techniques for scaling up data mining algorithms is given in Appendix F.
High Dimensionality
It is now common to encounter data sets with hundreds or thousands of attributes instead of the handful common a few decades ago. In bioinformatics, progress in microarray technology has produced gene expression data involving thousands of features. Data sets with temporal or spatial components also tend to have high dimensionality. For example,
consider a data set that contains measurements of temperature at various locations. If the temperature measurements are taken repeatedly for an extended period, the number of dimensions (features) increases in proportion to the number of measurements taken. Traditional data analysis techniques that were developed for low-dimensional data often do not work well for such high-dimensional data due to issues such as curse of dimensionality (to be discussed in Chapter 2 ). Also, for some data analysis algorithms, the computational complexity increases rapidly as the dimensionality (the number of features) increases.
Heterogeneous and Complex Data
Traditional data analysis methods often deal with data sets containing attributes of the same type, either continuous or categorical. As the role of data mining in business, science, medicine, and other fields has grown, so has the need for techniques that can handle heterogeneous attributes. Recent years have also seen the emergence of more complex data objects. Examples of such non-traditional types of data include web and social media data containing text, hyperlinks, images, audio, and videos; DNA data with sequential and three-dimensional structure; and climate data that consists of measurements (temperature, pressure, etc.) at various times and locations on the Earth’s surface. Techniques developed for mining such complex objects should take into consideration relationships in the data, such as temporal and spatial autocorrelation, graph connectivity, and parent-child relationships between ...
1.2 Motivating Challenges As mentioned earlier, traditional data
1. 1.2 Motivating Challenges
As mentioned earlier, traditional data analysis techniques have
often encountered practical difficulties in meeting the
challenges posed by big data applications. The following are
some of the specific challenges that motivated the development
of data mining.
Scalability
Because of advances in data generation and collection, data sets
with sizes of terabytes, petabytes, or even exabytes are
becoming common. If data mining algorithms are to handle
these massive data sets, they must be scalable. Many data
mining algorithms employ special search strategies to handle
exponential search problems. Scalability may also require the
implementation of novel data structures to access individual
records in an efficient manner. For instance, out-of-core
algorithms may be necessary when processing data sets that
cannot fit into main memory. Scalability can also be improved
by using sampling or developing parallel and distributed
algorithms. A general overview of techniques for scaling up
data mining algorithms is given in Appendix F.
High Dimensionality
It is now common to encounter data sets with hundreds or
thousands of attributes instead of the handful common a few
decades ago. In bioinformatics, progress in microarray
technology has produced gene expression data involving
thousands of features. Data sets with temporal or spatial
components also tend to have high dimensionality. For example,
consider a data set that contains measurements of temperature at
various locations. If the temperature measurements are taken
repeatedly for an extended period, the number of dimensions
(features) increases in proportion to the number of
measurements taken. Traditional data analysis techniques that
were developed for low-dimensional data often do not work
well for such high-dimensional data due to issues such as curse
2. of dimensionality (to be discussed in Chapter 2 ). Also, for
some data analysis algorithms, the computational complexity
increases rapidly as the dimensionality (the number of features)
increases.
Heterogeneous and Complex Data
Traditional data analysis methods often deal with data sets
containing attributes of the same type, either continuous or
categorical. As the role of data mining in business, science,
medicine, and other fields has grown, so has the need for
techniques that can handle heterogeneous attributes. Recent
years have also seen the emergence of more complex data
objects. Examples of such non-traditional types of data include
web and social media data containing text, hyperlinks, images,
audio, and videos; DNA data with sequential and three-
dimensional structure; and climate data that consists of
measurements (temperature, pressure, etc.) at various times and
locations on the Earth’s surface. Techniques developed for
mining such complex objects should take into consideration
relationships in the data, such as temporal and spatial
autocorrelation, graph connectivity, and parent-child
relationships between the elements in semi-structured text and
XML documents.
Data Ownership and Distribution
Sometimes, the data needed for an analysis is not stored in one
location or owned by one organization. Instead, the data is
geographically distributed among resources belonging to
multiple entities. This requires the development
of distributed data mining techniques. The key challenges faced
by distributed data mining algorithms include the following: (1)
how to reduce the amount of communication needed to perform
the distributed computation, (2) how to effectively consolidate
the data mining results obtained from multiple sources, and (3)
how to address data security and privacy issues.
Non-traditional Analysis
The traditional statistical approach is based on a hypothesize -
and-test paradigm. In other words, a hypothesis is proposed, an
3. experiment is designed to gather the data, and then the data is
analyzed with respect to the hypothesis. Unfortunately, this
process is extremely labor-intensive. Current data analysis tasks
often require the generation and evaluation of thousands of
hypotheses, and consequently, the development of some data
mining techniques has been motivated by the desire to automate
the process of hypothesis generation and evaluation.
Furthermore, the data sets analyzed in data mining are typically
not the result of a carefully designed experiment and often
represent opportunistic samples of the data, rather than random
samples.
NAME
COURSE
DATE
INSTRUCTOR
Lesson Analysis
Instructional Model Exhibited
Identify the instructional model(s) exhibited by the teacher
during instruction and provide evidence that validates this
assumption
Lesson Engagement
Explain your thoughts on the teacher’s ability to engage
students. In your explanation, be sure to connect to examples
from the lesson.
Strengths of the Lesson
Describe up to three things you liked about the lesson.
4. Recommendations/Justifications
Recommend one thing you would have done differently than the
teacher in the video and why. If you wouldn’t change anything,
explain why you think the lesson should remain as it is.
References
View the video of the lesson you have chosen, and respond to
the following items.
· Identify the instructional models exhibited by the teacher
during instruction, and provide examples that validates this
assumption. Refer to the (Links to an external site.) document
you used in the Week 5 Instructional Models discussion forum
to refresh your memory of the four types of instructional models
we learned and cite it as a source to support your response.
(APA citation is shown in the required resources section for this
week).
· From your vantage point, determine the teacher’s ability to
engage students throughout the lesson. Were the students
engaged, attentive, and having fun learning or were there areas
that the teacher could have improved upon to make the lesson
more engaging?
· Describe up to three things you liked about the lesson.
· Recommend one thing you would have done differently than
the teacher in the video and why. If you would not change
anything, justify why you think the lesson should remain as it
is.
Compile the responses to the questions above in a way that it
will be easy for you to transfer them to a PowerPoint
5. presentation (e.g., bullet points would work best).
Create a PowerPoint with a slide for each of the items above
(see below for instructions on how to create each slide).
· Use the Lesson Analysis PowerPoint Template Download
Lesson Analysis PowerPoint Templateprovided to create a
visual of your lesson analysis.
· You will use the 7x7 rule to create your presentation. The 7x7
rule states that you use no more than seven bullet points per
slide and no more than seven words per bullet point. This way
your visual presentation will only show the main points on each
slide without overwhelming your viewers without too many
words. You still need to make your slides attractive by adding
images and colors to make it attractive.
· Add your voice to fill in the gaps between the main points on
your slides. Limit your narration to five minutes or less. Use
your narration to explain each of your answers. More
importantly, use it as an opportunity to share your passion about
what you liked in the lesson and how you might modify the
lesson to better engage students and make the learning
experience fun. (View Microsoft PowerPoint 2013 Tutorial |
Recording Narration (Links to an external site.) for instructions
on how insert voice narration into a PowerPoint presentation.)
· If you need help with creating an effective PowerPoint
presentation, please review the How to Make a PowerPoint
Presentation (Links to an external site.) webpage from the
UAGC Writing Center.
In your project,
· Create a PowerPoint with voice narration.
· Summarize the instructional models used in the lesson.
· Explain thoughts on the engagement level of the lesson.
· Describe up to three strengths of the lesson.
· Justify whether the lesson should be changed or stay the same.
The What Would You Do? final presentation
· Must use at least one scholarly source to complete this
assignment. The Evidence-Based Models of
6. Teaching Download Evidence-Based Models of
Teachingdocument you accessed in the Week 5 Instructional
Models discussion forum will meet this requirement.
· Must be document all sources according to APA Style (Links
to an external site.) as outlined in the Writing Center’s How to
Make a PowerPoint Presentation (Links to an external
site.)resource.
· Must include a separate title slide with the following:
· Title of the project in bold font
· Space should be between title and the rest of the information
on the title page.
· Student’s name
· Name of institution University of Arizona Global Campus)
· Course name and number
· Instructor’s name
· Due date
· Must use at least one scholarly source in addition to the course
text.
· The Scholarly, Peer-Reviewed, and Other Credible
Sources (Links to an external site.) table offers additional
guidance on appropriate source types. If you have questions
about whether a specific source is appropriate for this
assignment, please contact your instructor. Your instructor has
the final say about the appropriateness of a specific source for a
particular assignment.
· To assist you in completing the research required for this
assignment, view this UAGC Library Quick ‘n’ Dirty (Links to
an external site.) tutorial, which introduces the UAGC Library
and the research process, and provides some library search tips.
· Must document any information used from sources in APA
Style as outlined in the Writing Center’s APA: Citing Within
Your Paper (Links to an external site.)
· Must include a separate references slide that is formatted
according to APA Style as outlined in the Writing Center. See
the APA: Formatting Your References List (Links to an external
site.) resource in the Writing Center for specifications.