Order out of Chaos: Construction of Knowledge Models from PDF TextbooksIsaac Alpizar-Chacon
Textbooks are educational documents created, structured and formatted by domain experts with the main purpose to explain the knowledge in the domain to a novice. Authors use their understanding of the domain when structuring and formatting the content of a textbook to facilitate this explanation. As a result, the formatting and structural elements of textbooks carry the elements of domain knowledge implicitly encoded by their authors. Our paper presents an extendable approach towards automated extraction of this knowledge from textbooks taking into account their formatting rules and internal structure. We focus on PDF as the most common textbook representation format; however, the overall method is applicable to other formats as well. The evaluation experiments examine the accuracy of the approach, as well as the pragmatic quality of the obtained knowledge models using one of their possible applications --- semantic linking of textbooks in the same domain. The results indicate high accuracy of model construction on symbolic, syntactic and structural levels across textbooks and domains, and demonstrate the added value of the extracted models on the semantic level.
Presented at Document Engineering 2020
Order out of Chaos: Construction of Knowledge Models from PDF TextbooksIsaac Alpizar-Chacon
Textbooks are educational documents created, structured and formatted by domain experts with the main purpose to explain the knowledge in the domain to a novice. Authors use their understanding of the domain when structuring and formatting the content of a textbook to facilitate this explanation. As a result, the formatting and structural elements of textbooks carry the elements of domain knowledge implicitly encoded by their authors. Our paper presents an extendable approach towards automated extraction of this knowledge from textbooks taking into account their formatting rules and internal structure. We focus on PDF as the most common textbook representation format; however, the overall method is applicable to other formats as well. The evaluation experiments examine the accuracy of the approach, as well as the pragmatic quality of the obtained knowledge models using one of their possible applications --- semantic linking of textbooks in the same domain. The results indicate high accuracy of model construction on symbolic, syntactic and structural levels across textbooks and domains, and demonstrate the added value of the extracted models on the semantic level.
Presented at Document Engineering 2020
Integrating Textbooks with Smart Interactive Content for Learning ProgrammingIsaac Alpizar-Chacon
Online textbooks with interactive content emerged as a popular medium for learning programming and other computer science topics. While the textbook component supports acquisition of programming concepts by reading, various types of ``smart'' interactive learning content such as worked examples, code animations, Parson's puzzles, and coding problems allow students to immediately practice and master the newly learned concepts. This paper attempts to automate the time-consuming manual process of augmenting textbooks with ``smart'' interactive content. We introduce an ontology-based approach that can link fragment of text with ``smart'' content activities, demonstrate its application to two practical linking cases, and present the results of its pilot evaluation.
video link => http://youtu.be/D9PBX8FmtpQ
Tweets Classifier which categorises tweets into these 6 categories:
Business
Politics
Music
Health
Sports
Technology
Integrating Textbooks with Smart Interactive Content for Learning ProgrammingIsaac Alpizar-Chacon
Online textbooks with interactive content emerged as a popular medium for learning programming and other computer science topics. While the textbook component supports acquisition of programming concepts by reading, various types of ``smart'' interactive learning content such as worked examples, code animations, Parson's puzzles, and coding problems allow students to immediately practice and master the newly learned concepts. This paper attempts to automate the time-consuming manual process of augmenting textbooks with ``smart'' interactive content. We introduce an ontology-based approach that can link fragment of text with ``smart'' content activities, demonstrate its application to two practical linking cases, and present the results of its pilot evaluation.
video link => http://youtu.be/D9PBX8FmtpQ
Tweets Classifier which categorises tweets into these 6 categories:
Business
Politics
Music
Health
Sports
Technology
Prepare the following documents and develop the software project startup, prototype
model, using software engineering methodology for at least two real time scenarios or
for the sample experiments
Content Wizard: Concept-Based Recommender System for Instructors of Programmi...Hung Chau
Authoring an adaptive educational system is a complex process which involves allocating a large range of educational contents within a fixed sequence of units. Given this scenario, in this paper we describe Content Wizard, a concept-based recommender system for recommending learning materials that meet the instructor’s pedagogical goals during the creation of an online programming course. Here, the instructors are asked to provide a set of code examples that jointly reflect the learning goals associated with each course unit. The Wizard is built on the top of our course authoring tool, and it helps to decrease the time instructors spend on the task and to maintain the coherence of the sequential structure of the course. It also provides instructors with additional information to identify the contents that might be not appropriate for the unit they are creating. We conducted an off-line study with data collected from an introductory Java course previously taught at the University of Pittsburgh, in order to evaluate the practicalness and effectiveness of the system. We found that the proposed recommendation’s performance is relatively close to the teacher expectation in creating a computer-based adaptive course.
1. COURSE DESCRIPTION
Department and Course Course
CS461 Russ Abbott
Number Coordinator
Course Total
Title Machine Learning Credits 4
Current Catalog Description:
Means that enable computers to perform tasks for which they were not explicitly
programmed; learning paradigms include inductive generalization for examples,
genetic algorithms, and connectionist systems such as neural nets.
Textbook:
Mitchell, Tom., Machine Learning, McGraw-Hill, 1997.
References:
At the discretion of the instructor.
Course Goals:
• To introduce students to tools and techniques for modeling complex systems and
for the automatic creation computer programs. Subsidiary goals will depend on
the approach(es) the instructor chooses to take.
o To introduce students to the theories, tools, and technologies used to
study complexity, including evolutionary computing and agent-based
modeling.
o To introduce students to inductive generalization from examples and
other traditional learning paradigms.
o To introduce students to the use of artificial neural nets for learning.
These course goals contribute to the success of Student Learning Outcomes 1.a, 1.d,
1.e, 2, 3, 4, 5, and 6.
Prerequisites by Topic:
• Fluent in at least one programming language
• Fluent in data structures and algorithms
• Computational complexity
Major Topics Covered in the Course:
This list represents the possible topics covered on this course. At the discretion of the
instructor, the course focuses on some of these topics.
2. • Agent-based modeling
• Modeling probability density functions and optimization in artificial neural
networks, decision trees, Gaussian process regression (k-Nearest Neighbor and
expectation-maximization algorithm), Bayesian networks, Markov Random
Fields, and support vector machines.
• Complex systems; the nature of emergence, evolutionary programming and
optimization through evolutionary programming
Laboratory Projects (specify number of weeks on each):
At the discretion of the instructor. Projects range from weekly assignments to three
more significant projects covering 3 weeks each over the course of the term.
Estimate Curriculum Category Content (Quarter Hours)
Area Core Advanced Area Core Advanced
Algorithms 1.0 Data Structures 1.0
Software Design 1.0 Prog. Languages 1.0
Comp. Arch.
Oral and Written Communications:
Students are required to submit and discuss the source code and documentation of the
work that they do.
Social and Ethical Issues:
No significant component.
Theoretical Content:
At the discretion of the instructor, possibly including an introduction to theoretical
foundations of agent-based modeling, types of learning algorithms, complex systems,
and evolutionary programming
Problem Analysis:
Students are required to identify the issues involved when required to design a system
that learns and evolves.
Solution Design:
Solution design involves developing programs that use techniques such as agent based
modeling, learning from observation, artificial neural networks, and evolutionary
programming.