IWALS 2018
6th International Workshop on Advanced Learning Sciences
Perspectives on the Learner: Cognition, Brain, and Education
University of Pittsburgh, USA JUNE 6-8, 2018
Dynamic Search Using Semantics & StatisticsPaul Hofmann
This presentation shows 3 applications of successfully combining semantics and statistics for text mining and interactive search.
1) We predict the Lehman bankruptcy using statistical topic modeling, SAP Business Objects entity extraction and associative memories (powered by Saffron Technologies).
2) We semi-automatically handle service requests at Cisco using knowledge extraction and knowledge reuse.
3) We discover user intent for interactive retrieval. User intent is defined as a latent state. The observations of this latent state are the reformulated query sequence, and the retrieved documents, together with the positive or negative feedback provided by the user. Demo shows recognizing user’s intent for health care search.
There are many examples of text-based documents (all in ‘electronic’ format…)
e-mails, corporate Web pages, customer surveys, résumés, medical records, DNA sequences, technical papers, incident reports, news stories and more…
Not enough time or patience to read
Can we extract the most vital kernels of information…
So, we wish to find a way to gain knowledge (in summarised form) from all that text, without reading or examining them fully first…!
Some others (e.g. DNA seq.) are hard to comprehend!
Dynamic Search Using Semantics & StatisticsPaul Hofmann
This presentation shows 3 applications of successfully combining semantics and statistics for text mining and interactive search.
1) We predict the Lehman bankruptcy using statistical topic modeling, SAP Business Objects entity extraction and associative memories (powered by Saffron Technologies).
2) We semi-automatically handle service requests at Cisco using knowledge extraction and knowledge reuse.
3) We discover user intent for interactive retrieval. User intent is defined as a latent state. The observations of this latent state are the reformulated query sequence, and the retrieved documents, together with the positive or negative feedback provided by the user. Demo shows recognizing user’s intent for health care search.
There are many examples of text-based documents (all in ‘electronic’ format…)
e-mails, corporate Web pages, customer surveys, résumés, medical records, DNA sequences, technical papers, incident reports, news stories and more…
Not enough time or patience to read
Can we extract the most vital kernels of information…
So, we wish to find a way to gain knowledge (in summarised form) from all that text, without reading or examining them fully first…!
Some others (e.g. DNA seq.) are hard to comprehend!
A SURVEY ON QUESTION AND ANSWER SYSTEM BY RETRIEVING THE DESCRIPTIONS USING L...IJARBEST JOURNAL
Question answering is a modern type of data recovery described by data needs
are at any rate somewhat communicated as normal dialect articulations or addresses, and
standout amongst the most regular types of human PC cooperation. This article gives an exten
and relative review of Question Answering Technology (QAT). Question retrieval in cur
community-based question answering (CQA) administrations does not, all in all, func
admirably for long and complex inquiries. This paper introduces the quality question and an
(QA) sets amassed as thorough information bases of human knowledge. It helps clients to look
exact data by acquiring right answers straightforwardly, as opposed to skimming thro
substantial ranked arrangements of results. Hence to retrieve relevant questions and t
corresponding answers becomes an important task for information acquisition. This p
discusses different focus of the QA task which is transformed from answer extraction, an
matching and answer ranking to searching for relevant questions with good ready answers.
In this talk we outline some of the key challenges in text analytics, describe some of Endeca's current research work in this area, examine the current state of the text analytics market and explore some of the prospects for the future.
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
Presentation as given to the Haystack Conference, which outlines research and techniques for automatic extraction of keywords, concepts, and vocabularies from text corpora.
This presentation introduces text analytics, its applications and various tools/algorithms used for this process. Given below are some of the important tools:
- Decision trees
- SVM
- Naive-Bayes
- K-nearest neighbours
- Artificial Neural Networks
- Fuzzy C-Means
- Latent Dirichlet Allocation
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyAndre Freitas
The growing size, heterogeneity and complexity of databases
demand the creation of strategies to facilitate users and systems to consume
data. Ideally, query mechanisms should be schema-agnostic or
vocabulary-independent, i.e. they should be able to match user queries
in their own vocabulary and syntax to the data, abstracting data consumers
from the representation of the data. Despite being a central requirement across natural language interfaces and entity search, there is a lack on the conceptual analysis of schema-agnosticism and on the associated semantic differences between queries and databases. This work aims at providing an initial conceptualization for schema-agnostic queries aiming at providing a fine-grained classification which can support the scoping, evaluation and development of semantic matching approaches for schema-agnostic queries.
The increase in the amount of structured data published using the principles of Linked Data, means that now it is more likely to find resources on the Web of Data that describe real life concepts. However, discovering resources related to any given resource is still an open research area. This thesis studies recommender systems that use Linked Data as a source for generating recommendations exploiting the big amount of available resources and the relationships between them. Accordingly, a framework named \emph{AlLied} to execute recommendation algorithms is proposed. This framework can be used as the main component for recommendations in a given architecture because it allows application developers to execute and evaluate recommendation algorithms in different contexts. Two implementations of this framework are presented and compared. The first one relies on graph-based algorithms and the second one on machine learning algorithms. Finally, a new recommendation algorithm that adapts dynamically to the linking features of the datasets used is also proposed
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
Question Answering systems define one of the most complex tasks in computational semantics. The intrinsic complexity of the QA task allows researchers of QA systems to investigate and explore different perspectives of semantics. However, this complexity also induces a bias towards a systems perspective, where researchers are alienated from a deeper reasoning on the semantic principles that are in place within the different components of the system. In this talk we will explore the semantic challenges, principles and perspectives behind the components of QA systems, aiming at providing a principled map and overview on the contribution of each component within the QA semantic interpretation goal.
Best Practices for Large Scale Text Mining ProcessingOntotext
Q&A:
NOW facilitates semantic search by having annotations attached to search strings. How compolex does that get, e.g. with wildcards between annotated strings?
NOW’s searchbox is quite basic at the moment, but still supports a few scenarios.
1. Pure concept/faceted search - search for all documents containing a concept or where a set of concepts are co-occurring. Ranking is based on frequence of occurrence.
2. Concept/faceted + Full Text search - search for both concepts and particular textual term of phrase.
3. Full text search
With search, pretty much anything can be done to customise it. For the NOW showcase we’ve kept it fairly simple, as usually every client has a slightly different case and wants to tune search in a slightly different direction.
The search in NOW is faceted which means that you search with concepts (facets) and you retrieve all documents which contain mentions of the searched concept. If you search by more than one facet the engine retrieves documents which contain mentions of both concepts but there is no restriction that they occur next to each other.
Is the tagging service expandable (say with custom ontologies)? also is it a something you offer as a service? it is unclear to me from the website.
The TAG service is used for demonstration purposes only. The models behind it are trained for annotating news articles. The pipeline is customizable for every concrete scenario, different domains and entities of interest. You can access several of our pipelines as a service through the S4 platform or you can have them hosted as an on premise solution. In some cases our clients want domain adaptation or improvements in particular area, or to tag with their internal dataset - in this case we offer again an on premise deployment and also a managed service hosted on our hardware.
Hdoes your system accomodate cluster analysis using unsupervised keyword/phrase annotation for knowledge discovery?
As much as the patterns of user behaviour are also considered knowledge discovery we employ these for suggesting related reads. Apart from these we have experience tailoring custom clustering pipelines which also rely on features like keyword and named entities.
For topic extraction how many topics can we extract? from twitter corpus wgat csn we infer?
For topic extraction we have determined that we obtain best results when suggesting 3 categories. These are taken from IPTC but only the uppermost levels which are less than 20.
The twitter corpus example is from a project Ontotext participates in called Pheme. The goal of the project is to detect rumours and to check their veracity, thus help journalists in their hunt for attractive news.
Do you provide Processing Resources and JAPE rules for GATE framework and that can be used with GATE embedded?
We are contributing to the GATE framework and everything which has been wrapped up as PRs has been included the corresponding GATE distributions.
Interleaving, Evaluation to Self-learning Search @904LabsJohn T. Kane
Presented at Open Source Connections Haystack Relevance Conference on 904Labs' "Interleaving: from Evaluation to Self-Learning". 904Labs is the first to commercialize "Online Learning to Rank" as a state-of-art for technical Self-learning Search Ranking that automatically takes into account your customers human behaviors for personalized search results.
A SURVEY ON QUESTION AND ANSWER SYSTEM BY RETRIEVING THE DESCRIPTIONS USING L...IJARBEST JOURNAL
Question answering is a modern type of data recovery described by data needs
are at any rate somewhat communicated as normal dialect articulations or addresses, and
standout amongst the most regular types of human PC cooperation. This article gives an exten
and relative review of Question Answering Technology (QAT). Question retrieval in cur
community-based question answering (CQA) administrations does not, all in all, func
admirably for long and complex inquiries. This paper introduces the quality question and an
(QA) sets amassed as thorough information bases of human knowledge. It helps clients to look
exact data by acquiring right answers straightforwardly, as opposed to skimming thro
substantial ranked arrangements of results. Hence to retrieve relevant questions and t
corresponding answers becomes an important task for information acquisition. This p
discusses different focus of the QA task which is transformed from answer extraction, an
matching and answer ranking to searching for relevant questions with good ready answers.
In this talk we outline some of the key challenges in text analytics, describe some of Endeca's current research work in this area, examine the current state of the text analytics market and explore some of the prospects for the future.
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
Presentation as given to the Haystack Conference, which outlines research and techniques for automatic extraction of keywords, concepts, and vocabularies from text corpora.
This presentation introduces text analytics, its applications and various tools/algorithms used for this process. Given below are some of the important tools:
- Decision trees
- SVM
- Naive-Bayes
- K-nearest neighbours
- Artificial Neural Networks
- Fuzzy C-Means
- Latent Dirichlet Allocation
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyAndre Freitas
The growing size, heterogeneity and complexity of databases
demand the creation of strategies to facilitate users and systems to consume
data. Ideally, query mechanisms should be schema-agnostic or
vocabulary-independent, i.e. they should be able to match user queries
in their own vocabulary and syntax to the data, abstracting data consumers
from the representation of the data. Despite being a central requirement across natural language interfaces and entity search, there is a lack on the conceptual analysis of schema-agnosticism and on the associated semantic differences between queries and databases. This work aims at providing an initial conceptualization for schema-agnostic queries aiming at providing a fine-grained classification which can support the scoping, evaluation and development of semantic matching approaches for schema-agnostic queries.
The increase in the amount of structured data published using the principles of Linked Data, means that now it is more likely to find resources on the Web of Data that describe real life concepts. However, discovering resources related to any given resource is still an open research area. This thesis studies recommender systems that use Linked Data as a source for generating recommendations exploiting the big amount of available resources and the relationships between them. Accordingly, a framework named \emph{AlLied} to execute recommendation algorithms is proposed. This framework can be used as the main component for recommendations in a given architecture because it allows application developers to execute and evaluate recommendation algorithms in different contexts. Two implementations of this framework are presented and compared. The first one relies on graph-based algorithms and the second one on machine learning algorithms. Finally, a new recommendation algorithm that adapts dynamically to the linking features of the datasets used is also proposed
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
Question Answering systems define one of the most complex tasks in computational semantics. The intrinsic complexity of the QA task allows researchers of QA systems to investigate and explore different perspectives of semantics. However, this complexity also induces a bias towards a systems perspective, where researchers are alienated from a deeper reasoning on the semantic principles that are in place within the different components of the system. In this talk we will explore the semantic challenges, principles and perspectives behind the components of QA systems, aiming at providing a principled map and overview on the contribution of each component within the QA semantic interpretation goal.
Best Practices for Large Scale Text Mining ProcessingOntotext
Q&A:
NOW facilitates semantic search by having annotations attached to search strings. How compolex does that get, e.g. with wildcards between annotated strings?
NOW’s searchbox is quite basic at the moment, but still supports a few scenarios.
1. Pure concept/faceted search - search for all documents containing a concept or where a set of concepts are co-occurring. Ranking is based on frequence of occurrence.
2. Concept/faceted + Full Text search - search for both concepts and particular textual term of phrase.
3. Full text search
With search, pretty much anything can be done to customise it. For the NOW showcase we’ve kept it fairly simple, as usually every client has a slightly different case and wants to tune search in a slightly different direction.
The search in NOW is faceted which means that you search with concepts (facets) and you retrieve all documents which contain mentions of the searched concept. If you search by more than one facet the engine retrieves documents which contain mentions of both concepts but there is no restriction that they occur next to each other.
Is the tagging service expandable (say with custom ontologies)? also is it a something you offer as a service? it is unclear to me from the website.
The TAG service is used for demonstration purposes only. The models behind it are trained for annotating news articles. The pipeline is customizable for every concrete scenario, different domains and entities of interest. You can access several of our pipelines as a service through the S4 platform or you can have them hosted as an on premise solution. In some cases our clients want domain adaptation or improvements in particular area, or to tag with their internal dataset - in this case we offer again an on premise deployment and also a managed service hosted on our hardware.
Hdoes your system accomodate cluster analysis using unsupervised keyword/phrase annotation for knowledge discovery?
As much as the patterns of user behaviour are also considered knowledge discovery we employ these for suggesting related reads. Apart from these we have experience tailoring custom clustering pipelines which also rely on features like keyword and named entities.
For topic extraction how many topics can we extract? from twitter corpus wgat csn we infer?
For topic extraction we have determined that we obtain best results when suggesting 3 categories. These are taken from IPTC but only the uppermost levels which are less than 20.
The twitter corpus example is from a project Ontotext participates in called Pheme. The goal of the project is to detect rumours and to check their veracity, thus help journalists in their hunt for attractive news.
Do you provide Processing Resources and JAPE rules for GATE framework and that can be used with GATE embedded?
We are contributing to the GATE framework and everything which has been wrapped up as PRs has been included the corresponding GATE distributions.
Interleaving, Evaluation to Self-learning Search @904LabsJohn T. Kane
Presented at Open Source Connections Haystack Relevance Conference on 904Labs' "Interleaving: from Evaluation to Self-Learning". 904Labs is the first to commercialize "Online Learning to Rank" as a state-of-art for technical Self-learning Search Ranking that automatically takes into account your customers human behaviors for personalized search results.
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Gabriel Moreira
This talk introduces the main techniques of Recommender Systems and Topic Modeling. Then, we present a case of how we've combined those techniques to build Smart Canvas, a SaaS that allows people to bring, create and curate content relevant to their organization, and also helps to tear down knowledge silos.
We give a deep dive into the design of our large-scale recommendation algorithms, giving special attention to a content-based approach that uses topic modeling techniques (like LDA and NMF) to discover people’s topics of interest from unstructured text, and social-based algorithms using a graph database connecting content, people and teams around topics.
Our typical data pipeline that includes the ingestion millions of user events (using Google PubSub and BigQuery), the batch processing of the models (with PySpark, MLib, and Scikit-learn), the online recommendations (with Google App Engine, Titan Graph Database and Elasticsearch), and the data-driven evaluation of UX and algorithms through A/B testing experimentation. We also touch topics about non-functional requirements of a software-as-a-service like scalability, performance, availability, reliability and multi-tenancy and how we addressed it in a robust architecture deployed on Google Cloud Platform.
Short-Bio: Gabriel Moreira is a scientist passionate about solving problems with data. He is Head of Machine Learning at CI&T and Doctoral student at Instituto Tecnológico de Aeronáutica - ITA. where he has also got his Masters on Science. His current research interests are recommender systems and deep learning.
https://www.meetup.com/pt-BR/machine-learning-big-data-engenharia/events/239037949/
Engaging Information Professionals in the Process of Authoritative Interlinki...Lucy McKenna
Through the use of Linked Data (LD), Libraries, Archives and Museums (LAMs) have the potential to expose their collections to a larger audience and to allow for more efficient user searches. Despite this, relatively few LAMs have invested in LD projects and the majority of these display limited interlinking across datasets and institutions. A survey was conducted to understand Information Professionals' (IPs') position with regards to LD, with a particular focus on the interlinking problem. The survey was completed by 185 librarians, archivists, metadata cataloguers and researchers. Results indicated that, when interlinking, IPs find the process of ontology and property selection to be particularly challenging, and LD tooling to be technologically complex and unsuitable for their needs.
Our research is focused on developing an authoritative interlinking framework for LAMs with a view to increasing IP engagement in the linking process. Our framework will provide a set of standards to facilitate IPs in the selection of link types, specifically when linking local resources to authorities. The framework will include guidelines for authority, ontology and property selection, and for adding provenance data. A user-interface will be developed which will direct IPs through the resource interlinking process as per our framework. Although there are existing tools in this domain, our framework differs in that it will be designed with the needs and expertise of IPs in mind. This will be achieved by involving IPs in the design and evaluation of the framework. A mock-up of the interface has already been tested and adjustments have been made based on results. We are currently working on developing a minimal viable product so as to allow for further testing of the framework. We will present our updated framework, interface, and proposed interlinking solutions.
Discovering User's Topics of Interest in Recommender SystemsGabriel Moreira
This talk introduces the main techniques of Recommender Systems and Topic Modeling.
Then, we present a case of how we've combined those techniques to build Smart Canvas (www.smartcanvas.com), a service that allows people to bring, create and curate content relevant to their organization, and also helps to tear down knowledge silos.
We present some of Smart Canvas features powered by its recommender system, such as:
- Highlight relevant content, explaining to the users which of his topics of interest have generated each recommendation.
- Associate tags to users’ profiles based on topics discovered from content they have contributed. These tags become searchable, allowing users to find experts or people with specific interests.
- Recommends people with similar interests, explaining which topics brings them together.
We give a deep dive into the design of our large-scale recommendation algorithms, giving special attention to our content-based approach that uses topic modeling techniques (like LDA and NMF) to discover people’s topics of interest from unstructured text, and social-based algorithms using a graph database connecting content, people and teams around topics.
Our typical data pipeline that includes the ingestion millions of user events (using Google PubSub and BigQuery), the batch processing of the models (with PySpark, MLib, and Scikit-learn), the online recommendations (with Google App Engine, Titan Graph Database and Elasticsearch), and the data-driven evaluation of UX and algorithms through A/B testing experimentation. We also touch topics about non-functional requirements of a software-as-a-service like scalability, performance, availability, reliability and multi-tenancy and how we addressed it in a robust architecture deployed on Google Cloud Platform.
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerFrancesco Osborne
The process of classifying scholarly outputs is crucial to ensure timely access to knowledge. However, this process is typically carried out manually by expert editors, leading to high costs and slow throughput. In this paper we present Smart Topic Miner (STM), a novel solution which uses semantic web technologies to classify scholarly publications on the basis of a very large automatically generated ontology of research areas. STM was developed to support the Springer Nature Computer Science editorial team in classifying proceedings in the LNCS family. It analyses in real time a set of publications provided by an editor and produces a structured set of topics and a number of Springer Nature classification tags, which best characterise the given input. In this paper we present the architecture of the system and report on an evaluation study conducted with a team of Springer Nature editors. The results of the evaluation, which showed that STM classifies publications with a high degree of accuracy, are very encouraging and as a result we are currently discussing the required next steps to ensure large-scale deployment within the company.
Annotation_M1.docxSubject Information SystemsJoshi, G. (2.docxjustine1simpson78276
Annotation_M1.docx
Subject: Information Systems
Joshi, G. (2013). Management information systems (Oxford Higher Education). New Delhi: Oxford University Press.
This work is an informational piece written to provide knowledge about management information systems for individuals interested in management. The author is a very intelligent and reliable source for this topic with a vast knowledge base of all things entailed in information systems. His in-depth coverage of the structures and concepts of this topic contribute greatly to the excellence of this book. Infrastructure information found in the work provides great insight into things such as databases, hardware, software, and other components of information systems. Talented author Joshi wraps up his ideas by highlighting the importance of the development, management, and challenges of management information systems. This is a great work that serves as an extremely reliable source of reference for those interested in this topic.
Gregor, S. (2006). The Nature of Theory in Information Systems. MIS Quarterly, 30(3), 611-642.
Gregor takes to his work, The Nature of Theory in Information Systems, to shed light on his research regarding the nature of information systems. Through this article the author addresses things such as prediction, generalization, and explanation. While he does so in a well- researched and knowledgeable manner, he does neglect a few points about information systems such as its structure. His concentration on things such as prediction, explanation, and other related topics are to be praised, as he provides an intelligent take on these things. As for the article as a whole, the author could have done a better job by providing my details about the aforementioned, and other, neglected topics.
Varajão, J. (2013). Enterprise information systems. The Learning Organization, 20(6) doi:10.1108/TLO-10-2013-0059.
In these articles, author Varajao gives a great account of the significance of enterprise information systems and its importance to individuals in fields which they frequently use information systems to execute their jobs. The author brings together five articles that all support his topic of enterprise information systems. He is able to successfully cover every aspect of the topic and leaves nothing to the imagination. His expertise serves as a reliable source of information for this topic, as well as all of its supporting information. His vast knowledge of enterprise information systems and how it can benefit many different areas is extremely commendable and allots him the ability to serve as such an expert in the field.
Wang, C., Wu, C., Wu, C., Chen, D., & Hu, Q. (2008). Communicating Between Information Systems. Information Sciences, 178(16), 3228-3239. doi:10.1016/j.ins.2008.03.017.
This article serves as an intelligent source of reference for one interested in the art of communicating between information systems by highlighting how to do so and why it is importan.
Assignment 2 LASA Research ProposalSubmit your final research BenitoSumpter862
Assignment 2: LASA: Research Proposal
Submit your final research paper to the
Submissions Area
by the
due date assigned.
It should include a cover page, an abstract, an introduction, a literature review, a methodology, and a reference page.
Your final paper should be double-spaced, 8–10 pages in length, and properly edited.
Please use the following outline:
Introduction (2–3 pages)
Introduction (including the statement of the problem)
Purpose of the study
Research question and hypotheses
Theoretical framework
Operational definitions
Literature review (3 pages)
Introduction
Review of research topic (as covered by the literature)
Conclusion
Methodology (3–4 pages)
Introduction
Research design
Participants
Instruments
Procedures
Data analysis
Limitations of the study (i.e., threats to validity)
Ethical issues
Dissemination strategy
Summary
Reference page
All written assignments and responses should follow APA rules for attributing sources.
Submission Details:
By the due date assigned,
save your document as M5_A2_Lastname_Firstname.doc and submit it to the
Submissions Area
.
This LASA is worth 300 points and will be graded according to the following rubric.
Assignment Component
Proficient
Maximum Points Possible
Articulate the problem to be researched, purpose of the study, the research question and hypotheses in operational terms aligned with the theoretical framework of the research. States the research question in operational terms that make the question measurable, but neglects to articulate the primary hypothesis and the null hypothesis in operational terms or the relationship between them.
Addresses the importance of the research with limited examples of appropriate scholarly support.
Mentions the theoretical but only superficially developed. 40 Presents a comprehensive literature review in support of the proposed research question. Presents and defines the research design.
Presents limited scholarly research to support the selected research design. 40 Identify and define all relevant variables (e.g., participants).
Present procedures for obtaining informed consent. States most appropriate variables with the appropriate statistical research questions for each variable.
Provides a general description of informed consent. 40 Present a systematic description of the methodology to be used in the proposed research. States the type of data being collected.
Partially defines how that data would be collected.
Addresses some limitations, but neglected others. 40 Identify and discuss the assessment instruments to be administered and rationale. Present the empirical support for the assessments you have suggested. Stated tests or assessment procedures proposed to address forensic issues are accurate based on the information provided in the vignette and empirically supported, but underdeveloped.
Accurate but incomplete description of how these tests woul ...
Improve your Searches, Get Trained up on Expernova!Expernova
Access the Best Experts Worldwide and Manage your Company's Networks thanks to Expernova.
Discover in this presentation helpful tips and examples on how to carry out more complex searches using the operators available with the solution.
Obtain even more relevant results!
How Anchoring Concepts Influence Essay Conceptual Structure And Test PerformanceRoy Clariana
Presented October 21 at CELDA 2023 in Madeira Portugal, https://www.celda-conf.org/
Abstract: This quasi-experimental study seeks to improve the conceptual quality of summary essays by comparing two conditions, essay prompts with or without a list of 13 broad concepts, the concepts were selected across a continuum of the 100 most frequent words in the lesson materials. It is anticipated that only the most central concepts will be used as “anchors” when writing. Participants (n = 90) in an Architectural Engineering undergraduate course read the assigned lesson textbook chapter and attended lectures and labs, then in a final lab session were asked to write a 300-word summary of the lesson content. Data consists of the essays converted to networks and the end-of-unit multiple choice test. Compared to the expert network benchmark, the essay networks of those receiving the broad concepts in the writing prompt were not significantly different from those who did not receive these concepts. However those receiving the broad concepts were significantly more like peer essay networks (mental model convergence) and like the networks of the two PowerPoint lectures but neither were like the textbook chapter. Further, those receiving the broad concepts performed significantly better on the end-of-unit test than those not receiving the concepts. Term frequency analysis of the essays indicates as expected that the most network-central concepts had a greater frequency in essays, the other terms frequencies were remarkably the same for both the terms and no terms groups, suggesting a similar underlying conceptual mental model of this lesson content. To further explore the influence of anchoring concepts in summary writing prompts, essays were generated with the same two summary writing prompts using OpenAI (ChatGPT) and Google Bard, plus a new prompt that used the 13 most central concepts from the expert’s network. The quality of the essay networks for both AI systems were equivalent to the students' essay networks for the broad concepts and for the no concept treatments. However the AI essays derived with the 13 most central concepts were significantly better (more like the expert network) than the students and AI essays derived with broad concepts or no concepts treatments. In addition, Bard and OpenAI used several of the same concepts at a higher frequency than the students suggesting that the two AI systems have more similar knowledge graphs of this content. In sum, adding 13 broad conceptual terms to a summary writing prompt improved both structural and declarative knowledge outcomes, but adding 13 most central concepts may be even better. More research is needed to understand how including concepts and other terms in a writing prompt influences students’ essay conceptual structure and subsequent test performance.
Presentation at AERA 2023 --
Investigation that considered the effect of adding key terms to an essay writing prompt. Funding from the Division of Undergraduate Education of the National Science Foundation (Award Abstract #2215807), Roy B. Clariana (PI).
Sentence versus Paragraph Processing: Linear and relational knowledge structu...Roy Clariana
Clariana, R. B., Follmer, D. J., & Li, P. (2019). Sentence versus paragraph processing: Linear and relational knowledge structure measures. Presented at the 7th International Workshop on Advanced Learning Sciences (IWALS 2019), June 17-19, 2019, University of Jyväskylä, Finland
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Directed versus undirected network analysis of student essays
1. Directed versus
undirected network
analysis of student essays
Roy Clariana (RClariana@psu.edu)
Penn State
IWALS 2018
6th International Workshop on
Advanced Learning Sciences
Perspectives on the Learner:
Cognition, Brain, and Education
University of Pittsburgh, USA
JUNE 6-8, 2018
2. Directed versus undirected
network analysis of student essays
Abstract
Knowledge structure (KS) is an expansive and expanding area
of research with a rich set of theoretical and software tools
for eliciting, representing, and analyzing KS. KS is especially
amenable to network graph methods.
Based on our work with analysis of concept maps, in 2003 I
developed a text-to-network aggregation of lexical aggregates
(ALA) approach that uses Pathfinder network scaling.
ALA pattern matches for preselected key terms (synonyms
and metonyms) in a sequential forward pass through the text,
pairs of terms discovered are entered as links into a
symmetric n x n array that is then analyzed using Pathfinder
analysis. For theoretical and pragmatic reasons at that time,
ALA was based on the representation of text as undirected
networks.
Slide 2 of 20
Research question: For analysis of student essays using ALA-
Reader, which is better, undirected networks or directed networks?
3. Elicit represent compare
graph
building
similarity
ratings
semantic
proximity
word
associations
ordered
recall
free
recall
additive
trees
hierarchical
clustering
ordered
trees minimum
spanning
trees
link
weighted
Pathfinder
nets
Networks
Dimensional
principal
components
MDS – multidimensional scaling
cluster
analysis
expert/
novice
qualitative
graph
comparisons
quantitative
graph
comparisons
relatedness
coefficients
scaling
solutions
C of PFNets
Trees
Knowledge
representation
Knowledge
comparison
Knowledge
elicitation
Jonassen, Beissner, & Yacci (1993), page 22
3 of 20
concept maps
written text
card
sort
Raw distance correl
listwise
pairwise
Origin of using undirected
4. Why has ALA used undirected?
4 of 20
X
X
X
X
n x n = 16
(n x n) - n = 12
n = 4 terms
asymmetric, undirected
((n x n) – 1)/2 = 6
symmetric, directed
terms asymmetric symmetric
0 0 0
3 9 3
6 36 15
9 81 36
12 144 66
15 225 105
18 324 153
21 441 210
24 576 276
27 729 351
30 900 435
33 1089 528
Pragmatics! How big is a concept map and do students use arrows? And how
many pair-wise comparison can you make before you go daft?
w/o diagonal
salt – pepper
pepper – salt
More data
Origin of using undirected
5. ALA-Reader example: Undirected
vs. directed Networks
Slide 5 of 20
humanists
jobsatisfaction
productivity
employees
empowered
humanists -- 1 0 0 0
job satisfaction 1 -- 1 0 0
productivity 0 1 -- 1 1
employees 0 0 1 -- 1
empowered 0 0 1 1 --
humanists -- 1 0 0 0
job satisfaction 0 -- 1 0 0
productivity 0 0 -- 1 0
employees 0 0 0 -- 1
empowered 0 0 1 0 --
Symmetric array
(undirected Pfnet)
Asymmetric array
(directed Pfnet)
Text example: “Humanists believed that job satisfaction is related to productivity. They
found that if employees were given more freedom and power, then they produced more”.
humanists employees
job satisfaction
empowered
productivity
humanists employees
job satisfaction
empowered
productivity
w/o sentence break
Pathfinder software
6. Contrast raw data from ALA-Reader
vs. Document-term matrix (i.e., LSA)
Slide 6 of 20
DATA: Expert link.txt
similarities
17 items
1 decimals
0.1 min
1 max
matrix:
1 1 0 1 0 1 1 1 1 0 0 1 0 1 1 0 0
1 1 1 0 1 0 0 0 0 0 0 1 1 0 0 0 0
0 0 1 0 1 1 0 1 1 1 0 1 0 0 0 1 1
1 0 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0
0 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 0
1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0
1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0
0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
Expert Essay ALA-Reader software (n x n data points)
Expert Essay doc-term matrix (n data points)
terms -->
management
employee
product
situation
work
TQM
customers
organization
quality
scientific_mxnagement
efficiency
humanistic
contingency
feelings
needs
service
planExpert: The most basic is the classical theory, which also includes scientific mxnagement. Managers who follow this theory focus more on business efficiency tha7 8 7 6 4 0 1 2 4 1 1 1 2 1 2 2 0
b01: The Classical style of management includes an old school and scientific approach to organization. Originally relationship between employer and employee3 7 1 2 5 0 2 1 0 0 2 0 1 1 3 0 0
b02: These four management theories can all be found in today’s workforce and each of them has their own specific way of managing employees and accomplis8 8 3 1 4 0 3 2 2 1 0 1 1 0 1 0 0
b03: Classical/ scientific mxnagement focuses on pay incentives and external rewards for job completion. Efficiency is the goal of the organization. Empowerme0 5 2 1 0 0 1 2 0 1 0 0 1 1 0 2 0
b04: Classical Management, with innovators such as Henri Fayol had a lot to do with maximizing productivity for the business’s sake. It has strict division of labo4 4 3 1 3 0 0 0 4 0 0 1 2 1 1 0 0
b05: Various management theories have been explored throughout history, each with its own benefits and disadvantages. The first of these was classical manag9 4 2 1 4 1 0 1 1 1 2 4 1 1 5 0 0
b06: Management is an evolving science that has taken many different perspectives: One of the first people to lead an organized study on management was Tay13 6 1 0 1 2 1 3 3 1 2 3 1 1 1 0 1
Expert Essay [7, 8, 7, 6, 4, 0, 1, 2, 4, 1, 1, 1, 2, 2, 2, 2, 0]
Linear order is preserved by ALA-Reader (not just
bag-of-words)
9. Expert essay referent network (same raw
data as directed and as undirected networks)
Slide 9 of 20
Symmetric array
(undirected Pfnet)
Asymmetric array
(directed Pfnet)
25 links in common
45 (90) links
57 links
Note: central and peripheral terms are the same in both networks
10. Directed versus undirected
network analysis of student essays
• Participants are 45 undergraduates enrolled in a business course
• During the regularly scheduled examination week at the end of
the semester, all students completed the customary final
examination for the course (worth 25% of their final course
grade) and also answered an extended-response essay question
for extra credit.
• Writing prompt: “Describe and contrast in an essay of 300 words
or less the following four Management theories: Classical/
scientific management, Humanistic/Human Resources,
Contingency, and Total Quality Management.”
• The essays were scored by three human raters and by the
ALA-Reader software (with using Pathfinder Network analysis)
Slide 10 of 20
11. Directed versus undirected
network analysis of student essays
The essays were scored by three human raters and by ALA-Reader
software (then using Pathfinder Network analysis links in common)
Research questions: For analysis of student essays using ALA-
Reader, which is better
1. Analysis of undirected networks or of directed networks?
2. Pattern analysis across sentence boundaries (document wise)
or NOT across sentence boundaries (sentence-wise)
Slide 11 of 20
expert
student
34 23 12
Links in common
12. Correlation Results
Document-wise (no
breaks between
sentences)
Sentence-wise
(breaks between
sentences)
Raters(3,median)
FinalExam
#ofwords
DirectedPfnetCMN
(toexpert)
UndirectedPfnetCMN
(toexpert)
DirectedPfnetCMN
(toexpert)
UndirectedPfnetCMN
(toexpert)
Raters (3, median) 1 .517 .600 .732 .655 .733 .615
Final Exam .517 1 .372 .342 .298 .332 .283
# of words .600 .372 1 .404 .319 .390 .307
Document-wise (no breaks between sentences)
Directed Pfnet CMN
(to expert)
.732 .342 .404 1 .923 .986 .881
Undirected Pfnet CMN
(to expert)
.655 .298 .319 .923 1 .919 .977
Sentence-wise (breaks between sentences)
Directed Pfnet CMN
(to expert)
.733 .332 .390 .986 .919 1 .899
Undirected Pfnet CMN
(to expert)
.615 .283 .307 .881 .977 .899 1
r > .290, p > .05 and r > .400, p > .01
• Directed > undirected (but only a
small difference)
• Document wise analysis = sentence
wise analysis; for these essays,
sentence boundaries don’t matter
• ALA-Reader data inter-correlations
all high
• Number of words in the student
essays strongly correlated with
raters’ scores (r = .60)
• Stepwise multiple regression
analysis was used to test if the essay
features significantly predicted
human essay rater scores. The
results of the regression indicated
two predictors explained 64.6% of
the variance (F(2,42)=38.395,
p<.0001). It was found that the
directed network common scores
significantly predicted rater scores (β
= .585, p<.0001), as did essay word
count (β = .364, p=.001).
Click bigger
14. Correlation Results
Document-wise (no
breaks between
sentences)
Sentence-wise
(breaks between
sentences)
Raters(3,median)
FinalExam
#ofwords
DirectedPfnetCMN
(toexpert)
UndirectedPfnetCMN
(toexpert)
DirectedPfnetCMN
(toexpert)
UndirectedPfnetCMN
(toexpert)
Raters (3, median) 1 .517 .600 .732 .655 .733 .615
Final Exam .517 1 .372 .342 .298 .332 .283
# of words .600 .372 1 .404 .319 .390 .307
Document-wise (no breaks between sentences)
Directed Pfnet CMN
(to expert)
.732 .342 .404 1 .923 .986 .881
Undirected Pfnet CMN
(to expert)
.655 .298 .319 .923 1 .919 .977
Sentence-wise (breaks between sentences)
Directed Pfnet CMN
(to expert)
.733 .332 .390 .986 .919 1 .899
Undirected Pfnet CMN
(to expert)
.615 .283 .307 .881 .977 .899 1
r > .290, p > .05 and r > .400, p > .01
• Directed > undirected (but only a
small difference)
• Document wise analysis = sentence
wise analysis; for these essays,
sentence boundaries don’t matter
• ALA-Reader data inter-correlations
all high
• Number of words in the student
essays strongly correlated with
raters’ scores (r = .60)
• Stepwise multiple regression
analysis was used to test if the essay
features significantly predicted
human essay rater scores. The
results of the regression indicated
two predictors explained 64.6% of
the variance (F(2,42)=38.395,
p<.0001). It was found that the
directed network common scores
significantly predicted rater scores (β
= .585, p<.0001), as did essay word
count (β = .364, p=.001).
15. Next steps
• Working with Ping Li’s neuroimaging lab at Penn
State to consider the possible neural influence of
the text’s and of reader’s knowledge structure of
those texts
• Further develop the GIKS “universal” feedback
writing tool
Slide 15 of 20
16. Central vs. peripheral neural
correlates
“Central ideas are functionally distinct from peripheral ideas,
showing greater activation in the PCC and PCU, while over the
course of passage comprehension, central and peripheral
ideas increasingly recruit different parts of the semantic
control network. The finding that central information elicits
greater response in mental model updating regions than
peripheral ideas supports previous behavioral models on the
cognitive importance of distinguishing textual centrality.” (p.
853)
Swett, K., Miller, A.C., Burns, S., Hoeft, F., Davis, N., Petrill,
S.A., & Cutting, L.E. (2013). Comprehending expository texts:
the dynamic neurobiological correlates of building a coherent
text representation. Frontiers in Human Neuroscience, 7, 853-
867. doi:10.3389/fnhum.2013.00853
Slide 16 of 20
17. GIKS
1. Enter list of key terms
with synonyms and
metonyms
2. Enter the writing
prompt
3. Enter expert referent
essay or term-term
data
4. Set the expert network
layout positions
5. Distribute ID#s for the
study
Slide 17 of 20
http://giks.herokuapp.com/
21. How many terms in the ALA-
Reader pattern matching?
• The optimal number of terms for ALA-Reader is
unknown (note that many LSA studies use 300-
element vectors). A recent dissertation indicates
about 20 (Fanella, 2015)
Slide 21 of xx
https://etda.libraries.psu.edu/catalog/26367
Expert essay and concept
map of this content
surfaced 17 terms for this
analysis
22. NodeXL network groups redrawn as a cmap
Could humans live on Mars some day?
Scientists ask this question because Earth and Mars are similar.
Similar to Earth’s day, Mars’s day is about 24 hours long.
Also, both planets are near the Sun in our solar system.
Earth is the 3rd planet and Mars the 4th planet from the Sun.
Mars also has an axial tilt similar to Earth's axial tilt.
An axial tilt gives both planets seasons with temperature changes.
Just like Earth, Mars has cold winters and warmer summers.
Like Earth, Mars has winds, weather, dust storms, and volcanoes.
But in some ways, Earth and Mars are different.
Differences include temperature, length of a year, and gravity.
The average temperature is –81o F on Mars, but 57o F on Earth.
A Martian year is almost twice as long as an Earth year.
Earth’s gravity is almost 3 times stronger than Martian gravity.
Given the similarities, can humans go to Mars and live there?
NASA scientists want to answer this question.
NASA oversees U.S. research on space exploration.
NASA scientists send devices called spacecraft to explore Mars.
The spacecraft carry rovers that can rove or move around.
These wheeled rovers can explore characteristics of the planet.
They can take pictures of mountains, plains, and dust storms on Mars.
One of these NASA rovers is named Curiosity.
Curiosity found evidence that soil on Mars contains 2% water.
NASA has planned a new mission called Mars 2020.
This mission will use a new car–sized rover to examine Mars.
The new rover will contain additional instruments to study Mars.
For example, one instrument will take images beneath Mars’s surface.
Another instrument will attempt to make oxygen from carbon
dioxide.
Mars 2020 will help scientists answer important questions.
It will explore whether there has been life on Mars.
It will also answer whether humans can live on Mars in the future.
Mars lesson text (or eye track)
23. ALA-Reader articles
• Zimmerman, W. et al. (2018). Computer-automated approach for scoring short essays in an introductory statistics course.
Journal of Statistics Education, 25, in press.
• Kim, K., & Clariana, R. (2018). Applications of Pathfinder Network scaling for identifying the optimal use of a first
language to support second language text comprehension. Educational Technology Research and Development, in press.
• Kim, K., & Clariana, R. (2017). Text signals influence second language expository text comprehension: Knowledge
structure analysis. Educational Technology Research and Development, 65, 909-930. Online First,
http://link.springer.com/article/10.1007/s11423-016-9494-x.
• Kim, K., & Clariana, R.B. (2015). Knowledge structure measures of reader’s situation models across languages:
Translation engenders richer structure. Technology, Knowledge and Learning, 20, 249-268.
• Clariana, R.B., Wolfe, M. B., & Kim, K. (2014). The influence of narrative and expository text lesson text structures on
knowledge structures: alternate measures of knowledge structure. Educational Technology Research and Development,
62 (4), 601-616. doi:10.1007/s11423-014-9348-3
• Clariana, R.B. (2010). Deriving group knowledge structure from semantic maps and from essays. In D. Ifenthaler, P.
Pirnay-Dummer, & N.M. Seel (Eds.), Computer-Based Diagnostics and Systematic Analysis of Knowledge (Chapter 7, pp.
117-130). New York, NY: Springer.
• Clariana, R.B., Wallace, P.E., & Godshalk, V.M. (2009). Deriving and measuring group knowledge structure from essays:
The effects of anaphoric reference. Educational Technology Research and Development, 57, 725-737. ETRD.pdf
• Clariana, R.B., & Wallace, P. E. (2007). A computer-based approach for deriving and measuring individual and team
knowledge structure from essay questions. Journal of Educational Computing Research, 37 (3), 209-225. link
• Koul, R., Clariana, R.B., & Salehi, R. (2005). Comparing several human and computer-based methods for scoring concept
maps and essays. Journal of Educational Computing Research, 32 (3), 261-273.
Slide 23 of xx