Presentation made on December 7th 2016 during ICADL'16
Full text can be found at http://link.springer.com/chapter/10.1007/978-3-319-49304-6_12
Extended version can be found at https://arxiv.org/abs/1609.01415
What papers should I cite from my reading list? User evaluation of a manuscri...Aravind Sesagiri Raamkumar
Long paper presented during the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016)
A task-based scientific paper recommender system for literature review and ma...Aravind Sesagiri Raamkumar
My PhD oral defense presentation (as of Oct 3rd 2017)
The dissertation can be requested at this link https://www.researchgate.net/publication/323308750_A_task-based_scientific_paper_recommender_system_for_literature_review_and_manuscript_preparation
Navigation through citation network based on content similarity using cosine ...Salam Shah
The rate of scientific literature has been increased in the past few decades; new topics and information is added in the form of articles, papers, text documents, web logs, and patents. The growth of information at rapid rate caused a tremendous amount of additions in the current and past knowledge, during this process, new topics emerged, some topics split into many other sub-topics, on the other hand, many topics merge to formed single topic. The selection and search of a topic manually in such a huge amount of information have been found as an expensive and workforce-intensive task. For the emerging need of an automatic process to locate, organize, connect, and make associations among these sources the researchers have proposed different techniques that automatically extract components of the information presented in various formats and organize or structure them. The targeted data which is going to be processed for component extraction might be in the form of text, video or audio. The addition of different algorithms has structured information and grouped similar information into clusters and on the basis of their importance, weighted them. The organized, structured and weighted data is then compared with other structures to find similarity with the use of various algorithms. The semantic patterns can be found by employing visualization techniques that show similarity or relation between topics over time or related to a specific event. In this paper, we have proposed a model based on Cosine Similarity Algorithm for citation network which will answer the questions like, how to connect documents with the help of citation and content similarity and how to visualize and navigate through the document.
What papers should I cite from my reading list? User evaluation of a manuscri...Aravind Sesagiri Raamkumar
Long paper presented during the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2016)
A task-based scientific paper recommender system for literature review and ma...Aravind Sesagiri Raamkumar
My PhD oral defense presentation (as of Oct 3rd 2017)
The dissertation can be requested at this link https://www.researchgate.net/publication/323308750_A_task-based_scientific_paper_recommender_system_for_literature_review_and_manuscript_preparation
Navigation through citation network based on content similarity using cosine ...Salam Shah
The rate of scientific literature has been increased in the past few decades; new topics and information is added in the form of articles, papers, text documents, web logs, and patents. The growth of information at rapid rate caused a tremendous amount of additions in the current and past knowledge, during this process, new topics emerged, some topics split into many other sub-topics, on the other hand, many topics merge to formed single topic. The selection and search of a topic manually in such a huge amount of information have been found as an expensive and workforce-intensive task. For the emerging need of an automatic process to locate, organize, connect, and make associations among these sources the researchers have proposed different techniques that automatically extract components of the information presented in various formats and organize or structure them. The targeted data which is going to be processed for component extraction might be in the form of text, video or audio. The addition of different algorithms has structured information and grouped similar information into clusters and on the basis of their importance, weighted them. The organized, structured and weighted data is then compared with other structures to find similarity with the use of various algorithms. The semantic patterns can be found by employing visualization techniques that show similarity or relation between topics over time or related to a specific event. In this paper, we have proposed a model based on Cosine Similarity Algorithm for citation network which will answer the questions like, how to connect documents with the help of citation and content similarity and how to visualize and navigate through the document.
The goal of information retrieval (IR) is to provide users with those documents that will satisfy their information need. The information need can be understood as forming a pyramid, where only its peak is made visible by users in the form of a conceptual query.
Hot Topic Detection and Technology Trend Tracking for Patents utilizing Term ...Ly Nguyen
This paper proposes a methodology for identifying
hot topics and tracking technology trends from the patent
domain. The methodology uses frequency information in
combination with the International Patent Classification (IPC) to
capture semantic information on word categorization, doing so in
a way that heretofore has not been employed for topic detection
and trend tracking. Term Frequency and Proportional Document
Frequency (TF*PDF) is employed as a means to detect hot topics
from patents, and IPCs are used to calculate semantic
importance of terms based on the IPCs where terms are
distributed. Aging Theory is also used to calculate the variation
of trends over time. Four types of trends including very stable
trends, stable trends, normal trends, and unstable trends are
defined and evaluated based on TF*PDF and TF*PDF combined
with Aging Theory. Experiment results show that for very stable
trends, the combination of TF*PDF and Aging Theory achieves
0.976% in Precision; for stable trends and all trends, TF*PDF
achieves 0.959% and 0.84% in Precision, respectively. By
applying TF*PDF in consideration of semantic information, we
also show a new criteria for weighting hot topics and technology
trend tracking.
An introduction to system-oriented evaluation in Information RetrievalMounia Lalmas-Roelleke
Slides for my lecture on IR evaluation, presented at 11th European Summer School in Information Retrieval (ESSIR 2017) at Universitat Pompeu Fabra, Barcelona.
These slides were based on
1. Evaluation lecture @ QMUL; Thomas Roelleke & Mounia Lalmas
3. Lecture 8: Evaluation @ Stanford University; Pandu Nayak & Prabhakar Raghavan
4. Retrieval Evaluation @ University of Virginia; Hongnig Wang
5. Lectures 11 and 12 on Evaluation @ Berkeley; Ray Larson
6. Evaluation of Information Retrieval Systems @ Penn State University; Lee Giles
Textbooks:
1. Information Retrieval, 2nd edition, C.J. van Rijsbergen (1979)
2. Introduction to Information Retrieval, C.D. Manning, P. Raghavan & H. Schuetze (2008)
3. Modern Information Retrieval: The Concepts and Technology behind Search, 2nd ed; R. Baeza-Yates & B. Ribeiro-Neto (2011)
A Federated Search Approach to Facilitate Systematic Literature Review in Sof...ijseajournal
To impact industry, researchers developing technologies in academia need to provide tangible evidence of
the advantages of using them. Nowadays, Systematic Literature Review (SLR) has become a prominent
methodology in evidence-based researches. Although adopting SLR in software engineering does not go far
in practice, it has been resulted in valuable researches and is going to be more common. However, digital
libraries and scientific databases as the best research resources do not provide enough mechanism for
SLRs especially in software engineering. On the other hand, any loss of data may change the SLR results
and leads to research bias. Accordingly, the search process and evidence collection in SLR is a critical
point. This paper provides some tips to enhance the SLR process. The main contribution of this work is
presenting a federated search tool which provides an automatic integrated search mechanism in wellknown Software Engineering databases. Results of case study show that this approach not only reduces
required time to do SLR and facilitate its search process, but also improves its reliability and results in the
increasing trend to use SLRs.
Europe PMC has implemented a section tagging pipeline that automatically classifies scientific article sections into predefined classes.
Şenay Kafkas will present this work during the ContentMine workshop at EBI on 6th October 2014.
PhD thesis defense.
This manuscript describes a methodology designed and implemented to realise the recommendation of vocabularies based on the content of a given website. The goal of the proposed approach is to generate vocabularies by reusing existing schemas. The automatic recommendation helps to leverage websites to self-described web entities in the Web of Data; understandable by both humans and machines. In this direction, the implemented approach is wrapped within a broader methodology of turning a website in a machine understandable node by using technologies that have been developed in the scope of the Semantic Web vision. Transforming a website to a machine understandable entity is the first step required by the websites side in order to narrow the gap with web agents and enable the structured content consumption without the need of implementing an Application Programming Interface (API) that would provide read-write functionality. The motivation of the thesis stems from the fact that the data provided via an API is already presented on the corresponding website in most of the cases.
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...Angelo Salatino
Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this paper, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of re-search areas in the field of Computer Science. The CSO Classifier takes as input the metadata associated with a research paper (title, abstract, keywords) and returns a selection of research concepts drawn from the ontology. The approach was evaluated on a gold standard of manually annotated articles yielding a significant improvement over alternative methods.
Detection of Embryonic Research Topics by Analysing Semantic Topic NetworksAngelo Salatino
Being aware of new research topics is an important asset for anybody involved in the research environment, including researchers, academic publishers and institutional funding bodies. In recent years, the amount of scholarly data available on the web has increased steadily, allowing the development of several approaches for detecting emerging research topics and assessing their trends. However, current methods focus on the detection of topics which are already associated with a label or a substantial number of documents. In this paper, we address instead the issue of detecting embryonic topics, which do not possess these characteristics yet. We suggest that it is possible to forecast the emergence of novel research topics even at such early stage and demonstrate that the emergence of a new topic can be anticipated by analysing the dynamics of pre-existing topics. We present an approach to evaluate such dynamics and an experiment on a sample of 3 million research papers, which confirms our hypothesis. In particular, we found that the pace of collaboration in sub-graphs of topics that will give rise to novel topics is significantly higher than the one in the control group.
Systematic Literature Reviews and Systematic Mapping Studiesalessio_ferrari
Lecture slides on Systematic Literature Reviews and Systematic Mapping Studies in software engineering. It describes the different steps, discusses differences between the two methods, and gives guidelines on how to conduct these types of study.
The goal of information retrieval (IR) is to provide users with those documents that will satisfy their information need. The information need can be understood as forming a pyramid, where only its peak is made visible by users in the form of a conceptual query.
Hot Topic Detection and Technology Trend Tracking for Patents utilizing Term ...Ly Nguyen
This paper proposes a methodology for identifying
hot topics and tracking technology trends from the patent
domain. The methodology uses frequency information in
combination with the International Patent Classification (IPC) to
capture semantic information on word categorization, doing so in
a way that heretofore has not been employed for topic detection
and trend tracking. Term Frequency and Proportional Document
Frequency (TF*PDF) is employed as a means to detect hot topics
from patents, and IPCs are used to calculate semantic
importance of terms based on the IPCs where terms are
distributed. Aging Theory is also used to calculate the variation
of trends over time. Four types of trends including very stable
trends, stable trends, normal trends, and unstable trends are
defined and evaluated based on TF*PDF and TF*PDF combined
with Aging Theory. Experiment results show that for very stable
trends, the combination of TF*PDF and Aging Theory achieves
0.976% in Precision; for stable trends and all trends, TF*PDF
achieves 0.959% and 0.84% in Precision, respectively. By
applying TF*PDF in consideration of semantic information, we
also show a new criteria for weighting hot topics and technology
trend tracking.
An introduction to system-oriented evaluation in Information RetrievalMounia Lalmas-Roelleke
Slides for my lecture on IR evaluation, presented at 11th European Summer School in Information Retrieval (ESSIR 2017) at Universitat Pompeu Fabra, Barcelona.
These slides were based on
1. Evaluation lecture @ QMUL; Thomas Roelleke & Mounia Lalmas
3. Lecture 8: Evaluation @ Stanford University; Pandu Nayak & Prabhakar Raghavan
4. Retrieval Evaluation @ University of Virginia; Hongnig Wang
5. Lectures 11 and 12 on Evaluation @ Berkeley; Ray Larson
6. Evaluation of Information Retrieval Systems @ Penn State University; Lee Giles
Textbooks:
1. Information Retrieval, 2nd edition, C.J. van Rijsbergen (1979)
2. Introduction to Information Retrieval, C.D. Manning, P. Raghavan & H. Schuetze (2008)
3. Modern Information Retrieval: The Concepts and Technology behind Search, 2nd ed; R. Baeza-Yates & B. Ribeiro-Neto (2011)
A Federated Search Approach to Facilitate Systematic Literature Review in Sof...ijseajournal
To impact industry, researchers developing technologies in academia need to provide tangible evidence of
the advantages of using them. Nowadays, Systematic Literature Review (SLR) has become a prominent
methodology in evidence-based researches. Although adopting SLR in software engineering does not go far
in practice, it has been resulted in valuable researches and is going to be more common. However, digital
libraries and scientific databases as the best research resources do not provide enough mechanism for
SLRs especially in software engineering. On the other hand, any loss of data may change the SLR results
and leads to research bias. Accordingly, the search process and evidence collection in SLR is a critical
point. This paper provides some tips to enhance the SLR process. The main contribution of this work is
presenting a federated search tool which provides an automatic integrated search mechanism in wellknown Software Engineering databases. Results of case study show that this approach not only reduces
required time to do SLR and facilitate its search process, but also improves its reliability and results in the
increasing trend to use SLRs.
Europe PMC has implemented a section tagging pipeline that automatically classifies scientific article sections into predefined classes.
Şenay Kafkas will present this work during the ContentMine workshop at EBI on 6th October 2014.
PhD thesis defense.
This manuscript describes a methodology designed and implemented to realise the recommendation of vocabularies based on the content of a given website. The goal of the proposed approach is to generate vocabularies by reusing existing schemas. The automatic recommendation helps to leverage websites to self-described web entities in the Web of Data; understandable by both humans and machines. In this direction, the implemented approach is wrapped within a broader methodology of turning a website in a machine understandable node by using technologies that have been developed in the scope of the Semantic Web vision. Transforming a website to a machine understandable entity is the first step required by the websites side in order to narrow the gap with web agents and enable the structured content consumption without the need of implementing an Application Programming Interface (API) that would provide read-write functionality. The motivation of the thesis stems from the fact that the data provided via an API is already presented on the corresponding website in most of the cases.
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...Angelo Salatino
Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this paper, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of re-search areas in the field of Computer Science. The CSO Classifier takes as input the metadata associated with a research paper (title, abstract, keywords) and returns a selection of research concepts drawn from the ontology. The approach was evaluated on a gold standard of manually annotated articles yielding a significant improvement over alternative methods.
Detection of Embryonic Research Topics by Analysing Semantic Topic NetworksAngelo Salatino
Being aware of new research topics is an important asset for anybody involved in the research environment, including researchers, academic publishers and institutional funding bodies. In recent years, the amount of scholarly data available on the web has increased steadily, allowing the development of several approaches for detecting emerging research topics and assessing their trends. However, current methods focus on the detection of topics which are already associated with a label or a substantial number of documents. In this paper, we address instead the issue of detecting embryonic topics, which do not possess these characteristics yet. We suggest that it is possible to forecast the emergence of novel research topics even at such early stage and demonstrate that the emergence of a new topic can be anticipated by analysing the dynamics of pre-existing topics. We present an approach to evaluate such dynamics and an experiment on a sample of 3 million research papers, which confirms our hypothesis. In particular, we found that the pace of collaboration in sub-graphs of topics that will give rise to novel topics is significantly higher than the one in the control group.
Systematic Literature Reviews and Systematic Mapping Studiesalessio_ferrari
Lecture slides on Systematic Literature Reviews and Systematic Mapping Studies in software engineering. It describes the different steps, discusses differences between the two methods, and gives guidelines on how to conduct these types of study.
Managing Ireland's Research Data - 3 Research MethodsRebecca Grant
Slides providing an overview of the research methods used in the author's thesis, "Managing Ireland's Research Data: Recognising Roles for Recordkeepers". The methods discussed are online surveys, comparative case studies, and autoethnography.
Licensed as CC-BY.
Apresentação - Revisão Sistemática | Técnicas de Estudos do FuturoIgor Sampaio
Apresentação da rápida revisão sistemática das técnicas de estudos do futuro realizada na disciplina "Estudos do Futuro".
Centro de Informática - UFPE - 2016.2
A Model of Decision Support System for Research Topic Selection and Plagiaris...theijes
The paper proposes a model of the decision support system for deciding a research topic in academia. The biggest challenge for a student in the field of research is to identify area and topic of research. The paper explains the model which helps student to identify the most suitable area and/or topic for academic research. The model is also design to assist supervisors to explore latest areas of research as well as to get rid of non intentional plagiarism. The model facilitates the user to select either keyword bases topic search or questionnaire based topic search. The model uses local database and service of a Meta search engine in decision making activity
This introductory lecture for IA377 will be devoted to the topic of “Literature Review”.
What is a literature review?
Methodology, best practices, tips, tools, etc.
Practical example
Application to IA377 seminar activities.
https://ia377-feec-unicamp.github.io/classes/2023/03/09/Literature-Review.html
Measuring the Outreach Efforts of Public Health Authorities and the Public Re...Aravind Sesagiri Raamkumar
JMIR paper presented during the Annul ID Symposium conducted Saw Swee Hock School of Public Health (National University of Singapore)
Main paper accessible at https://www.jmir.org/2020/5/e19334/
Presentation made during the Intelligent User-Adapted Interfaces: Design and Multi-Modal Evaluation Workshop (IUadaptME) workshop conducted as part of UMAP 2018
Evolution and state-of-the art of Altmetric research: Insights from network a...Aravind Sesagiri Raamkumar
Evolution and state-of-the art of Altmetric research: Insights from network analysis and altmetric analysis
Authors: Hiran Lathabai, Thara Prabhakaran, Manoj Changat
Workshop Website: http://www.altmetrics.ntuchess.com/AROSIM2018/
Scientometric Analysis of Research Performance of African Countries in select...Aravind Sesagiri Raamkumar
Scientometric Analysis of Research Performance of African Countries in selected subjects within the field of Science and Technology
Author: Yusuff Utieyineshola
Workshop Website: http://www.altmetrics.ntuchess.com/AROSIM2018/
New Dialog, New Services with Altmetrics: Lingnan University Library ExperienceAravind Sesagiri Raamkumar
New Dialog, New Services with Altmetrics: Lingnan University Library Experience
Authors: Sze Lui, Sheila Cheung, Cindy Kot, Kammy Chan
Workshop Website: http://www.altmetrics.ntuchess.com/AROSIM2018/
Field-weighting readership: how does it compare to field-weighting citations?
Authors: Sarah Huggett, Eleonora Palmaro, Christopher James
Workshop Website: http://www.altmetrics.ntuchess.com/AROSIM2018/
How do Scholars Evaluate and Promote Research Outputs? An NTU Case Study
Authors: Han Zheng, Mojisola Erdt, Yin-Leng Theng
Workshop Website: http://www.altmetrics.ntuchess.com/AROSIM2018/
Monitoring the broad impact of the journal publication output on country leve...Aravind Sesagiri Raamkumar
Monitoring the broad impact of the journal publication output on country level: A case study for Austria
Authors: Juan Gorraiz, Benedikt Blahous, Martin Wieland
Workshop Website: http://www.altmetrics.ntuchess.com/AROSIM2018/
A Comparative Investigation on Citation Counts and Altmetrics between Papers ...Aravind Sesagiri Raamkumar
A Comparative Investigation on Citation Counts and Altmetrics between Papers Authored by Universities and Companies in the Research Field of Artificial Intelligence
Authors: Feiheng Luo, Han Zheng, Mojisola Erdt, Aravind Sesagiri Raamkumar, Yin-Leng Theng
Workshop Website: http://www.altmetrics.ntuchess.com/AROSIM2018/
Presentation deck prepared for the paper 'Object Recognition-based Mnemonics Mobile App for Senior Adults Communication' to be presented during ICCCNT'15 conference
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Proposing a Scientific Paper Retrieval and Recommender Framework
1. Proposing a Scientific Paper Retrieval and
Recommender Framework
Aravind Sesagiri Raamkumar, Schubert Foo & Natalie Pang
Wee Kim Wee School of Communication and Information
Nanyang Technological University, Singapore
Presentation for ICADL’16
December 7th
2016
1
2. •Information Retrieval (IR) and Recommender Systems (RS) techniques
have been used to find information objects for:-
Scholarly Communication Lifecycle tasks
Literature Review (LR) search tasks
•Examples of such tasks include
Building a reading list of research papers
Recommending similar papers based on seed papers
Recommending papers based on query logs
Serendipitous discovery of interesting papers
Recommending publication venues for manuscripts
Recommending papers based on citation context
Recommending co-authors for papers
And few more….
BACKGROUND
2
3. Issues
Proposed techniques and applications are piecemeal approaches
Wide variety of algorithms and data fields used in prior studies
What was done?
A prototype system Rec4LRW was built for recommending papers for
three tasks:-
1. Building a reading list of research papers
2. Finding similar papers based on a set of papers
3. Shortlisting papers from the final reading list for inclusion in
manuscript
Task recommendation techniques conceptualized on top of an
identified set of base features
BACKGROUND
3
7. REC4LRW SYSTEM EVALUATION
• Offline evaluation experiment and user evaluation study conducted to
evaluate the Rec4LRW system
• ACM DL extract of papers published between 1951 and 2011 used
as corpus for the system with 103,739 articles
• Postgraduate research students, research staff and academic staff
were recruited for the user evaluation study
Main entry criteria: Participant should have authored at least
one research paper
• Participants evaluated the task recommendations and the overall
Rec4LRW system from a list of 43 topics
Online questionnaires were provided at the end of each task
7
9. USER STUDY PARTICIPANTS
9
Demographic Variable Number of Participants
Position
Student 62 (47%)
Staff 70 (53%)
Experience Level [Self-Reported]
Beginner 15 (11.4%)
Intermediate 61 (46.2%)
Advanced 34 (25.8%)
Expert 22 (16.7%)
Discipline Category
Engineering & Technology 87 (65.9%)
Social Sciences 42 (31.8%)
Life Sciences & Medicine 3 (2.3%)
Discipline
Computer Science & Information Systems 51 (38.6%)
Library and Information Studies 30 (22.7%)
Electrical & Electronic Engineering 30 (22.7%)
Communication & Media Studies 8 (6.1%)
Mechanical, Aeronautical & Manufacturing Engineering 5 (3.8%)
Biological Sciences 2 (1.5%)
Statistics & Operational Research 1 (0.8%)
Education 1 (0.8%)
Politics & International Studies 1 (0.8%)
Economics & Econometrics 1 (0.8%)
Civil & Structural Engineering 1 (0.8%)
Psychology 1 (0.8%)
10. DATA ANALYSIS PROCEDURES
Quantitative Data
Ascertain the agreement percentages of the evaluation
measures
Logistic regression, t-test and correlation tests
Qualitative Data
Identify the top preferred and critical aspects of the tasks
and the overall system
Feedback responses were coded by a single coder using an
inductive approach
10
11. EMERGENT THEMES AND A
FRAMEWORK
• Certain dominant themes were apparent from the qualitative feedback
• These themes were consolidated into a single framework - Scientific Paper
Retrieval and Recommender Framework (SPRRF)
Why do we need a framework?
• Most RS and IR studies are single dimensional i.e. algorithmic
• Need to consider the overall context towards providing a meaningful
experience
• Framework generation based on empirical data
• Guide the next round of evaluation of Rec4LRW system
11
12. THEMES (1-2)
Theme 1: Distinct User Groups
•Users who want more control
Participants required control features in the UI and gave preferences on the
algorithms logic
“..Maybe a side window with categories like high reach, survey etc could be put up and upon clicking
it, more papers in that category could be loaded.”
•Users who tend to trust the system and its output
Participants were largely satisfied with the overall system
“The idea of providing this system is quite* good. Such a system if developed and prepared well, can
help and speed up the process of literature survey by helping to find better papers…”
Theme 2: Information Cues
•Four cue labels used in the system: Recent, Popular, High Reach, Survey/Review
•Cues positively impacted participants’ perceptions of the system
“I like the highlighted recommendations - for e.g. Popular, Recent etc. which greatly helps in
distinguishing various references and catches the eye !”
12
13. THEMES (3-4)
Theme 3: Forced Serendipity vs Natural Serendipity
•Prior studies have focused mainly on modelling serendipity
•‘View Papers in the Parent Cluster’ feature helped participants in noticing
papers which they have not read earlier
“The view papers in the parent cluster function is very helpful to get a full picture of research
field.”
“The user can view many papers in the parent cluster in addition to the shortlisted papers. Thus
the user need not spend much time on finding related papers.”
Theme 4: Learning Algorithms vs Fixed Algorithms
•Some participants in the study suggested heuristics to identify papers for the
tasks 1 and 2
•These users expect a list of appropriate algorithms to be presented in the
system
“..Take a high impact paper (based on citation and may be exact keyword matching), then go
through its own references to understand more about the research conducted. This is because,
a good work generally cites other prominent works in the field…”
13
15. THEMES (3-4)
Theme 3: Forced Serendipity vs Natural Serendipity
•Prior studies have focused mainly on modelling serendipity
•‘View Papers in the Parent Cluster’ feature helped participants in noticing
papers which they have not read earlier
“The view papers in the parent cluster function is very helpful to get a full picture of research
field.”
“The user can view many papers in the parent cluster in addition to the shortlisted papers. Thus
the user need not spend much time on finding related papers.”
Theme 4: Learning Algorithms vs Fixed Algorithms
•Some participants in the study suggested heuristics to identify papers for the
tasks 1 and 2
•These users expect a list of appropriate algorithms to be presented in the
system
“..Take a high impact paper (based on citation and may be exact keyword matching), then go
through its own references to understand more about the research conducted. This is because,
a good work generally cites other prominent works in the field…”
15
16. THEMES (5-6)
Theme 5: Inclusion of Control Features in User-Interface
•Many participants felt handicapped by the absence of control features in the
Rec4LRW system
•Expected control features were sort options, topical facets and advanced search
features
“Really good for the initial review. It would be nice to see additional filters to focus on a specific
topic”
“More recent papers shall be included, and it is better if the user can sort the recommended
paper by sequence such as sort times, date, relevance...”
Theme 6: Inclusion of Bibliometric Data
•Participants explicitly stated the need for metrics such as impact factor and h-
index in the UI
•The main challenge is the computing overhead for calculating the new metrics
“Categorizing the papers based on popularity, journal impact factor, and etc”
“…In case that an item in the recommendation list is a journal paper, can we also know its
impact factor and which databases indexes it?”
16
17. THEMES (7-8)
Theme 7: Diversification of Corpus
•The evaluation of algorithms has been restricted to datasets from certain
disciplines such as computer science in prior studies
•Future studies should include papers from “far-apart” disciplines for the
evaluation
“…Due to limitation of data sets (as only ACM papers) search result is not of decent quality.”
“But in general the main drawback is that "the papers in the corpus/dataset are from an extract
of papers from ACM DL". As I work at the intersection of information systems and business
many relevant papers are not included in the list.”
Theme 8: Task Interconnectivity
•Participants appreciated the utility of ‘seed basket’ and ‘reading list’ towards
management of the paper across the three tasks
“I like the idea of giving recommendations based on a seed group of articles, but there needs to
be more facets to select from, there needs to be greater selection of seeding articles as well in
terms of those facets.”
“The whole idea seems good for me, especially making seed of 5+ for expanding the bunch.”
17
18. THE FRAMEWORK
18
SPRRF Feature Skill-Reliant User System-Reliant User
UI Customization
Sort options √
Topical Facets √ √
Advanced search options √
Algorithmic Customization
Setting the recommendations count √ √
Selecting the retrieval algorithm √
Submitting external papers √ √
User Personalization
Paper collections √ √
Favourites specification √ √
Paper anchors √
Relevance feedback √
19. FUTURE WORK
• SPRRF to be used in second round of Rec4LRW
evaluation studies
• SPRRF components to be statistically validated through
hypotheses
• Expand the scope of SPRRF to other information
objects in the Scholarly Communication Lifecycle
19
20. GET ACCESS TO REC4LRW…
Use the link http://goo.gl/XgynzY or scan the below QR code
20