This document discusses metrics for evaluating scientific literature and research impact. It begins by describing traditional citation-based metrics like the impact factor. It then introduces alternative metrics (altmetrics) that measure impact through social media and online mentions. The document notes promises of altmetrics like assessing impact more quickly. It also outlines criticisms of metrics, including that citations take long to accumulate and most only consider articles, not other research outputs. It calls for more transparency and validation in developing metrics.
This presentation described Big Data concept. Then it shows example of applications in Banking. The presenter is Dr. Tuangtong Wattarujeekrit in Big Data Analytics Day event.
Data-Enriched Products and Services – Options to Apply Advanced Analytics to ...Dr. Ronny M. Schüritz
With the rise of “big data” and associated technologies, analytics solutions are mushrooming.
Many organizations face an increasing amount of available data, but are still lacking support on how to systematically transform this data into business value. Although there has been progress to apply advanced analytics to improve internal efficiency and effectiveness, the adaptation or creation of more-valuable services – called “Analytics 3.0” by Davenport - is still in its infancy.
In this paper, we try to shed light on the status of the industry and to identify the changes in value propositions that are contributed by analytics. We systematically analyze 700 advanced analytics projects across the industry worldwide via an open coding approach.
First, we derive a series of distinct patterns of how organizations actually create business value – and find that in only 10% of our cases the value proposition is infused by data and analytics. Second, we reveal a set of eleven patterns that illustrate how data and analytics can enrich service offerings – resulting in added value for the customer. Through these patterns we explain how the utilization is altering the service experience, how added value is created for the customer and what data is needed.
Thus, we contribute to the fundamental understanding of how the use of data and the application of analytics may spark service innovation. The identified patterns should give guidance for practitioners on how to utilize (big) data and analytics, not just internally but also for changed value propositions.
This presentation described Big Data concept. Then it shows example of applications in Banking. The presenter is Dr. Tuangtong Wattarujeekrit in Big Data Analytics Day event.
Data-Enriched Products and Services – Options to Apply Advanced Analytics to ...Dr. Ronny M. Schüritz
With the rise of “big data” and associated technologies, analytics solutions are mushrooming.
Many organizations face an increasing amount of available data, but are still lacking support on how to systematically transform this data into business value. Although there has been progress to apply advanced analytics to improve internal efficiency and effectiveness, the adaptation or creation of more-valuable services – called “Analytics 3.0” by Davenport - is still in its infancy.
In this paper, we try to shed light on the status of the industry and to identify the changes in value propositions that are contributed by analytics. We systematically analyze 700 advanced analytics projects across the industry worldwide via an open coding approach.
First, we derive a series of distinct patterns of how organizations actually create business value – and find that in only 10% of our cases the value proposition is infused by data and analytics. Second, we reveal a set of eleven patterns that illustrate how data and analytics can enrich service offerings – resulting in added value for the customer. Through these patterns we explain how the utilization is altering the service experience, how added value is created for the customer and what data is needed.
Thus, we contribute to the fundamental understanding of how the use of data and the application of analytics may spark service innovation. The identified patterns should give guidance for practitioners on how to utilize (big) data and analytics, not just internally but also for changed value propositions.
The objective of this project is to discuss the importance of Machine Learning in different sectors and how does it solve the problems in the Marketing Analytics field. We have discussed Marketing Segmentation, Advertisement, and Fraud detection in our project. We used different Machine Learning algorithms and used R and Python library to predict and solve these problems. After making models and running test data on those models we got following results:
• We trained a Decision tree and Random Forest classifier model which has 73% accuracy to predict whether a person will be a defaulter or not based on credit history, income, job type, dependents etc.
• We segmented the Social networking profiles based on the likes and dislikes of a person using K-Means Clustering.
• We made a predictive model of the messages a customer receives and determined whether a message will be a Spam or not a spam with an accuracy of 97%. We used Naïve Bayes classifier for this model.
With the development of advanced remote sensing and communication technology, new sources of data began to develop in the lots of industries such as finance, marketing, transport, utility, etc.. These new types of datasets are being received continuously at a very high speed. Researchers in academia and industry have made many efforts to improve the value of big data and significant use of its value using data science. Having a good process for data mining and machine learning and clear guidelines is always plus point for any data science project. It also helps to focus required time and resources early in the process to get a clear idea of the business problem to be solved.
Hence, the framework is proposed to aid data science project lifecycle and bridge the gap with business needs and technical realities.
Main motivation of building this new framework is to address big data analysis changes and reduce the complexity of the any big data related data science projects. Recent improvements in technology demand real-time data processing and analytics and visualization to gain completive advantage of real-time decision making. After carefully examination and analysis of the related literature, there are a variety of issues in Big Data processing and analysis. Therefore, this research present new Big Data analytics and processing framework for data acquisition, data fusion, data storing, managing, processing, analysing, visualising and modelling. Often the purpose of data analysis is not only to identify pattern, but to build models, if possible by gaining an understanding of process. We believe that without a proper coordination and structuring framework there is likely to be much overlap and duplication amongst project phases, and can cause confusion around the responsibilities of each project participant. A common mistake made in big data projects is rushing into data collection and data analysis, which prevents spending adequate time to plan the amount of work involved in the project, understanding business requirements, or even defining the business problem properly. Big data has is available all around us in various formats, shapes and sizes. Understanding the relevance of each of these data sets to business problem is a key aspect to succeed with the project. Also, big data has multiple layers of hidden complexity that are not visible by simply inspecting. Poorly planned project can ruin entire project and the finding of the project in any organization. If the project does not clearly identify the appropriate level of complexity and the granularity, then the chances are high an erroneous result set will occur that twists the expected analytical outputs.
8th International Forum and Exhibition on Sustainable Energy in UkraineIgor Kostik
8th International Forum and Exhibition on Sustainable Energy in Ukraine SEF-2016 KYIV gathers the market players of energy efficiency, renewable energy and independent power supply and ready to once again confirm its status as the main business platform of the industry.
The objective of this project is to discuss the importance of Machine Learning in different sectors and how does it solve the problems in the Marketing Analytics field. We have discussed Marketing Segmentation, Advertisement, and Fraud detection in our project. We used different Machine Learning algorithms and used R and Python library to predict and solve these problems. After making models and running test data on those models we got following results:
• We trained a Decision tree and Random Forest classifier model which has 73% accuracy to predict whether a person will be a defaulter or not based on credit history, income, job type, dependents etc.
• We segmented the Social networking profiles based on the likes and dislikes of a person using K-Means Clustering.
• We made a predictive model of the messages a customer receives and determined whether a message will be a Spam or not a spam with an accuracy of 97%. We used Naïve Bayes classifier for this model.
With the development of advanced remote sensing and communication technology, new sources of data began to develop in the lots of industries such as finance, marketing, transport, utility, etc.. These new types of datasets are being received continuously at a very high speed. Researchers in academia and industry have made many efforts to improve the value of big data and significant use of its value using data science. Having a good process for data mining and machine learning and clear guidelines is always plus point for any data science project. It also helps to focus required time and resources early in the process to get a clear idea of the business problem to be solved.
Hence, the framework is proposed to aid data science project lifecycle and bridge the gap with business needs and technical realities.
Main motivation of building this new framework is to address big data analysis changes and reduce the complexity of the any big data related data science projects. Recent improvements in technology demand real-time data processing and analytics and visualization to gain completive advantage of real-time decision making. After carefully examination and analysis of the related literature, there are a variety of issues in Big Data processing and analysis. Therefore, this research present new Big Data analytics and processing framework for data acquisition, data fusion, data storing, managing, processing, analysing, visualising and modelling. Often the purpose of data analysis is not only to identify pattern, but to build models, if possible by gaining an understanding of process. We believe that without a proper coordination and structuring framework there is likely to be much overlap and duplication amongst project phases, and can cause confusion around the responsibilities of each project participant. A common mistake made in big data projects is rushing into data collection and data analysis, which prevents spending adequate time to plan the amount of work involved in the project, understanding business requirements, or even defining the business problem properly. Big data has is available all around us in various formats, shapes and sizes. Understanding the relevance of each of these data sets to business problem is a key aspect to succeed with the project. Also, big data has multiple layers of hidden complexity that are not visible by simply inspecting. Poorly planned project can ruin entire project and the finding of the project in any organization. If the project does not clearly identify the appropriate level of complexity and the granularity, then the chances are high an erroneous result set will occur that twists the expected analytical outputs.
8th International Forum and Exhibition on Sustainable Energy in UkraineIgor Kostik
8th International Forum and Exhibition on Sustainable Energy in Ukraine SEF-2016 KYIV gathers the market players of energy efficiency, renewable energy and independent power supply and ready to once again confirm its status as the main business platform of the industry.
MBA midterm presentation "Impact of flexible working arrangements"Ange Kaung
This is mid term presentation with the title of the Impact of flexible working arrangements on performance, work-life conflict and work pressure on the employees of software industry Yangon. (Myanmar) for MBA.
Research data explored II: the Anatomy and Reception of figshareOpen Knowledge Maps
Presentation at the 20th International Conference on Science and Technology Indicators (STI 2015) in Lugano. Full citation: Kraker, P., Lex, E., Gorraiz, J., Gumpenberger, C., & Isabella. (2015). Research Data Explored II : the Anatomy and Reception of figshare. In 20th International Conference on Science and Technology Indicators. Online at: http://arxiv.org/abs/1503.01298
Enabling data scientists within an enterprise requires a well-thought out approach from an organization, technology, and business results perspective. In this talk, Tim and Hussain will share common pitfalls to data science enablement in the enterprise and provide their recommendations to avoid them. Taking an example, actionable use case from the financial services industry, they will focus on how Anaconda plays a pivotal role in setting up big data infrastructure, integrating data science experimentation and production environments, and deploying insights to production. Along the way, they will highlight opportunities for leveraging open source and unleashing data science teams while meeting regulatory and compliance challenges.
Learn about the emerging field of big data and advanced quantitative models and how the Rady School's MS in Business Analytics program is designed to solve important business problems.
Analytical Thinking is a fortnightly newsletter from the UK Business Analytics team.
The purpose of the newsletter is to raise awareness about why analytics is a hot topic at the moment, where is analytics being referenced in the press and in what ways are organisations using analytics.
Business Analytics (Operational Research) is part of the Digital Transformation team in Capgemini Consulting UK
Presentation: Study: #Big Data in #Austria, Mario Meir-Huber, Big Data Leader Eastern Europe, Teradata GmbH & Martin Köhler, Austrian Institute of Technology, AIT (AT), at the European Data Economy Workshop taking place back to back to SEMANTiCS2015 on 15 September 2015 in Vienna.
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...multimediaeval
Paper: http://ceur-ws.org/Vol-2283/MediaEval_18_paper_5.pdf
Youtube: https://youtu.be/tgO8k3mNH4g
Andreas Lommatzsch, Benjamin Kille, Martha Larson, Frank Hopfgartner and Leif Ramming, NewsREEL Multimedia at MediaEval 2018: News Recommendation with Image and Text Content. Proc. of MediaEval 2018, 29-31 October 2018, Sophia Antipolis, France.
Abstract: NewsREEL Multimedia premiers 2018 as part of the MediaEval Benchmarking Initiative. The NewsREEL task combines recommendation algorithms with image and text analysis. Participants must predict engagement with news items based on text snippets and annotated images. Several major German news portals have supplied data. The algorithms are evaluated in terms of precision on unknown data. This paper describes the task and the provided data in detail and explains the applied evaluation approach. The algorithms are evaluated based on Precision and Average-Precision for the top news items.
Presented by Benjamin Kille
Big Data Analytics: A New Business OpportunityEdward Curry
This talk introduces Big Data analytics and how they can be used to deliver value within organisations. The talk will cover the transformational potential of creating data value chains between different sectors. Developing a Big Data analytics capability will be discussed in addition to the challenges facing the emerging data economy.
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
This is my presentation on the Topic "Data Science - An emerging Stream of Science with its Spreading Reach & Impact". I have compiled and collected different statistics and data from different sources. This may be useful for students and those who might be interested in this field of Study.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
The increased availability of biomedical data, particularly in the public domain, offers the opportunity to better understand human health and to develop effective therapeutics for a wide range of unmet medical needs. However, data scientists remain stymied by the fact that data remain hard to find and to productively reuse because data and their metadata i) are wholly inaccessible, ii) are in non-standard or incompatible representations, iii) do not conform to community standards, and iv) have unclear or highly restricted terms and conditions that preclude legitimate reuse. These limitations require a rethink on data can be made machine and AI-ready - the key motivation behind the FAIR Guiding Principles. Concurrently, while recent efforts have explored the use of deep learning to fuse disparate data into predictive models for a wide range of biomedical applications, these models often fail even when the correct answer is already known, and fail to explain individual predictions in terms that data scientists can appreciate. These limitations suggest that new methods to produce practical artificial intelligence are still needed.
In this talk, I will discuss our work in (1) building an integrative knowledge infrastructure to prepare FAIR and "AI-ready" data and services along with (2) neurosymbolic AI methods to improve the quality of predictions and to generate plausible explanations. Attention is given to standards, platforms, and methods to wrangle knowledge into simple, but effective semantic and latent representations, and to make these available into standards-compliant and discoverable interfaces that can be used in model building, validation, and explanation. Our work, and those of others in the field, creates a baseline for building trustworthy and easy to deploy AI models in biomedicine.
Bio
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research explores socio-technological approaches for responsible discovery science, which includes collaborative multi-modal knowledge graphs, privacy-preserving distributed data mining, and AI methods for drug discovery and personalized medicine. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon Europe, the European Open Science Cloud, the US National Institutes of Health, and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
3. 3
The Hitchhiker‘s Guide to the Galaxy
3Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
4. 4
How to Stay on Top of the Universe
4Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
5. 5
How to Stay on Top of Scientific Literature
5Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
6. 6
6Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
“One of the diseases of this age is the
multiplicity of books;
they doth so overcharge the world that it is not
able to digest the abundance of idle matter that
is every day hatched and brought forth into the
world.“
Attributed to Barnaby Rich in 1613
7. 7
7Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Price (1963)
9. 9
Eugene Garfield: Pathways through Science
9Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Science Citation Index (Garfield 1955)
Web of Science/Web of Knowledge
An index of incoming citations
Outgoing citations
Incoming citations
12. 12
12Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Van Eck and Waltman
(2010)
13. 13
Evaluative Scientometrics: Impact Factor
13Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
A measure to quantify the relative importance of a scientific
journal
The average number of citations in a given year y to papers of a
journal in the years y-1 and y-2
𝐼𝐹𝐽𝑜𝑢𝑟𝑛𝑎𝑙 𝑋, 2015 =
𝐶𝑖𝑡𝑎𝑡𝑖𝑜𝑛𝑠 𝑖𝑛 2015 𝑡𝑜
𝑎𝑟𝑡𝑖𝑐𝑙𝑒𝑠 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑑 𝑏𝑦
𝐽𝑜𝑢𝑟𝑛𝑎𝑙 𝑋 𝑖𝑛
2013 𝑎𝑛𝑑 2014
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑟𝑡𝑖𝑐𝑙𝑒𝑠
𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑑 𝑏𝑦 𝐽𝑜𝑢𝑟𝑛𝑎𝑙 𝑋
𝑖𝑛 2013 𝑎𝑛𝑑 2014
14. 14
14Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Source: Thomson Reuters
15. 15
15Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Journal X
Paper 1 Citations 15
Paper 2 Citations 17
Paper 3 Citations 14
Paper 4 Citations 18
Paper 5 Citations 15
Paper 6 Citations 15
Impact Factor 15.5
Median 15 2
Journal Y
100
2
1
2
1
2
18.0
Rank 2 1
16. 16
Criticisms of the Impact Factor
16Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Source: Amin & Mabe (2000)
17. 17
Antidotes
17Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
„Blockbuster“ papers can skew the IF
Use robust statistics (e.g. the median)
Report variation in the data (CWTS Journal Indicators)
Publication and citation behavior varies wildly between fields
Source-normalised impact factor (SNIP)
The IF window is too short
3- and 5-year Impact Factors
18. 18
Criticisms of citation-based metrics
18Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Citations take very long to appear in meaningful quantities
Citation-based metrics are dependent on the corpus that is used
for calculation
Citations are mostly assigned to articles, leaving other products
of research out of the picture (data, code etc.)
Citation-based metrics are often intransparent and irreproducible
20. 20
Setting the Stage for Alternative Metrics
20Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Open Science and the publication of more research products
Increased use of online services in the scientific community
Source: figshare
21. 21
Altmetrics
21Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Altmetrics: metrics based on data generated in online systems
Seeing academic literature through the eyes of the readers (Rowlands
& Nicholas 2007)
Usage data (downloads, views, readership)
Links, likes and shares
Blogs and comments
Promises of altmetrics
Assess publications quicker and on a broader scale
Consider all outputs of research, not just papers
The altmetrics manifesto: http://altmetrics.org
22. 22
22Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
http://zenodo.org/record/34079
23. 23
23Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
http://www.altmetric.com/details/4822452
24. 24
24Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Source:
https://impactstory.org/Carl
Boettiger
26. 26
26Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
http://openknowledgemaps.org
Kraker et al.
(2014)
27. 27
27Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
http://stellar.know-center.tugraz.at/umap
Evolution of a Knowledge Domain
Kraker et al.
(2014)
28. 28
The Emerging Open Digital Science Ecosystem
Bibliographic data – bibliometric data – full texts – open source software
32. 32
32Know-Center GmbH • Research Center for Data-Driven Business and Big Data Analytics
Kraker & Lex
(2014)
33. 33
The Leiden Manifesto: Ten Principles for Research
Metrics (Hicks et al. 2015)
1) Quantitative evaluation should support qualitative, expert
assessment
…
4) Keep data collection and analytical processes open,
transparent and simple
5) Allow those evaluated to verify data and analysis
…
9) Recognize the systemic effects of assessment and indicators
10) Scrutinize indicators regularly and update them