Replication of Uzzi (2013) Science study on atypical combinations, with additional work to show that journal and disciplinary effects are not insignificant.
the role of Cochrane collaboration and specifically the menstrual disorder & subfertility group is illustrated . simple explanation how to use cochrane reviews is done.
This document provides an overview of key concepts in statistics, including hypothesis testing, null and alternative hypotheses, regression analysis, correlation, the exponential distribution, types of errors in hypothesis testing, central tendency, Bayes' theorem, Chebyshev's theorem, and simple random sampling. It defines these terms and provides examples to illustrate statistical concepts.
1) The document discusses the challenge of studying rare diseases due to the small amount of available data. It proposes that N-of-1 trials, where individual patients are repeatedly randomized to treatment or control, could help address this issue.
2) It provides examples of how careful experimental design and statistical analysis are important even with small data sets. Factors like randomization, blocking, and replication can increase efficiency and validity.
3) Analyzing an N-of-1 trial for a rare disease, the document explores objectives like determining if one treatment is better, estimating average effects, and predicting effects for future patients. It discusses randomization and sampling philosophies and mixed effects models.
Riverpoint writer research statistics psyJody Marvin
The document discusses statistical reasoning in psychology. It defines the scientific method as a standardized way to systematically acquire knowledge through observations, data collection, hypothesis formulation, testing through experiments, and results interpretation. Statistics are used in psychology to organize, analyze, and interpret numerical data to evaluate the reliability of data and research findings. The role of statistics is to solve problems by analyzing quantitative and qualitative data using tools like t-distributions and p-values to scientifically illustrate relationships in the data and differences over time. Both primary data that is original to a study and secondary data from other sources are important, though secondary data reliability depends on the organization collecting and reporting it.
This document summarizes a simulation study comparing the performance of different meta-analysis methods when assumptions of normality are violated. The study generated simulated datasets with various distributions for true effects and degrees of heterogeneity. It then compared methods like fixed effects, DerSimonian-Laird, maximum likelihood, and permutations in terms of coverage, power, and confidence interval estimation. The results showed that some methods are more robust to non-normal data, with profile likelihood and permutations generally performing best, while other methods like fixed effects and DerSimonian-Laird showed poorer performance.
The statistical revolution of the 20th century was largely concerned with developing methods for analysing small datasets. Student’s paper of 1908 was the first in the English literature to address the problem of second order uncertainty (uncertainty about the measures of uncertainty) seriously and was hailed by Fisher as heralding a new age of statistics. Much of what Fisher did was concerned with problems of what might be called ‘small data’, not only as regards efficient analysis but also as regards efficient design and in addition paying close attention to what was necessary to measure uncertainty validly.
I shall consider the history of some of these developments, in particular those that are associated with what might be called the Rothamsted School, starting with Fisher and having its apotheosis in John Nelder’s theory of General Balance and see what lessons they hold for the supposed ‘big data’ revolution of the 21st century.
This document summarizes a re-analysis of meta-analysis data from the Cochrane Library. It examines the performance of different methods for estimating between-study heterogeneity and explores model selection in published meta-analyses. Simulation studies were conducted to compare heterogeneity estimators. Over 57,000 meta-analyses from the Cochrane Library were also analyzed. Results showed that the DerSimonian-Laird estimator often failed to detect high between-study heterogeneity, particularly in small meta-analyses. Bayesian methods performed well for very small meta-analyses. In the Cochrane data, over 30% of meta-analyses had only 2 studies and the random-effects model was more commonly used with larger numbers of studies.
the role of Cochrane collaboration and specifically the menstrual disorder & subfertility group is illustrated . simple explanation how to use cochrane reviews is done.
This document provides an overview of key concepts in statistics, including hypothesis testing, null and alternative hypotheses, regression analysis, correlation, the exponential distribution, types of errors in hypothesis testing, central tendency, Bayes' theorem, Chebyshev's theorem, and simple random sampling. It defines these terms and provides examples to illustrate statistical concepts.
1) The document discusses the challenge of studying rare diseases due to the small amount of available data. It proposes that N-of-1 trials, where individual patients are repeatedly randomized to treatment or control, could help address this issue.
2) It provides examples of how careful experimental design and statistical analysis are important even with small data sets. Factors like randomization, blocking, and replication can increase efficiency and validity.
3) Analyzing an N-of-1 trial for a rare disease, the document explores objectives like determining if one treatment is better, estimating average effects, and predicting effects for future patients. It discusses randomization and sampling philosophies and mixed effects models.
Riverpoint writer research statistics psyJody Marvin
The document discusses statistical reasoning in psychology. It defines the scientific method as a standardized way to systematically acquire knowledge through observations, data collection, hypothesis formulation, testing through experiments, and results interpretation. Statistics are used in psychology to organize, analyze, and interpret numerical data to evaluate the reliability of data and research findings. The role of statistics is to solve problems by analyzing quantitative and qualitative data using tools like t-distributions and p-values to scientifically illustrate relationships in the data and differences over time. Both primary data that is original to a study and secondary data from other sources are important, though secondary data reliability depends on the organization collecting and reporting it.
This document summarizes a simulation study comparing the performance of different meta-analysis methods when assumptions of normality are violated. The study generated simulated datasets with various distributions for true effects and degrees of heterogeneity. It then compared methods like fixed effects, DerSimonian-Laird, maximum likelihood, and permutations in terms of coverage, power, and confidence interval estimation. The results showed that some methods are more robust to non-normal data, with profile likelihood and permutations generally performing best, while other methods like fixed effects and DerSimonian-Laird showed poorer performance.
The statistical revolution of the 20th century was largely concerned with developing methods for analysing small datasets. Student’s paper of 1908 was the first in the English literature to address the problem of second order uncertainty (uncertainty about the measures of uncertainty) seriously and was hailed by Fisher as heralding a new age of statistics. Much of what Fisher did was concerned with problems of what might be called ‘small data’, not only as regards efficient analysis but also as regards efficient design and in addition paying close attention to what was necessary to measure uncertainty validly.
I shall consider the history of some of these developments, in particular those that are associated with what might be called the Rothamsted School, starting with Fisher and having its apotheosis in John Nelder’s theory of General Balance and see what lessons they hold for the supposed ‘big data’ revolution of the 21st century.
This document summarizes a re-analysis of meta-analysis data from the Cochrane Library. It examines the performance of different methods for estimating between-study heterogeneity and explores model selection in published meta-analyses. Simulation studies were conducted to compare heterogeneity estimators. Over 57,000 meta-analyses from the Cochrane Library were also analyzed. Results showed that the DerSimonian-Laird estimator often failed to detect high between-study heterogeneity, particularly in small meta-analyses. Bayesian methods performed well for very small meta-analyses. In the Cochrane data, over 30% of meta-analyses had only 2 studies and the random-effects model was more commonly used with larger numbers of studies.
Common statistical pitfalls in basic science researchRamachandra Barik
This document discusses common statistical pitfalls in basic science research. It notes that while clinical studies undergo rigorous statistical review, basic science studies are often handled less uniformly. Some key issues it identifies include: treating repeated measurements of the same unit as independent observations, underestimating required sample sizes, lack of consideration for control groups and randomization in study design, and improper presentation of data through unclear reporting of sample sizes, use of standard deviations instead of standard errors, and inappropriate graphical displays. The document provides guidance on how to properly determine sample sizes, design studies, analyze data, and present results to address these common pitfalls.
This document summarizes issues with observational studies replicating claims about human health. It notes that while randomized clinical trials replicate over 80% of the time, observational studies only replicate 10-20% of the time. The document discusses how data staging, lack of analysis protocols, multiple testing, multiple modeling, and uncorrected bias can lead observational studies to produce essentially all positive results. It argues that funding agencies and journal editors need to implement management solutions like requiring data and analysis protocols to be posted publicly to improve the reliability of claims from observational studies.
Poster: Equivalence of Electronic and Paper Administration of PROCRF Health
This systematic review and meta-analysis found high levels of equivalence between electronic and paper administration of patient-reported outcome measures (PROs). 435 correlations between paper and electronic versions showed good agreement (pooled correlation = 0.88). 355 estimates of mean differences between versions were also small (mean = 1.8% of scale score). Moderator analyses found greater agreement for more recent studies, randomized designs, shorter time intervals between versions, and older participant ages. The review concludes that PRO data from electronic and paper versions are comparable, supporting use of electronic administration in clinical trials and research.
This document summarizes a meta-analysis of 206 studies on adventure therapy outcomes published between 1967 and 2012. The meta-analysis found that adventure therapy has a moderate positive effect on psychosocial outcomes, with an overall effect size of 0.50 for pre-post outcomes. Larger effects were found for outcomes related to self-concept, social development, and clinical measures. Moderator analyses found slightly larger effects for older participants and programs with an open group structure. The meta-analysis provides benchmarking data to evaluate adventure therapy program outcomes.
Leroy Hood biomedical challenges at Skolkovoigorod
This document discusses biomedical challenges at Skolkovo and outlines several questions around the types of science and partnerships that will be involved. It discusses whether the initiatives will include both non-profit and for-profit organizations, and how to attract existing companies and enable new company creation. It also questions whether the funds will compete with other Russian science programs and if academic science in Russia will be reformed to align with the Skolkovo initiative. Overall, the document explores how to structure the biomedical programs and partnerships at Skolkovo across non-profit and for-profit organizations.
The document provides guidelines for reporting animal research studies in a transparent manner. It outlines the ARRIVE guidelines, which include 10 essential items that should be reported in research papers involving animal subjects. The guidelines aim to improve reproducibility, transparency and quality of reporting. They include reporting the study's objectives and design, the animals used, experimental procedures, and the statistical analysis to allow rigorous assessment of the study. Adhering to these guidelines can help improve communication of research findings.
Network meta-analysis with integrated nested Laplace approximationsBurak Kürsad Günhan
This document discusses network meta-analysis (NMA) models for combining data from multiple treatment comparisons. It provides an overview of NMA terminology and models, including the Lu-Ades and Jackson models. It also demonstrates the application of these models to sample datasets on tuberculosis vaccine trials and smoking cessation interventions using Bayesian inference with integrated nested Laplace approximations (INLA). The key contributions are the INLA implementation of the Jackson NMA model and an R function for fitting various pairwise and network meta-analysis models.
Lecture on causal inference to the pediatric hematology/oncology fellows at Texas Children's hospital as part of their Biostatistics for Busy Clinicians lecture seriers.
This document provides guidance on reading and understanding medical research papers. It discusses the key elements of clinical trial papers and science papers, and emphasizes the importance of reading papers to gain knowledge and the ability to critically evaluate research. It also reviews guidelines for reporting clinical trials and animal studies, and provides tips on analyzing data and interpreting various types of charts, graphs, and statistics commonly found in medical research papers.
1. The document discusses the recommendations from the Royal Statistical Society Working Party report on ethical, practical and statistical considerations in designing first-in-man studies. The report made recommendations around generic issues, preparatory work, protocol content, risk sharing and reporting standards.
2. The report recommended that regulators provide more statistical expertise, mandatory insurance for participants, and only conducting studies at tertiary care hospitals if there is any risk of a cytokine storm. Protocols should provide quantitative justification of doses and risks with uncertainty and study classification.
3. Subsequent work has further explored trial design considerations like variance of treatment contrasts and dynamic decision making based on interim results, but safety in first-in-man studies remains a challenge that requires
Controversy Over the Significance Test Controversyjemille6
Deborah Mayo (Professor of Philosophy, Virginia Tech, Blacksburg, Virginia) in PSA 2016 Symposium: Philosophy of Statistics in the Age of Big Data and Replication Crises
This document provides an overview of how to conduct a systematic review and meta-analysis. It describes the key steps: (1) asking a focused clinical question using PICO, (2) acquiring relevant studies through database searches, (3) appraising the quality of included studies, (4) analyzing the data using statistical methods to obtain an overall treatment effect size, and (5) reporting results typically in a forest plot. Meta-analyses provide increased statistical power over individual studies but are not without limitations such as potential bias that must be considered when interpreting results.
Clinical trials: quo vadis in the age of covid?Stephen Senn
A discussion of the role of clinical trials in the age of COVID. My contribution to the phastar 2020 life sciences summit https://phastar.com/phastar-life-science-summit
Elashoff approach section in grant applicationsUCLA CTSI
This document provides guidance on how to write the "Approach" section of an R grant application. It discusses including preliminary data to demonstrate expertise and support for hypotheses. The study design should describe the overall design, endpoints, study population, inclusion/exclusion criteria, and measures. Sample size calculations must provide sufficient power and account for dropouts and multiple comparisons. Overall, the Approach section must convince reviewers that the study hypotheses could be true and the research team is capable of carrying out the study.
Measures of disease frequency include rates, ratios, and proportions. A ratio expresses the relation between two quantities where the numerator is not part of the denominator. A proportion indicates the relation of a part to the whole, with the numerator included in the denominator. A rate measures the occurrence of an event in a population during a time period. Other concepts discussed include incidence, prevalence, measures of central tendency (mean, median, mode), and measures of variation (range, standard deviation). Factors that can affect study outcomes include various types of biases such as selection, response, information, and confounding variables.
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...jemille6
Gerd Gigerenzer (Director of Max Planck Institute for Human Development, Berlin, Germany) in the PSA 2016 Symposium:Philosophy of Statistics in the Age of Big Data and Replication Crises
Common statistical pitfalls in basic science researchRamachandra Barik
This document discusses common statistical pitfalls in basic science research. It notes that while clinical studies undergo rigorous statistical review, basic science studies are often handled less uniformly. Some key issues it identifies include: treating repeated measurements of the same unit as independent observations, underestimating required sample sizes, lack of consideration for control groups and randomization in study design, and improper presentation of data through unclear reporting of sample sizes, use of standard deviations instead of standard errors, and inappropriate graphical displays. The document provides guidance on how to properly determine sample sizes, design studies, analyze data, and present results to address these common pitfalls.
This document summarizes issues with observational studies replicating claims about human health. It notes that while randomized clinical trials replicate over 80% of the time, observational studies only replicate 10-20% of the time. The document discusses how data staging, lack of analysis protocols, multiple testing, multiple modeling, and uncorrected bias can lead observational studies to produce essentially all positive results. It argues that funding agencies and journal editors need to implement management solutions like requiring data and analysis protocols to be posted publicly to improve the reliability of claims from observational studies.
Poster: Equivalence of Electronic and Paper Administration of PROCRF Health
This systematic review and meta-analysis found high levels of equivalence between electronic and paper administration of patient-reported outcome measures (PROs). 435 correlations between paper and electronic versions showed good agreement (pooled correlation = 0.88). 355 estimates of mean differences between versions were also small (mean = 1.8% of scale score). Moderator analyses found greater agreement for more recent studies, randomized designs, shorter time intervals between versions, and older participant ages. The review concludes that PRO data from electronic and paper versions are comparable, supporting use of electronic administration in clinical trials and research.
This document summarizes a meta-analysis of 206 studies on adventure therapy outcomes published between 1967 and 2012. The meta-analysis found that adventure therapy has a moderate positive effect on psychosocial outcomes, with an overall effect size of 0.50 for pre-post outcomes. Larger effects were found for outcomes related to self-concept, social development, and clinical measures. Moderator analyses found slightly larger effects for older participants and programs with an open group structure. The meta-analysis provides benchmarking data to evaluate adventure therapy program outcomes.
Leroy Hood biomedical challenges at Skolkovoigorod
This document discusses biomedical challenges at Skolkovo and outlines several questions around the types of science and partnerships that will be involved. It discusses whether the initiatives will include both non-profit and for-profit organizations, and how to attract existing companies and enable new company creation. It also questions whether the funds will compete with other Russian science programs and if academic science in Russia will be reformed to align with the Skolkovo initiative. Overall, the document explores how to structure the biomedical programs and partnerships at Skolkovo across non-profit and for-profit organizations.
The document provides guidelines for reporting animal research studies in a transparent manner. It outlines the ARRIVE guidelines, which include 10 essential items that should be reported in research papers involving animal subjects. The guidelines aim to improve reproducibility, transparency and quality of reporting. They include reporting the study's objectives and design, the animals used, experimental procedures, and the statistical analysis to allow rigorous assessment of the study. Adhering to these guidelines can help improve communication of research findings.
Network meta-analysis with integrated nested Laplace approximationsBurak Kürsad Günhan
This document discusses network meta-analysis (NMA) models for combining data from multiple treatment comparisons. It provides an overview of NMA terminology and models, including the Lu-Ades and Jackson models. It also demonstrates the application of these models to sample datasets on tuberculosis vaccine trials and smoking cessation interventions using Bayesian inference with integrated nested Laplace approximations (INLA). The key contributions are the INLA implementation of the Jackson NMA model and an R function for fitting various pairwise and network meta-analysis models.
Lecture on causal inference to the pediatric hematology/oncology fellows at Texas Children's hospital as part of their Biostatistics for Busy Clinicians lecture seriers.
This document provides guidance on reading and understanding medical research papers. It discusses the key elements of clinical trial papers and science papers, and emphasizes the importance of reading papers to gain knowledge and the ability to critically evaluate research. It also reviews guidelines for reporting clinical trials and animal studies, and provides tips on analyzing data and interpreting various types of charts, graphs, and statistics commonly found in medical research papers.
1. The document discusses the recommendations from the Royal Statistical Society Working Party report on ethical, practical and statistical considerations in designing first-in-man studies. The report made recommendations around generic issues, preparatory work, protocol content, risk sharing and reporting standards.
2. The report recommended that regulators provide more statistical expertise, mandatory insurance for participants, and only conducting studies at tertiary care hospitals if there is any risk of a cytokine storm. Protocols should provide quantitative justification of doses and risks with uncertainty and study classification.
3. Subsequent work has further explored trial design considerations like variance of treatment contrasts and dynamic decision making based on interim results, but safety in first-in-man studies remains a challenge that requires
Controversy Over the Significance Test Controversyjemille6
Deborah Mayo (Professor of Philosophy, Virginia Tech, Blacksburg, Virginia) in PSA 2016 Symposium: Philosophy of Statistics in the Age of Big Data and Replication Crises
This document provides an overview of how to conduct a systematic review and meta-analysis. It describes the key steps: (1) asking a focused clinical question using PICO, (2) acquiring relevant studies through database searches, (3) appraising the quality of included studies, (4) analyzing the data using statistical methods to obtain an overall treatment effect size, and (5) reporting results typically in a forest plot. Meta-analyses provide increased statistical power over individual studies but are not without limitations such as potential bias that must be considered when interpreting results.
Clinical trials: quo vadis in the age of covid?Stephen Senn
A discussion of the role of clinical trials in the age of COVID. My contribution to the phastar 2020 life sciences summit https://phastar.com/phastar-life-science-summit
Elashoff approach section in grant applicationsUCLA CTSI
This document provides guidance on how to write the "Approach" section of an R grant application. It discusses including preliminary data to demonstrate expertise and support for hypotheses. The study design should describe the overall design, endpoints, study population, inclusion/exclusion criteria, and measures. Sample size calculations must provide sufficient power and account for dropouts and multiple comparisons. Overall, the Approach section must convince reviewers that the study hypotheses could be true and the research team is capable of carrying out the study.
Measures of disease frequency include rates, ratios, and proportions. A ratio expresses the relation between two quantities where the numerator is not part of the denominator. A proportion indicates the relation of a part to the whole, with the numerator included in the denominator. A rate measures the occurrence of an event in a population during a time period. Other concepts discussed include incidence, prevalence, measures of central tendency (mean, median, mode), and measures of variation (range, standard deviation). Factors that can affect study outcomes include various types of biases such as selection, response, information, and confounding variables.
Surrogate Science: How Fisher, Neyman-Pearson, and Bayes Were Transformed int...jemille6
Gerd Gigerenzer (Director of Max Planck Institute for Human Development, Berlin, Germany) in the PSA 2016 Symposium:Philosophy of Statistics in the Age of Big Data and Replication Crises
Promoting australia through chinese social mediaKaryn Lanthois
The rise of WeChat is difficult to ignore. Relationships are as important as ever in China but we can't ignore a low cost opportunity to gain market research and get our brands in front of 1.2 billion users.
The document provides information and resources for preparing for a Sitka police department job interview. It lists top materials available at a website including 80 police interview questions and answers, tips for different types of interviews, cover letter and resume samples, and ways to search for jobs. Other tips include practicing different interview types and sending a thank you letter after an interview.
Rivière rouge police department interview questionsselinasimpson989
The document provides resources for preparing for a police department interview with Rivière-Rouge, including 80 police interview questions and answers, tips on different types of interviews, cover letter and resume samples, and ways to search for jobs. It also gives sample answers to common interview questions and advises practicing different interview styles.
Be remembered, Be Seen - Brand South Australia in ChinaKaryn Lanthois
As International Marketing Manager for The Australia China Development Company, I share some learnings from marketing Brand South Australia to China. After a successful 250+ Delegation to Shandong, what are the next steps to 'Be Seen and Be Remembered '
Dokumen ini membahas tentang pemecahan masalah dan bimbingan konseling di SMK Cor Jesu Malang. Pemecahan masalah merupakan keterampilan penting untuk mengatasi konflik dan menjalin hubungan positif. Ada beberapa tahapan untuk menyelesaikan masalah yaitu mendefinisikan masalah, mencari penyebab, mempertimbangkan solusi, memilih dan menerapkan solusi terbaik.
Colorado springs police department interview questionsselinasimpson709
The document provides resources for preparing for a Colorado Springs police department job interview, including 80 police interview questions and answers, top cover letter and resume samples, and tips on different types of interview questions. It offers example responses to common interview questions and advises practicing situational and behavioral interviews. Additional materials and tips are available at the listed website.
1. Dokumen tersebut membahas konsep dasar sistem operasi, termasuk komponen, layanan, system calls, pemrograman sistem, struktur sistem, mesin virtual, dan manajemen proses, memori, berkas, I/O, penyimpanan sekunder, jaringan, dan proteksi.
2. Sistem operasi bertanggung jawab mengelola sumber daya komputer dan memfasilitasi interaksi antara hardware, software, dan pengguna.
3. Ada berbagai pendekatan dalam merancang sistem oper
This document outlines a group project on Magnesia, a natural pain killer made of magnesium supplements. The group includes 5 members - Shekhar Mahatre, Mayur Kate, Swapnil Shelke, John Gomes, and Shantaram Jadhav. It describes Magnesia as more effective than drugs for pain relief and lists its main competitors as Aspirin, Saridon, Crocine, and D'cold Total. Pricing details are provided that it is available in packets for Rs. 12 for 10 tablets or bottles for Rs. 25 for 20 tablets.
Northeastern manitoulin and the islands police department interview questionsselinasimpson989
The document provides guidance and sample answers for common interview questions for the Northeastern Manitoulin and the Islands police department. It includes tips on how to answer questions about work experience, weaknesses, challenges, and criminal records. Sample answers are given for typical questions along with explanations of what makes a strong response. A list of additional free resources on the website includes guides on different interview types, cover letters, resumes, and ways to search for jobs.
The document provides resources for preparing for a police department job interview in Dothan, including 80 police interview questions and answers, tips on different types of interviews, and samples of cover letters, resumes, and thank you letters. Key materials available at policecareer123.com include police interview questions and answers, secrets to winning interviews, cover letter and resume samples, and ways to search for jobs.
Este documento describe el desarrollo de un sillón artesanal hecho de llantas de vehículos recicladas. Se explica que las llantas de desecho representan un problema ambiental y de salud. El sillón fue diseñado para proporcionar comodidad ergonómica mediante un diseño 3D, la creación de un prototipo a escala y la fabricación de un modelo a tamaño real. Las pruebas subjetivas mostraron que el sillón ofrece descanso al usuario.
El documento contiene instrucciones para varias actividades de inglés, incluyendo preparar un juego de roles, escribir un diálogo usando el verbo "ser", crear un vocabulario con el contenido de la unidad 1, y expresiones como "estoy bien pero podría estar mejor" y "me gusta practicar en clases de inglés".
Indicators of Innovative Research (Klavans, Boyack, Small, Sorensen, Ioannidis)Kevin Boyack
Most people assume that highly cited papers are "innovative". Using survey results we show that most highly cited papers exemplify normal progress rather than innovation. We also attempt to correlate various indicators with those papers classified as innovative by their authors. Most of these correlations are very weak.
The document discusses issues with computational scientific software and proposes a solution called Digital Scientific Notations. Current scientific software is difficult to test and validate due to a lack of specifications and documentation. This makes the software results unverifiable and prevents comparison of different models. The proposed Digital Scientific Notations would embed computational models and methods into scholarly documents using a formal programming language. This would allow models to be precisely defined, validated, and compared, addressing current verification and reproducibility problems in computational science.
This document provides guidance on writing effective abstracts. It discusses what abstracts are, why they are important, and different types of abstracts such as unstructured and structured. Key elements that should be included in abstracts are background, objectives, methods, results, and conclusions. Tips are provided such as explaining abbreviations, using synonyms, and refraining from citations. The importance of keywords for searchability is covered, including reviewing similar articles and MeSH terms. Overall, the document aims to help authors write abstracts that accurately summarize their work and allow other researchers to easily find the information.
The document describes a visualization of PLoS data created by mapping PLoS thesaurus terms to an existing map of all science created from Scopus data. It discusses mapping terms directly and indirectly, analyzing coverage by journal and year, and using the maps and underlying data to answer questions posed by PLoS about trends, relationships between fields, and how PLoS coverage compares to other databases. Visualizations are created in software like Pajek and Gephi and are meant to facilitate communication around science structure and metrics.
This document provides a summary and review of trends in translational bioinformatics in 2013 by Russ Altman. It begins with an overview and goals section, followed by sections highlighting important papers from 2013 in areas like omics medicine, cool new methods, cancer research, and drugs/delivery. The document reviews over 350 papers, focusing on 27 that are briefly summarized. It aims to provide colleagues in the field a "snapshot" of important progress and opportunities in using informatics approaches to link basic biological research to clinical applications in 2013.
Answering More Questions with Provenance and Query PatternsBertram Ludäscher
This document discusses using provenance information to improve transparency and reproducibility in research. It begins by asking questions about the input data, methods, and parameter settings used in a study in order to assess its reliability. It then provides examples of how workflow systems can capture provenance at both the design level (prospective provenance) and runtime level (retrospective provenance). These include a Kepler workflow that simulates X-ray data collection and provenance traces captured by DataONE. The document argues that provenance is a critical link between workflow modeling and runtime traces that can increase trust in research findings.
El QUANTEC nos ayuda a los oncólogos radioterápicos a la hora de aprobar un tratamiento con sus tablas con "constraints" de los órganos de riesgo (los límites de dosis que pueden recibir los órganos sanos situados entorno al tumor que queremos tratar).
PD: Las tablas se encuentran en las páginas 15-17
The document outlines the mission and approach of AETIONOMY, a project aimed at generating a mechanism-based taxonomy for Alzheimer's and Parkinson's diseases. The project seeks to increase knowledge of the causes of these diseases by curating publicly available data, developing disease models and ontologies, and identifying testable hypotheses on disease mechanisms. These hypotheses will then be validated in a prospective clinical study to identify patient subgroups. The ultimate goal is to lay the foundation for improved disease classification and targeted treatment approaches.
Scientific research in a number of fields is in a state of crisis due to the discovery that many published results are non-reproducible, and applied statistics has been assigned a substantial share of the blame. Proposed solutions range from requiring independent statistical review of results for major journals to abolishing the use of certain methods entirely.
Lennox argues that the problem does not lie with statistical methods, but rather from misleading training for non-statisticians. The talk is intended to establish that statistics is not just a set of numerical procedures, but rather a distinctive way of thinking about and solving problems. Real-world examples demonstrate the pitfalls of "procedural" statistics, and that non-statisticians can be successful by approaching statistical challenges in the same way that they do problems in their field of expertise and by leveraging the statistical expertise available at the laboratory as necessary.
1) The path length from A to B in the following graph is .docxmonicafrancis71118
1) The path length from A to B in the following graph is:
a- 2
b- 10
c- 22
d- There is no path
2) The minimum path weight from A to B in the following graph is:
a- 2
b- 10
c- 32
d- There is no path
3) The minimum path weight from A to E in the following graph is:
a- 1
b- 7
c- 67
d- There is no path
4) The longest cycle that starts at A and ends at A in the following graph is:
a- 104
b- 122
c- 42
d- There is no cycle
5) The entry AE in the length one adjacency matrix representation of the following graph is:
a- 7
b-
c- 0
d- None of the above
6) The entry AB in the length one adjacency matrix representation of the following graph is:
a- 10
b-
c- 22
d- 0
7) The entry AD in the length two adjacency matrix representation of the following graph is:
a- 60
b-
c- 44
d- 0
8) In the following graph, which of the following paths is considered a simple path?
a- AECAD
b- AEBFC
c- ADBFD
d- There is no simple path in the graph above
9) Some of the cliques the following graph has include: (A clique is a subgraph that is complete which means each node in the subgraph is connected to every other node n the subgraph). In the following graph, the subgraph AEBD is not a clique because A and B are not connected and E and D are not connected also, otherwise if they were connected it would be a clique.
a- ADBE, EBFC, EB, F, C
b- AEC, DBF
c- AEB, EBC
d- AECFBD
10) (TSP): Apply the nearest-neighbor algorithm to the complete weighted graph G in the following figure, beginning at vertex B, what is the path and the total weight?
a- BADECB with weight 725
b- BAEDCB with weight 775
c- TSP does not work with complete graph
d- None of the answers is true
Experimental Design 1
Running Head: EXPERIMENTAL DESIGN
Experimental Design and Some Threats to
Experimental Validity: A Primer
Susan Skidmore
Texas A&M University
Paper presented at the annual meeting of the Southwest Educational
Research Association, New Orleans, Louisiana, February 6, 2008.
Experimental Design 2
Abstract
Experimental designs are distinguished as the best method to respond to
questions involving causality. The purpose of the present paper is to explicate
the logic of experimental design and why it is so vital to questions that demand
causal conclusions. In addition, types of internal and external validity threats are
discussed. To emphasize the current interest in experimental designs, Evidence-
Based Practices (EBP) in medicine, psychology and education are highlighted.
Finally, cautionary statements regarding experimental designs are elucidated
with examples from the literature.
Experimental Design 3
The No Child Left Behind Act (NCLB) demands “scientifically based
research” as the basis for awarding many grants in education (2001).
Specifically, the 107th Congress (2001) delineated scientifically-based research
as that which “is evaluated using experimen.
This document provides an overview of biostatistics and the role of biostatisticians. It discusses how biostatistics applies statistical methods to address questions in public health, medicine, and other biological fields. The document outlines some of the key roles of biostatisticians, which include identifying disease risk factors and treatments, designing and analyzing clinical studies, and developing new statistical methods for analyzing medical data. It also notes some of the challenges of biostatistics, such as separating systematic effects from random noise in data and making inferences.
The document discusses hypothesis testing and the scientific research process. It begins by defining a hypothesis as a tentative statement about the relationship between two or more variables that can be tested. It then outlines the typical steps in the scientific research process, which includes forming a question, background research, creating a hypothesis, experiment design, data collection, analysis, conclusions, and communicating results. Finally, it provides details on characteristics of a strong hypothesis, the process of hypothesis testing through statistical analysis, and setting up an experiment for hypothesis testing, including defining hypotheses, significance levels, sample size determination, and calculating standard deviation.
This document discusses issues with reproducibility in EEG research and proposes solutions. It notes that flexible choices in EEG methodology and exploratory analyses can lead to false positives. Simulations demonstrate how double dipping, multiple comparisons, and lack of independent replication can produce significant effects from noise alone. The document advocates for preregistering analysis plans, including dummy effects in studies, subdividing data for exploration and replication, and using registered reports to improve reproducibility in EEG research.
This document discusses evidence-based medicine (EBM) and key concepts in evaluating medical evidence. It defines EBM as the conscientious use of current best evidence in patient care. Randomized controlled trials are considered the gold standard for evaluating new therapies or tests. However, observational studies can also provide valuable evidence when RCTs are not possible or ethical. Systematic reviews provide a critical summary of all relevant randomized trials on a topic to determine the state of evidence and guide clinical practice and policy.
1) The document discusses normality tests that are used to check the assumption of normal distribution for many statistical analyses. It focuses on how to check for normality using SPSS.
2) It provides examples of using SPSS to check for normality on two datasets - serum magnesium levels were normally distributed while serum TSH levels were not normally distributed.
3) The Kolmogorov-Smirnov and Shapiro-Wilk normality tests in SPSS showed that the magnesium levels data passed normality tests while the TSH levels data failed normality tests, indicating the correct statistical analyses to use.
The document discusses various types of research studies and common problems in research reporting. It describes basic and applied research, as well as animal studies, case studies, clinical trials, correlational studies, cross-sectional surveys, epidemiological studies, experimental studies, literature reviews, longitudinal studies, meta-analyses, and problems that can occur in writing research proposals and reports. Common issues include plagiarism, poor formatting, weak structure of sentences, and improperly organizing the different sections of a research report.
Here are the responses to the questions:
1. A statistical population is the entire set of individuals or objects of interest. A sample is a subset of the population selected to represent the population. The sample infers information about the characteristics, attributes, and properties of the entire population.
2. Variance is the average of the squared deviations from the mean. It is calculated as the sum of the squared deviations from the mean divided by the number of values in the data set minus 1. Standard deviation is the square root of the variance. It measures how far data values spread out from the mean.
3. No data was provided to create graphs. Additional data on the number of fish in each age group would be needed.
Open Science Better Science? Steyerberg 2June2022.pptxEwout Steyerberg
Is Open Science Better Science?
Ewout W. Steyerberg, PhD
Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands
Abstract
The Open Science movement has many components, including Open Access to scientific publications, sharing of research data, and providing open source software. These components are expected to contribute to better science. In this seminar I aim to reflect on the strength and limitations of Open Science in the context of epidemiological research.
First, I note that by making research more open, the scale of research increases; this might enable addressing some research questions better. This allows us to recognize that different researchers use different scientific approaches; Open science makes that we become increasingly aware of different styles in research.
Second, we may hope to learn more about the value of modern approaches to data analysis such as machine learning. Indeed, neutral comparison studies benefit from the open availability of multiple data sets that can be analyzed with standardized approaches to the analysis, adding realism compared to analytical and simulation studies.
Third, I note that more data sharing is a positive development, especially to highlight heterogeneity between settings. In sum, I remain optimistic that open science will lead to better science, with the caveat that we recognize complexities that limit the interpretation of increasing amounts of data, such as the medical context, study design, measurement and data analysis.
These slides were presented in a series of lectures organized by Prof Marianna Huebner, June 2, 2022
Open Science and Ecological meta-anlaysisAntica Culina
This document discusses using open data and meta-analysis to help with ecological and evolutionary synthesis. It describes how data from various sources like published studies, unpublished datasets, and metadata can be gathered and synthesized. Challenges include incomplete or unavailable data as well as differences in data collection and reporting. Case studies on topics like genetic change rates, divorce in birds, microbe communities, and soil carbon stocks demonstrate searching for relevant open data, screening datasets for usability, and analyzing data to answer research questions. The document advocates for open science to improve data sharing and the robustness of synthesis results.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
Atypical combinations are confounded by disciplinary effects (Boyack & Klavans)
1. SCITECH STRATEGIES
Better Maps ● Better Solutions
Physics Chemistry Engineering Biology Disease Medicine Computer Earth Brain Health Social Humanities
Atypical combinations are confounded by
disciplinary effects
STI 2014
Leiden, The Netherlands
Sept. 3-5, 2014
Kevin W. Boyack & Richard Klavans
SciTech Strategies, Inc.
www.mapofscience.com
2. Better Maps SCITECH STRATEGIES ● Better Solutions
2
BACKGROUND
We have long been interested
in indicators of innovative
research
Uzzi et al. (UMSJ) recently
published an article
correlating high impact papers
(innovation) with “atypical
combinations” (novelty) of
reference journals
Intriguing results; we decided
to investigate further – to
replicate the study and then
further explore this idea of
novelty
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
3. Better Maps SCITECH STRATEGIES ● Better Solutions
3
UZZI STUDY
Hypothesis: “The highest-impact science is primarily grounded in
exceptionally conventional combinations of prior work yet
simultaneously features an intrusion of unusual combinations”
Data: Used 17.9M articles (1950-2000) from WOS, containing 302M
references to 15,613 cited journals
Method:
» Journals are used as proxy for “areas of knowledge”
» Determine which co-cited journal combinations are “conventional” and which are
“unusual” or “novel”
» Develop indicators of “convention” and “novelty” from co-citation statistics
» Calculate “convention” and “novelty” for each paper using indicators
» Test indicators to see how they correlate with highly cited papers
Finding: Papers with high convention AND high novelty are twice as
likely to be highly cited as the average paper
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
4. Better Maps SCITECH STRATEGIES ● Better Solutions
4
UMSJ METHOD (1)
To determine which co-cited journal combinations are “conventional”
and which are “novel”, UMSJ calculated Z-scores for each co-cited
journal pair, where Z is defined:
Z = (Nact – Nexp) / Nvar
Nact is the actual number of journal co-citation counts
Nexp is an expected number of journal co-citation counts
Nvar is the variance of Nexp
Nexp and Nvar were estimated by calculating (10) randomized citation
networks where all citation links were switched using a Monte Carlo
technique, keeping citing/cited distributions constant at the paper level
A negative Z-score indicates that a journal pair is co-cited less often
than expected; thus is an “atypical combination” of journals
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
5. Better Maps SCITECH STRATEGIES ● Better Solutions
5
UMSJ METHOD (2)
Using the computed Z-scores
for each co-cited journal pair,
the set of Z-scores can then
be located for each paper
Two summary statistics were
calculated for each paper
from its Z-score distribution:
» Median Z-score – to characterize
central tendency or “convention”
» 10th percentile (left tail) Z-score –
to characterize “novelty”
Distributions of these
summary statistics were
analyzed
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
6. Better Maps SCITECH STRATEGIES ● Better Solutions
6
UMSJ METHOD (3)
Distributions of these paper-level
summary statistics were
analyzed
Indicators based on these
summary statistics were
created
» Novelty
HIGH – 10th Pctl Z-score < 0
LOW – 10th Pctl Z-score > 0
» Conventionality
HIGH – median Z-score > Avg
LOW – median Z-score < Avg
Each paper classified in terms
of convention and novelty
Low
Convention
High
Convention
High
Novelty
Low
Novelty
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
7. Better Maps SCITECH STRATEGIES ● Better Solutions
7
UMSJ RESULTS
“Hit” papers defined as the
top-5% highly cited papers
Using indicators:
» Probability of a (N+C+)
HIGH NOVELTY,
HIGH CONVENTION
paper being a hit paper is 0.0911
» Probability of a (N-C-)
LOW NOVELTY,
LOW CONVENTION
paper being a hit paper is 0.0205
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
8. Better Maps SCITECH STRATEGIES ● Better Solutions
8
UMSJ ISSUES
“Analyses in the supplementary materials (fig. S6) show that these
empirical regularities for the WOS taken as a whole are largely
replicated on a field-by-field basis and across time”
» Across time – YES
» Across fields or disciplines – NOT REALLY! – UMSJ supplemental results show that
the N+C+ bin has the highest probability (of the 4 bins) of containing a hit paper for
only 64% of the 243 subject categories
The fact that the N+C+ bin is not ranked first in 36% of subject
categories is troubling, suggesting potentially large field effects, or even
individual journal effects
Top-5% highly cited not sampled by field
Journals may not be the right proxy for “areas of knowledge”
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
9. Better Maps SCITECH STRATEGIES ● Better Solutions
9
REPLICATION
We used a different, but parallel, methodology to replicate the UMSJ
distributions and results
Scopus data (2001-2010) – 12M articles, 226M references
Included conference papers along with articles
K50 statistics for co-cited journal pairs rather than Z-scores and Monte
Carlo simulations
» K50 has the same conceptual formulation as the Z-score:
(Nact – Nexp) / Normalization
» Expected values and normalization are based on row and column sums
UMSJ procedures for calculating distributions, etc. were all followed
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
10. Better Maps SCITECH STRATEGIES ● Better Solutions
10
REPLICATION
For the left tail, we used the 5th
percentile rather than the 10th
percentile to more closely
match UMSJ distributions
Indicator distributions for the
median and left tail percentile
values are very similar to the
UMSJ distributions
» Differences in the tail percentile
curves have no effect on
indicators since the fractions of
articles at the zero point of all
curves are the same
Low
Convention
High
Convention
High
Novelty
Low
Novelty
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
11. Better Maps SCITECH STRATEGIES ● Better Solutions
11
REPLICATION
Probabilities of hit papers 2001-2005 (top-5% highly cited) as of 2011
UMSJ (1990-2000) This study (2001-2005)
% sample Prob % sample Prob
High Novelty, High Convention (N+C+) 6.7% 0.0911 9.5% 0.0959
High Novelty, Low Convention (N+C-) 26% 0.0533 30.6% 0.0659
Low Novelty, High Convention (N-C+) 44% 0.0582 40.5% 0.0433
Low Novelty, Low Convention (N-C-) 23% 0.0205 19.4% 0.0205
Our results are similar to the UMSJ results
» Higher probability for N+C+ (0.0959 to 0.0911) coupled with a higher fraction within
that bin (9.5% to 6.7%) suggest that our method does even a bit better at locating
highly cited papers.
» High novelty is accentuated overall using our method (N+C- is 0.0659 rather than
0.0533)
Replication was successful, and reproduces the major features of the
UMSJ study
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
12. Better Maps SCITECH STRATEGIES ● Better Solutions
12
FIELD EFFECTS?
2x2 matrix probabilities for the
top-5% sampled by field were
compared to the 2x2 matrix
probabilities using the top-5%
overall
The bins are in the same
order using top-5% by field,
but the differences between
bins are smaller
» N+C+ (0.0834 vs 0.0959)
» N-C- (0.0335 vs. 0.0205)
This suggests that “atypical
combinations” are influenced
by field effects
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
13. Better Maps SCITECH STRATEGIES ● Better Solutions
13
FIELD EFFECTS?
Top 20 largest journals (by
numbers of co-citations) are
plotted in terms of convention
and novelty
» These 20 journals account for
15.9% of all co-citations
Reminder note: Journal are
plotted here based on how
they are co-cited, not what is
published in them !
% co-citations above overall median
% co-citations below zero
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
14. Better Maps SCITECH STRATEGIES ● Better Solutions
14
FIELD EFFECTS?
Three groups appear
» PHYSICS (6 journals) – cited as
conventional, but not novel
» BIOMED (9 journals) – cited as
both conventional and novel
» MULTI (5 journals) – cited as
novel and not conventional
Nature, Science, and PNAS
account for 9.4% of ALL
atypical co-citation pairs
» Multidisciplinary journals are
obviously not good proxies for
“areas of knowledge”
» They contribute the most to the
notion of “atypical”, suggesting
that journals are a poor basis for
this study
% co-citations above overall median
% co-citations below zero
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
15. Better Maps SCITECH STRATEGIES ● Better Solutions
15
SUMMARY
We have replicated the UMSJ study and primary finding that
» Papers with high convention AND high novelty are twice as likely to be highly cited
as the average paper
This is a real finding! There seems to be something to the notion of
“atypical combinations” that is meaningful and could be predictive
However …
Field and journal effects are not insignificant, and given that these
studies were based on journal co-citation, journals and fields may be
driving “atypical combinations”
Journals are the wrong proxy for “areas of knowledge”; we need an
alternative proxy for “areas of knowledge”
Other potential measurements of “atypical-ness” or “novelty” that are
relatively independent of field or journal effects should be proposed and
tested
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities
16. Better Maps SCITECH STRATEGIES ● Better Solutions
16
QUESTIONS
Thank-you for your attention !
Physics Computer Chemistry Engineering Earth Biology Disease Medicine Brain Health Social Humanities