1. The document discusses generalized linear mixed models (GLMMs), which are statistical models that combine linear predictors, non-normal response distributions, link functions, and random effects.
2. It outlines some of the statistical, computational, and sociological challenges in using GLMMs, such as estimating models with large matrices and interpreting results accurately.
3. The conclusion emphasizes next steps like improving correlation structures and inference methods in GLMMs while addressing issues like proper interpretation and use by non-experts.
Consideration on Fairness-aware Data Mining
IEEE International Workshop on Discrimination and Privacy-Aware Data Mining (DPADM 2012)
Dec. 10, 2012 @ Brussels, Belgium, in conjunction with ICDM2012
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2012.101
Article @ Personal Site: http://www.kamishima.net/archive/2012-ws-icdm-print.pdf
Handnote: http://www.kamishima.net/archive/2012-ws-icdm-HN.pdf
Workshop Homepage: https://sites.google.com/site/dpadm2012/
Abstract:
With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect individuals' lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair regarding sensitive features such as race, gender, religion, and so on. Several researchers have recently begun to develop fairness-aware or discrimination-aware data mining techniques that take into account issues of social fairness, discrimination, and neutrality. In this paper, after demonstrating the applications of these techniques, we explore the formal concepts of fairness and techniques for handling fairness in data mining. We then provide an integrated view of these concepts based on statistical independence. Finally, we discuss the relations between fairness-aware data mining and other research topics, such as privacy-preserving data mining or causal inference.
Parametric Sensitivity Analysis of a Mathematical Model of Two Interacting Po...IOSR Journals
Experts in the mathematical modeling for two interacting technologies have observed the different contributions between the intraspecific and the interspecific coefficients in conjunction with the starting population sizes and the trading period. In this complex multi-parameter system of competing technologies which evolve over time, we have used the numerical method of mathematical norms to measure the sensitivity values of the intraspecific coefficients b and e, the starting population sizes of the two interacting technologies and the duration of trading. We have observed that the two intraspecific coefficients can be considered as most sensitive parameter while the starting populations are called least sensitive. We will expect these contributions to provide useful insights in the determination of the important parameters which drive the dynamics of the technological substitution model in the context of one-at-a-timesensitivity analysis
Diminishing Returns: When Should Real- world Surveys Stop Sampling?Inspirient
This talk was given on 18 July 2023 at the European Survey Research Association conference (ESRA'23) in Milan, Italy, by Dr. Georg Wittenburg (Inspirient) and Dr. Josef Hartmann (Kantar Public). We argue that for real-world surveys, for which the scope of deliverables is known before the survey starts, it is possible to reduce the required sample size by monitoring association and significance metrics while the data is being collected.
The full abstract of this talk is as follows (also available at https://www.europeansurveyresearch.org/conf2023/prog.php?sess=113#603):
It is intuitively clear that more can be learned from the initial survey interview of a sample than from the 10,000th interview: Assuming random sampling, the information contained in each new sample element only contributes in smaller and smaller increments to the information of all previously collected interviews. While it is a fact that – given a sufficiently high sample size – a significant result may eventually be achieved even for trivial relations, this point of view only partly describes the issue at hand, simply because in real-world surveys each additional sample element comes at a cost. The cost per sample element may be constant in simpler setups, but in practice this cost increases as hard-to-reach or hard-to-convince subpopulations may require additional effort. For real-world surveys, conducted under economic constraints, the question of when to stop sampling is thus very much worth revisiting.
The question of optimal sample size is commonly modelled as an a priori problem: The required sample size n is estimated for a given set of parameters, including effect size, significance level and statistical power (power analysis). Complementing this perspective, we propose to look at determining sample size as an adaptive problem, i.e., one that tracks effect size and significance metrics as sampled elements are coming in. We propose to observe the rate of convergence of these metrics while the survey is still in progress, and thus have the opportunity to stop as soon as saturation sets in. We have validated this approach on a number of real-world survey datasets and found that in some cases comparable results regarding effect size and overall significance levels could be reached with less than half of the number of cases actually taken. The results imply less respondents and thus, less respondent burden.
Consideration on Fairness-aware Data Mining
IEEE International Workshop on Discrimination and Privacy-Aware Data Mining (DPADM 2012)
Dec. 10, 2012 @ Brussels, Belgium, in conjunction with ICDM2012
Article @ Official Site: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2012.101
Article @ Personal Site: http://www.kamishima.net/archive/2012-ws-icdm-print.pdf
Handnote: http://www.kamishima.net/archive/2012-ws-icdm-HN.pdf
Workshop Homepage: https://sites.google.com/site/dpadm2012/
Abstract:
With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect individuals' lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair regarding sensitive features such as race, gender, religion, and so on. Several researchers have recently begun to develop fairness-aware or discrimination-aware data mining techniques that take into account issues of social fairness, discrimination, and neutrality. In this paper, after demonstrating the applications of these techniques, we explore the formal concepts of fairness and techniques for handling fairness in data mining. We then provide an integrated view of these concepts based on statistical independence. Finally, we discuss the relations between fairness-aware data mining and other research topics, such as privacy-preserving data mining or causal inference.
Parametric Sensitivity Analysis of a Mathematical Model of Two Interacting Po...IOSR Journals
Experts in the mathematical modeling for two interacting technologies have observed the different contributions between the intraspecific and the interspecific coefficients in conjunction with the starting population sizes and the trading period. In this complex multi-parameter system of competing technologies which evolve over time, we have used the numerical method of mathematical norms to measure the sensitivity values of the intraspecific coefficients b and e, the starting population sizes of the two interacting technologies and the duration of trading. We have observed that the two intraspecific coefficients can be considered as most sensitive parameter while the starting populations are called least sensitive. We will expect these contributions to provide useful insights in the determination of the important parameters which drive the dynamics of the technological substitution model in the context of one-at-a-timesensitivity analysis
Diminishing Returns: When Should Real- world Surveys Stop Sampling?Inspirient
This talk was given on 18 July 2023 at the European Survey Research Association conference (ESRA'23) in Milan, Italy, by Dr. Georg Wittenburg (Inspirient) and Dr. Josef Hartmann (Kantar Public). We argue that for real-world surveys, for which the scope of deliverables is known before the survey starts, it is possible to reduce the required sample size by monitoring association and significance metrics while the data is being collected.
The full abstract of this talk is as follows (also available at https://www.europeansurveyresearch.org/conf2023/prog.php?sess=113#603):
It is intuitively clear that more can be learned from the initial survey interview of a sample than from the 10,000th interview: Assuming random sampling, the information contained in each new sample element only contributes in smaller and smaller increments to the information of all previously collected interviews. While it is a fact that – given a sufficiently high sample size – a significant result may eventually be achieved even for trivial relations, this point of view only partly describes the issue at hand, simply because in real-world surveys each additional sample element comes at a cost. The cost per sample element may be constant in simpler setups, but in practice this cost increases as hard-to-reach or hard-to-convince subpopulations may require additional effort. For real-world surveys, conducted under economic constraints, the question of when to stop sampling is thus very much worth revisiting.
The question of optimal sample size is commonly modelled as an a priori problem: The required sample size n is estimated for a given set of parameters, including effect size, significance level and statistical power (power analysis). Complementing this perspective, we propose to look at determining sample size as an adaptive problem, i.e., one that tracks effect size and significance metrics as sampled elements are coming in. We propose to observe the rate of convergence of these metrics while the survey is still in progress, and thus have the opportunity to stop as soon as saturation sets in. We have validated this approach on a number of real-world survey datasets and found that in some cases comparable results regarding effect size and overall significance levels could be reached with less than half of the number of cases actually taken. The results imply less respondents and thus, less respondent burden.
Social Learning in Networks: Extraction Deterministic RulesDmitrii Ignatov
In this talk, we want to introduce experimental
economics to the field of data mining and vice versa. It continues
related work on mining deterministic behavior rules of human
subjects in data gathered from experiments. Game-theoretic
predictions partially fail to work with this data. Equilibria also
known as game-theoretic predictions solely succeed with experienced
subjects in specific games – conditions, which are rarely
given. Contemporary experimental economics offers a number of
alternative models apart from game theory. In relevant literature,
these models are always biased by philosophical plausibility
considerations and are claimed to fit the data. An agnostic
data mining approach to the problem is introduced in this
paper – the philosophical plausibility considerations follow after
the correlations are found. No other biases are regarded apart
from determinism. The dataset of the paper “Social Learning in
Networks” by Choi et al 2012 is taken for evaluation. As a result,
we come up with new findings. As future work, the design of a
new infrastructure is discussed.
What is statistics and how is the discipline of statistics different to machine learning? Statistics is the oldest kind on the block of data science. However, it is not as popular as machine learning or deep learning is. Nevertheless, there are countless applications of statistical science in the real world.
Agent based modelling is a very useful and flexible modelling technique. It is especially useful when modelling complex systems, such as societies or economies. This makes it particularly useful when modelling token economies. Agent based modelling can be a powerful tool for any ICO.
This slideshare has been produced by the Tesseract Academy (http://tesseract.academy), a company that educates decision makers in deep technical topics such as data science, analytics, and blockchain.
For more information about this topic also visit The Data Scientist:
http://thedatascientist.com/statistics-vs-machine-learning-two-worlds/
Susie Bayarri Plenary Lecture given in the ISBA (International Society of Bayesian Analysis) World Meeting in Montreal, Canada on June 30, 2022, by Pierre E, Jacob (https://sites.google.com/site/pierrejacob/)
Part 2 of the course "Topics in Survey Methologogy and Survey Analysis" held in University of Helsinki in 2013, together with prof Seppo Laaksonen (Part 1) and prof Risto Lehtonen (Part 3). Part 2 included topics such as exploratory and confirmatory analysis, reliability, validity and measurement errors, data reduction with factor analysis, visualization of multidimensional data. The course was intended mainly for doctoral students from Social (and Behavioral) Sciences.
Invited talk at the Focus Fortnight 8: ""The analysis of discrete choice experiments", organized by the Centre for Bayesian Statistics in Health Economics, University of Sheffield (UK), September, 2007.
The Power of Topology - Colleen Farrelly - WiDS Miami 2018Catalina Arango
A lot of data science coverage in the media focuses on big data—storage systems, deep learning, and analyzing data with billions or trillions of observations. However, there’s an equally pressing problem in many industries and smaller companies today: small sample sizes or small subgroups within larger datasets. Machine learning algorithms fail to converge. Statistical methods break down completely. And valuable insight is lost.
However, recent advances in a branch of machine learning called topological data analysis (TDA), along with novel applications of topology to existing statistical methods, have provided a toolset suited to the challenges of small data. These methods have great potential as the field of data science moves from quantity to quality of data. This talk overviews several of TDA’s major tools, as well as their applications to three projects in which traditional methods fail.
Women in Data Science 2018 Slides--Small Samples, Subgroups, and TopologyColleen Farrelly
A lot of data science coverage in the media focuses on big data—storage systems, deep learning, and analyzing data with billions or trillions of observations. However, there’s an equally pressing problem in many industries and smaller companies today: small sample sizes or small subgroups within larger datasets. Machine learning algorithms fail to converge. Statistical methods break down completely. And valuable insight is lost.
However, recent advances in a branch of machine learning called topological data analysis (TDA), along with novel applications of topology to existing statistical methods, have provided a toolset suited to the challenges of small data. These methods have great potential as the field of data science moves from quantity to quality of data. This talk overviews several of TDA’s major tools, as well as their applications to three projects in which traditional methods fail.
I will link to the video when it is made available :)
Ecological synthesis across scales: West Nile virus in individuals and commun...Ben Bolker
West Nile Virus (WNV), a mosquito-borne virus of birds, emerged in North America in 1999; the invading strain was then displaced within a few years by a novel mutant. In order to understand this competitive displacement event, and to predict transmission of WNV in bird communities comprising hundreds of species, we collected data on bird and mosquito infections, bird community composition, and mosquito biting preferences from lab experiments, field observations, and citizen-science databases. We use a Bayesian framework, including a method for phylogenetic imputation applied to species with missing data, to synthesize information across the entire disease life cycle and throughout the community.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Threads 2013
1. Denitions Statistics Computation Sociological Conclusions References
General-purpose tools
for generalized linear mixed models
Ben Bolker
McMaster University, Mathematics Statistics and Biology
13 September 2013
Ben Bolker
GLMMs
4. Denitions Statistics Computation Sociological Conclusions References
Generalized linear mixed models
GLMMs: a statistical modeling framework incorporating:
Linear combinations of categorical and continuous
predictors, and interactions
Response distributions in the exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic link function
(e.g. logistic, exponential models)
Flexible combinations of blocking factors
(clustering; random eects)
Applications in ecology, neurobiology, behaviour, epidemiology, real
estate, . . .
Ben Bolker
GLMMs
5. Denitions Statistics Computation Sociological Conclusions References
Generalized linear mixed models
GLMMs: a statistical modeling framework incorporating:
Linear combinations of categorical and continuous
predictors, and interactions
Response distributions in the exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic link function
(e.g. logistic, exponential models)
Flexible combinations of blocking factors
(clustering; random eects)
Applications in ecology, neurobiology, behaviour, epidemiology, real
estate, . . .
Ben Bolker
GLMMs
6. Denitions Statistics Computation Sociological Conclusions References
Generalized linear mixed models
GLMMs: a statistical modeling framework incorporating:
Linear combinations of categorical and continuous
predictors, and interactions
Response distributions in the exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic link function
(e.g. logistic, exponential models)
Flexible combinations of blocking factors
(clustering; random eects)
Applications in ecology, neurobiology, behaviour, epidemiology, real
estate, . . .
Ben Bolker
GLMMs
7. Denitions Statistics Computation Sociological Conclusions References
Generalized linear mixed models
GLMMs: a statistical modeling framework incorporating:
Linear combinations of categorical and continuous
predictors, and interactions
Response distributions in the exponential family
(binomial, Poisson, and extensions)
Any smooth, monotonic link function
(e.g. logistic, exponential models)
Flexible combinations of blocking factors
(clustering; random eects)
Applications in ecology, neurobiology, behaviour, epidemiology, real
estate, . . .
Ben Bolker
GLMMs
8. Denitions Statistics Computation Sociological Conclusions References
Technical denition
Yi
response
∼
conditional
distribution
Distr (g
−1(ηi )
inverse
link
function
, φ
scale
parameter
)
η
linear
predictor
= Xβ
xed
eects
+ Zb
random
eects
b
conditional
modes
∼ MVN(0, Σ(θ)
variance-
covariance
matrix
)
Ben Bolker
GLMMs
12. Denitions Statistics Computation Sociological Conclusions References
Inference
Big problem.
Inferential tools: either asymptotic
or taken from classical linear
models
boundary solutions (Stram and
Lee, 1994)
the great p-value/degrees of
freedom debate
small numbers of clusters
solutions: computational
and/or Bayesian
(parametric bootstrap, MCMC)
True p value
Inferredpvalue
0.02
0.04
0.06
0.08
0.02 0.06
Osm Cu
H2S
0.02 0.06
0.02
0.04
0.06
0.08
Anoxia
Ben Bolker
GLMMs
17. Denitions Statistics Computation Sociological Conclusions References
Sociological issues
The curse of neophilia
Wide user base:
As usual when software for complicated statistical
inference procedures is broadly disseminated, there is
potential for abuse and misinterpretation.
(Breslow, 2004)
What if there is no good answer?
do no harm vs. better me than someone else
Diagnostics and warning messages
End users vs. downstream developers
Ben Bolker
GLMMs
19. Denitions Statistics Computation Sociological Conclusions References
Next steps
Alternative platforms/languages
Flexible correlation structures:
spatial, temporal, phylogenetic . . .
Improved MCMC methods?
Simulation tests of inferential tools (sigh)
Ben Bolker
GLMMs
20. Denitions Statistics Computation Sociological Conclusions References
Is it science?
Science is what we
understand well enough to
explain to a computer. Art
is everything else we do.
(Donald Knuth)
10
20
30
40
50
2006 2008 2010 2012
Date
articlespermonth
key
glmm
lme4
Ben Bolker
GLMMs
21. Denitions Statistics Computation Sociological Conclusions References
Acknowledgments
lme4: Doug Bates, Martin
Mächler, Steve Walker
Data: Adrian Stier (UBC/OSU),
Sea McKeon (Smithsonian),
David Julian (UF)
NSERC (Discovery)
SHARCnet
Ben Bolker
GLMMs
22. Denitions Statistics Computation Sociological Conclusions References
Booth, J.G. and Hobert, J.P., 1999. Journal of the Royal Statistical Society. Series B, 61(1):265285.
doi:10.1111/1467-9868.00176.
Breslow, N.E., 2004. In D.Y. Lin and P.J. Heagerty, editors, Proceedings of the second Seattle
symposium in biostatistics: Analysis of correlated data, pages 122. Springer. ISBN 0387208623.
McKeon, C.S., Stier, A., et al., 2012. Oecologia, 169(4):10951103. ISSN 0029-8549.
doi:10.1007/s00442-012-2275-2.
Pinheiro, J.C. and Bates, D.M., 1996. Statistics and Computing, 6(3):289296.
doi:10.1007/BF00140873.
Ponciano, J.M., Taper, M.L., et al., 2009. Ecology, 90(2):356362. ISSN 0012-9658.
Stram, D.O. and Lee, J.W., 1994. Biometrics, 50(4):11711177.
Sung, Y.J., 2007. The Annals of Statistics, 35(3):9901011. ISSN 0090-5364.
doi:10.1214/009053606000001389.
Ben Bolker
GLMMs