Machine Learning (ML) and Artificial Intelligence (AI) have made great strides in this decade. We have a plethora of ML algorithms that can be used to perform a given task, be it face recognition, image classification or natural language processing. However, explainability of ML/AI algorithms remains a big problem. Explainable AI (XAI) is a branch of ML that is devoted to unravelling the black-box nature of AI so that we understand the reasons behind the decisions/output. However, there are concerns that XAI sometimes produce “tools for computer scientists to explain things to other computer scientists”, which defeats its purpose. To this end, a growing number of researchers have called for integration with social sciences to make truly explainable and trustworthy AI, because philosophy and social sciences have debated the meaning and function of an explanation for millennia and have deeper insights1. In this talk, we present such an integration2. Our problem domain is algorithm evaluation, which considers a portfolio of algorithms and its performance on a set of problems. For example, it can be a portfolio of regression algorithms. The goal is to understand meaningful, explainable insights about the algorithms from the performance results. As the social science linkage, we use Item Response Theory (IRT), a methodology from educational psychometrics. IRT is traditionally used to evaluate the difficulty and discrimination of test questions and the ability of students and has causal interpretations. Using IRT we obtain explainable insights about algorithms relating to their stable/consistent nature, the difficulty level of problems they can handle and their behaviour. In addition, we visualise the problem spectrum and find regions on the spectrum where algorithms exhibit strengths. The causal interpretations of IRT transfer to the algorithm evaluation domain as we gain a deeper understanding of algorithms. References 1. Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artif Intell 267, 1–38 (2019). 2. Kandanaarachchi, S. & Smith-Miles, K. Comprehensive Algorithm Portfolio Evaluation using Item Response Theory. Journal of Machine Learning Research 24, 1–52 (2023).