This document discusses previous research on the "hot hand fallacy" in basketball and presents a new analysis using advanced tracking data. The authors find no evidence that making previous shots increases the likelihood of making subsequent shots, with some exceptions. They also find that both offensive and defensive players act as if the hot hand exists, possibly explaining previous null findings. This behavior warrants further investigation in future studies of any potential hot hand effect.
Software libre y modelos de programación en la investigación con supercomputa...Andrés Gómez
Presentación hecha en el II Congreso de Software Libre para Educación en julio 2013 en donde se presentan los resultados de una encuesta realizada a los usuarios del CESGA sobre las necesidades computacionales y las herramientas de programación utilizadas.
El documento define software como el conjunto de componentes lógicos necesarios para realizar tareas específicas, en contraposición al hardware que son los componentes físicos. Explica que el software se refiere a los programas y datos almacenados en una computadora e incluye instrucciones que permiten que el hardware realice sus tareas. También discute las similitudes y diferencias entre el desarrollo de software y hardware.
El documento describe el análisis de varianza (ANOVA), un método estadístico para determinar si existen diferencias significativas entre los promedios de tres o más grupos. Se presenta un ejemplo de ANOVA para analizar los efectos de cuatro sustancias de calcificación (A, B, C, D) en el espesor de calcificación en el fémur. Los resultados del ANOVA muestran que no hay diferencias significativas entre los promedios de los cuatro grupos.
El documento presenta información sobre cuatro especies animales: el león, el tigre, la vicuña y el zorro de Sechura. Para cada especie se proporcionan detalles sobre su familia, género y especie. También se brindan detalles adicionales sobre su hábitat, características físicas y en algunos casos su importancia cultural.
Este documento proporciona instrucciones sobre cómo insertar y manipular diferentes elementos en Microsoft Word 2007, incluyendo formas, flechas, líneas, cuadros de texto, diagramas de flujo y rayos. Explica cómo insertar, copiar, mover y eliminar estos elementos mediante acciones como hacer clic en la pestaña Insertar, seleccionar un tipo de elemento en la galería de formas, hacer clic derecho para eliminar, y arrastrar y soltar para mover.
Este documento describe varios temas relacionados con la seguridad en redes corporativas. Primero, define las principales amenazas y ataques como el sniffing, man-in-the-middle, spoofing y pharming. Luego, explica los sistemas de detección de intrusos y el análisis de puertos para mejorar la seguridad. Finalmente, cubre las comunicaciones seguras a través de protocolos como SSH, SSL y VPN, y los riesgos y recomendaciones de seguridad en redes Wi-Fi.
Este documento discute cómo el maltrato animal es una expresión del fascismo y el especismo. Define el maltrato animal como la aceptación de la violencia y la supresión de la vida de cualquier especie. Explica que el bombardeo con imágenes de actos violentos contra los animales es una estrategia fascista para normalizar la violencia. Concluye que el maltrato animal rara vez se queda solo en eso y puede conducir a la aceptación más amplia de la violencia.
Software libre y modelos de programación en la investigación con supercomputa...Andrés Gómez
Presentación hecha en el II Congreso de Software Libre para Educación en julio 2013 en donde se presentan los resultados de una encuesta realizada a los usuarios del CESGA sobre las necesidades computacionales y las herramientas de programación utilizadas.
El documento define software como el conjunto de componentes lógicos necesarios para realizar tareas específicas, en contraposición al hardware que son los componentes físicos. Explica que el software se refiere a los programas y datos almacenados en una computadora e incluye instrucciones que permiten que el hardware realice sus tareas. También discute las similitudes y diferencias entre el desarrollo de software y hardware.
El documento describe el análisis de varianza (ANOVA), un método estadístico para determinar si existen diferencias significativas entre los promedios de tres o más grupos. Se presenta un ejemplo de ANOVA para analizar los efectos de cuatro sustancias de calcificación (A, B, C, D) en el espesor de calcificación en el fémur. Los resultados del ANOVA muestran que no hay diferencias significativas entre los promedios de los cuatro grupos.
El documento presenta información sobre cuatro especies animales: el león, el tigre, la vicuña y el zorro de Sechura. Para cada especie se proporcionan detalles sobre su familia, género y especie. También se brindan detalles adicionales sobre su hábitat, características físicas y en algunos casos su importancia cultural.
Este documento proporciona instrucciones sobre cómo insertar y manipular diferentes elementos en Microsoft Word 2007, incluyendo formas, flechas, líneas, cuadros de texto, diagramas de flujo y rayos. Explica cómo insertar, copiar, mover y eliminar estos elementos mediante acciones como hacer clic en la pestaña Insertar, seleccionar un tipo de elemento en la galería de formas, hacer clic derecho para eliminar, y arrastrar y soltar para mover.
Este documento describe varios temas relacionados con la seguridad en redes corporativas. Primero, define las principales amenazas y ataques como el sniffing, man-in-the-middle, spoofing y pharming. Luego, explica los sistemas de detección de intrusos y el análisis de puertos para mejorar la seguridad. Finalmente, cubre las comunicaciones seguras a través de protocolos como SSH, SSL y VPN, y los riesgos y recomendaciones de seguridad en redes Wi-Fi.
Este documento discute cómo el maltrato animal es una expresión del fascismo y el especismo. Define el maltrato animal como la aceptación de la violencia y la supresión de la vida de cualquier especie. Explica que el bombardeo con imágenes de actos violentos contra los animales es una estrategia fascista para normalizar la violencia. Concluye que el maltrato animal rara vez se queda solo en eso y puede conducir a la aceptación más amplia de la violencia.
Este documento presenta un juego en el que el jugador debe adivinar un número entre 1 y 60. El sistema verifica si el número del jugador está en una lista de números y le indica si acertó o no. El jugador puede intentar adivinar de nuevo varias veces.
El documento discute el lenguaje simbólico y figurativo que Cristo usó para revelar la palabra de Dios. Explica que este lenguaje de Cristo no debe tomarse literalmente ni descartarse como absurdo, sino que debe comprenderse para iluminar el futuro de la humanidad. También muestra un diagrama de cómo Dios ha ido revelando su mensaje progresivamente a través de varias figuras religiosas a lo largo de la historia.
El documento describe el positivismo como una doctrina filosófica que surgió a finales del siglo XIX y principios del XX en un contexto de cambios tecnológicos y declive de lo metafísico y religioso. El positivismo considera que solo el conocimiento científico obtenido a través de la experimentación y el método científico es válido para entender el mundo de manera objetiva. Auguste Comte acuñó el término y David Hume y otros filósofos anticiparon algunos de sus conceptos basados en la experiencia
El documento presenta el plan de gestión de uso de medios y TIC de la Institución Educativa Alfonso López Pumarejo. El plan fue elaborado con la asesoría del Ministerio de Educación Nacional y la Universidad del Cauca, y tiene como objetivo mejorar los procesos educativos a través de la incorporación de las TIC. El plan incluye una misión, visión y metas relacionadas con el uso de las TIC, así como estrategias para su implementación en las diferentes áreas de gestión de la institución.
This document provides information about the artist Michael Bublé, including a list of some of his popular songs from 2005-2014. It then discusses how Bublé's image is currently promoted through his unique style of music that draws on his Canadian/Italian background. Bublé is described as one of the most famous "crooners" who sings in a soft, low voice and replicates the styles of artists from the 20th century like Frank Sinatra. The document also examines whether Bublé's image is inverted or ready-made and how his commercial imperative allows him to generate popularity through his iconic style without needing extravagant promotional methods.
Este documento presenta la introducción de un libro que analiza la relación entre salarios y costo de vida en el Perú entre 1821 y 1879. Explica que tras la independencia, la economía colonial se mantuvo vigente, condenando a la pobreza a las masas rurales y urbanas. A lo largo del periodo, los salarios reales se redujeron drásticamente debido a la inflación y la devaluación monetaria, mientras que los terratenientes y comerciantes aumentaron sus ganancias. El libro examinará cómo afectaron estas políticas
Debate Gabriel salazar Alfredo Jocelyn-Holt- Rolf Lüders - Centro de Estudi...Historia Salinas Sánchez
Este documento presenta una reseña de un libro titulado "Mercaderes, Empresarios y Capitalistas (Chile, Siglo XIX)" de Gabriel Salazar. El libro analiza la historia del capitalismo en Chile desde la independencia hasta fines del siglo XIX. La reseña elogia la amplitud y profundidad de la investigación de Salazar, pero también critica algunas de sus tesis por ser excesivamente deterministas y monocausales. A pesar de esto, reconoce la importancia y valor del libro por rescatar sujetos históricos olvidados y
El documento describe la anatomía y fisiología del sistema nervioso. Explica que está compuesto por el sistema nervioso central (cerebro y médula espinal) y el sistema nervioso periférico (nervios y ganglios). También describe las partes principales del cerebro como el tálamo, hipotálamo y tronco encefálico, así como la médula espinal, plexos nerviosos y sistema nervioso autónomo.
El modelo matemático de Panagiotou describe la pérdida de agua y ganancia de sólidos durante la inmersión de una muestra en función de la concentración de la solución, la temperatura, el tiempo, la velocidad de agitación y el tamaño de la muestra. El modelo incluye ecuaciones para la pérdida de agua y ganancia de sólidos solubles que se resuelven usando datos del problema para determinar coeficientes A y B y tener ecuaciones de la forma y = A + Bx.
Este documento presenta una introducción a las redes sociales, incluyendo su definición, tipos principales como Facebook y Twitter, usos comunes como compartir fotos e informarse, y consejos sobre seguridad y privacidad al usarlas. También describe brevemente el origen y evolución de los sitios de redes sociales desde Classmates.com en 1995 y el enorme éxito que han tenido, con más de 940 millones de usuarios en sólo 7 años.
Este documento presenta una guía para mejorar la autoestima. Explica que la autoestima se refiere a cómo nos valoramos a nosotros mismos y cómo nos vemos. Una baja autoestima puede reconocerse por pensamientos y declaraciones negativas sobre uno mismo. También describe técnicas como identificar pensamientos automáticos negativos y reemplazarlos con respuestas más racionales para mejorar la autoestima. Finalmente, enfatiza la importancia de cambiar los comportamientos para romper círculos viciosos de negatividad
El documento describe un proyecto de diseño tipográfico que analiza la legibilidad y el contraste tipográfico. Se utilizan diferentes tipografías como Calisto MT y Pristina en tamaños que van desde 12pt hasta 36pt para un título, subtítulo, texto y leyenda con el propósito de describir un plato de comida criolla.
Dokumen ini berisi iklan dan daftar harga spring bed dari toko spring bed guhdo di Surabaya. Terdapat informasi tentang berbagai merek dan tipe spring bed beserta harganya. Kontak toko tertera lengkap beserta alamat website untuk mendapatkan informasi harga terbaru dan lengkap.
Antihemoroid DOEN (Suppositoria) AMBEJOSS, Anda mempunyai keluhan sakit di pantat atau bokong karena wasir?? Obat Herbal Ambejoss De Nature untuk Wasir atau Ambeien merupakan obat herbal terbuat dari ekstrak daun ungu, mahkota dewa dan kunyit putih yang aman dalam membantu penyembuhan apabila terkena penyakit Ambeien. Berikut kami sampaikan sedikit penjelasan tentang penyakit Ambeien atau Wasir. Wasir adalah masalah medis yang sangat umum dialami oleh banyak pria dan wanita. Wasir terjadi ketika pembuluh darah di daerah anus membengkak. Ketika pembengkakan ada di dalam rektum mereka wasir internal. Di luar rektum, pembuluh darah bengkak disebut wasir eksternal. Beberapa penyebab biasa dari vena bengkak adalah sembelit dan mengejan saat buang air besar, kehamilan dan hal lain yang memberikan kontribusi untuk menekan pada pembuluh darah di daerah dubur.
This paper develops a model to analyze how information can be revealed over time to maximize expected suspense or surprise experienced by an audience. Suspense is defined as uncertainty about the next period's beliefs, measured by the variance. Surprise is defined as the difference between the current and previous period's beliefs. The optimal policies for suspense and surprise are derived. For suspense, uncertainty decreases over time through asymmetric "plot twists." For surprise, uncertainty may increase or decrease, and beliefs change gradually with many periods.
The document discusses the myth of clutch hitting in baseball. Several studies have found no evidence that professional baseball players can consistently perform better in high-pressure situations. Despite this, the myth of clutch hitting persists due to factors like selective memory, the media spotlighting memorable clutch performances, and the desire of fans and players to believe in it. The concept of clutch hitting is difficult to define statistically and different analyses have found no correlation between clutch performance in one season carrying to the next.
This document provides an overview of American football and discusses opportunities for statistical analysis and research. It describes the basic structure of the game and notes that while detailed play-by-play data exists, it has not been easily accessible for academic research. The document then discusses several areas where statistical methods could be applied, including evaluating individual player and team performance, developing models to assess strategy around decisions like fourth downs and extra points, and creating statistical ratings of teams. It argues more research is needed to better understand the game and inform coaching decisions.
Este documento presenta un juego en el que el jugador debe adivinar un número entre 1 y 60. El sistema verifica si el número del jugador está en una lista de números y le indica si acertó o no. El jugador puede intentar adivinar de nuevo varias veces.
El documento discute el lenguaje simbólico y figurativo que Cristo usó para revelar la palabra de Dios. Explica que este lenguaje de Cristo no debe tomarse literalmente ni descartarse como absurdo, sino que debe comprenderse para iluminar el futuro de la humanidad. También muestra un diagrama de cómo Dios ha ido revelando su mensaje progresivamente a través de varias figuras religiosas a lo largo de la historia.
El documento describe el positivismo como una doctrina filosófica que surgió a finales del siglo XIX y principios del XX en un contexto de cambios tecnológicos y declive de lo metafísico y religioso. El positivismo considera que solo el conocimiento científico obtenido a través de la experimentación y el método científico es válido para entender el mundo de manera objetiva. Auguste Comte acuñó el término y David Hume y otros filósofos anticiparon algunos de sus conceptos basados en la experiencia
El documento presenta el plan de gestión de uso de medios y TIC de la Institución Educativa Alfonso López Pumarejo. El plan fue elaborado con la asesoría del Ministerio de Educación Nacional y la Universidad del Cauca, y tiene como objetivo mejorar los procesos educativos a través de la incorporación de las TIC. El plan incluye una misión, visión y metas relacionadas con el uso de las TIC, así como estrategias para su implementación en las diferentes áreas de gestión de la institución.
This document provides information about the artist Michael Bublé, including a list of some of his popular songs from 2005-2014. It then discusses how Bublé's image is currently promoted through his unique style of music that draws on his Canadian/Italian background. Bublé is described as one of the most famous "crooners" who sings in a soft, low voice and replicates the styles of artists from the 20th century like Frank Sinatra. The document also examines whether Bublé's image is inverted or ready-made and how his commercial imperative allows him to generate popularity through his iconic style without needing extravagant promotional methods.
Este documento presenta la introducción de un libro que analiza la relación entre salarios y costo de vida en el Perú entre 1821 y 1879. Explica que tras la independencia, la economía colonial se mantuvo vigente, condenando a la pobreza a las masas rurales y urbanas. A lo largo del periodo, los salarios reales se redujeron drásticamente debido a la inflación y la devaluación monetaria, mientras que los terratenientes y comerciantes aumentaron sus ganancias. El libro examinará cómo afectaron estas políticas
Debate Gabriel salazar Alfredo Jocelyn-Holt- Rolf Lüders - Centro de Estudi...Historia Salinas Sánchez
Este documento presenta una reseña de un libro titulado "Mercaderes, Empresarios y Capitalistas (Chile, Siglo XIX)" de Gabriel Salazar. El libro analiza la historia del capitalismo en Chile desde la independencia hasta fines del siglo XIX. La reseña elogia la amplitud y profundidad de la investigación de Salazar, pero también critica algunas de sus tesis por ser excesivamente deterministas y monocausales. A pesar de esto, reconoce la importancia y valor del libro por rescatar sujetos históricos olvidados y
El documento describe la anatomía y fisiología del sistema nervioso. Explica que está compuesto por el sistema nervioso central (cerebro y médula espinal) y el sistema nervioso periférico (nervios y ganglios). También describe las partes principales del cerebro como el tálamo, hipotálamo y tronco encefálico, así como la médula espinal, plexos nerviosos y sistema nervioso autónomo.
El modelo matemático de Panagiotou describe la pérdida de agua y ganancia de sólidos durante la inmersión de una muestra en función de la concentración de la solución, la temperatura, el tiempo, la velocidad de agitación y el tamaño de la muestra. El modelo incluye ecuaciones para la pérdida de agua y ganancia de sólidos solubles que se resuelven usando datos del problema para determinar coeficientes A y B y tener ecuaciones de la forma y = A + Bx.
Este documento presenta una introducción a las redes sociales, incluyendo su definición, tipos principales como Facebook y Twitter, usos comunes como compartir fotos e informarse, y consejos sobre seguridad y privacidad al usarlas. También describe brevemente el origen y evolución de los sitios de redes sociales desde Classmates.com en 1995 y el enorme éxito que han tenido, con más de 940 millones de usuarios en sólo 7 años.
Este documento presenta una guía para mejorar la autoestima. Explica que la autoestima se refiere a cómo nos valoramos a nosotros mismos y cómo nos vemos. Una baja autoestima puede reconocerse por pensamientos y declaraciones negativas sobre uno mismo. También describe técnicas como identificar pensamientos automáticos negativos y reemplazarlos con respuestas más racionales para mejorar la autoestima. Finalmente, enfatiza la importancia de cambiar los comportamientos para romper círculos viciosos de negatividad
El documento describe un proyecto de diseño tipográfico que analiza la legibilidad y el contraste tipográfico. Se utilizan diferentes tipografías como Calisto MT y Pristina en tamaños que van desde 12pt hasta 36pt para un título, subtítulo, texto y leyenda con el propósito de describir un plato de comida criolla.
Dokumen ini berisi iklan dan daftar harga spring bed dari toko spring bed guhdo di Surabaya. Terdapat informasi tentang berbagai merek dan tipe spring bed beserta harganya. Kontak toko tertera lengkap beserta alamat website untuk mendapatkan informasi harga terbaru dan lengkap.
Antihemoroid DOEN (Suppositoria) AMBEJOSS, Anda mempunyai keluhan sakit di pantat atau bokong karena wasir?? Obat Herbal Ambejoss De Nature untuk Wasir atau Ambeien merupakan obat herbal terbuat dari ekstrak daun ungu, mahkota dewa dan kunyit putih yang aman dalam membantu penyembuhan apabila terkena penyakit Ambeien. Berikut kami sampaikan sedikit penjelasan tentang penyakit Ambeien atau Wasir. Wasir adalah masalah medis yang sangat umum dialami oleh banyak pria dan wanita. Wasir terjadi ketika pembuluh darah di daerah anus membengkak. Ketika pembengkakan ada di dalam rektum mereka wasir internal. Di luar rektum, pembuluh darah bengkak disebut wasir eksternal. Beberapa penyebab biasa dari vena bengkak adalah sembelit dan mengejan saat buang air besar, kehamilan dan hal lain yang memberikan kontribusi untuk menekan pada pembuluh darah di daerah dubur.
This paper develops a model to analyze how information can be revealed over time to maximize expected suspense or surprise experienced by an audience. Suspense is defined as uncertainty about the next period's beliefs, measured by the variance. Surprise is defined as the difference between the current and previous period's beliefs. The optimal policies for suspense and surprise are derived. For suspense, uncertainty decreases over time through asymmetric "plot twists." For surprise, uncertainty may increase or decrease, and beliefs change gradually with many periods.
The document discusses the myth of clutch hitting in baseball. Several studies have found no evidence that professional baseball players can consistently perform better in high-pressure situations. Despite this, the myth of clutch hitting persists due to factors like selective memory, the media spotlighting memorable clutch performances, and the desire of fans and players to believe in it. The concept of clutch hitting is difficult to define statistically and different analyses have found no correlation between clutch performance in one season carrying to the next.
This document provides an overview of American football and discusses opportunities for statistical analysis and research. It describes the basic structure of the game and notes that while detailed play-by-play data exists, it has not been easily accessible for academic research. The document then discusses several areas where statistical methods could be applied, including evaluating individual player and team performance, developing models to assess strategy around decisions like fourth downs and extra points, and creating statistical ratings of teams. It argues more research is needed to better understand the game and inform coaching decisions.
This document is a dissertation that examines the determinants of NHL goalies' salaries. It aims to extend previous research by considering factors related to a player's popularity in addition to on-ice performance statistics. The author argues that after the 2004-2005 NHL lockout, which increased league profitability and popularity, goalies' wages became dependent on both on-ice production and off-ice popularity measures. Using regression analysis, the paper finds that including variables related to popularity significantly improves the model's ability to explain variation in goalies' salaries compared to only using performance statistics. The document provides context on previous literature, discusses the impact of the lockout, and outlines the data and methodology used in the empirical analysis.
The document discusses the role and importance of sports statisticians in analyzing game statistics. It explains that statisticians record various metrics during games like field goals, assists and rebounds. They then analyze the statistics to provide insights for coaches, players and fans. Studies have shown that teams performing better in assists, steals and defensive rebounds tend to be more successful. Statisticians also examine how factors like home/away games and starters/non-starters influence statistics and outcomes. Their analysis is valuable for improving team strategies and preparation.
This document provides an introduction to American football and a discussion of the history and current state of statistical analysis and player evaluation in the sport. It notes that while detailed play-by-play data exists, it has not always been easily accessible to academic researchers. As a result, statistical and analytical research in football lags behind other sports like baseball. The document outlines some of the challenges in analyzing football statistically, given factors like discrete scoring increments and the continuous movement of play over a large field. It then reviews the current basic approaches to evaluating different positions but notes the limitations, before discussing opportunities for more sophisticated statistical analysis and modeling.
This document provides an overview of American football and discusses opportunities for statistical analysis and research. It covers the following key points:
- American football involves two teams trying to score points by advancing the ball down the field. Scoring can occur via touchdowns, field goals, or safeties.
- While basic statistics are collected, opportunities exist to more formally evaluate players and analyze strategy using statistical methods. Areas that could be improved include evaluating positions like quarterback and evaluating based on points scored rather than just yardage.
- Open questions remain around how to partition credit for plays among multiple players, and how to explicitly link player performance to points scored and games won. Advanced statistical analysis could help address strategy questions around extra points,
NEW DELHI: For those obsessed with the size-zero, here's a phone as thin as paper. Called PaperPhone, the smartphone is presently in prototype stage. It uses latest printing technologies to print copper circuits and wiring on to a 9.5-centimetre surface. A layer of E Ink, used in Amazon Inc's Kindle eReader, is applied to act as a display. As for OS, it is powered by Google Android. To be unveiled at the forthcoming Association of Computing Machinery's CHI conference in Canada, the PaperPhone, has been developed by a team of researchers from Arizona State University, Queen's University, and E Ink Corporation. The 'flexible' phone can store books, play music and make phone calls. According to the researchers, bend gestures are fed into a gesture-recognition engine and can associate certain movements with certain instructions. As creator Roel Vertegaal, the director of Queen's University Human Media Lab told the The Vancouver Sun, "So you can bend the top in order to page forward or make a bookmark, you can navigate left and right on your home screen in order to open an icon, and you can make a call by squeezing the paper so that it curves, and then if you want to stop the call you pop it back into shape." "This is the future. Everything is going to look and feel like this within five years," he said. This computer looks, feels and operates like a small sheet of interactive paper. You interact with it by bending it into a cell phone, flipping the corner to turn pages, or writing on it with a pen, Vertegaal reportedly added. As for the pricing, while the prototype costs as high as $6,000 to $7,000, the device is likely to be priced less than $100.
http://qa.us/aaaaG9 is a link Multi channel content from new pagenikhilawareness
This document provides an overview of American football and discusses opportunities for statistical analysis and research. It covers the following key points:
- American football involves two teams trying to score points by advancing the ball down the field. Scoring can occur via touchdowns, field goals, or safeties.
- While basic statistics are collected, opportunities exist to more formally evaluate players and analyze strategy using statistical methods. Areas that could be improved include evaluating positions like quarterback and evaluating based on points scored rather than just yardage.
- Open questions remain around how to partition credit for plays among multiple players, and how to explicitly link player performance to points scored and games won. Advanced statistical analysis of strategy around fourth downs, extra points,
Harry Potter 7-2 3D tonight!!! http://4rd.ca/aaaj6wnikhilawareness
This document provides an overview of American football and discusses opportunities for statistical analysis and research. It covers the following key points:
- American football involves two teams trying to score points by advancing the ball down the field. Scoring can occur via touchdowns, field goals, or safeties.
- While basic statistics are collected, opportunities exist to more formally evaluate players and analyze strategy using statistical methods. Areas that could be improved include evaluating positions like quarterback and evaluating based on points scored rather than just yardage.
- Open questions remain around how to partition credit for plays among multiple players, and how to explicitly link player performance to points scored and games won. Advanced statistical analysis could help address strategy questions around extra points,
Go to all channels so that I may test your stats tomnikhilawareness
This document provides an introduction to analyzing American football statistically. It discusses how early statistical analysis focused on computerized systems to track opponents' tendencies. However, academic research has lagged compared to other sports due to issues with data availability, the complex nature of the game, and proprietary research related to gambling. The document then outlines open problems in statistically evaluating individual players at different positions, developing models to assess strategy, and rating teams.
This document provides an overview of American football and discusses opportunities for statistical analysis and research. It covers the following key points:
- American football involves two teams trying to score points by advancing the ball down the field. Scoring can occur via touchdowns, field goals, or safeties.
- While basic statistics are collected, opportunities exist to more formally evaluate players and analyze strategy using statistical methods. Areas that could be improved include evaluating positions like quarterback and evaluating based on points scored rather than just yardage.
- Open questions remain around how to partition credit for plays among multiple players, and how to explicitly link player performance to points scored and games won. Advanced statistical analysis could help address strategy questions around extra points,
This document provides an overview of American football and discusses opportunities for statistical analysis in evaluating players and strategies. It notes that while extensive play-by-play data exists, it has not been easily accessible for academic research. Key areas discussed include evaluating kickers and quarterbacks, partitioning credit among contributing players, and using points scored and games won in evaluations rather than just yardage. The document also discusses different types of strategic questions that could be analyzed, like extra point and fourth down decision making.
This paper examines the relationship between MLB players' salaries and various performance statistics from the 2013 season. The authors regress salary data against age, games played, home runs, slugging percentage, hits, at bats, and on-base percentage for 447 players after removing pitchers. Their model explains 51.39% of salary variation, suggesting these statistics significantly influence pay. Home runs, hits, at bats, and on-base percentage positively impact salary, while slugging percentage has a negative effect. The paper concludes player salaries can be reasonably predicted using performance data.
The document summarizes a study examining whether race is a factor in determining salaries for NFL players. The study collected data on 130 players' salaries, statistics, race, position, and years played. Regression analysis found that race was not a statistically significant predictor of salary. However, the study notes that positions in the NFL are dominated by certain races, with whites more likely to play quarterback and blacks more likely to play running back or wide receiver. This correlation between race and position raises questions about whether racial stereotypes influence player evaluations and positioning. Further research is needed to understand potential discrimination in how players are categorized by position.
This document summarizes the author's proposal for a new advanced lacrosse statistic called LAX IMPACT! to better evaluate player performance in Major League Lacrosse. It finds current lacrosse stats are limited and don't account for context. LAX IMPACT! calculates a player's points per team possession by considering shots, ground balls, and faceoff wins. An analysis of 2015 MLL data ranks players and finds top teams don't always have highest LAX IMPACT! due to style of play differences. The author believes advanced stats can help teams evaluate players and improve as salaries rise.
This document provides an analysis of MLB player valuation from 2010-2015 based on Marginal Revenue Product (MRP). The author developed models for team winning percentage and revenue. The winning percentage model uses wRC, UZR and xFIP as variables. The revenue model uses winning percentage, population, average ticket price and income per capita. These models are then used to determine which players were overvalued or undervalued based on their contribution to winning percentage and team revenue. The author aims to see if defensive metrics are more accurately valued now compared to past studies.
Writing Paper And Envelopes Sets, 72PCS Cute StationaryGina Alfaro
This document provides instructions for requesting a paper writing service from HelpWriting.net. It outlines a 5-step process:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, deadline and attaching a sample work.
3. Review bids from writers and choose one based on qualifications. Place a deposit to start the assignment.
4. Ensure the completed paper meets expectations and authorize final payment if pleased. Free revisions are provided.
5. Multiple revisions can be requested to ensure satisfaction. Papers are original and guaranteed or a full refund is provided.
This document is a thesis submitted by Steve Cultrera to Central Connecticut State University analyzing the impact of weather on runs scored in baseball games at Fenway Park in Boston over 40 years. It reviews previous literature that found variables like hits, walks, and stolen bases explain over 95% of runs scored, but none looked at weather impacts. The thesis aims to determine if weather variables like temperature, wind, and pressure can explain additional variance in runs. It describes the dataset created by combining baseball game data from Fenway Park with weather data from a nearby airport. Exploratory analysis, clustering, and predictive modeling techniques are used to analyze the data and relationships between weather and runs scored.
1. Big Data and The Hot Hand Fallacy:
A Nonparametric Approach
Jacob Dorn, Lukas Hager, David Mendelssohn∗
University of Chicago
March 2016
Abstract
While many fans believe in the concept of a “Hot Hand,” where a player who has
made many shots in close succession will have more success in the next attempt than
normal, statisticians have argued that shooting success is independent of whether or
not a player is indeed “hot.” We add to this discussion by using a dataset which has
only been in existence since 2010 and novel, nonparametric methods for defining heat.
In general, we find no effect of heat on shot outcomes after accounting for location. In
some special cases, we find that hot players in fact fare worse on their next attempt
by 2.5 to 5 percentage points. Additionally, we find that both offensive and defensive
players act as if there were such a thing as the Hot Hand, possibly explaining our lack of
Hot Hand findings, and our apparent worse performance may, instead, be endogenous
unobserved defender characteristics. This behavior is an important consideration in
future evaluations of any Hot Hand effect.
1 Introduction and Review of Relevant Literature
One of the most compelling spectacles in a basketball game is a ‘hot streak’, where a player
seemingly cannot miss. This effect is well documented in the more poetic portions of the
annals of basketball; many players, whether amateur or professional, have described them-
selves as hot when racking up high scoring totals. Purvis Short of the Golden State Warriors
famously declared that the Hot Hand “...[is] hard to describe. But the basket seems to be
so wide. No matter what you do, you know the ball is going to go in” [6, p. 158]. Based
on this statement, the Hot Hand effect can be conceptualized as immediately-prior success
positively correlating with immediately-subsequent success.
By and large, researchers have disagreed with Short’s contention of the Hot Hand’s existence.
The classic explanation follows the premise of Kahneman and Tversky’s “Law of Small
∗
With thanks to Winnie van Dijk, Ken DeGennaro, Jordan Solomon, and Charlie Rohlf
1
2. Numbers,” that humans are pattern-seeking beings, attributing meaning to random sampling
variation [8]. Essentially, they argue that, when a fan thinks a player is hot, they are over-
interpreting a chance occurrence.
While this disagreement might not look like an economic question at first glance, it can
be interpreted not only statistically, but also economically in many ways. The first such
interpretation is practical: An NBA championship may bring perhaps tens of millions of
dollars in revenue to a local economy [3]. While much of that income ends up in the hands of
already-wealthy team owners, and some of it may come at the expense of other parts of the
local economy, a true (or false) Hot Hand could present an inefficiency in player evaluation,
and thus be a factor in the allocation of a nontrivial amount of money. If there are hot
players, a coach who lets such a player take a break in the midst of a streak is a foolish
one. Further, if only certain players have the ability to get “hot”, then a coach ought to
consciously play them more when the team is losing with little time left, as the Hot Hand
becomes particularly beneficial if the player manages to make the game’s close by playing
well.1
The second interpretation views the response to perceived hot streaks as a particularly plain
instance of classic economic theory. A generalized corporation has the goal of maximizing
shareholder profit. Firms theoretically optimize profits, but the agents doing so do not always
have aligned incentives. A team is a specific corporation where “profit” is likely some mix of
the prestige of victory and the joy of hard cash. During a game, these two presumably align
and agents – players and coaches – should, at least in theory, aim to optimize the likelihood
of victory.
We may speculate on the implications of possible outcomes. If there are hot streaks and
coaches act as if there are not, it suggests some bit of information failure. But if there are no
real hot streaks, and agents act as if there are, it suggests some other incentives may be at
play. Perhaps coaches and defenders maneuver to avoid appearing, respectively, as oblivious
chumps who spurn a well-positioned asset by not playing a hot player or as absentminded
marks who fail to adjust to the time-specific threat of a hot shooter. Perhaps shooters take
advantage of the widespread belief that they have temporary special powers in order to take
more shots and experience a bit more of the limelight. More shots might even translate to
higher salaries, yielding mismatched incentives. While we do not consider these implications
in our paper, they are certainly relevant for future consideration.
1
It is worth noting that we do not consider time effects in “cooling” a player’s hot streak, but our inclusion
of only the prior four shots in evaluating heat should, hopefully, make this omission’s effects negligible.
2
3. The third interpretation is to view our paper as an investigation of a peculiar cultural
situation. The NBA can be viewed as a series of high-stakes Bernoulli trials – effectively,
coin flips with differing probabilities. Organizations spend hundreds of millions of dollars in
the hopes that, in any given night, they will have more coins come up heads than another
organization. The efficiency of these choices of metaphorical coins, of throwing style, and
of ordering of throws, gets debated for hours each day before and during the season on
commute-time radio and Internet forums (e.g. [5]). Many of these commentators think there
is such a thing as the Hot Hand; in these terms, a persistent belief in correlation across
Bernoulli trials. We would like to know if they are correct.
In 1985, Gilovich and Tversky wrote a seminal paper on this subject entitled “The Hot
Hand in Basketball: On the Misperception of Random Sequences.” After defining hotness
as a player hitting multiple shots in a row, they found (for a Philadelphia 76ers team over
48 home games) no statistical difference between the supposed hot streaks and the expected
number of hot streaks if shots were independent Bernoulli. This paper has been highly
regarded for a long time. In fact, the Harvard basketball team was criticized on the basis of
this paper’s findings when they professed belief in the Hot Hand [7]. However, Gilovich and
Tversky have a notable assumption, commendably mentioned in the beginning of the paper:
“Each player has an ensemble of shots that vary in difficulty (depending, for
example, on the distance from the basket and on defensive pressure), and each
shot is randomly selected from this ensemble.” [2, p. 297]
This statement, while perhaps necessary when using only data available in 1985, represents
a fundamental oversimplification of the problem. Player shot choice is non-constant. And,
in fact, we find that their choice depends on their heat level, as hot players take worse shots.
As such, treating the overall shooting percentage as constant across shots is fundamentally
misguided, as the very instance of a hot streak makes its continuation less likely.
Their approach was improved upon somewhat by Bocskocsky et al.’s “The Hot Hand: A New
Approach to an Old ‘Fallacy.’” The authors set about to ameliorate issues with Gilovich and
Tversky’s approach by defining heat as exceeding some expectation of performance, rather
than raw streaks of hits and misses. This required the creation of some way of defining the
expectation of a certain shot on the court. To achieve this end, Bocskocsky et al. regressed
shot outcomes on dummy variables for position on court, by dividing the floor into 2-by-2
foot boxes with assumed homogeneous effects across players, and regressing on dummies for
box and player [1]. Thus, their expected probability of making a shot model took distance
3
4. into account, only insofar as discrete squares can be a proxy for location. Further, they used
a linear regression model because of a lack of convergence of their Probit models; we consider
our KNN approach, using a similar data source, preferable.
2 Data
The SportVU system is an arrangement of cameras hung from the rafters of arenas for all
NBA teams. It collects 25 readings per second to provide statistics on player position, ball
position, and play information. STATS LLC provides an Application Programmer Interface
(API) to access the data, beginning in the 2010-2011 NBA season. They use two API sources:
shot chart data and shot log data. Shot chart data offers position, time, and outcome
information. Shot type (layup, jump shot, etc.) and some other provided attributes were
discarded from our analysis. We also removed the provided distance value in favor of a more
exact measure we calculated using (X,Y) data.2
We were able to retrieve approximately
205,000 shots worth of data from the shot charts from the 2014-2015 season. However,
because shots on which the shooter was fouled are only considered shots if the basket is
made, we limited our study to the 160,000 shots where no foul occurred. An example of all
of the remaining individual (X,Y) data points for Stephen Curry is provided as Figure 1.
Unfortunately, the shot log dataset became unavailable after we began this project. By
communicating directly with STATS, we were able to procure fewer than 10,000 shots worth
of data from the 2015-2016 season. This data is anonymized, meaning we could not integrate
it with the shot chart data we had already obtained, nor could we determine if a foul had
been committed on any shot. Not only did this mean that any analysis we could perform
would potentially suffer from omitted variable bias in the form of missing attributes, but it
also meant that we had very few shots worth of shot log data for any given player (and, at
most, four games per player). This issue is further accentuated by the fact that some players
shoot much more often than others, giving us even less data on those players who attempt
shots infrequently. Histograms of number of shots by a player in shot chart data and shot
log data are provided as Figure 2 and Figure 3 respectively.
As is clearly evident by comparing these figures, the shot chart data commonly has 1000 shots
per player, while, in the less complete shot log data, almost no one reaches even one tenth
of that frequency. Still, for analyzing this smaller shot log data, we used closest defender
2
We verified that our estimates were consistent with the given data. In fact, we found that the STATS
data consistently rounded in the same direction, so our imputation was a slight improvement.
4
5. 0
10
20
30
40
−20 −10 0 10 20
x
y
Figure 1: All Stephen Curry shots. Green triangles are made shots. Red circles are missed
shots.
5
6. 0.0
0.3
0.6
0.9
1.2
10 1000
Number of shots by player (log scale)
density
Figure 2: Histogram of number of shots per player in the larger shot chart dataset
6
7. 0.0
0.5
1.0
1 10 100
Number of shots by player (log scale)
Density
Figure 3: Histogram of number of shots per player in the smaller shot log dataset
7
8. 0
50
100
150
1 2 3 4
Number of games in dataset
Numberofplayers
Figure 4: Number of games by player in shot log dataset
8
9. height, closest defender distance, number of dribbles taken before the shot, an indicator
for scoring, (X,Y) location, (anonymized) game id number, (anonymized) player id number,
game period, game clock, and shot clock.3
Length of touch time was also provided, but we
exclude this from the analysis on theoretical grounds because it is potentially an effect of
player heat and thus should not be evaluated as a factor in order to prevent endogeneity,
with a justification we detail below. We also did not consider effects of game time, shot clock,
or number of dribbles, though these effects may turn out to be relevant in future analyses.
3 Methods
We consider each attempted shot to be, at the moment the offensive player decides to shoot,
a Bernoulli random variable with unknown probability. The probability is clearly in the
interval [0, 1], with a mix of factors in the probability generation model both unknown and
known to the econometrician. We may then write the outcome model for a shot by player i
in game g at time t as:
make(i,g,t) = f( Z(g,t), Y(i,g,t)
) + η(i,g,t) = f(X(i,g,t)) + η(i,g,t)
Here, f() is some unknown function of Z, factors which might apply identically to any other
player in the same game at the same time (e.g. if there is something odd about the arena at
that moment), and Y , factors which are time-, player-, and game-specific. It is worth noting
that, if the factors in Z can be consistently estimated for a player’s data, we can do just as
well by treating any factors in Z as factors in Y (though there may be losses of efficiency
in the finite case). We then combine these factors as Xi,g,t. As is well-known, since f is
necessarily in the interval [0, 1], η may be written as an exogenous (to f) variable to ensure
that the make function takes on the values 0 or 1, 0 for missing and 1 for making, and f()
is the probability of the shot being made. Our question of interest may then be restated as
a test for substantial evidence that heat belongs as an input to the functional form. We aim
to test the null hypothesis that most fans are wrong, i.e. our null hypothesis is that heat is
not a factor in f().
Before we may consider whether heat belongs as an argument to the functional form, we
must have some definition of heat. In turn, heat is defined as outperforming expectations
in prior shots, so to have a definition of heat, we must have a definition of expectations. In
3
For a reader more interested in the econometric content than the basketball content of this paper, this
clock effectively refers to the time remaining for the offensive team to attempt a shot. More details can be
found through alternative sources.
9
10. our main dataset, we have only position on court and time. We chose not to pursue time
controls, so we are left with only two potential inputs to control for in expectations: distance
and position.
We do not see any good method of parametrizing distance. Proportion of shots made by
nearest distance can be found in Figure 54
. While, in general, shots from farther away are
less likely to go in, there is a large stretch of distances where farther shots are more likely
to go in, before percentages fall off quickly with distances beyond the three-point line. This
makes the parametrization of distance difficult with any polynomial.
Indeed, we consider any attempt to parametrize these positional factors with a generalized
equation form difficult, because of the variety of player-specific positional factors for which
no econometrician can account. Consider Figure 6, a graph of Stephen Curry’s shooting
percentages by four square-foot (”2-by-2”) box in our dataset. Curry shoots far better in
the lower-left corner than the lower-right corner, and has a large patch of particularly good
shooting near (-8, 8). While the first factor might be captured by side of court, the latter
cannot easily be included in any top-down analysis.
There is also a more pernicious threat to any analysis of our larger dataset: defenders.
Because of limitations on data availability, we cannot incorporate any defender information
into our shot chart analysis. If defenders play more closely when the offensive player is hot,
this is a threat to exogeneity. The defender problem is an unavoidable result of temporary
data limitations. The issues of parametrizations, though, can be addressed with a different
prediction method: K Nearest Neighbors (KNN).
Given some number of shots (K) and some cutoff value (a distance), KNN predicts a shot’s
likelihood of going in by taking the sample average of the outcomes of the first K shots that
are within the cutoff. As we increase K, we incorporate more data and, if the additional data
is sufficiently similar, achieve more-exact predictions. However, many players lack sufficient
shots in all locations, so increasing K means that we cannot predict some shots for some
players. To remedy this, we can increase our cutoff, but this leads to us predicting a shot
using the outcomes of attempts that are less and less similar to it in terms of location. As
such, deciding which K and which cutoff value to use is a sort of optimization: we want to
be able to predict as many shots as possible, while also making sure that those predictions
are as accurate as possible. KNN further represents an attractive choice of specification,
as it has the property that when the ratio K
n
→ 0 when n → ∞, we have that the bias of
4
Data is rounded in log scale to the nearest multiple of 0.2, then exponentiated, because of large swings
in probability among the many shots near the basket.
10
11. 0.00
0.25
0.50
0.75
1 100
Distance (log scale)
prop_made
Figure 5: Shooting percentage, by distance, in larger dataset. Vertical line represents general
distance of three-point line.
11
13. the estimator goes to 0, and, under suitable moment conditions on f(), the variance of the
estimator tends towards 0 as well [4]. While our data is finite, a large K and small cutoff
should then yield low-variance and low-bias estimation of f().
We construct our KNN estimate as follows:
1. For each player’s shot, consider all of their other shots
2. If there are not at least K other shots by the player within the chosen cutoff distance
of the given shot (in our case, the cutoff is evaluated under the Euclidian metric, and
so can be measured in feet), make no prediction
3. If there are at least K other shots by the player within the chosen cutoff of the given
shot, use the proportion of those K which were successful
4. Optionally, among the shots where we can make a prediction, regress outcome on
prediction in a simple linear model to generate an improved “fitted” prediction
Choosing a specification for KNN (values for K and cutoff) is a non-trivial optimization
problem. We want to be able to predict on as many shots as possible, while also making
good predictions. We considered these factors in picking a specification, based on R2
values
for different specifications, visible in Figure 7. While R2
is certainly problematic when used
alone as a metric, it gives us a good idea of the fit of the specifications used. The best
specification by this metric was K=25 with a cutoff of 5 feet. However, we could only
predict 38% of shots with this standard. Potentially, these 38% may be different shots than
the remainder of our dataset, so we also consider the R2
-maximizing metric which covers at
least 80% of shots. This specification ended up being a K-value of 50 with no cutoff, for
which we can make a prediction on 99% of shots. We further cut our sample arbitrarily in
half and applied KNN to ensure that no overfitting was occurring; the results are shown in
Figure 8. We see that the prediction is worse, but not so much worse as to expect that KNN
has overfitted for any given specification.
While there are alternatives, such as Bocsocsky et al.’s division of the court into 2-by-2 foot
squares with player fixed effects (and other factors which we did not incorporate into our
analysis), we find that, after regressing on KNN predictions to produce “fitted” predictions,
our low-cutoff prediction method outperforms all other parametrizations tried (see Table 1),
with fewer degrees of freedom needed. Admittedly, our specification with no cutoff under-
performs other specifications, but, as it is solely used as a robustness check, this is of little
consequence.
13
15. q
q
q
q
q
q
q q q
q
q q
q
−0.8
−0.6
−0.4
−0.2
0.0
1 10 100
k
knn_r_sq
10
cutoff
cutoff_is
q 2
5
10
25
Inf
Figure 8: Predictions based on half of the data
15
16. Table 1: Comparison of R2
for various prediction methods in NBA location-only dataset5
prediction regression r sq adjusted r sq simple knn r sq
box pred with fe 0.079 0.073
linear fe pred 0.072 0.070
third deg fe pred 0.080 0.077
fifth deg fe pred 0.085 0.082
lin by player 0.078 0.073
lin plus theta by player 0.080 0.075
lin plus sin by player 0.080 0.075
lin plus triangle by player 0.083 0.075
cubic by player 0.089 0.080
cubic plus theta by player 0.092 0.082
cubic plus sin by player 0.092 0.082
cubic plus triangle by player 0.094 0.082
k=50 cutoff=none 0.061 0.054
k=25 cutoff=5 0.095 0.084
We then have a method to test the hypothesis that heat levels have no effect on shooting
percentage. If it were truly a part of our functional form, we could not give a specific level
of effect without parametrizing f(). But, to test our null (that each shot is independent),
we may consider the derivative of f() with respect to heat, however heat is defined. If f
is, in fact, differentiable with respect to heat, this function has a linear first order Taylor
Approximation, so we may include the average marginal effect of heat, however heat is
defined, as:
make(i,g,t) = βknn(i,g,t) + γheat(i,g,t) + (i,g,t)
Where knn(i,g,t) is our KNN prediction of the shot calculated using the player’s other shots,
as explained above.
Even if we were to find an effect of heat, the likely interpretation would be endogeneity.
Unobserved factors in shooting (e.g. defender quality, if the shooter fought with their spouse
5
Here, “box pred with fe” refers our attempt to replicate Bocsocksky et al.’s method of division into 2-by-
2 boxes, plus player fixed effects, though we did not have their other controls (namely their controls for time
remaining, score differential, play-by-play categorizaton, angle of defender, and height differential). “lin,”
“linear”, “third”, “cubic”, and “fifth” refer to, respectively, 1st
, 1st
, 3rd
, 3rd
, and 5th
degree polynomials of
imputed distance. “theta” refers to regressing on angle between shooter and basket, relative to the symmetric
line from basket to basket, while “sin” refers to the sin of that angle, and “triangle” regressions divided the
court with the y = x and y = −x lines to produce triangles for left, right, top, and bottom portions of the
court (with the bottom and top portions combined to form a “middle”). “fe” regressions include player fixed
effects; “by player” regressions regressed on distance and angle separately by player.
16
17. the prior night, the quality of the shooter’s lunch, time-specific injuries, etc.) are almost
certainly correlated among shots in the same game, and even more so, among shots in the
same game at similar times. Indeed, when viewed in this way, it would be shocking not to
find apparent positive effects of heat. It would suggest that the econometrician accounted
for all significant non-shot-specific factors!
We have thus far delayed any real definition of heat. This is, in part, because our choice of
definition requires navigating waves of ambiguity. It is not immediately clear what factors
would suggest one measure over the other. While we choose to use a novel definition that
seems at least as good as the others, we include other definitions for comparison. Before
embarking onto these mathematical decisions, it would be good to have a more specific sense
of what the “Hot Hand” is.
The informal sense – a player having a temporary burst of near-magical powers – suffices
from a fan’s perspective. Let us develop a more formal sense with a brief thought experiment.
Presume, perhaps impossibly, that a hot streak does exist during a specific player’s shot,
and bestows positive effects. What can the hot streak not do?
Suppose, at the precise moment the hot player releases his shot, a physicist were to freeze
the arena in time. They may explore the action with a calculator, a supercomputer, and
as much time as they would like. The physicist can calculate the velocity of the ball, the
thickness of the air, the density of sweat on the ball. We presume they may predict, with
some high degree of precision, whether the ball, if unimpeded on its way, will fall through
the basket. What role, at this moment, could the shooter’s hot streak serve? Very nearly
none, as the result is quite literally out of their hands.
Consider now the moment the player decides to shoot. If they have normally disadvantageous
momentum, they may be inclined to shoot anyway because of their heat. But at this moment,
their momentum is fixed, so heat’s only apparent effect can be the change of choice function.
A well designed model may show selection bias later rooted in this stage. But if a hot streak,
in fact, were to do nothing to shot quality (nor rational players’ shot selection), our physicist,
perhaps in cahoots with an econometrician and using more extensive data than ours, could
control for these shot-specific effects, yielding no remaining role for heat.
So we have a specific period of time in which we may see heat: During the action of the shot.
The player’s momentum, position, and defender are all fixed at the moment of decision, so
we should, given perfect information, control for these.
17
18. We are then left with what we would characterize as a hot streak given perfect information: a
change in likelihood of shot success, controlling for all characteristics decided at the beginning
of the shot.
The defender’s precise actions are in part decided in the few moments after release, so even
if we could, we should not control for overly-specific defender actions. For different reasons,
even if we could observe the quality of the player’s shooting motion, we should not control
for this; the shooting motion would be a likely path for heat to have its effect. Thus, we
would prefer to not include touch time in our analysis.
This definition has an uncomfortable implication that we should not control for some defender
characteristics. Morally, it feels wrong to ascribe the actions of the defender to the shooter.
Indeed, a player with heat may shoot better, but if they are also defended better, we may
even see them shoot more poorly during a hot streak. While perhaps morally repugnant,
this is the economically-relevant perspective. If all positive offensive effects are cancelled out
by defensive effects, there is no reason to pass to a “hot shooter.”
We now proceed to our mathematical definition of statistical heat. We first revisit Bocskoc-
sky et al.’s paper. Particularly, we are interested in their specification of “complex heat.”
Their definition is
Complexn = % of past n shots made − a priori expected shooting % of those n shots
This specification is certainly viable, but we chose to alternatively define heat using what
we called a “Heat CDF,” the probability that the shooter would do at least as well as they
actually did on the n immediately-prior shots. We can express this value as:
CDFn = P(X ≥ shots actually made) = P(makes at least as many shots as observed)
where n is the number of shots attempted and X is a variable referring to the number
of makes. Consistently with Bocskocsky et al., we choose n=4. This value is, in fact,
1 − CDFshooting percentage(shots made), but can be viewed as CDF“missing” percentage(shots
missed-ε), for any ε in (0, 1).
This statistic has two benefits. One benefit accrues to the researcher: We can compare
outcomes in a fair way. Not all players who make one more shot than expected – which is to
say, 25% more shots than expected – are equal. Instead, the language of probability allows
us to compare the marginal difficulty of those 25% additional makes, and to look at the
18
19. effects of the CDF being below any level, i.e. the effect of the immediately-prior four shots
being at least a certain level of improbability. The second benefit accrues to the reader: A
fan does not, in our experience, consider a player “hot” directly because of the additional
points they contribute. “Hot” is, instead, closer to a synonym of “impossible,” and so we
prefer to define heat as the impossibility of outcome; how likely a player was to do as well
as they did.
To construct the distribution, we use our KNN predictions to create a probability for each
of the prior four shots6
, and then calculate the player’s CDF function on those four shots.
As such, if the probability that they made as many shots as they did is extremely small
(i.e. small probability of seeing more extreme results), we would call them hot. We can
additionally define coldness as the opposite: if the probability that they made so few shots
out of the n is very small, we can call them cold. We omit results for coldness from this
paper for brevity, not logic.
As a robustness check, we also try Bocskocksy’s definition of heat, after multiplying by n, as
“complex difference,” and the ratio of percentage of shots made to expected shots made as
“complex ratio.” We find similar results with these definitions, though we lose the natural
cutoffs for comparing heat levels, p-values.
We also explored the use of linear, cubic, and higher-order polynomials, with either homoge-
neous effects in addition to player fixed effects or as independent variables with coefficients
allowed to vary for all players. Again, their R2
values all fell in between those of a fitted
KNN prediction with K=50 and no cutoff and with K=25 and a cutoff at 5 feet (Table 1),
suggesting that the aforementioned convergence as K
n
→ 0 and K → ∞ is, to at least a loose
degree, holding in our finite sample for K=25.
As a side note, by and large, our predictions are in the interval [0, 1]. For direct KNN, failure
in this is impossible. While regressing on KNN prediction, it is theoretically possible to find
either negative or greater-than-1 fitted probabilities. For the other methods of approximating
distance and angle, it is certainly possible that the sum of average effects may, in some cases,
yield an implied probability outside the realm of feasibility. In fact, these occurrences are
rare for non-KNN methods, and never were observed in our KNN methods. The non-KNN
percentages are detailed in Table 2.
6
We do not consider the possibility that a player might be “hot” during the first four shots of the game,
nor do we consider heat if we cannot make a prediction during all four immediately preceding shots. The
latter is alleviated by the inclusion of the no-cutoff KNN specification, but the former remains as a case
where our findings may not have external validity.
19
20. Table 2: Proportion of shots feasible for non-KNN predictions
Predictions in interval [0, 1]
Baseline, shot log 9036 feasible predictions
(0.994% of shots in dataset)
Cubic & triangle by player, shot chart 188901
(0.997%)
Cubic by player, shot chart 188961
(0.998%)
Fifth degree & fe, shot chart 189075
(0.998%)
4 Results and Discussion
Our main results will be based on the shot chart data, which lacks defender information.
We will also briefly look at shot log data to make conclusions about defender behavior, but
will be unable to look for heat for statistical reasons rooted in our lack of data. We ran
many alternative regressions which were omitted because of space but which, in general,
were consistent with the results given here.
Within the shot chart data, we will look for, and in general fail to find, heat, in our main
specification, K=25 with a cutoff of 5 feet and using our CDF of heat. We will then search,
analogously, for heat in the larger sample size where K=50 and no cutoff, to check for
selection bias among the shots with sufficient similar shots. As a robustness check, we will
also consider other methods of predicting shots and defining heat.
For all of these shot chart results, besides effects of heat on shooting percentage, we will
also consider apparent effects of heat on shooter behavior. This is done by regressing our
prediction itself on heat among the prior four shots. If a hotter player takes worse shots, we
will see a decrease in predicted quality of shot associated with hotter players.
Let us begin with the shot chart data. Consider Table 3. An increasing Heat CDF (i.e. a
player who was more likely to do at least as well as they did on their prior four shots, which
is to say he performed worse) is associated with worse outcomes, and the other two measures
of heat show slight improvement in shooting from hot players. While all these effects are in
the direction of increased performance by “hot” players, none of these effects are significant,
either statistically or practically.
There is some effect of statistically significantly hot players, but it runs in the opposite
20
21. Table 3: Dependent variable: Shot Made (k=25, cutoff=5 ft, fitted)
Definition of heat:
Heat CDF complex difference complex ratio
Fitted KNN prediction 1.008∗∗∗
1.048∗∗∗
1.048∗∗∗
(0.033) (0.044) (0.044)
Heat −0.015 0.003 0.007
(0.020) (0.008) (0.015)
Constant 0.008 −0.023 −0.030
(0.024) (0.023) (0.028)
Observations 8,786 4,407 4,407
R2
0.098 0.116 0.116
Adjusted R2
0.098 0.116 0.116
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
direction. As seen in Table 4, players with a Heat CDF below .05, i.e. players whose play
on the prior four shots would be seen as statistically-significantly better than expected from
a one-sided test, actually perform five percentage points worse. While the number of shots
where this occurs is not large enough to find statistical significance, similar effects are found
by using K=50 with no cutoff, as in Table 5. The magnitude of effect is smaller – only 2.5
percentage points, perhaps because of imprecision in prediction estimates – but the effect is
clear: Extremely hot players shoot worse.
21
22. Table 4: Dependent variable: Shot Made (k=25, cutoff=5 ft, fitted)
Heat CDF... ≤ 0.05 ≤ 0.1 ≤ 0.5
Fitted KNN Prediction 1.008∗∗∗
1.008∗∗∗
1.008∗∗∗
(0.033) (0.033) (0.033)
Heat CDF ≤ (x) −0.052 −0.002 0.021
(0.070) (0.042) (0.015)
Constant −0.004 −0.004 −0.007
(0.017) (0.017) (0.017)
Observations 8,786 8,786 8,786
R2
0.098 0.098 0.098
Adjusted R2
0.098 0.098 0.098
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
Table 5: Dependent variable: Shot Made (k=50, no cutoff,fitted)
Heat CDF... ≤ 0.05 ≤ 0.1 ≤ 0.5
Fitted KNN prediction 1.012∗∗∗
1.012∗∗∗
1.012∗∗∗
(0.012) (0.012) (0.012)
Heat CDF ≤ (x) −0.026∗∗
−0.011 −0.001
(0.011) (0.008) (0.003)
Constant −0.008 −0.008 −0.008
(0.005) (0.005) (0.005)
Observations 100,726 100,726 100,726
R2
0.062
Adjusted R2
0.062 0.062 0.062
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
How can we make sense of this? We cannot write off these effects as reflecting some curious
result of KNN predictions. Our alternative methods of making predictions – here we note
regressing on a cubic polynomial of log shot distance, with the possible inclusion of a triangle
for shot angle, and a general fifth-degree polynomial of log distance with player fixed effects
– yields similar coefficients on Heat CDF, as in Table 6
22
23. Table 6: Dependent variable: Shot Made
Prediction Method:
Cubic of Distance Cubic with Fifth-Degree Polynomial
Triangle with Player Fixed Effects
Shot prediction 1.007∗∗∗
1.006∗∗∗
1.003∗∗∗
(0.010) (0.010) (0.011)
Heat CDF ≤ CDF Max −0.024∗∗
−0.026∗∗
−0.024∗∗
(0.011) (0.011) (0.011)
Constant −0.008∗
−0.008∗
−0.006
(0.004) (0.004) (0.005)
Observations 100,994 100,994 100,994
R2
0.085 0.089 0.082
Adjusted R2
0.085 0.089 0.082
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
As an explanation, we suggest that unobserved defender choices are correlated with heat.
If, when a shooter plays exceptionally well, defenders put more effort into defense, we would
find that players who play extremely well on four shots will be guarded more carefully (or
even guarded by multiple players) on the fifth, yielding worse shooting percentages after
controlling for location. These effects would be most marked for extreme offensive perfor-
mances, yielding similar results. While we cannot test this hypothesis directly in the shot
chart data, we can find indirect evidence for it in offensive shot choice in the shot chart data
and direct evidence from the shot log data.
Analagous to Tables 3, 4, and 5 are Tables 7, 8, and 9. Here, we regress our predicted
likelihoods on heat – if a player takes shots that are more likely to go in, they are taking
better shots. Across measures in these three tables, we consistently find that hotter players
take worse shots (in the shot log data). While the general regression is not very significant,
the large-cutoff measures are.7
So we have a clear narrative that “hot” players do not seem to do better than others if we do
not control for their defenders, and, if anything, do slightly worse in extreme cases. We can
7
It is worth noting that there may be endogeneity, also caused by omitted defender information. It may
be that closely-defended shooters take farther shots to increase their distance from the defender. Removing
such endogeneity can be done once the shot log dataset becomes available again.
23
25. Table 9: Dep var: KNN Prediction (k=50, no cutoff, fitted)
Heat CDF... ≤ 0.05 ≤ 0.1 ≤ 0.5
Heat CDF ≤ (x) −0.013∗∗∗
−0.015∗∗∗
−0.014∗∗∗
(0.003) (0.002) (0.001)
Constant 0.402∗∗∗
0.403∗∗∗
0.407∗∗∗
(0.0004) (0.0004) (0.0005)
Observations 100,726 100,726 100,726
R2
0.0002 0.001 0.003
Adjusted R2
0.0002 0.001 0.003
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
test that this pattern does not reflect some sort of selection issues by regressing outcomes
and shot selection on heat, by each player, and comparing the distribution of coefficients.
To ensure we have enough shots, we used K=50 and no cutoff. The relevant graphs can
be found in Figure 9 and Figure 10, with median coefficients as a vertical line. Indeed, the
distribution of estimates appears roughly centered on our pooled OLS estimates, supporting
our findings.
We can also consider the shot log data. With this, we used only a “baseline” model, generat-
ing predictions from a player fixed effects model with a homogeneous third-degree polynomial
of log distance in addition to homogeneous defender distance and height effects, with the
results in Table 10. We found, unsurprisingly, that defender closeness (the distance between
shooter and defender when the shot is taken) and height both reduced the likelihood of shot
success, again suggesting that the omission of these defensive factors in our model of heat is
detrimental to our method.
We can do the analagous regressions of shot outcomes on heat. We find, consistently, that
hot players shoot worse (e.g. Table 11). Yet there is a clear explanation, which is rooted in
our player fixed effects.
Because residuals are orthogonal to player after the inclusion of player fixed effects, the sum
of residuals for each player must be zero. For a player with, say, five shots, if the first four
exceed expectations greatly, the fifth must be significantly below expectations in order to
balance out the residuals. Since we have at most four games of data for any player, such
effects persist in our sample. Indeed, we find that if we reduce our regression to players where
25
26. 0
1
2
3
0 1
Coefficient on heat
Density(unweighted)
Figure 9: Coefficients on Heat CDF from regressing outcome on fitted knn prediction and
heat, by player
26
27. 0.00
0.25
0.50
0.75
1.00
1.25
−4 −2 0 2
Coefficient on heat
Density(unweighted)
Figure 10: Coefficients on Heat CDF from regressing shot quality on heat, by player
27
29. we can make predictions on at least 10 shots, the effect is markedly reduced (see Table 11),
strongly suggesting that these apparent findings reflect our statistical approach, rather than
anything which has to do with basketball itself. It is notable that KNN, which only uses
other shots to make a prediction on a given shot, would avoid these issues, even though the
shot log dataset is not large enough to make the approach useful.
Table 11: Dependent variable: Shooting Percentage (shot log data)
Group:
all shooters
Baseline prediction 1.014∗∗∗
(0.047)
Heat CDF 0.086∗∗∗
(0.024)
Constant −0.056∗∗
(0.027)
Observations 5,076
R2
0.087
Adjusted R2
0.087
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
frequent shooters
0.995∗∗∗
(0.052)
0.059∗∗
(0.026)
−0.031
(0.029)
4,497
0.078
0.077
On the other hand, hot players had defenders play more closely to the shooter, with similar
magnitudes of effect across both frequent and all shooters; defender height does not appear
to be affected (Tables 12 and 13, respectively). While we do not find statistical significance
in either, undoubtedly caused at least in part by our small sample size, we expect that,
once the relevant data becomes available again at a comparable scale, this portion of our
analysis could be completed, perhaps using KNN with some defender distance metric. In
the meantime, the consistency among frequent and infrequent shooters is heartening but
unsurprising. We did not mandate that residuals in defender distance balance out to zero,
so we no longer are forcing canceling effects in subsequent shots.
29
30. Table 12: Dependent variable: Closest Defender Distance
Group:
all shooters
Heat CDF 0.102
(0.135)
Constant 4.029∗∗∗
(0.094)
Observations 5,076
R2
0.0001
Adjusted R2
−0.0001
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
frequent shooters
0.104
(0.139)
3.997∗∗∗
(0.097)
4,497
0.0001
−0.0001
Table 13: Dependent variable: Closest Defender Height
Group:
all shooters
Heat CDF −0.001
(0.015)
Constant 6.612∗∗∗
(0.011)
Observations 5,076
R2
0.00000
Adjusted R2
−0.0002
Note: ∗
p<0.1; ∗∗
p<0.05; ∗∗∗
p<0.01
frequent shooters
−0.013
(0.016)
6.617∗∗∗
(0.011)
4,497
0.0001
−0.0001
5 Conclusion
We find no real evidence that hot shooters shoot better. If anything, we find that hot shooters
shoot worse on their next shot. We suspect that this reflects endogeneity dictated by limited
data availability. This explanation is in agreement with Bocskocsky et al.’s finding that a
player shooting well caused a defender to move closer. We also find strong indication that
defender size and effort have significant effects on shot quality. Our clearest findings – a
30
31. weak indication that defender effort is impacted by heat, and a strong indication that shot
choice is impacted – are important for future studies of the Hot Hand. Our KNN and CDF
methodology will be a useful approach for further examinations as well.
References
[1] Bocskocsky, A., J. Ezekowitz, and C. Stein (2014). The hot hand: A new ap-
proach to an old “Fallacy”. In 8th Annual MIT Sloan Sports Analytics Conference.
http://www.sloansportsconference.com/wp-content/uploads/2014/02/2014 SSAC The-
Hot-Hand-A-New-Approach.pdf.
[2] Gilovich, T. (1985). The hot hand in basketball: On the misperception of random
sequences. Cognitive Psychology 17(3), 295–314.
[3] Hemlock, D. (2013). What’s an nba championship worth? at least tens of millions of
dollars, sports business experts say. SunSentinel. http://articles.sun-sentinel.com/2013-
05-26/news/fl-heat-nba-championship-worth-20130526 1 miami-heat-nba-championship-
merchandise-sales”.
[4] Mack, Y. and M. Rosenblatt (1979). Multivariate k-nearest neighbor density estimates.
Journal of Multivariate Analysis 9(1), 1–15.
[5] Paine, N. (2015). No matter how much they make, the best players in the nba are vastly
underpaid. Fivethirtyeight.com. http://fivethirtyeight.com/features/kawhi-leonard-like-
all-the-best-nba-players-is-vastly-underpaid/.
[6] Poundstone, W. (2014). How to Predict the Unpredictable: The Art of Outsmarting
Almost Everyone. Oneworld Publications.
[7] Tepper, T. (2014). Why you shouldn’t overplay a hot hand — in basketball or investing.
Time Magazine. http://time.com/money/3145979/hot-hand-basketball-investing/.
[8] Tversky, A. and D. Kahneman (1971). Belief in the law of small numbers. Psychological
Bulletin 76(2), 105–110.
31