Statistical Analysis of PG&E electric SmartMeter accuracy Gaetan “Guy” Lion July 2010
Introduction <ul><li>PG&E, the utility serving Central and Northern California, is currently replacing all its customers analog meters with digital meters (SmartMeters) that transmit customers energy consumption wirelessly. By doing so, PG&E staff will not have to read millions of customer analog meters on a monthly basis; </li></ul><ul><li>PG&E has disclosed data regarding side-by-side testing of digital vs analog meters measuring electric consumption only (gas meter reading data is not available) at its website; </li></ul><ul><li>This statistical analysis evaluates the accuracy of PG&E’s electric SmartMeter only. </li></ul>
The data <ul><li>PG&E is currently testing 91 SmartMeter(s) matched with analog meters. </li></ul><ul><li>This testing started in mid April and is ongoing. </li></ul><ul><li>The readings are updated weekly and are cumulative. The data is disclosed as of July 6, 2010. </li></ul><ul><li>Some of the meters have been running for just a week. While others have been running for over two months. So, the cumulative reading of energy consumptions cover a wide range from a low of 44 kWh to a high of 3,294 kWh. </li></ul>
Benford’s Law to detect data manipulation The data for both the analog and the SmartMeter(s) conform very closely visually and statistically to Benford Law’s distribution of first digit (Chi Square p value of 0.90). So, from this standpoint there is no evidence of data manipulation.
Measure of Central Tendencies The average electricity consumption readings between the two meter types are almost identical.
Unpaired student t test The unpaired student t test is the most common statistical test to check whether a tested sample (Smartmeter) is different from a control group (analog meters). As shown, the p value of 98.9% reflects the probability that those two samples would come from the same population. This means that any difference between the two is trivial and due to randomness.
Testing whether samples come from populations that are normally distributed The Jarque-Bera test checks what is the probability (p value) that the samples in meters readings come from populations that are normally distributed. This test figures this out using the Skewness and Kurtosis of the samples. In this case, it deducts that there is a 0% probability that the samples come from populations that are normally distributed. Thus, we can’t just rely on the student unpaired t test that assumes that the samples come from normally distributed populations. We have to use a nonparametric test instead that relaxes the normal distribution assumption.
Mann-Whitney test Because the meter readings are not normally distributed, we have to use a nonparametric test, such as the Mann-Whitney test, that relaxes the normal distribution assumption. The main difference between the Mann-Whitney test and the unpaired student t test is that the Mann-Whitney test deals with average rank instead of average value. The Mann Whitney test directionally generates the same answer as the student t test (p value 98.8%). Thus, any difference between the two types of meters is due to randomness.
Viewing differences on a scatter plot This scatter plot graphs the cumulative usage in kWh on the x-axis and percentage difference on the y-axis between the two types of meters. This difference shrinks as cumulative usage increases. This suggests that difference between the two meters may be driven by discrepancies occurring during the set up of the meters. And, that the remainder of the period generates very accurate and comparable readings. Thus, the initial discrepancy shrinks naturally in % term as the cumulative usage increases.
Conclusion <ul><li>The electric SmartMeter(s) work well. Their readings do not diverge from the analog ones in any material way. </li></ul><ul><li>The readings between SmartMeters and analog meters may be even closer if any difference during the initial set up would be caught and resolved. </li></ul>