Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SueCA2

97 views

Published on

  • Login to see the comments

  • Be the first to like this

SueCA2

  1. 1. Higher Diploma in Data Analytics Programming for Big Data Semester 2 Advanced Business Data Analysis Continuous Assessment Assignment 2 Sue Ryo Kim
  2. 2. 1. Abstract Dublin County consists of four local authorities such as Dublin City, Dún Laoghaire-Rathdown, Fingal and Dublin South. Dublin County has total area of 922 km2 (356 sq mi) and has population of 1,273,069 (CSO 2011 [1]). The purpose of this analysis is to find out if there are any differencesbetweenlocal authorities in Dublin region. As the capital of Ireland, Dublin local authorities have active public transportation and travel times are expected to be influenced by traffic. Rush hour time (between 8 and 9am) is expected to have longer travel time. Also it is expected to have short travel time early in the morning before 07:30 am and after 9:30am as majority of starting hours of school and work is 9:00 am. Among the four authorities, Dublin City Council was expected to have longer travel time. However, considering distance of traveling from outside of Dublin to Dublin City and availability of transportation and road, Dublin City might have shorter travel times than other authorities. Time leaving home to travelto work betweenfour authorities of Dublin City, Dún Laoghaire-Rathdown, Fingal and Dublin South from Eastern and Midlands planning region were compared. Nine variables of time differences are travel time before 6:30, 6:30 to 07:00, 07:01 to 07:30, 07:31 to 08:00, 08:01 to 08:30, 08:31 to 09:00, 09:01 to 09:30, after 9:30 and not stated. 2. Analysis 2.1. Descriptive statistics In this section, four Dublin authorities and all Dublin County Area are analysed. The Figure 1 shows average travel times for the above time samples in four Dublin authorities and average of all Dublin County area. Overall, in Dublin City, traveltimes to school or work takes the least time among four Dublin authorities. Also, the travel time to school or work in Dublin City is almost 20 minutes shorter than the average four Dublin authorities. Dublin City, South Dublin and Fingal shows similar patterns for the average traveltime. However,Dún Laoghaire-Rathdown has shortest travel time before 6:30 excluding not stated. On the other hand, Dún Laoghaire-Rathdown shows the longest travel time at 08:01 to 08:30. Also, other County areas shows gradual decrease in travel time after the highest at 08:01 to 08:30. However, the travel time drops suddenly after the peak in Dún Laoghaire-Rathdown.
  3. 3. Figure 1. Average travel time to school or work in Dublin County areas. For all authorities, the busiest period was from 08:01 to 08:30 and the travel time rapidly dropped at 09:01 to 09:30. There is also sudden increase in travel time after 9:30. Both South Dublin and Fingal has very similar travel time patterns, the graphs are almost overlaps from 07:01. Fingal has the longest travel time overall and it takes over 23 minutes longer than compare to the average Dublin County. The time differences between Fingal and Dublin City is approximately 44 minutes. 2.2. Graphical display of distributions Histograms and scatter plots of travel time for Dublin City is drew below in Figure 2. Noticeably all the histogram are skewed right, the median is smaller than the mean, the mode is smaller than the median. Also, the upper tails are over the line in the Q-Q plots. The distance between each of the data points increases to the top of upper end of the range. 0 5 10 15 20 25 30 35 40 45 50 Minutes Axis Title AverageTravelTime to school/work DubCity Dún Laoghaire-Rathdown SouthDub Fingal AveDubCo
  4. 4. Figure 2. Histogram and scatterplot of Travel Time in Dublin City Figure 3 and figure 4 compared the average traveltime to school or work in Dún Laoghaire-Rathdown before 06:30 and at 08:01 to 08:30. The average of travel time in Dún Laoghaire-Rathdown is the most close to the average of travel time in the all for Dublin County areas. Also, Dún Laoghaire-Rathdown has the shortest and longest travel times among the Dublin County areas. In the figure 3, binominal distribution is observed. From the box plot outliers are observed upper quartile. The most of the travel time is observed between 0 minutes and 10 minutes. The average travel time before 06:30 is around 4 minutes.
  5. 5. Figure 3. Graphical display of distributions in Dún Laoghaire-Rathdown Travel Time before 0630 Figure 4.Graphical display of distributionsin Dún Laoghaire-Rathdown Travel Time at 0801 to 0830
  6. 6. Travel time at 08:01 to 08:30 has higher median than the travel time before 06:30. Also, the travel time range is wider than the range of the travel time before 06:30. From the box plot of figure 4, the most observations are upper part of the quartile which means it is skewed to the right. Figure 5. A multiple scatterplot among four Dublin Counties. In figure 5, a multiple scatter plot is created to predict relationships between four Dublin County areas. Asshown in the table , no patterns appear.This shows that there is no relationship betweenDublin City, Dún Laoghaire-Rathdown, Fingal and South Dublin. Thus, all areas in Dublin County areas are independent. 3. Kruskal_Wallis H Test A comparison of means of transport to work in four Dublin County area are tested with R. Null hypothesis: H0 = μ1= μ2= μ3= μ4 (There is no difference in the average travel time between 4 Dublin County areas.) Alternative hypothesis: H1= μ1 ≠ μ2 ≠ μ3 ≠ u4 (There is at least 2 differencesbetween4 Dublin County areas.)
  7. 7. The value of the test statistic is 1.2503. The value of the chi-square-tabulation is 7.814728. Thus, Test statistic value does not fall into the critical statistic value. We fail to reject the null hypothesis at the 95% significant level. Therefore, we concluded that the average traveltime of four Dublin County areas are statistically equal. 4. One way ANOVA analysis Three planning regions of Eastern and Midlands, North and West, and Southern areas are analysed to find out differences in the number of people using public transportation to school or work. Null hypothesis: H0 = μ1= μ2= μ3 (The average number of people using public transportation to school or work in the three planning regions are the same.) Alternative hypothesis: H1= μ1 ≠ μ2 ≠ μ3 (There is at least 2 differences between the average number of people using public transportation in the regions.) The test result is obtained by ANOVA analysis with SPSS. From the figure 6, the values of the test statistic is 3207.3. The critical statistic value of 6.96 is obtained from F-distribution table with df1 is 2 and df2 is 18485 (maximum number of df2 is 1000 from F-table). The test statistic value falls into the critical statistic value. We have enough evidence to reject the null hypothesis at p-value of 0.000. Therefore, we infer that there is at least one differences in the average number of people using public transportation to school or work among Eastern and Midlands, North and West and Southern planning regions [2]. Figure 6. ANOVA analysisfor the mean of people using public transportation to school or work.
  8. 8. The difference in the number of people using private transportation to school or work among the three planning regions is also analysed by ANOVA. Null hypothesis: H0 = μ1= μ2= μ3 (The average number of people using private transportation to school or work in among Eastern and Midlands, North and West and Southern planning regions is the same.) Alternative hypothesis: H1= μ1 ≠ μ2 ≠ μ3 (There is at least 2 differences between the average number of people using private transportation in the regions.) From the figure 7, the values of the test statistic, 50.19 is greater than the critical statistic value of 2.996218. We have enough evidence to reject the null hypothesis at p-value of 0.000. Therefore, we concluded that there is atleast one difference in the average number of people using private transportation to school or work among the three planning regions. Figure 7. ANOVA analysisfor the mean of people using private Transportation to school or work.
  9. 9. 5. Conclusion In this study, statistical tests for non-normal data are analysed with excel, R and SPSS. Travel time to school or work in Four Dublin County areas of Dublin City, Dún Laoghaire-Rathdown, Fingal and Dublin South are filtered from the provided data of “Small Area Population Statistics”. Descriptive statistics are used to summarize the data. Histogram, Q-Q plot and a multiple scatter plot is drawn to visualize the data of Four Dublin County areas. With Kruskal_Wallis H Test,no statistical differences in the average traveltime of four Dublin County areas are observed. ANOVA is applied to observe differences the average number of people using public or private transportation to school or work betweenEastern and Midlands, North and Westand Southern planning regions. The test result shows that the average number of people using public transportation to school or work is different at least one area among others. Also, the same result appeared to private transportation users. References [1] http://www.cso.ie/en/statistics/population/populationofeachprovincecountyandcity2011/ [2] http://www.sjsu.edu/faculty/gerstman/StatPrimer/F-table.pdf link to open resource.

×