Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Regional Science Presentation


Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

Regional Science Presentation

  1. 1. Surnames as Indicators of Cultural Regions James Cheshire PhD Supervisors: Prof. Paul Longley, Dr Pablo Mateos Department of Geography, University College London Research Blog: Email:
  2. 2. Outline <ul><li>Regional identity in Britain. </li></ul><ul><li>Surnames and regions. </li></ul><ul><li>Data and Lasker’s Distance. </li></ul><ul><li>Regionalization Results: Multidimensional Scaling, Clustering. </li></ul><ul><li>Corby: an interesting example. </li></ul><ul><li>Future Work </li></ul><ul><li>Conclusions. </li></ul>
  3. 3. Regional identity in Britain
  4. 4. Surnames and Regions <ul><li>Many surnames originate from a specific area. </li></ul><ul><li>The highest frequency of these names still exists in their place of origin. </li></ul><ul><li>We can therefore expect areas to possess unique combinations of names. </li></ul><ul><li>We can also expect certain types of surname to occur more frequently in some areas rather than others. </li></ul><ul><li>This study draws on the above assertions to identify areas/ populations that have similar surname structures within Great Britain. </li></ul>
  5. 5. Some Examples: Lewis Smith Macleod Buckley
  6. 6. Data 2001 Enhanced Electoral Roll 45.6 Million People 1,597, 805 Surnames 1,457, 681< 10 occurrences 1.5 million postcodes, 436 Districts 1881 Census 29 Million People 425, 793 Surnames 345, 781 <10 occurrences 657 Districts Worldnames Database Approx. 300 million individuals, 26 Countries
  7. 7. Creating Regions: Aggregating Surname Data <ul><li>Isonymy : The occurrence of the same name in marriage. </li></ul><ul><ul><li>The smaller the surname ‘pool’ the greater the probability of isonymy . </li></ul></ul><ul><li>Geneticists developed the Coefficient of Isonymy to estimate the probability of isonymy between two populations. </li></ul>L x,y = -log e 2(R x,y ) x and y: Districts i: Surname x i and y i : Freq. proportional to the x and y total popn. <ul><li>The Coefficient of Isonymy has been extended to a distance measure, the Lasker’s Distance, for comparison between populations. </li></ul>
  8. 8. Creating Regions: Aggregating Surname Data - Each district in Britain is assigned a position in “surname space” based on a matrix Lasker’s Distances. 95Z 99ZZ OOLN 00BL 7.520982 7.336616 7.219516 00BM 7.428889 7.315671 7.425037 00BN 7.347616 7.356772 7.394888 00BP 7.452982 7.299915 7.330886 00BQ 7.410027 7.300150 7.387787 Yarmouth Yeovil York Aberayron 6.389540 6.289929 6.438361 Aberdeen 6.356152 7.019357 6.213222 Abergavenny 6.412893 6.361753 6.566717 Aberystwith 6.327093 6.319481 6.467985 Abingdon 6.353814 6.559106 6.621873 2001 Matrix 1881 Matrix District x Lasker’s Distance
  9. 9. District x Lasker’s Distance Creating Regions: Grouping Lasker’s Distance - Multidimensional Scaling - Clustering: Ward’s Hierachical Clustering K- Means
  10. 10. Creating Regions: Multidimensional Scaling North East North West Yorkshire and the Humber East Midlands West Midlands East of England South East South West Wales Scotland Northern Ireland 1881
  11. 11. Creating Regions: Multidimensional Scaling 1881 2001
  12. 12. Creating Regions: Ward’s Hierarchical Clustering 1881 2001
  13. 13. Creating Regions: Ward’s Hierarchical Clustering 1881 2001
  14. 14. Danish Rule
  15. 15. Corby: A Scottish Town? 1881 2001 MDS Ward’s K -Means
  16. 16. Corby: A Scottish Town? In 1932 Stewarts and Lloyds built a new iron and steel works in Corby. Workforce sourced from closing Scottish steelworks, mainly in Lanarkshire. Into the 1970s, 50% of the incoming population Scottish. Transformed population from 1,500 to 34,000 . Annual Highland Games.
  17. 17. Future Work <ul><li>Methodological: </li></ul><ul><li>Different input geographies. </li></ul><ul><li>Narrow focus to specific areas/ groups of names. </li></ul><ul><li>Validation : </li></ul><ul><li>Comparison with genetics data. </li></ul><ul><li>Telephone call flows. </li></ul><ul><li>Application: </li></ul><ul><li>Genetic sampling strategy: “local names”. </li></ul><ul><li>Expansion: </li></ul><ul><li>- Incorporating Worldnames data for regionalisation of Europe. </li></ul>
  18. 18. Audlem…Is it Welsh?
  19. 19. Back to Audlem…Is it Welsh?
  20. 20. Conclusions <ul><li>Unprecedented mobility over the last century has failed to erase surname regions. </li></ul><ul><li>Clustering and MDS provide powerful methods for drawing out surname trends. </li></ul><ul><li>More research is required into the methods and scales to which they can be applied. </li></ul>
  21. 21. References Lasker Distance: Lasker, G. W. and C. G. N. Mascie-Taylor (2001). &quot;The genetic structure of English villages: surname diversity changes between 1976 and 1997.&quot; Annals of Human Biology 28(5): 546-553. K-Means: Adnan, M., Singleton, A.D., Brunsdon, C., Longley, P.A. 2009. Moving to Real-Time Segmentation: Efficient Computation of Geodemographic Classification. GISRUK 2009. Multidimensional Scaling Plots: Kleiweg, P. : Monmonier Algorithm: Manni, F., E. Guerard, et al. (2004). &quot;Geographic Patterns of (Genetic, Morphologic, Linguistic) Variation: How Barriers Can Be Detected by Using Monmonier’s Algorithm.&quot; Human Biology 76(2): 173-190. KDE: Crimestat Workbook: R Packages: Adegenet, cluster, maptools, rgl, sm, spdep , splancs from iL04_1.13 from All boundary data from the maps Crown Copyright Ordnance Survey 2009.
  22. 22. Please Visit for slides: