SlideShare a Scribd company logo
1 of 20
L’ONOMASTIQUE APPLIQUÉE AU
DÉCRYPTAGE DES ENJEUX IDENTITAIRES ET DE TERRITOIRE
Elian CARSENAT, NamSor Applied Onomastics
1
2015-09-12
2
 (En introduction : présentation du cas AKAF47)
Founder Bio
3
Elian CARSENAT, a computer scientist trained at
ENSIIE/INRIA, started his career at JP Morgan in
Paris in 1997. He later worked as consultant and
managed business & IT projects in London, Paris,
Moscow and Shanghai.
In 2012, Elian created NamSor, a piece of
sociolinguistics software to mine the 'Big Data' and
better understand international flows of money,
ideas and people.
http://fr.linkedin.com/in/eliancarsenat/en
NamSor sorts Names
4
 Names are meaningful : we use sociolinguistics to extract their
semantics and deliver actionable intelligence.
 Names reflect cultural Identity
 NamSor data mining software
recognizes the linguistic or cultural
origin of names in any alphabet /
language, with fine grain and high
accuracy.
Mining 3M twitter names to map Diasporas
Who are they, where are they and what are they doing?
5
Source: Twitter
Source: Twitter
Visualization : CartoDB
Data Mining: NamSor
Flow view – who travels where?
6
Source Target Type Id Onoma Weight
United Kingdom France Directed 16 Great Britain 37
Spain France Directed 55 Spain 14
United States France Directed 75 Great Britain 12
Turkey France Directed 79 Turkey 11
Brazil France Directed 87 Portugal 10
United Kingdom France Directed 112 Ireland 9
Italy France Directed 152 Italy 7
Switzerland France Directed 226 France 5
Belgium France Directed 247 France 5
United Kingdom France Directed 258 France 5
Mexico France Directed 287 Spain 4
Ireland France Directed 317 Great Britain 4
United Kingdom France Directed 333 Italy 4
United States France Directed 375 France 4
Source: Twitter
Visualization : Gephi
Data Mining: NamSor
Mapping Talents in Cancer Research
(in collaboration with French INSERM)
7
Thomson Reuters WebOfScience (6 countries, 250k scientists, 50k papers)
“Analysts uncovered amazing patterns in the way scientists’ names correlate with whom they publish, and who
they cite in their papers - not just in case of a particular country, but globally. Tania Vichnevskaia of the French
National Institute for Health (INSERM) presented the paper ‘Applying onomastics to scientometrics‘ at IREG
International symposium 2015 organised by University of Maribor and Shanghai Jiao Tong University. The
paper was prepared jointly with NamSor, a private start-up company specialized in mapping international
Diasporas.”
Source: WoS; Data Mining: INSERM with NamSor
Cancer Research in Poland and Slovenia
Examining the ‘brain drain’
8
In the Polish Corpus, we look at co-
authors with Polish names, affiliated
abroad.
Top countries:
1. US,
2. Great-Britain,
3. Germany.
In the Slovenian Corpus, we look at co-
authors with Slovenian names,
affiliated abroad.
Top countries:
1. Great-Britain,
2. US,
3. Germany.
Source: WoS; Data Mining: INSERM with NamSor
9
 WORK IN PROGRESS – INDIA
“Incredible India” – 1.2 BN People
Indian onomastics by State/Union Territory
10
Names in LATIN, BENGALI, DEVANAGARI, GUJARATI, GURMUKHI, KANNADA, MALAYALAM,
ORIYA, TAMIL, TELUGU, ARABIC
GUJARAT : mapping onomastics by district
11
Source: Voters List; Visualization : Google Fusion Tables; Data Mining: NamSor
7Ahmedabad
22Surat
2Banaskantha
4Mahesana
9Rajkot
16Kheda
14Bhavnagar
21Bharuch
15Anand
24Navsari
19Vadodara
5Sabarkantha
6Gandhinagar
25Valsad
17Panchmahal
8Surendranagar
3Patan
10Jamnagar
12Junagadh
13Amreli
1Kachchh
28Morbi
27Arvalli
30GirSomnath
32Mahisagar
18Dahod
31Botad
29DevbhumiDwarka
20Narmada
26Tapi
33Chhotaudepur
11Porbandar
23Dangs
L62454:497
L63405:400
L79998:394
L184529:18
L511960:527
L437236:1278
L256619:111
L294948:886
L85680:416
L184530:134
ASSAM: Karbi Anglong, within district
Inter-caste marriages ?
12
output Input Input
clusterId clusterParentId Firstname LastName parent is FirstParentLastParent
L25354:253L64958:2797 A¡à[¹ ¹}[ššã husband ¤àl¡ü[W¡³ [W¡}>à¹
L47490:1593L64958:2797 ¤àK[¹ [W¡}>๠father ¤àl¡ü[W¡³ [W¡}>à¹
L28582:1209L47490:1593 [³>à Òü}[t¡šã husband ¤àK[¹ [W¡}>à¹
L23643:669L35593:510 ™åKƒ}à [W¡}>๚ã father ¤ài¡[W¡³ [W¡}>à¹
L23643:669L35593:510 ³à>àÒü [W¡}>๚ã father ¤ài¡[W¡³ [W¡}>à¹
L47490:1593L35593:510 W¡àì=¢ [W¡}>๠father Wå¡ì¤ [W¡}>à¹
L23643:669L35593:510 A¡àì¹ t¡àì¹ïšã husband Wå¡ì¤ [W¡}>à¹
L35593:510L47490:1593 [ƒ[ºš [W¡}>๠father W¡àì¤ [W¡}>à¹
L23643:669L47490:1593 [¹>à [W¡}>๚ã father W¡àì¤ [W¡}>à¹
parent is husband
Count of serial Column Labels
Row Labels L47490:1593 L116370:3612 L54332:2031 L184096:2297 L35593:510 L168871:1819 L135664:4438 L51271:837
L23643:669 6931 84 5099 15 2069 28 791 1924
L151415:3559 18 212 11 6446 19 1217 55 6
L28582:1209 5132 68 3565 10 1494 17 592 1323
L116370:3612 66 10283 38 72 40 321 137 29
L9839:442 2491 60 1851 9 774 11 321 660
L168871:1819 7 263 6 361 8 2730 24 4
L23642:141 1198 8 822 2 375 4 156 332
L25354:253 1181 12 932 375 7 100 323
L135664:4438 20 154 5 22 19 44 2212 3
L87032:1210 11 315 13 51 14 141 37 9
L90333:3644 3 204 2 31 190 5
L184096:2297 13 1735 3 84 11 1
L87031:697 4 136 4 12 3 137 4 5
L14495:131 614 10 432 167 4 68 163
L63724:1422 17 83 10 34 34 28 96 6
L98994:891 31 161 46 21 19 59 21 5
ASSAM: Karbi Anlong district
names clustered L116370:3612
L23643:669
L151415:3559
L47490:1593
L28582:1209
L54332:2031
L184096:2297
L168871:1819
L9839:442
L135664:4438
L87032:1210
L90333:3644
L35593:510
L51271:837
L63724:1422
L154797:1168
L64959:1796
L23642:141
L87031:697
L6536:295
L98994:891
L25354:253
L64958:2797
L30570:2614
L90334:1189
L95839:287
L100510:366
L121390:783
Other
Source: Voters List; Data Mining: NamSor
13
 USE CASE – BOSTON CITY
US Census vs NamSor geo-demographics
14
 In July 2015, the US Government announced new
rules that will require all cities and towns receiving
federal housing funds to assess patterns of
segregation.
 The NY Times has published interactive maps of
Boston geo-demographics, which we can compare
with the information inferred by NamSor
US Census Race Map of Boston
15
http://www.nytimes.com/interactive/2015/07/08/us/census-race-map.html
Using Voters List
 US Census:
1pixel = 40 inhabitants
 Voters List:
1 pixel = 1 voter
16
Source: Boston Voters List
Visualization : ESRI
Data Mining: NamSor
Voter’s list: zooming further into 051200
 US Census
 Voters List + NamSor
17
Source: Boston Voters List
Visualization : ESRI
Data Mining: NamSor
Breaking down ‘White’ and ‘Asian’ into
Portuguese, Spanish, Italian, India, Pakistan, China, ...
18
Source: Boston Voters List
Visualization : ESRI
Data Mining: NamSor
Using NamSor API
19
Option 1/
Online
Option 2/
RapidMiner Extension
Merci !
Elian CARSENAT,
elian.carsenat@namsor.com
Phone : +33 6 52 77 99 07
20
Juillet 2013, Ambassade de Lituanie à Paris

More Related Content

Similar to NamSor for GEOINT

Mining names in the big data to map diasporas - NamSor
Mining names in the big data to map diasporas - NamSorMining names in the big data to map diasporas - NamSor
Mining names in the big data to map diasporas - NamSorICMPD
 
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...Instituto Diáspora Brasil (IDB)
 
12 anni di Notte Europea dei Ricercatori!
12 anni di Notte Europea dei Ricercatori!12 anni di Notte Europea dei Ricercatori!
12 anni di Notte Europea dei Ricercatori!Giovanni Mazzitelli
 
Diasporas Digital Développement
Diasporas Digital DéveloppementDiasporas Digital Développement
Diasporas Digital DéveloppementElian CARSENAT
 
Forensic odontology a peek in the future dr vs-rege
Forensic odontology a peek in the future dr vs-regeForensic odontology a peek in the future dr vs-rege
Forensic odontology a peek in the future dr vs-regeAmit Amit
 
Innovation and Future Thinking @ IED 2017
Innovation and Future Thinking @ IED 2017Innovation and Future Thinking @ IED 2017
Innovation and Future Thinking @ IED 2017John V Willshire
 
Science communication a new frontier of researcher’s job - part 1
Science communication a new frontier of researcher’s job - part 1Science communication a new frontier of researcher’s job - part 1
Science communication a new frontier of researcher’s job - part 1Giovanni Mazzitelli
 
Mind-reading computer seminar ppt
Mind-reading computer seminar pptMind-reading computer seminar ppt
Mind-reading computer seminar pptsanjeev kumar suman
 
Researchers Night Frascati Scienza
Researchers Night  Frascati ScienzaResearchers Night  Frascati Scienza
Researchers Night Frascati ScienzaGiovanni Mazzitelli
 
National Honors Society Essay.pdf
National Honors Society Essay.pdfNational Honors Society Essay.pdf
National Honors Society Essay.pdfJackie Rojas
 
Enabling information interoperability with identifiers (L. Haak)
 Enabling information interoperability with identifiers (L. Haak) Enabling information interoperability with identifiers (L. Haak)
Enabling information interoperability with identifiers (L. Haak)ORCID, Inc
 
Bigdataforesight
BigdataforesightBigdataforesight
Bigdataforesightsuresh sood
 
Sensing devices by Masha Ioveva
Sensing devices by Masha IovevaSensing devices by Masha Ioveva
Sensing devices by Masha IovevaJooYoun Paek
 
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen ScienceECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen ScienceMargaret Gold
 
Report on Finger print sensor and its application
Report on Finger print sensor and its applicationReport on Finger print sensor and its application
Report on Finger print sensor and its applicationArnab Podder
 

Similar to NamSor for GEOINT (15)

Mining names in the big data to map diasporas - NamSor
Mining names in the big data to map diasporas - NamSorMining names in the big data to map diasporas - NamSor
Mining names in the big data to map diasporas - NamSor
 
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
 
12 anni di Notte Europea dei Ricercatori!
12 anni di Notte Europea dei Ricercatori!12 anni di Notte Europea dei Ricercatori!
12 anni di Notte Europea dei Ricercatori!
 
Diasporas Digital Développement
Diasporas Digital DéveloppementDiasporas Digital Développement
Diasporas Digital Développement
 
Forensic odontology a peek in the future dr vs-rege
Forensic odontology a peek in the future dr vs-regeForensic odontology a peek in the future dr vs-rege
Forensic odontology a peek in the future dr vs-rege
 
Innovation and Future Thinking @ IED 2017
Innovation and Future Thinking @ IED 2017Innovation and Future Thinking @ IED 2017
Innovation and Future Thinking @ IED 2017
 
Science communication a new frontier of researcher’s job - part 1
Science communication a new frontier of researcher’s job - part 1Science communication a new frontier of researcher’s job - part 1
Science communication a new frontier of researcher’s job - part 1
 
Mind-reading computer seminar ppt
Mind-reading computer seminar pptMind-reading computer seminar ppt
Mind-reading computer seminar ppt
 
Researchers Night Frascati Scienza
Researchers Night  Frascati ScienzaResearchers Night  Frascati Scienza
Researchers Night Frascati Scienza
 
National Honors Society Essay.pdf
National Honors Society Essay.pdfNational Honors Society Essay.pdf
National Honors Society Essay.pdf
 
Enabling information interoperability with identifiers (L. Haak)
 Enabling information interoperability with identifiers (L. Haak) Enabling information interoperability with identifiers (L. Haak)
Enabling information interoperability with identifiers (L. Haak)
 
Bigdataforesight
BigdataforesightBigdataforesight
Bigdataforesight
 
Sensing devices by Masha Ioveva
Sensing devices by Masha IovevaSensing devices by Masha Ioveva
Sensing devices by Masha Ioveva
 
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen ScienceECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
 
Report on Finger print sensor and its application
Report on Finger print sensor and its applicationReport on Finger print sensor and its application
Report on Finger print sensor and its application
 

More from Elian CARSENAT

NamSor AI Bias Estimator, Gender, Racial and Ethnic Fairness Toolkit
NamSor AI Bias Estimator, Gender, Racial and Ethnic Fairness ToolkitNamSor AI Bias Estimator, Gender, Racial and Ethnic Fairness Toolkit
NamSor AI Bias Estimator, Gender, Racial and Ethnic Fairness ToolkitElian CARSENAT
 
Announcing NamSorML : AI classifiers for race, ethnicity and migration studies
Announcing NamSorML :  AI classifiers for race, ethnicity and migration studiesAnnouncing NamSorML :  AI classifiers for race, ethnicity and migration studies
Announcing NamSorML : AI classifiers for race, ethnicity and migration studiesElian CARSENAT
 
GEOINT visualization of the Tunisian Diaspora in Europe
GEOINT visualization of the Tunisian Diaspora in EuropeGEOINT visualization of the Tunisian Diaspora in Europe
GEOINT visualization of the Tunisian Diaspora in EuropeElian CARSENAT
 
Promouvoir l'investissement en Afrique
Promouvoir l'investissement en AfriquePromouvoir l'investissement en Afrique
Promouvoir l'investissement en AfriqueElian CARSENAT
 
Gender Gap in Corporate Governance : AFRICA
Gender Gap in Corporate Governance : AFRICAGender Gap in Corporate Governance : AFRICA
Gender Gap in Corporate Governance : AFRICAElian CARSENAT
 
FDI Magnet wishes you a happy 2016!
FDI Magnet wishes you a happy 2016!FDI Magnet wishes you a happy 2016!
FDI Magnet wishes you a happy 2016!Elian CARSENAT
 
Data Geeks Paris - Cherchez la Femme
Data Geeks Paris - Cherchez la FemmeData Geeks Paris - Cherchez la Femme
Data Geeks Paris - Cherchez la FemmeElian CARSENAT
 
BigData Paris 2014 - Enjeux Sociaux
BigData Paris 2014 - Enjeux SociauxBigData Paris 2014 - Enjeux Sociaux
BigData Paris 2014 - Enjeux SociauxElian CARSENAT
 
Rôle des français de l'étranger pour faire rayonner la 'Marque France', les M...
Rôle des français de l'étranger pour faire rayonner la 'Marque France', les M...Rôle des français de l'étranger pour faire rayonner la 'Marque France', les M...
Rôle des français de l'étranger pour faire rayonner la 'Marque France', les M...Elian CARSENAT
 

More from Elian CARSENAT (9)

NamSor AI Bias Estimator, Gender, Racial and Ethnic Fairness Toolkit
NamSor AI Bias Estimator, Gender, Racial and Ethnic Fairness ToolkitNamSor AI Bias Estimator, Gender, Racial and Ethnic Fairness Toolkit
NamSor AI Bias Estimator, Gender, Racial and Ethnic Fairness Toolkit
 
Announcing NamSorML : AI classifiers for race, ethnicity and migration studies
Announcing NamSorML :  AI classifiers for race, ethnicity and migration studiesAnnouncing NamSorML :  AI classifiers for race, ethnicity and migration studies
Announcing NamSorML : AI classifiers for race, ethnicity and migration studies
 
GEOINT visualization of the Tunisian Diaspora in Europe
GEOINT visualization of the Tunisian Diaspora in EuropeGEOINT visualization of the Tunisian Diaspora in Europe
GEOINT visualization of the Tunisian Diaspora in Europe
 
Promouvoir l'investissement en Afrique
Promouvoir l'investissement en AfriquePromouvoir l'investissement en Afrique
Promouvoir l'investissement en Afrique
 
Gender Gap in Corporate Governance : AFRICA
Gender Gap in Corporate Governance : AFRICAGender Gap in Corporate Governance : AFRICA
Gender Gap in Corporate Governance : AFRICA
 
FDI Magnet wishes you a happy 2016!
FDI Magnet wishes you a happy 2016!FDI Magnet wishes you a happy 2016!
FDI Magnet wishes you a happy 2016!
 
Data Geeks Paris - Cherchez la Femme
Data Geeks Paris - Cherchez la FemmeData Geeks Paris - Cherchez la Femme
Data Geeks Paris - Cherchez la Femme
 
BigData Paris 2014 - Enjeux Sociaux
BigData Paris 2014 - Enjeux SociauxBigData Paris 2014 - Enjeux Sociaux
BigData Paris 2014 - Enjeux Sociaux
 
Rôle des français de l'étranger pour faire rayonner la 'Marque France', les M...
Rôle des français de l'étranger pour faire rayonner la 'Marque France', les M...Rôle des français de l'étranger pour faire rayonner la 'Marque France', les M...
Rôle des français de l'étranger pour faire rayonner la 'Marque France', les M...
 

Recently uploaded

Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptssuser319dad
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrsaastr
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@vikas rana
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxFamilyWorshipCenterD
 

Recently uploaded (20)

Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.ppt
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
 

NamSor for GEOINT

  • 1. L’ONOMASTIQUE APPLIQUÉE AU DÉCRYPTAGE DES ENJEUX IDENTITAIRES ET DE TERRITOIRE Elian CARSENAT, NamSor Applied Onomastics 1 2015-09-12
  • 2. 2  (En introduction : présentation du cas AKAF47)
  • 3. Founder Bio 3 Elian CARSENAT, a computer scientist trained at ENSIIE/INRIA, started his career at JP Morgan in Paris in 1997. He later worked as consultant and managed business & IT projects in London, Paris, Moscow and Shanghai. In 2012, Elian created NamSor, a piece of sociolinguistics software to mine the 'Big Data' and better understand international flows of money, ideas and people. http://fr.linkedin.com/in/eliancarsenat/en
  • 4. NamSor sorts Names 4  Names are meaningful : we use sociolinguistics to extract their semantics and deliver actionable intelligence.  Names reflect cultural Identity  NamSor data mining software recognizes the linguistic or cultural origin of names in any alphabet / language, with fine grain and high accuracy.
  • 5. Mining 3M twitter names to map Diasporas Who are they, where are they and what are they doing? 5 Source: Twitter Source: Twitter Visualization : CartoDB Data Mining: NamSor
  • 6. Flow view – who travels where? 6 Source Target Type Id Onoma Weight United Kingdom France Directed 16 Great Britain 37 Spain France Directed 55 Spain 14 United States France Directed 75 Great Britain 12 Turkey France Directed 79 Turkey 11 Brazil France Directed 87 Portugal 10 United Kingdom France Directed 112 Ireland 9 Italy France Directed 152 Italy 7 Switzerland France Directed 226 France 5 Belgium France Directed 247 France 5 United Kingdom France Directed 258 France 5 Mexico France Directed 287 Spain 4 Ireland France Directed 317 Great Britain 4 United Kingdom France Directed 333 Italy 4 United States France Directed 375 France 4 Source: Twitter Visualization : Gephi Data Mining: NamSor
  • 7. Mapping Talents in Cancer Research (in collaboration with French INSERM) 7 Thomson Reuters WebOfScience (6 countries, 250k scientists, 50k papers) “Analysts uncovered amazing patterns in the way scientists’ names correlate with whom they publish, and who they cite in their papers - not just in case of a particular country, but globally. Tania Vichnevskaia of the French National Institute for Health (INSERM) presented the paper ‘Applying onomastics to scientometrics‘ at IREG International symposium 2015 organised by University of Maribor and Shanghai Jiao Tong University. The paper was prepared jointly with NamSor, a private start-up company specialized in mapping international Diasporas.” Source: WoS; Data Mining: INSERM with NamSor
  • 8. Cancer Research in Poland and Slovenia Examining the ‘brain drain’ 8 In the Polish Corpus, we look at co- authors with Polish names, affiliated abroad. Top countries: 1. US, 2. Great-Britain, 3. Germany. In the Slovenian Corpus, we look at co- authors with Slovenian names, affiliated abroad. Top countries: 1. Great-Britain, 2. US, 3. Germany. Source: WoS; Data Mining: INSERM with NamSor
  • 9. 9  WORK IN PROGRESS – INDIA
  • 10. “Incredible India” – 1.2 BN People Indian onomastics by State/Union Territory 10 Names in LATIN, BENGALI, DEVANAGARI, GUJARATI, GURMUKHI, KANNADA, MALAYALAM, ORIYA, TAMIL, TELUGU, ARABIC
  • 11. GUJARAT : mapping onomastics by district 11 Source: Voters List; Visualization : Google Fusion Tables; Data Mining: NamSor 7Ahmedabad 22Surat 2Banaskantha 4Mahesana 9Rajkot 16Kheda 14Bhavnagar 21Bharuch 15Anand 24Navsari 19Vadodara 5Sabarkantha 6Gandhinagar 25Valsad 17Panchmahal 8Surendranagar 3Patan 10Jamnagar 12Junagadh 13Amreli 1Kachchh 28Morbi 27Arvalli 30GirSomnath 32Mahisagar 18Dahod 31Botad 29DevbhumiDwarka 20Narmada 26Tapi 33Chhotaudepur 11Porbandar 23Dangs L62454:497 L63405:400 L79998:394 L184529:18 L511960:527 L437236:1278 L256619:111 L294948:886 L85680:416 L184530:134
  • 12. ASSAM: Karbi Anglong, within district Inter-caste marriages ? 12 output Input Input clusterId clusterParentId Firstname LastName parent is FirstParentLastParent L25354:253L64958:2797 A¡à[¹ ¹}[ššã husband ¤àl¡ü[W¡³ [W¡}>๠L47490:1593L64958:2797 ¤àK[¹ [W¡}>๠father ¤àl¡ü[W¡³ [W¡}>๠L28582:1209L47490:1593 [³>à Òü}[t¡šã husband ¤àK[¹ [W¡}>๠L23643:669L35593:510 ™åKƒ}à [W¡}>๚ã father ¤ài¡[W¡³ [W¡}>๠L23643:669L35593:510 ³à>àÒü [W¡}>๚ã father ¤ài¡[W¡³ [W¡}>๠L47490:1593L35593:510 W¡àì=¢ [W¡}>๠father Wå¡ì¤ [W¡}>๠L23643:669L35593:510 A¡àì¹ t¡àì¹ïšã husband Wå¡ì¤ [W¡}>๠L35593:510L47490:1593 [ƒ[ºš [W¡}>๠father W¡àì¤ [W¡}>๠L23643:669L47490:1593 [¹>à [W¡}>๚ã father W¡àì¤ [W¡}>๠parent is husband Count of serial Column Labels Row Labels L47490:1593 L116370:3612 L54332:2031 L184096:2297 L35593:510 L168871:1819 L135664:4438 L51271:837 L23643:669 6931 84 5099 15 2069 28 791 1924 L151415:3559 18 212 11 6446 19 1217 55 6 L28582:1209 5132 68 3565 10 1494 17 592 1323 L116370:3612 66 10283 38 72 40 321 137 29 L9839:442 2491 60 1851 9 774 11 321 660 L168871:1819 7 263 6 361 8 2730 24 4 L23642:141 1198 8 822 2 375 4 156 332 L25354:253 1181 12 932 375 7 100 323 L135664:4438 20 154 5 22 19 44 2212 3 L87032:1210 11 315 13 51 14 141 37 9 L90333:3644 3 204 2 31 190 5 L184096:2297 13 1735 3 84 11 1 L87031:697 4 136 4 12 3 137 4 5 L14495:131 614 10 432 167 4 68 163 L63724:1422 17 83 10 34 34 28 96 6 L98994:891 31 161 46 21 19 59 21 5 ASSAM: Karbi Anlong district names clustered L116370:3612 L23643:669 L151415:3559 L47490:1593 L28582:1209 L54332:2031 L184096:2297 L168871:1819 L9839:442 L135664:4438 L87032:1210 L90333:3644 L35593:510 L51271:837 L63724:1422 L154797:1168 L64959:1796 L23642:141 L87031:697 L6536:295 L98994:891 L25354:253 L64958:2797 L30570:2614 L90334:1189 L95839:287 L100510:366 L121390:783 Other Source: Voters List; Data Mining: NamSor
  • 13. 13  USE CASE – BOSTON CITY
  • 14. US Census vs NamSor geo-demographics 14  In July 2015, the US Government announced new rules that will require all cities and towns receiving federal housing funds to assess patterns of segregation.  The NY Times has published interactive maps of Boston geo-demographics, which we can compare with the information inferred by NamSor
  • 15. US Census Race Map of Boston 15 http://www.nytimes.com/interactive/2015/07/08/us/census-race-map.html
  • 16. Using Voters List  US Census: 1pixel = 40 inhabitants  Voters List: 1 pixel = 1 voter 16 Source: Boston Voters List Visualization : ESRI Data Mining: NamSor
  • 17. Voter’s list: zooming further into 051200  US Census  Voters List + NamSor 17 Source: Boston Voters List Visualization : ESRI Data Mining: NamSor
  • 18. Breaking down ‘White’ and ‘Asian’ into Portuguese, Spanish, Italian, India, Pakistan, China, ... 18 Source: Boston Voters List Visualization : ESRI Data Mining: NamSor
  • 19. Using NamSor API 19 Option 1/ Online Option 2/ RapidMiner Extension
  • 20. Merci ! Elian CARSENAT, elian.carsenat@namsor.com Phone : +33 6 52 77 99 07 20 Juillet 2013, Ambassade de Lituanie à Paris