SlideShare a Scribd company logo
Mining personal names in the
‘Big Data’ to map Diasporas
Who are they, where are they and what are they doing?
Connecting, Communicating and Networking with Diasporas
4-6 May 2016 - Dublin Castle - Ireland
Elian CARSENAT, NamSor
Funded by the
European Union
2
#RMM4Dublin
NamSor sorts Names
3
 Personal names are meaningful : we use sociolinguistics to
extract their semantics and deliver actionable intelligence.
 Names reflect cultural Identity
 NamSor data mining software
recognizes the linguistic or cultural
origin of names in any alphabet /
language, with fine grain and high
accuracy.
Mining 3M twitter names to map Diasporas
Who are they, where are they and what are they doing?
4
Source: Twitter
Source: Twitter
Visualization : CartoDB
Data Mining: NamSor
Flow view –
who travels
where?
5
Source Target Type Id Onoma Weight
United Kingdom France Directed 16 Great Britain 37
Spain France Directed 55 Spain 14
United States France Directed 75 Great Britain 12
Turkey France Directed 79 Turkey 11
Brazil France Directed 87 Portugal 10
United Kingdom France Directed 112 Ireland 9
Italy France Directed 152 Italy 7
Switzerland France Directed 226 France 5
Belgium France Directed 247 France 5
United Kingdom France Directed 258 France 5
Mexico France Directed 287 Spain 4
Ireland France Directed 317 Great Britain 4
United Kingdom France Directed 333 Italy 4
United States France Directed 375 France 4
Source: Twitter
Visualization : Gephi
Data Mining: NamSor
Flow view –
who travels
where?
6
Source: Twitter
Visualization : Gephi
Data Mining: NamSor
“Incredible India” – 1.2 Billion People
Indian onomastics by State/Union Territory
7
Names in LATIN, BENGALI, DEVANAGARI, GUJARATI, GURMUKHI,
KANNADA, MALAYALAM, ORIYA, TAMIL, TELUGU, ARABIC
Applications to a global Airline’s customer intelligence
8
Example: Indian Diaspora / Non Resident Indians (NRI)
based in the United States
‘It applies indeed to 93% of our customers: when
NamSor recognizes an Indian name, the client has
travelled to India in the past.’
At state level : ~50%
Finer grain segmentation using names brings insights into
diasporas’ travel patterns visiting family and friends in their
home country, as well as their specific needs.
Mapping Talents in Cancer Research
(in collaboration with French INSERM)
9
Thomson Reuters WebOfScience
(6 countries, 250k scientists, 50k papers)
“Analysts uncovered amazing patterns in the way scientists’ names
correlate with whom they publish, and who they cite in their papers
- not just in case of a particular country, but globally. Tania
Vichnevskaia of the French National Institute for Health (INSERM)
presented the paper ‘Applying onomastics to scientometrics‘ at IREG
International symposium 2015 organised by University of Maribor
and Shanghai Jiao Tong University. The paper was prepared jointly
with NamSor, a private start-up company specialized in mapping
international Diasporas.”
Source: WoS; Data Mining: INSERM with NamSor
10
Source: WoS; Data Mining: INSERM with NamSor
Mapping Talents in Cancer Research
(in collaboration with French INSERM)
Cancer Research in Poland and Slovenia
Examining the ‘brain drain’
11
In the Polish Corpus, we look at co-
authors with Polish names, affiliated
abroad. Top countries:
1. USA
2. Great-Britain
3. Germany
In the Slovenian Corpus, we look at co-
authors with Slovenian names,
affiliated abroad. Top countries:
1. Great-Britain
2. USA
3. Germany
Source: WoS; Data Mining: INSERM with NamSor
Tunisie
Marocains Résidant à l'Étranger (MRE)
Répartition parmi les principales Universités au Canada
13
Canadian Science Policy Conference - CSPC2015
Boston geo-demographics 1/2
14
Boston geo-demographics 2/2
15
Analysing patent data
16
Founder Bio
17
Elian CARSENAT, a computer scientist trained at ENSIIE/INRIA, started his career
at JP Morgan in Paris in 1997. He later worked as consultant and managed
business & IT projects in London, Paris, Moscow and Shanghai.
In 2012, Elian created NamSor, a piece of sociolinguistics software to mine the
'Big Data' and better understand international flows of money, ideas and
people. NamSor helps answer the perennial question all countries ask about
their diasporas – who are they, where are they and what are they doing.
NamSor has been used to attract Foreign Direct Investments (FDI), to build-up
international collaboration within scientific communities, to attract and
facilitate Diaspora investment in Start-ups...
as well as other use cases.
http://fr.linkedin.com/in/eliancarsenat/en
Thank you!
Elian CARSENAT
elian.carsenat@namsor.com
Phone : +33 6 52 77 99 07
www.namsor.com
18
Juillet 2013, Ambassade de Lituanie à Paris

More Related Content

Viewers also liked

Gold mining process
Gold mining processGold mining process
Gold mining process
physics101
 
Topic 2: Mining
Topic 2: MiningTopic 2: Mining
Topic 2: Mining
London Mining Network
 
Philippine Mining Act
Philippine Mining ActPhilippine Mining Act
Philippine Mining Act
leony_daisog
 
Mining law
Mining lawMining law
Environmental policy
Environmental policyEnvironmental policy
Environmental policy
Abdullah Mansoor
 
PHILIPPINE BIODIVERSITY: Ecological Roles, Uses, and Conservation Status
PHILIPPINE BIODIVERSITY: Ecological Roles, Uses, and Conservation StatusPHILIPPINE BIODIVERSITY: Ecological Roles, Uses, and Conservation Status
PHILIPPINE BIODIVERSITY: Ecological Roles, Uses, and Conservation Status
No to mining in Palawan
 
The Republic of Ireland Pw pt Presentation
The Republic of Ireland Pw pt PresentationThe Republic of Ireland Pw pt Presentation
The Republic of Ireland Pw pt Presentation
mariaghnet
 
Environmental Treaties, Laws and Policies
Environmental Treaties, Laws and PoliciesEnvironmental Treaties, Laws and Policies
Environmental Treaties, Laws and Policies
Genevieve Garcia
 
Brazil
BrazilBrazil
Brazil
aviapiana
 
ENVIRONMENTAL ETHICS
ENVIRONMENTAL ETHICSENVIRONMENTAL ETHICS
ENVIRONMENTAL ETHICS
bhanu_
 
Germany powerpoint
Germany powerpointGermany powerpoint
Germany powerpoint
amorris88
 
Environmental Laws
Environmental LawsEnvironmental Laws
Environmental Laws
Marilen Parungao
 
Environmental ethics
Environmental ethicsEnvironmental ethics
Environmental ethics
Mark McGinley
 
State of the Philippine Environment
State of the Philippine EnvironmentState of the Philippine Environment
State of the Philippine Environment
Arangkada Philippines
 
Powerpoint Germany
Powerpoint GermanyPowerpoint Germany
Powerpoint Germany
JohannaGschwendner
 
German Culture
German CultureGerman Culture
German Culture
ourcultureculture
 
Ethics
EthicsEthics

Viewers also liked (17)

Gold mining process
Gold mining processGold mining process
Gold mining process
 
Topic 2: Mining
Topic 2: MiningTopic 2: Mining
Topic 2: Mining
 
Philippine Mining Act
Philippine Mining ActPhilippine Mining Act
Philippine Mining Act
 
Mining law
Mining lawMining law
Mining law
 
Environmental policy
Environmental policyEnvironmental policy
Environmental policy
 
PHILIPPINE BIODIVERSITY: Ecological Roles, Uses, and Conservation Status
PHILIPPINE BIODIVERSITY: Ecological Roles, Uses, and Conservation StatusPHILIPPINE BIODIVERSITY: Ecological Roles, Uses, and Conservation Status
PHILIPPINE BIODIVERSITY: Ecological Roles, Uses, and Conservation Status
 
The Republic of Ireland Pw pt Presentation
The Republic of Ireland Pw pt PresentationThe Republic of Ireland Pw pt Presentation
The Republic of Ireland Pw pt Presentation
 
Environmental Treaties, Laws and Policies
Environmental Treaties, Laws and PoliciesEnvironmental Treaties, Laws and Policies
Environmental Treaties, Laws and Policies
 
Brazil
BrazilBrazil
Brazil
 
ENVIRONMENTAL ETHICS
ENVIRONMENTAL ETHICSENVIRONMENTAL ETHICS
ENVIRONMENTAL ETHICS
 
Germany powerpoint
Germany powerpointGermany powerpoint
Germany powerpoint
 
Environmental Laws
Environmental LawsEnvironmental Laws
Environmental Laws
 
Environmental ethics
Environmental ethicsEnvironmental ethics
Environmental ethics
 
State of the Philippine Environment
State of the Philippine EnvironmentState of the Philippine Environment
State of the Philippine Environment
 
Powerpoint Germany
Powerpoint GermanyPowerpoint Germany
Powerpoint Germany
 
German Culture
German CultureGerman Culture
German Culture
 
Ethics
EthicsEthics
Ethics
 

Similar to Mining names in the big data to map diasporas - NamSor

NamSor for GEOINT
NamSor for GEOINTNamSor for GEOINT
NamSor for GEOINT
Elian CARSENAT
 
Diasporas Digital Développement
Diasporas Digital DéveloppementDiasporas Digital Développement
Diasporas Digital Développement
Elian CARSENAT
 
Bigdataforesight
BigdataforesightBigdataforesight
Bigdataforesight
suresh sood
 
HomeComing for Develoment in Africa
HomeComing for Develoment in AfricaHomeComing for Develoment in Africa
HomeComing for Develoment in Africa
Elian CARSENAT
 
Text mining names in ‘Big Data’ to recognize migration trends
Text mining names in ‘Big Data’ to recognize migration trendsText mining names in ‘Big Data’ to recognize migration trends
Text mining names in ‘Big Data’ to recognize migration trends
Elian CARSENAT
 
Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...
Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...
Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...
Lewis Shepherd
 
Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Using Digital Traces for User Profiling: the Uncertainty of Identity ToolsetUsing Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Dr Muhammad Adnan
 
Communications practices for livestock genetics for Africa
Communications practices for livestock genetics for AfricaCommunications practices for livestock genetics for Africa
Communications practices for livestock genetics for Africa
ILRI
 
Future Social Media Research
Future Social Media ResearchFuture Social Media Research
Future Social Media Research
Pulsar
 
1000 Images About Cause And Effect Essay On Pinter
1000 Images About Cause And Effect Essay On Pinter1000 Images About Cause And Effect Essay On Pinter
1000 Images About Cause And Effect Essay On Pinter
Jennifer Nulton
 
How DNA and Criminal Profiling Solve Crimes
How DNA and Criminal Profiling Solve CrimesHow DNA and Criminal Profiling Solve Crimes
How DNA and Criminal Profiling Solve Crimes
Sam Brandt
 
Conoscenza finals 2016
Conoscenza finals 2016Conoscenza finals 2016
Conoscenza finals 2016
zakir husain delhi college
 
Datainnovation
DatainnovationDatainnovation
Datainnovation
suresh sood
 
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
Future Processing
 
Culture Mapping Food & Beverage Trends 2018
Culture Mapping Food & Beverage Trends 2018Culture Mapping Food & Beverage Trends 2018
Culture Mapping Food & Beverage Trends 2018
Tim Stock
 
IAOS 2018 - Official statistics and Indigenous People – the state of play and...
IAOS 2018 - Official statistics and Indigenous People – the state of play and...IAOS 2018 - Official statistics and Indigenous People – the state of play and...
IAOS 2018 - Official statistics and Indigenous People – the state of play and...
StatsCommunications
 
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen ScienceECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
Margaret Gold
 
Artificial Intelligence for Goods: Cases and Tools
Artificial Intelligence for Goods: Cases and ToolsArtificial Intelligence for Goods: Cases and Tools
Artificial Intelligence for Goods: Cases and Tools
Oleksandr Krakovetskyi
 
Enabling information interoperability with identifiers (L. Haak)
 Enabling information interoperability with identifiers (L. Haak) Enabling information interoperability with identifiers (L. Haak)
Enabling information interoperability with identifiers (L. Haak)
ORCID, Inc
 
Spark
SparkSpark

Similar to Mining names in the big data to map diasporas - NamSor (20)

NamSor for GEOINT
NamSor for GEOINTNamSor for GEOINT
NamSor for GEOINT
 
Diasporas Digital Développement
Diasporas Digital DéveloppementDiasporas Digital Développement
Diasporas Digital Développement
 
Bigdataforesight
BigdataforesightBigdataforesight
Bigdataforesight
 
HomeComing for Develoment in Africa
HomeComing for Develoment in AfricaHomeComing for Develoment in Africa
HomeComing for Develoment in Africa
 
Text mining names in ‘Big Data’ to recognize migration trends
Text mining names in ‘Big Data’ to recognize migration trendsText mining names in ‘Big Data’ to recognize migration trends
Text mining names in ‘Big Data’ to recognize migration trends
 
Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...
Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...
Manichean Progress: Positive and Negative States of the Art in Web-Scale Data...
 
Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Using Digital Traces for User Profiling: the Uncertainty of Identity ToolsetUsing Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
 
Communications practices for livestock genetics for Africa
Communications practices for livestock genetics for AfricaCommunications practices for livestock genetics for Africa
Communications practices for livestock genetics for Africa
 
Future Social Media Research
Future Social Media ResearchFuture Social Media Research
Future Social Media Research
 
1000 Images About Cause And Effect Essay On Pinter
1000 Images About Cause And Effect Essay On Pinter1000 Images About Cause And Effect Essay On Pinter
1000 Images About Cause And Effect Essay On Pinter
 
How DNA and Criminal Profiling Solve Crimes
How DNA and Criminal Profiling Solve CrimesHow DNA and Criminal Profiling Solve Crimes
How DNA and Criminal Profiling Solve Crimes
 
Conoscenza finals 2016
Conoscenza finals 2016Conoscenza finals 2016
Conoscenza finals 2016
 
Datainnovation
DatainnovationDatainnovation
Datainnovation
 
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
Coś zupełnie offline: badania etnograficzne są kluczem do skutecznego zaangaż...
 
Culture Mapping Food & Beverage Trends 2018
Culture Mapping Food & Beverage Trends 2018Culture Mapping Food & Beverage Trends 2018
Culture Mapping Food & Beverage Trends 2018
 
IAOS 2018 - Official statistics and Indigenous People – the state of play and...
IAOS 2018 - Official statistics and Indigenous People – the state of play and...IAOS 2018 - Official statistics and Indigenous People – the state of play and...
IAOS 2018 - Official statistics and Indigenous People – the state of play and...
 
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen ScienceECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
 
Artificial Intelligence for Goods: Cases and Tools
Artificial Intelligence for Goods: Cases and ToolsArtificial Intelligence for Goods: Cases and Tools
Artificial Intelligence for Goods: Cases and Tools
 
Enabling information interoperability with identifiers (L. Haak)
 Enabling information interoperability with identifiers (L. Haak) Enabling information interoperability with identifiers (L. Haak)
Enabling information interoperability with identifiers (L. Haak)
 
Spark
SparkSpark
Spark
 

More from ICMPD

Current state of migration in the Mediterranean - Nov 2016 by OECD
Current state of migration in the Mediterranean - Nov 2016 by OECDCurrent state of migration in the Mediterranean - Nov 2016 by OECD
Current state of migration in the Mediterranean - Nov 2016 by OECD
ICMPD
 
Prospects for International Migration Governance
Prospects for International Migration GovernanceProspects for International Migration Governance
Prospects for International Migration Governance
ICMPD
 
Migration Governance Framework & its applications by IOM
Migration Governance Framework & its applications by IOMMigration Governance Framework & its applications by IOM
Migration Governance Framework & its applications by IOM
ICMPD
 
25 Points d'action pour l'engagement des diasporas
25 Points d'action pour l'engagement des diasporas25 Points d'action pour l'engagement des diasporas
25 Points d'action pour l'engagement des diasporas
ICMPD
 
Trafficking in persons in Syria and the neighbouring countries
Trafficking in persons in Syria and the neighbouring countriesTrafficking in persons in Syria and the neighbouring countries
Trafficking in persons in Syria and the neighbouring countries
ICMPD
 
EUROMED Migration IV: Approaches & possibilities for future collaboration
EUROMED Migration IV: Approaches & possibilities for future collaborationEUROMED Migration IV: Approaches & possibilities for future collaboration
EUROMED Migration IV: Approaches & possibilities for future collaboration
ICMPD
 

More from ICMPD (6)

Current state of migration in the Mediterranean - Nov 2016 by OECD
Current state of migration in the Mediterranean - Nov 2016 by OECDCurrent state of migration in the Mediterranean - Nov 2016 by OECD
Current state of migration in the Mediterranean - Nov 2016 by OECD
 
Prospects for International Migration Governance
Prospects for International Migration GovernanceProspects for International Migration Governance
Prospects for International Migration Governance
 
Migration Governance Framework & its applications by IOM
Migration Governance Framework & its applications by IOMMigration Governance Framework & its applications by IOM
Migration Governance Framework & its applications by IOM
 
25 Points d'action pour l'engagement des diasporas
25 Points d'action pour l'engagement des diasporas25 Points d'action pour l'engagement des diasporas
25 Points d'action pour l'engagement des diasporas
 
Trafficking in persons in Syria and the neighbouring countries
Trafficking in persons in Syria and the neighbouring countriesTrafficking in persons in Syria and the neighbouring countries
Trafficking in persons in Syria and the neighbouring countries
 
EUROMED Migration IV: Approaches & possibilities for future collaboration
EUROMED Migration IV: Approaches & possibilities for future collaborationEUROMED Migration IV: Approaches & possibilities for future collaboration
EUROMED Migration IV: Approaches & possibilities for future collaboration
 

Recently uploaded

Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
Edge AI and Vision Alliance
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 

Recently uploaded (20)

Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
“How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-eff...
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Artificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic WarfareArtificial Intelligence and Electronic Warfare
Artificial Intelligence and Electronic Warfare
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 

Mining names in the big data to map diasporas - NamSor

  • 1. Mining personal names in the ‘Big Data’ to map Diasporas Who are they, where are they and what are they doing? Connecting, Communicating and Networking with Diasporas 4-6 May 2016 - Dublin Castle - Ireland Elian CARSENAT, NamSor Funded by the European Union
  • 3. NamSor sorts Names 3  Personal names are meaningful : we use sociolinguistics to extract their semantics and deliver actionable intelligence.  Names reflect cultural Identity  NamSor data mining software recognizes the linguistic or cultural origin of names in any alphabet / language, with fine grain and high accuracy.
  • 4. Mining 3M twitter names to map Diasporas Who are they, where are they and what are they doing? 4 Source: Twitter Source: Twitter Visualization : CartoDB Data Mining: NamSor
  • 5. Flow view – who travels where? 5 Source Target Type Id Onoma Weight United Kingdom France Directed 16 Great Britain 37 Spain France Directed 55 Spain 14 United States France Directed 75 Great Britain 12 Turkey France Directed 79 Turkey 11 Brazil France Directed 87 Portugal 10 United Kingdom France Directed 112 Ireland 9 Italy France Directed 152 Italy 7 Switzerland France Directed 226 France 5 Belgium France Directed 247 France 5 United Kingdom France Directed 258 France 5 Mexico France Directed 287 Spain 4 Ireland France Directed 317 Great Britain 4 United Kingdom France Directed 333 Italy 4 United States France Directed 375 France 4 Source: Twitter Visualization : Gephi Data Mining: NamSor
  • 6. Flow view – who travels where? 6 Source: Twitter Visualization : Gephi Data Mining: NamSor
  • 7. “Incredible India” – 1.2 Billion People Indian onomastics by State/Union Territory 7 Names in LATIN, BENGALI, DEVANAGARI, GUJARATI, GURMUKHI, KANNADA, MALAYALAM, ORIYA, TAMIL, TELUGU, ARABIC
  • 8. Applications to a global Airline’s customer intelligence 8 Example: Indian Diaspora / Non Resident Indians (NRI) based in the United States ‘It applies indeed to 93% of our customers: when NamSor recognizes an Indian name, the client has travelled to India in the past.’ At state level : ~50% Finer grain segmentation using names brings insights into diasporas’ travel patterns visiting family and friends in their home country, as well as their specific needs.
  • 9. Mapping Talents in Cancer Research (in collaboration with French INSERM) 9 Thomson Reuters WebOfScience (6 countries, 250k scientists, 50k papers) “Analysts uncovered amazing patterns in the way scientists’ names correlate with whom they publish, and who they cite in their papers - not just in case of a particular country, but globally. Tania Vichnevskaia of the French National Institute for Health (INSERM) presented the paper ‘Applying onomastics to scientometrics‘ at IREG International symposium 2015 organised by University of Maribor and Shanghai Jiao Tong University. The paper was prepared jointly with NamSor, a private start-up company specialized in mapping international Diasporas.” Source: WoS; Data Mining: INSERM with NamSor
  • 10. 10 Source: WoS; Data Mining: INSERM with NamSor Mapping Talents in Cancer Research (in collaboration with French INSERM)
  • 11. Cancer Research in Poland and Slovenia Examining the ‘brain drain’ 11 In the Polish Corpus, we look at co- authors with Polish names, affiliated abroad. Top countries: 1. USA 2. Great-Britain 3. Germany In the Slovenian Corpus, we look at co- authors with Slovenian names, affiliated abroad. Top countries: 1. Great-Britain 2. USA 3. Germany Source: WoS; Data Mining: INSERM with NamSor
  • 13. Marocains Résidant à l'Étranger (MRE) Répartition parmi les principales Universités au Canada 13 Canadian Science Policy Conference - CSPC2015
  • 17. Founder Bio 17 Elian CARSENAT, a computer scientist trained at ENSIIE/INRIA, started his career at JP Morgan in Paris in 1997. He later worked as consultant and managed business & IT projects in London, Paris, Moscow and Shanghai. In 2012, Elian created NamSor, a piece of sociolinguistics software to mine the 'Big Data' and better understand international flows of money, ideas and people. NamSor helps answer the perennial question all countries ask about their diasporas – who are they, where are they and what are they doing. NamSor has been used to attract Foreign Direct Investments (FDI), to build-up international collaboration within scientific communities, to attract and facilitate Diaspora investment in Start-ups... as well as other use cases. http://fr.linkedin.com/in/eliancarsenat/en
  • 18. Thank you! Elian CARSENAT elian.carsenat@namsor.com Phone : +33 6 52 77 99 07 www.namsor.com 18 Juillet 2013, Ambassade de Lituanie à Paris