SlideShare a Scribd company logo
1 of 12
Download to read offline
Neo4Dogs
Innovation
Intelligent Systems
Software
Engineering
Graph Cafe, Teknologihuset, Oslo, 27.06.2014
Totto-14
@javatotto / totto@totto.org
A Global Leader
AMERICAS
EUROPE
ASIA
Bringing our customers'
projects to life and
boosting their performance
through technology and
innovation
«  
«  
€ +1 633 m
REVENUES in 2013
+20 000
EMPLOYEES in 2013
+20
COUNTRIES
R&D and Innovation
For 30 years, Altran has had a
close relationship with
innovation.
Where creative ideas become
a reality, Altran consultants
step up to transform ideas into
innovative solutions that can
enable technological progress.
In this way, Altran has
contributed to major
technological advances in
recent decades: speed,
precision, security,
communication, practicality,
interoperability, artificial
intelligence...
AEGT: the world's most
powerful electric car
Altran was responsible for
designing and engineering the
electric transmission on this
car, capable of reaching
speeds of 300 km/h.
Solar Impulse:
the first plane to fly on
solar energy alone
Since 2003, Altran experts
have dedicated their skills
to bringing about this
formidable technical and
human achievement.
The Airport of the Future:
outlining a ‘friend-lean’
space in 2040
Altran develops revolutionary
concepts for airports
responding to long-term
changes in the industry.
Agenda
● Situation analysis
● From dog register via case management
to dog-hub
● The platform
● Performance and some metrics
Initial analysis
● From register to case management – over 20 years of legacy..
– Dog information spread across 30+ relational tables
– 2-3 weeks of work to retrieve «a dog» with some info (every time)
– «impossible» to store new types of data/information on a dog
– Data was hidden/unavailable to people -> «data rot»
– Cascading costs of change and new features
● Recognized the need for a different approach
– But how to get out of the squeeze was not obvious..
– Limited technical skills, system knowledge and functional knowledge
– No time, capacity or money to do a «full rewrite»
● We selected a bottom up, data first, platform aproach. With strong capabilities for
continous data quality processes and strong support for semi-structured data.
From dog register to case management to dog hub
● Quick and easy access to individual dogs
● Scale - 10 to 50 integrations with other systems (hub)
● Handling individual dogs of «questionable» data quality
● Easily extendable to store more data on any individual
– Semi structured strategy for persistence/storage
Top level architecture
The platform we built
● Dog search & lookup
– SolrCloud with "json_full"
● DogPopulationService
– Pedigree, population structure, breeddata
– Data error, data deviation, data missing -> DogFixer
● DogIDMapper (multi-source, multi-master, map different ID-schemes)
● DogCrawler
– Is it possible to find aditional data to fix this individual?
● DogFixer
– Is it possible to statistically find the right answer?
– Manual process in some corner-cases / difficult cases
● DogServiceREST
– verify & merge, writeback updates
– «tailing» datasources of dog information
Some numbers
● 2 mill reqs/hour
● 10 mill reqs/24 hours
● Breed calculations went form taking «months» to «instant»
– 200-500 joins per individual, 1000/year, 10 years = 2-8 sek
● Latency: 0.2 sek, 99.7% of reqs
● DogIDMapper: 4000 dogs/sec
● DogGraph: 3000 dogs/sec
● DogFixer: 10-15 dogs/sec
● DogCrawler: 100-200 dogs/sec
Handle huge spikes
And survive «issues» with low latency
Try it out:
* http://dogsearch.nkk.no
* http://dogpopulation.nkk.no/
* http://dogpopulation.nkk.no/ras/?breed=Dunker
* http://dogpopulation.nkk.no/dogpopulation/concurrent/executor/status
* Code: by request :)

More Related Content

Similar to Neo4Dogs - a data quality platform approach with SolrCloud and graphs

Flutura jun 2013
Flutura jun 2013Flutura jun 2013
Flutura jun 2013fluturads
 
Exor Group Ltd Company Presentation
Exor Group Ltd   Company PresentationExor Group Ltd   Company Presentation
Exor Group Ltd Company Presentationexorgroup
 
Exor group ltd company presentation
Exor group ltd   company presentationExor group ltd   company presentation
Exor group ltd company presentationAnaMarijana
 
AI Unveiled: From Current State to Future Frontiers
AI Unveiled: From Current State to Future FrontiersAI Unveiled: From Current State to Future Frontiers
AI Unveiled: From Current State to Future FrontiersLiming Zhu
 
MDIS workshop 2015
MDIS workshop 2015MDIS workshop 2015
MDIS workshop 2015terradue
 
DataScouting - Success Through Software Innovation
DataScouting - Success Through Software InnovationDataScouting - Success Through Software Innovation
DataScouting - Success Through Software InnovationStavros Vologiannidis
 
Use r 2013 tutorial - r and cloud computing for higher education and research
Use r 2013   tutorial - r and cloud computing for higher education and researchUse r 2013   tutorial - r and cloud computing for higher education and research
Use r 2013 tutorial - r and cloud computing for higher education and researchkchine3
 
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...FIWARE
 
FinalPPT-StJoseph (3).pptx
FinalPPT-StJoseph (3).pptxFinalPPT-StJoseph (3).pptx
FinalPPT-StJoseph (3).pptxssuser046cf5
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Research Data Alliance
 
MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR Technologies
 
Deciphering AI: Human Expertise in the Age of Evolving AI
Deciphering AI: Human Expertise in the Age of Evolving AIDeciphering AI: Human Expertise in the Age of Evolving AI
Deciphering AI: Human Expertise in the Age of Evolving AILiming Zhu
 
OpenDataSoft - Towards Cost-efficient Innovation with Data Open Platforms
OpenDataSoft - Towards Cost-efficient Innovation with Data Open PlatformsOpenDataSoft - Towards Cost-efficient Innovation with Data Open Platforms
OpenDataSoft - Towards Cost-efficient Innovation with Data Open PlatformsOpenDataSoft
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsMohd Izhar Firdaus Ismail
 
7 drivers that will change the way we work
7 drivers that will change the way we work7 drivers that will change the way we work
7 drivers that will change the way we workTalent-Alpha
 

Similar to Neo4Dogs - a data quality platform approach with SolrCloud and graphs (20)

Flutura jun 2013
Flutura jun 2013Flutura jun 2013
Flutura jun 2013
 
Exor Group Ltd Company Presentation
Exor Group Ltd   Company PresentationExor Group Ltd   Company Presentation
Exor Group Ltd Company Presentation
 
Exor group ltd company presentation
Exor group ltd   company presentationExor group ltd   company presentation
Exor group ltd company presentation
 
AI Unveiled: From Current State to Future Frontiers
AI Unveiled: From Current State to Future FrontiersAI Unveiled: From Current State to Future Frontiers
AI Unveiled: From Current State to Future Frontiers
 
MDIS workshop 2015
MDIS workshop 2015MDIS workshop 2015
MDIS workshop 2015
 
DataScouting - Success Through Software Innovation
DataScouting - Success Through Software InnovationDataScouting - Success Through Software Innovation
DataScouting - Success Through Software Innovation
 
Use r 2013 tutorial - r and cloud computing for higher education and research
Use r 2013   tutorial - r and cloud computing for higher education and researchUse r 2013   tutorial - r and cloud computing for higher education and research
Use r 2013 tutorial - r and cloud computing for higher education and research
 
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
 
FinalPPT-StJoseph (3).pptx
FinalPPT-StJoseph (3).pptxFinalPPT-StJoseph (3).pptx
FinalPPT-StJoseph (3).pptx
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
 
Karagiannis space hellas_sepve
Karagiannis space hellas_sepveKaragiannis space hellas_sepve
Karagiannis space hellas_sepve
 
The silent project disruptor: Building AI solutions
The silent project disruptor: Building AI solutionsThe silent project disruptor: Building AI solutions
The silent project disruptor: Building AI solutions
 
Log I am your father
Log I am your fatherLog I am your father
Log I am your father
 
Introduction to Ostia
Introduction to Ostia  Introduction to Ostia
Introduction to Ostia
 
AI meets Big Data
AI meets Big DataAI meets Big Data
AI meets Big Data
 
MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012
 
Deciphering AI: Human Expertise in the Age of Evolving AI
Deciphering AI: Human Expertise in the Age of Evolving AIDeciphering AI: Human Expertise in the Age of Evolving AI
Deciphering AI: Human Expertise in the Age of Evolving AI
 
OpenDataSoft - Towards Cost-efficient Innovation with Data Open Platforms
OpenDataSoft - Towards Cost-efficient Innovation with Data Open PlatformsOpenDataSoft - Towards Cost-efficient Innovation with Data Open Platforms
OpenDataSoft - Towards Cost-efficient Innovation with Data Open Platforms
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
7 drivers that will change the way we work
7 drivers that will change the way we work7 drivers that will change the way we work
7 drivers that will change the way we work
 

More from Thor Henning Hetland

Robust smidig utvikling - når resultater er viktigere enn religion
Robust smidig utvikling - når resultater er viktigere enn religionRobust smidig utvikling - når resultater er viktigere enn religion
Robust smidig utvikling - når resultater er viktigere enn religionThor Henning Hetland
 
Internet of things - what is really happening
Internet of things - what is really happeningInternet of things - what is really happening
Internet of things - what is really happeningThor Henning Hetland
 
Edr mds a less is more approach to MDM
Edr mds a less is more approach to MDMEdr mds a less is more approach to MDM
Edr mds a less is more approach to MDMThor Henning Hetland
 
Nyere forskningsresultater som er viktige for software arkitekten
Nyere forskningsresultater som er viktige for software arkitektenNyere forskningsresultater som er viktige for software arkitekten
Nyere forskningsresultater som er viktige for software arkitektenThor Henning Hetland
 
Kan vi skape mye mere verdi i softwareporosjekter
Kan vi skape mye mere verdi i softwareporosjekterKan vi skape mye mere verdi i softwareporosjekter
Kan vi skape mye mere verdi i softwareporosjekterThor Henning Hetland
 
Cloud Psychology - a look at why many businesses will go out of business soon.
Cloud Psychology - a look at why many businesses will go out of business soon.Cloud Psychology - a look at why many businesses will go out of business soon.
Cloud Psychology - a look at why many businesses will go out of business soon.Thor Henning Hetland
 
Open Knowledge Community Wiki Celebration
Open Knowledge Community Wiki CelebrationOpen Knowledge Community Wiki Celebration
Open Knowledge Community Wiki CelebrationThor Henning Hetland
 

More from Thor Henning Hetland (14)

Fixing the problem
Fixing the problemFixing the problem
Fixing the problem
 
Robust smidig utvikling - når resultater er viktigere enn religion
Robust smidig utvikling - når resultater er viktigere enn religionRobust smidig utvikling - når resultater er viktigere enn religion
Robust smidig utvikling - når resultater er viktigere enn religion
 
Internet of things - what is really happening
Internet of things - what is really happeningInternet of things - what is really happening
Internet of things - what is really happening
 
laws of SOA
laws of SOAlaws of SOA
laws of SOA
 
Edr mds a less is more approach to MDM
Edr mds a less is more approach to MDMEdr mds a less is more approach to MDM
Edr mds a less is more approach to MDM
 
Nyere forskningsresultater som er viktige for software arkitekten
Nyere forskningsresultater som er viktige for software arkitektenNyere forskningsresultater som er viktige for software arkitekten
Nyere forskningsresultater som er viktige for software arkitekten
 
Kan vi skape mye mere verdi i softwareporosjekter
Kan vi skape mye mere verdi i softwareporosjekterKan vi skape mye mere verdi i softwareporosjekter
Kan vi skape mye mere verdi i softwareporosjekter
 
Cloud Psychology - a look at why many businesses will go out of business soon.
Cloud Psychology - a look at why many businesses will go out of business soon.Cloud Psychology - a look at why many businesses will go out of business soon.
Cloud Psychology - a look at why many businesses will go out of business soon.
 
SOA 911
SOA 911SOA 911
SOA 911
 
Design time governance
Design time governanceDesign time governance
Design time governance
 
Agile wineaccn2011
Agile wineaccn2011 Agile wineaccn2011
Agile wineaccn2011
 
Neo4 dogs
Neo4 dogsNeo4 dogs
Neo4 dogs
 
Open Knowledge Community Wiki Celebration
Open Knowledge Community Wiki CelebrationOpen Knowledge Community Wiki Celebration
Open Knowledge Community Wiki Celebration
 
Soa Runtime
Soa RuntimeSoa Runtime
Soa Runtime
 

Recently uploaded

Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 

Recently uploaded (20)

E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 

Neo4Dogs - a data quality platform approach with SolrCloud and graphs

  • 1. Neo4Dogs Innovation Intelligent Systems Software Engineering Graph Cafe, Teknologihuset, Oslo, 27.06.2014 Totto-14 @javatotto / totto@totto.org
  • 2. A Global Leader AMERICAS EUROPE ASIA Bringing our customers' projects to life and boosting their performance through technology and innovation «   «   € +1 633 m REVENUES in 2013 +20 000 EMPLOYEES in 2013 +20 COUNTRIES
  • 3. R&D and Innovation For 30 years, Altran has had a close relationship with innovation. Where creative ideas become a reality, Altran consultants step up to transform ideas into innovative solutions that can enable technological progress. In this way, Altran has contributed to major technological advances in recent decades: speed, precision, security, communication, practicality, interoperability, artificial intelligence... AEGT: the world's most powerful electric car Altran was responsible for designing and engineering the electric transmission on this car, capable of reaching speeds of 300 km/h. Solar Impulse: the first plane to fly on solar energy alone Since 2003, Altran experts have dedicated their skills to bringing about this formidable technical and human achievement. The Airport of the Future: outlining a ‘friend-lean’ space in 2040 Altran develops revolutionary concepts for airports responding to long-term changes in the industry.
  • 4. Agenda ● Situation analysis ● From dog register via case management to dog-hub ● The platform ● Performance and some metrics
  • 5. Initial analysis ● From register to case management – over 20 years of legacy.. – Dog information spread across 30+ relational tables – 2-3 weeks of work to retrieve «a dog» with some info (every time) – «impossible» to store new types of data/information on a dog – Data was hidden/unavailable to people -> «data rot» – Cascading costs of change and new features ● Recognized the need for a different approach – But how to get out of the squeeze was not obvious.. – Limited technical skills, system knowledge and functional knowledge – No time, capacity or money to do a «full rewrite» ● We selected a bottom up, data first, platform aproach. With strong capabilities for continous data quality processes and strong support for semi-structured data.
  • 6. From dog register to case management to dog hub ● Quick and easy access to individual dogs ● Scale - 10 to 50 integrations with other systems (hub) ● Handling individual dogs of «questionable» data quality ● Easily extendable to store more data on any individual – Semi structured strategy for persistence/storage
  • 8. The platform we built ● Dog search & lookup – SolrCloud with "json_full" ● DogPopulationService – Pedigree, population structure, breeddata – Data error, data deviation, data missing -> DogFixer ● DogIDMapper (multi-source, multi-master, map different ID-schemes) ● DogCrawler – Is it possible to find aditional data to fix this individual? ● DogFixer – Is it possible to statistically find the right answer? – Manual process in some corner-cases / difficult cases ● DogServiceREST – verify & merge, writeback updates – «tailing» datasources of dog information
  • 9. Some numbers ● 2 mill reqs/hour ● 10 mill reqs/24 hours ● Breed calculations went form taking «months» to «instant» – 200-500 joins per individual, 1000/year, 10 years = 2-8 sek ● Latency: 0.2 sek, 99.7% of reqs ● DogIDMapper: 4000 dogs/sec ● DogGraph: 3000 dogs/sec ● DogFixer: 10-15 dogs/sec ● DogCrawler: 100-200 dogs/sec
  • 11. And survive «issues» with low latency
  • 12. Try it out: * http://dogsearch.nkk.no * http://dogpopulation.nkk.no/ * http://dogpopulation.nkk.no/ras/?breed=Dunker * http://dogpopulation.nkk.no/dogpopulation/concurrent/executor/status * Code: by request :)