SlideShare a Scribd company logo
NLP and Graph Databases in
Charlie Greenbacker & Joe Kerner
Agenda
Graph Databases
Lumify Overview
Introductions
Natural Language Processing
photo:&Columbia&Pictures&
About me: @greenbacker
Theories: popular tripe
Methods: sloppy
Conclusions: highly questionable
Best reason for
not finishing PhD
@ExploreAltamira
is an open source
big data analysis and
visualization platform
built by Altamira engineers
Key Lumify Concepts
structure for organizing information (i.e., your data model)
Ontology
any “thing” you want to represent (e.g., person, place, event)
Entities
a link between two entities (e.g., leader-of, works-for, sibling-of)
Relationships
data about an entity (e.g., first name, last name, date of birth)
Properties
collection of entities and the relationships between them
Graph
Live Demo
Who can Lumify help?
Lumify helps analysts
fuse structured and
unstructured data
from myriad sources
into actionable
intelligence.
Intelligence
Analyst
Law enforcement
personnel can use
Lumify to explore
criminal networks,
uncover hidden
connections, and
develop leads.
Police
Investigator
Lumify analyzes
financial data and
transaction records
to help detect fraud
and identify possible
insider threats.
Financial
Analyst
photo:&Ken&Teegardin&(h9ps://flic.kr/p/9rn9Yh)&
Scientists, law firms,
news organizations,
and others can
track their research
in Lumify to unearth
latent knowledge
and discover critical
new insights.
Research
Staff
photo:&UK&NaConal&Archives&(h9p://bit.ly/1n9dhR8)&
Why Lumify?
•  Distributed under the
permissive Apache 2.0
license
•  No restrictions on
modifications
•  No licensing or usage
constraints
Free and
Open Source
Built on Scalable Open Source Tech
Hadoop&CDH&4&
Accumulo&
ElasCcSearch&
tesseract&CLAVIN& CMU&Sphinx&OpenNLP& OpenCV& ffmpeg&
Apache&Storm&
Secure&Graph&
custom&code&
•  Separate security
restrictions at the
entity, property, and
relationship level
•  Implemented in and
enforced by
Accumulo cell-level
security
Highly Secure
Joaquin Guzman Loera
DOB: 1957-04-04
POB: Badiraguarto
Nationality: Mexican
Founded: 2010-01-11
Location: Mexico City
Employees: 121
Zarka de Mexico
•  Full-time development
staff
•  Custom development
and customization
services
•  Commercial support
offerings
Supported
•  Day-to-day
development done on
Amazon infrastructure
•  Primarily use EC2, VPC,
S3, SES, CloudWatch
•  Altamira is an AWS
consulting partner
AWS
Compatible
Natural Language Processing in
Text Extraction
video
text docs
structured
data
images OCR
tesseract
audio CMU
Sphinx
CMU
Sphinx
OCR
tesseract
extractor
Text Enrichment
•  Apache OpenNLP
•  Named Entity Recognition
•  Extracts names of entities
from unstructured text
•  Persons, Orgs, & Locations
•  Highlighted in preview text
•  User must confirm/resolve
•  CLAVIN
•  Geospatial Entity Resolution
•  Resolves extracted location
names to gazetteer records
•  Solves “Springfield problem”
•  Disambiguates place names
•  Turns text docs into maps!
Machine-powered entity
extraction and resolution,
combined with human QA
and supplementation,
supports rich semantic
analysis of raw text.
Enriched
Text
Documents
Drug Lord “El Chapo” Captured in Mexico
PUBLISHED DATE
SOURCE
Audit
2014/02/22
Wikipedia
Add Property
Although Guzman had long hidden successfully in remote areas of the
Sierra Madre mountains, the arrested members of his security team told
the military he had begun venturing out to Culiacan and the beach town of
Mazatlan. A week prior to his capture, Guzman and Zambada were
reported to have attended a family reunion in Sinaloa. The Mexican military
followed the bodyguards tips to Guzman’s ex-wife’s house, but they had
trouble ramming the steel-reinforced front door, which allowed Guzman to
escape through a system of secret tunnels that connected six houses,
eventually moving south to Mazatlan. He planned to stay a few days in
Mazatlan to see his twin baby daughters before retreating to the
mountains.

On 22 February 2014, at around 6:40 a.m., Mexican authorities arrested
Guzman at a hotel in a beach front area in Mazatlan, Sinaloa, following an
operation by the Mexican Navy, with joint intelligence from the DEA and
Benefits to Users
quickly find relevant data without reading
Increases Discoverability
machines process text faster than humans
Helps Deal with Information Overload
enables object-based analysis & investigations
Uncovers Hidden Connections
Future NLP Integration
e.g., Stanford NER, SUTime, MITIE
Support other NER tools
e.g., OpenIE (formerly ReVerb)
Event/Relationship Extraction
augmenting/extending GATE/ANNIE
Coreference Resolution
e.g., frequency analysis, topic modeling, sentiment analysis
Additional Text Analytics
use non-English language models for NER, etc.
Multilingual Support
Graph Databases in
view part 2 of the presentation here:
github.com/altamiracorp/secure-graph-presentation
Questions?
more info: lumify.io

More Related Content

What's hot

Common themes in psychological thrillers
Common themes in psychological thrillersCommon themes in psychological thrillers
Common themes in psychological thrillers
khalfyard
 
Semantics and syntactics
Semantics and syntacticsSemantics and syntactics
Semantics and syntacticshiimtoon
 
The Last Exorcism poster analysis
The Last Exorcism poster analysisThe Last Exorcism poster analysis
The Last Exorcism poster analysisfatemajohara
 
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Michele Chubirka
 
Film & Reality
Film & RealityFilm & Reality
Film & Reality
Iain Williamson
 
Analysis Of DVD Cover
Analysis Of DVD CoverAnalysis Of DVD Cover
Analysis Of DVD CoverAdil AbbAs
 
Crime Data Analysis, Visualization and Prediction using Data Mining
Crime Data Analysis, Visualization and Prediction using Data MiningCrime Data Analysis, Visualization and Prediction using Data Mining
Crime Data Analysis, Visualization and Prediction using Data Mining
Anavadya Shibu
 
GCSE FILM STUDIES EXAMPLE PAPER
GCSE FILM STUDIES EXAMPLE PAPERGCSE FILM STUDIES EXAMPLE PAPER
GCSE FILM STUDIES EXAMPLE PAPERBelinda Raji
 
3. Who Watches Crime Dramas and Why?
3. Who Watches Crime Dramas and Why?3. Who Watches Crime Dramas and Why?
3. Who Watches Crime Dramas and Why?latymermedia
 
Film Studies- Paper 1(Session C: Inception & Captain Fantastic)
Film Studies- Paper 1(Session C: Inception & Captain Fantastic)Film Studies- Paper 1(Session C: Inception & Captain Fantastic)
Film Studies- Paper 1(Session C: Inception & Captain Fantastic)
SofiaRibWillDS75
 
Trailer analysis shutter island
Trailer analysis   shutter islandTrailer analysis   shutter island
Trailer analysis shutter island
Josh Martey
 
The conjuring - research
The conjuring  - researchThe conjuring  - research
The conjuring - research
MariannaGould
 
Face spoofing detection using texture analysis
Face spoofing detection  using texture analysisFace spoofing detection  using texture analysis
Face spoofing detection using texture analysis
SREEKUTTY SREEKUMAR
 
The girl on the train’ analysis
The girl on the train’ analysisThe girl on the train’ analysis
The girl on the train’ analysis
Merrie Buckley-Sheldon
 
Analysis on taylor swift’s cd album covers
Analysis on taylor swift’s cd album coversAnalysis on taylor swift’s cd album covers
Analysis on taylor swift’s cd album coversParys Gardener
 
Analysis of similar product: Limitless
Analysis of similar product: LimitlessAnalysis of similar product: Limitless
Analysis of similar product: Limitlesstsang7787
 
Postmodernism in scream
Postmodernism in screamPostmodernism in scream
Postmodernism in scream
aimeelouisasmith
 
Codes and conventions of thriller films
Codes and conventions of thriller filmsCodes and conventions of thriller films
Codes and conventions of thriller films
Lauryn Robertson
 

What's hot (19)

Common themes in psychological thrillers
Common themes in psychological thrillersCommon themes in psychological thrillers
Common themes in psychological thrillers
 
Semantics and syntactics
Semantics and syntacticsSemantics and syntactics
Semantics and syntactics
 
The Last Exorcism poster analysis
The Last Exorcism poster analysisThe Last Exorcism poster analysis
The Last Exorcism poster analysis
 
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
Beware the Firewall My Son: The Jaws That Bite, The Claws That Catch!
 
Film & Reality
Film & RealityFilm & Reality
Film & Reality
 
Analysis Of DVD Cover
Analysis Of DVD CoverAnalysis Of DVD Cover
Analysis Of DVD Cover
 
Crime Data Analysis, Visualization and Prediction using Data Mining
Crime Data Analysis, Visualization and Prediction using Data MiningCrime Data Analysis, Visualization and Prediction using Data Mining
Crime Data Analysis, Visualization and Prediction using Data Mining
 
GCSE FILM STUDIES EXAMPLE PAPER
GCSE FILM STUDIES EXAMPLE PAPERGCSE FILM STUDIES EXAMPLE PAPER
GCSE FILM STUDIES EXAMPLE PAPER
 
3. Who Watches Crime Dramas and Why?
3. Who Watches Crime Dramas and Why?3. Who Watches Crime Dramas and Why?
3. Who Watches Crime Dramas and Why?
 
Film Studies- Paper 1(Session C: Inception & Captain Fantastic)
Film Studies- Paper 1(Session C: Inception & Captain Fantastic)Film Studies- Paper 1(Session C: Inception & Captain Fantastic)
Film Studies- Paper 1(Session C: Inception & Captain Fantastic)
 
Trailer analysis shutter island
Trailer analysis   shutter islandTrailer analysis   shutter island
Trailer analysis shutter island
 
The ring analysis
The ring analysisThe ring analysis
The ring analysis
 
The conjuring - research
The conjuring  - researchThe conjuring  - research
The conjuring - research
 
Face spoofing detection using texture analysis
Face spoofing detection  using texture analysisFace spoofing detection  using texture analysis
Face spoofing detection using texture analysis
 
The girl on the train’ analysis
The girl on the train’ analysisThe girl on the train’ analysis
The girl on the train’ analysis
 
Analysis on taylor swift’s cd album covers
Analysis on taylor swift’s cd album coversAnalysis on taylor swift’s cd album covers
Analysis on taylor swift’s cd album covers
 
Analysis of similar product: Limitless
Analysis of similar product: LimitlessAnalysis of similar product: Limitless
Analysis of similar product: Limitless
 
Postmodernism in scream
Postmodernism in screamPostmodernism in scream
Postmodernism in scream
 
Codes and conventions of thriller films
Codes and conventions of thriller filmsCodes and conventions of thriller films
Codes and conventions of thriller films
 

Viewers also liked

Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Radityo Eko Prasojo
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer Feedback
Alyona Medelyan
 
“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”
diannepatricia
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and Disease
Janna Hastings
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data Integration
Janna Hastings
 
Ontology
OntologyOntology
Ontology
Mithat Ekinci
 
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
semanticsconference
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
DataWorks Summit
 
Knowledge representation
Knowledge representationKnowledge representation
Knowledge representation
Md. Tanvir Masud
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontology
Janna Hastings
 
AI and the Future of Growth
AI and the Future of GrowthAI and the Future of Growth
AI and the Future of Growth
Accenture Technology
 

Viewers also liked (11)

Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer Feedback
 
“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”“Semantic PDF Processing & Document Representation”
“Semantic PDF Processing & Document Representation”
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and Disease
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data Integration
 
Ontology
OntologyOntology
Ontology
 
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
 
Knowledge representation
Knowledge representationKnowledge representation
Knowledge representation
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontology
 
AI and the Future of Growth
AI and the Future of GrowthAI and the Future of Growth
AI and the Future of Growth
 

Recently uploaded

The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 

Natural Language Processing and Graph Databases in Lumify

  • 1. NLP and Graph Databases in Charlie Greenbacker & Joe Kerner
  • 3. photo:&Columbia&Pictures& About me: @greenbacker Theories: popular tripe Methods: sloppy Conclusions: highly questionable
  • 4. Best reason for not finishing PhD
  • 6. is an open source big data analysis and visualization platform built by Altamira engineers
  • 7. Key Lumify Concepts structure for organizing information (i.e., your data model) Ontology any “thing” you want to represent (e.g., person, place, event) Entities a link between two entities (e.g., leader-of, works-for, sibling-of) Relationships data about an entity (e.g., first name, last name, date of birth) Properties collection of entities and the relationships between them Graph
  • 10. Lumify helps analysts fuse structured and unstructured data from myriad sources into actionable intelligence. Intelligence Analyst
  • 11. Law enforcement personnel can use Lumify to explore criminal networks, uncover hidden connections, and develop leads. Police Investigator
  • 12. Lumify analyzes financial data and transaction records to help detect fraud and identify possible insider threats. Financial Analyst photo:&Ken&Teegardin&(h9ps://flic.kr/p/9rn9Yh)&
  • 13. Scientists, law firms, news organizations, and others can track their research in Lumify to unearth latent knowledge and discover critical new insights. Research Staff photo:&UK&NaConal&Archives&(h9p://bit.ly/1n9dhR8)&
  • 15. •  Distributed under the permissive Apache 2.0 license •  No restrictions on modifications •  No licensing or usage constraints Free and Open Source
  • 16. Built on Scalable Open Source Tech Hadoop&CDH&4& Accumulo& ElasCcSearch& tesseract&CLAVIN& CMU&Sphinx&OpenNLP& OpenCV& ffmpeg& Apache&Storm& Secure&Graph& custom&code&
  • 17. •  Separate security restrictions at the entity, property, and relationship level •  Implemented in and enforced by Accumulo cell-level security Highly Secure Joaquin Guzman Loera DOB: 1957-04-04 POB: Badiraguarto Nationality: Mexican Founded: 2010-01-11 Location: Mexico City Employees: 121 Zarka de Mexico
  • 18. •  Full-time development staff •  Custom development and customization services •  Commercial support offerings Supported
  • 19. •  Day-to-day development done on Amazon infrastructure •  Primarily use EC2, VPC, S3, SES, CloudWatch •  Altamira is an AWS consulting partner AWS Compatible
  • 21. Text Extraction video text docs structured data images OCR tesseract audio CMU Sphinx CMU Sphinx OCR tesseract extractor
  • 22. Text Enrichment •  Apache OpenNLP •  Named Entity Recognition •  Extracts names of entities from unstructured text •  Persons, Orgs, & Locations •  Highlighted in preview text •  User must confirm/resolve •  CLAVIN •  Geospatial Entity Resolution •  Resolves extracted location names to gazetteer records •  Solves “Springfield problem” •  Disambiguates place names •  Turns text docs into maps!
  • 23. Machine-powered entity extraction and resolution, combined with human QA and supplementation, supports rich semantic analysis of raw text. Enriched Text Documents Drug Lord “El Chapo” Captured in Mexico PUBLISHED DATE SOURCE Audit 2014/02/22 Wikipedia Add Property Although Guzman had long hidden successfully in remote areas of the Sierra Madre mountains, the arrested members of his security team told the military he had begun venturing out to Culiacan and the beach town of Mazatlan. A week prior to his capture, Guzman and Zambada were reported to have attended a family reunion in Sinaloa. The Mexican military followed the bodyguards tips to Guzman’s ex-wife’s house, but they had trouble ramming the steel-reinforced front door, which allowed Guzman to escape through a system of secret tunnels that connected six houses, eventually moving south to Mazatlan. He planned to stay a few days in Mazatlan to see his twin baby daughters before retreating to the mountains. On 22 February 2014, at around 6:40 a.m., Mexican authorities arrested Guzman at a hotel in a beach front area in Mazatlan, Sinaloa, following an operation by the Mexican Navy, with joint intelligence from the DEA and
  • 24. Benefits to Users quickly find relevant data without reading Increases Discoverability machines process text faster than humans Helps Deal with Information Overload enables object-based analysis & investigations Uncovers Hidden Connections
  • 25. Future NLP Integration e.g., Stanford NER, SUTime, MITIE Support other NER tools e.g., OpenIE (formerly ReVerb) Event/Relationship Extraction augmenting/extending GATE/ANNIE Coreference Resolution e.g., frequency analysis, topic modeling, sentiment analysis Additional Text Analytics use non-English language models for NER, etc. Multilingual Support
  • 26. Graph Databases in view part 2 of the presentation here: github.com/altamiracorp/secure-graph-presentation