SlideShare a Scribd company logo
Real-time Natural Language Processing for
Crowdsourced Road Traffic Alerts
C.D. Athuraliya, M.K.H. Gunasekara,
Srinath Perera, Sriskandarajah Suhothayan
http://bit.ly/1NwBXTv
● Introduction
● Background
● Solution & Methodology
● Results & Conclusion
Overview
2
Introduction
● Success of modern day enterprises and businesses is highly relied
on how they process massive amounts of data
● “Drowning in data yet starving for knowledge”
● With the emergence of social media, public has gained the
potential to generate massive amounts of data
● But we are still in a struggle to extract useful information out of this
data
3
Introduction
● Road traffic has become a major issue, mainly in developing
countries
● Directly affects country’s economy and development due to the
waste of resources – Fuel, time
● Using technology to find solutions – Proven to be success stories in
number of cases
● This study was focused on one such solution emerged with the use
of social media
● Twitter – Popular for dynamic content publishing
○ Users publish on different topics such as current affairs, news, politics
and personal interests via 140 character messages called tweets
4
Background
● Road.lk – A website that provides localized traffic alerts from a
Twitter feed
● Experiencing road traffic or have information on road traffic? Tweet
about it!
● All users, follow @road_lk receive traffic alerts nearly in real-time
● Identified as a potential source to extract information on road traffic
in real-time
● Reliability maintained by higher number of publishers
5
Background – @road_lk Feed
6
Background
● Potential is significant to a country like Sri Lanka – Due to the
unavailability of high tech traffic monitoring systems
● Several limitations,
○ Connectivity requirement
○ Unavailability of proper alert mechanism except Twitter feed or
road.lk website
● Notable limitation – Users use natural language to post traffic
updates
● A format can make processing tweets more straightforward but it
can reduce the flexibility of sharing updates
7
Solution & Methodology
● A prototype solution was implemented by combining NLP and CEP
tools
● Accommodates three use cases,
○ Real-time road traffic feed and geo location map
○ Traffic search within an area
○ Traffic alert subscription
● Developed an architecture for a these use cases
● Multiple tools were utilized to retrieve, process and present
information
8
Solution & Methodology – Architecture
9
Solution & Methodology – Feed
● Feed Retrieval – Access Twitter via its API
● Existing feed for model training dataset generation
○ REST API, Twitter4J
● Real-time feed stream for alert generation
○ Streaming API, WSO2 Enterprise Service Bus Twitter
connector
10
Solution & Methodology – NLP
● @road_lk Twitter feed
○ Reliable data source to generate real-time traffic alerts
○ Constrained by natural language representation
● Transform this data into a machine readable representation – Can
use the full potential of this source for a better solution
● Proposed a NLP model to address this problem
● Extracted two entities from a tweet – location and traffic level
● Before extracting these two entities,
○ A tweet needed to be classified – Traffic alert or not?
○ Cleaning, preprocessing
11
Solution & Methodology – NLP
● NLP tasks required to classify and extract,
○ Tweet categorization
○ Location extraction
○ Traffic level extraction
● First task – Document categorization task
● Latter two – Name entity recognition (NER) tasks
● Apache OpenNLP toolkit was used
● Custom tokenizer for street names and city names
● Traffic level NER task – Predefined set of words selected to tag
● Had to consider factors – Spelling mistakes, informal language,
abbreviations 12
Solution & Methodology – CEP
● Another important property of this data source – Required to
process the Twitter feed in real-time
● Our approach was complex event processing (CEP)
● CEP is a field, concerned in processing data from multiple sources
in real-time
● Used WSO2 Complex Event Processor as the CEP tool to analyse
and process Twitter feed input stream
● Siddhi Query Language (SiddhiQL) is at the core of WSO2 CEP
● Designed to process event streams and identify complex event
occurrences
13
Solution & Methodology – Siddhi Queries
from classifiedStream#transform.nlp:getEntities(convertedText,4,true,"/_system/governance/en-location.bin")
select * insert into templocationStream;
from classifiedStream#transform.nlp:getEntities(convertedText,1,false,"/_system/governance/en-trafficlevel.bin")
select * insert into temptrafficlevelStream;
from S1=classifiedStream, S2=temptrafficlevelStream, S3=templocationStream
select S1.createdAt as time, S2.nameElement1 as trafficLevel, S3.nameElement1 as location1, S3.nameElement2 as
location2, S3.nameElement3 as location3, S3.nameElement4 as location4
insert into locationsStream;
from uiFeedStream#window.time(120 min) as trafficFeed join SearchEventStream as request
on (trafficFeed.latitude < request.latitude + 0.018 and trafficFeed.latitude > request.latitude - 0.018 and
trafficFeed.longitude < request.longitude + 0.027 and trafficFeed.longitude > request.longitude - 0.027)
select trafficFeed.formattedAddress, trafficFeed.latitude, trafficFeed.longitude, trafficFeed.level, trafficFeed.time
insert into searchResult;
14
Solution & Methodology – CEP
● Siddhi queries define how to process and combine existing event
streams to create new event streams
● SiddhiQL was extended with extensions for,
○ Tweet categorization
○ Name entity recognition
○ Geocoding
● Geocoding extension converts the locations into geo coordinates
● Searching functionality used a time-based Siddhi window
○ To retrieve traffic in nearby geo area within a predefined time
period
15
Results & Conclusion
● Implemented a web based interface to demonstrate the
functionalities
● Users can interact with this interface and make use of the use
cases
● Accuracy measures of NLP through OpenNLP evaluation APIs
● A solution to extract useful information from a crowdsourced social
networking service
● By utilizing a NLP/CEP combined approach
16
Results & Conclusion – Web UI
17
Results & Conclusion
● Results demonstrate the potential of such model
● To tackle an application of real-time natural language processing
task
● This model can be extended to tackle any real-time unstructured
data stream
● Transforming human readable data into machine readable format
enables deep processing of data to generate useful information and
insights
○ Trend analysis
○ Pattern detection and prediction
18
Thank you.

More Related Content

Similar to Real-time Natural Language Processing for Crowdsourced Road Traffic Alerts

Presentation
PresentationPresentation
Presentation
Saumil Khanduri
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at Twitter
Prasad Wagle
 
Geospatial data platform at Uber
Geospatial data platform at UberGeospatial data platform at Uber
Geospatial data platform at Uber
DataWorks Summit
 
A Platform Approach to Digital Transformation
A Platform Approach to Digital TransformationA Platform Approach to Digital Transformation
A Platform Approach to Digital Transformation
Integration Meetups
 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and public
Stephanie Simms
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
Sammy Fung
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M
 
Uber Geo spatial data platform at DataWorks Summit
Uber Geo spatial data platform at DataWorks SummitUber Geo spatial data platform at DataWorks Summit
Uber Geo spatial data platform at DataWorks Summit
Zhenxiao Luo
 
DAMG7245-Fall23-FinalProjectProposal.pdf
DAMG7245-Fall23-FinalProjectProposal.pdfDAMG7245-Fall23-FinalProjectProposal.pdf
DAMG7245-Fall23-FinalProjectProposal.pdf
ssuserf0a206
 
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
International Map Industry Association
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
AgileNetwork
 
Open data for development
Open data for developmentOpen data for development
Open data for development
mlepage
 
City of Amsterdam: High velocity development
City of Amsterdam: High velocity developmentCity of Amsterdam: High velocity development
City of Amsterdam: High velocity development
Boris van Hoytema
 
The tripscore Linked Data client: calculating specific summaries over large t...
The tripscore Linked Data client: calculating specific summaries over large t...The tripscore Linked Data client: calculating specific summaries over large t...
The tripscore Linked Data client: calculating specific summaries over large t...
David Chaves-Fraga
 
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data HubSFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
South Tyrol Free Software Conference
 
Android development
Android developmentAndroid development
Android development
Parinda Rajapaksha
 
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
Together We're Better
 
Engage 2020-nerd-for-move-on-from-x pages
Engage 2020-nerd-for-move-on-from-x pagesEngage 2020-nerd-for-move-on-from-x pages
Engage 2020-nerd-for-move-on-from-x pages
Heiko Voigt
 
Visualizing CDR Data
Visualizing CDR DataVisualizing CDR Data
Visualizing CDR Data
NopphawanTamkuan
 
Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in Denodo
Denodo
 

Similar to Real-time Natural Language Processing for Crowdsourced Road Traffic Alerts (20)

Presentation
PresentationPresentation
Presentation
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at Twitter
 
Geospatial data platform at Uber
Geospatial data platform at UberGeospatial data platform at Uber
Geospatial data platform at Uber
 
A Platform Approach to Digital Transformation
A Platform Approach to Digital TransformationA Platform Approach to Digital Transformation
A Platform Approach to Digital Transformation
 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and public
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
 
Uber Geo spatial data platform at DataWorks Summit
Uber Geo spatial data platform at DataWorks SummitUber Geo spatial data platform at DataWorks Summit
Uber Geo spatial data platform at DataWorks Summit
 
DAMG7245-Fall23-FinalProjectProposal.pdf
DAMG7245-Fall23-FinalProjectProposal.pdfDAMG7245-Fall23-FinalProjectProposal.pdf
DAMG7245-Fall23-FinalProjectProposal.pdf
 
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
Putting Spatial Information in Customer Hands - Wayne Fry - Dept Natural Reso...
 
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
Agile Gurugram 2023 | Observability for Modern Applications. How does it help...
 
Open data for development
Open data for developmentOpen data for development
Open data for development
 
City of Amsterdam: High velocity development
City of Amsterdam: High velocity developmentCity of Amsterdam: High velocity development
City of Amsterdam: High velocity development
 
The tripscore Linked Data client: calculating specific summaries over large t...
The tripscore Linked Data client: calculating specific summaries over large t...The tripscore Linked Data client: calculating specific summaries over large t...
The tripscore Linked Data client: calculating specific summaries over large t...
 
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data HubSFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
SFSCON23 - Martin Rabanser - Real-time aeroplane tracking and the Open Data Hub
 
Android development
Android developmentAndroid development
Android development
 
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
#twbconf 2017: Digital transformation in London - Natalie Taylor, Mayor of Lo...
 
Engage 2020-nerd-for-move-on-from-x pages
Engage 2020-nerd-for-move-on-from-x pagesEngage 2020-nerd-for-move-on-from-x pages
Engage 2020-nerd-for-move-on-from-x pages
 
Visualizing CDR Data
Visualizing CDR DataVisualizing CDR Data
Visualizing CDR Data
 
Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in Denodo
 

Recently uploaded

Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 

Recently uploaded (20)

Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 

Real-time Natural Language Processing for Crowdsourced Road Traffic Alerts

  • 1. Real-time Natural Language Processing for Crowdsourced Road Traffic Alerts C.D. Athuraliya, M.K.H. Gunasekara, Srinath Perera, Sriskandarajah Suhothayan http://bit.ly/1NwBXTv
  • 2. ● Introduction ● Background ● Solution & Methodology ● Results & Conclusion Overview 2
  • 3. Introduction ● Success of modern day enterprises and businesses is highly relied on how they process massive amounts of data ● “Drowning in data yet starving for knowledge” ● With the emergence of social media, public has gained the potential to generate massive amounts of data ● But we are still in a struggle to extract useful information out of this data 3
  • 4. Introduction ● Road traffic has become a major issue, mainly in developing countries ● Directly affects country’s economy and development due to the waste of resources – Fuel, time ● Using technology to find solutions – Proven to be success stories in number of cases ● This study was focused on one such solution emerged with the use of social media ● Twitter – Popular for dynamic content publishing ○ Users publish on different topics such as current affairs, news, politics and personal interests via 140 character messages called tweets 4
  • 5. Background ● Road.lk – A website that provides localized traffic alerts from a Twitter feed ● Experiencing road traffic or have information on road traffic? Tweet about it! ● All users, follow @road_lk receive traffic alerts nearly in real-time ● Identified as a potential source to extract information on road traffic in real-time ● Reliability maintained by higher number of publishers 5
  • 7. Background ● Potential is significant to a country like Sri Lanka – Due to the unavailability of high tech traffic monitoring systems ● Several limitations, ○ Connectivity requirement ○ Unavailability of proper alert mechanism except Twitter feed or road.lk website ● Notable limitation – Users use natural language to post traffic updates ● A format can make processing tweets more straightforward but it can reduce the flexibility of sharing updates 7
  • 8. Solution & Methodology ● A prototype solution was implemented by combining NLP and CEP tools ● Accommodates three use cases, ○ Real-time road traffic feed and geo location map ○ Traffic search within an area ○ Traffic alert subscription ● Developed an architecture for a these use cases ● Multiple tools were utilized to retrieve, process and present information 8
  • 9. Solution & Methodology – Architecture 9
  • 10. Solution & Methodology – Feed ● Feed Retrieval – Access Twitter via its API ● Existing feed for model training dataset generation ○ REST API, Twitter4J ● Real-time feed stream for alert generation ○ Streaming API, WSO2 Enterprise Service Bus Twitter connector 10
  • 11. Solution & Methodology – NLP ● @road_lk Twitter feed ○ Reliable data source to generate real-time traffic alerts ○ Constrained by natural language representation ● Transform this data into a machine readable representation – Can use the full potential of this source for a better solution ● Proposed a NLP model to address this problem ● Extracted two entities from a tweet – location and traffic level ● Before extracting these two entities, ○ A tweet needed to be classified – Traffic alert or not? ○ Cleaning, preprocessing 11
  • 12. Solution & Methodology – NLP ● NLP tasks required to classify and extract, ○ Tweet categorization ○ Location extraction ○ Traffic level extraction ● First task – Document categorization task ● Latter two – Name entity recognition (NER) tasks ● Apache OpenNLP toolkit was used ● Custom tokenizer for street names and city names ● Traffic level NER task – Predefined set of words selected to tag ● Had to consider factors – Spelling mistakes, informal language, abbreviations 12
  • 13. Solution & Methodology – CEP ● Another important property of this data source – Required to process the Twitter feed in real-time ● Our approach was complex event processing (CEP) ● CEP is a field, concerned in processing data from multiple sources in real-time ● Used WSO2 Complex Event Processor as the CEP tool to analyse and process Twitter feed input stream ● Siddhi Query Language (SiddhiQL) is at the core of WSO2 CEP ● Designed to process event streams and identify complex event occurrences 13
  • 14. Solution & Methodology – Siddhi Queries from classifiedStream#transform.nlp:getEntities(convertedText,4,true,"/_system/governance/en-location.bin") select * insert into templocationStream; from classifiedStream#transform.nlp:getEntities(convertedText,1,false,"/_system/governance/en-trafficlevel.bin") select * insert into temptrafficlevelStream; from S1=classifiedStream, S2=temptrafficlevelStream, S3=templocationStream select S1.createdAt as time, S2.nameElement1 as trafficLevel, S3.nameElement1 as location1, S3.nameElement2 as location2, S3.nameElement3 as location3, S3.nameElement4 as location4 insert into locationsStream; from uiFeedStream#window.time(120 min) as trafficFeed join SearchEventStream as request on (trafficFeed.latitude < request.latitude + 0.018 and trafficFeed.latitude > request.latitude - 0.018 and trafficFeed.longitude < request.longitude + 0.027 and trafficFeed.longitude > request.longitude - 0.027) select trafficFeed.formattedAddress, trafficFeed.latitude, trafficFeed.longitude, trafficFeed.level, trafficFeed.time insert into searchResult; 14
  • 15. Solution & Methodology – CEP ● Siddhi queries define how to process and combine existing event streams to create new event streams ● SiddhiQL was extended with extensions for, ○ Tweet categorization ○ Name entity recognition ○ Geocoding ● Geocoding extension converts the locations into geo coordinates ● Searching functionality used a time-based Siddhi window ○ To retrieve traffic in nearby geo area within a predefined time period 15
  • 16. Results & Conclusion ● Implemented a web based interface to demonstrate the functionalities ● Users can interact with this interface and make use of the use cases ● Accuracy measures of NLP through OpenNLP evaluation APIs ● A solution to extract useful information from a crowdsourced social networking service ● By utilizing a NLP/CEP combined approach 16
  • 17. Results & Conclusion – Web UI 17
  • 18. Results & Conclusion ● Results demonstrate the potential of such model ● To tackle an application of real-time natural language processing task ● This model can be extended to tackle any real-time unstructured data stream ● Transforming human readable data into machine readable format enables deep processing of data to generate useful information and insights ○ Trend analysis ○ Pattern detection and prediction 18