SlideShare a Scribd company logo
1 of 5
Download to read offline
1
Merging Interstate Transportation Networks for routing
Hazardous Materials
I. ABSTRACT
Andrew Emerson (Furman University, Greenville, SC 29613)
Ingrid Busch, PhD (Oak Ridge National Laboratory, Oak Ridge, TN 37830)
The Oak Ridge National Laboratory’s Center for Transportation Analysis (CTA) uses intermodal
transportation network models to analyze and organize methods for the routing of materials and
people. For example, one of the CTA’s ongoing projects is to develop plans and maps for safe
transportation of hazardous materials. Current transportation networks contain rich, detailed
information such as the designation of roads for hazardous materials. However, the intermodal
network that the CTA is currently using is outdated and inaccurate. To improve the network, we
planned to merge the current CTA highway network with a newer, open-source network called
OpenStreetMap (OSM). OSM contains much more precise and accurate geospatial data and is
updated daily. OSM, like other crowd-sourced developed networks, contains detailed geospatial
data. However, it lacks many of the attributes that CTA generally relies on for use in its projects.
First, we compared the OSM links and nodes to the links and nodes currently in CTA database.
The nomenclature OSM uses to classify road types is similar, but not identical, to the
classifications that CTA uses. Thus, finding the appropriate highways to update was the first
task. Second, we grouped the OSM data into individual datasets which we refer to as
“SuperLinks.” Each SuperLink contains links and nodes for various sections of each interstate.
By using these larger datasets, we could more easily find the corresponding CTA link containing
the attributes to be captured. We identified specific starting nodes within the CTA network and
employed geographical techniques to merge the two networks. By doing this, we have built an
improved CTA network and developed a reusable merging method leading to a much more
accurate and updated transportation model.
2
II. INTRODUCTION
The CTA utilizes transportation data to complete various projects ranging from fuel
efficiency to the routing of hazardous materials (HM-164). These projects produce research that
goes towards the “efficient, safe and free movement of people and goods in our Nation's
transportation systems.”1
One of the major transportation networks that the CTA uses contains
multimodal routes: highway, railway, waterway, and airway. In projects relating to HM-164
routing, researchers rely on road designations contained in the CTA database. However, the data
in the CTA network is inaccurate and outdated. For instance, several of the interstates don’t quite
follow mapped satellite images. Additionally, since the CTA network was created, many roads
and interstates have been extended or rerouted. An improved network would allow more precise
and efficient routing. Also, a method to update the network would allow for a straightforward,
reusable process in obtaining new data. To improve the network and provide up-to-date data, our
project aimed to merge the current CTA network with a newer, open-source network called
OpenStreetMap (OSM). OSM is “built by a community of mappers that contribute and maintain
data about roads, trails, cafés, railway stations, and much more, all over the world.”2
Our project
was primarily concerned with interstates. To visualize results, we mapped the link/node data on
Google Earth.
In the process of investigating methods to merge the networks, we considered a couple
different strategies. One immediate issue that we faced is that OSM classifies its data in different
ways than the current CTA network. To accommodate this, we had to manipulate the
downloaded data in the OSM network to even begin our project. In addition to data
manipulation, we dealt with other concepts, such as conflation. In this paper, I will discuss
conflation, data manipulation, and network merging, along with the methods we decided to use
to merge the OSM and CTA transportation networks.
III. METHODS
A. Conflation
When investigating methods to combine the two networks, one existing concept that we
found is called “conflation”. Conflation is broadly defined as, “a set of procedures that aligns the
features of two geographic data layers and then transfers the attributes of one to the other.”3
This
essentially describes the general idea of our project, but there was not a specific set of procedures
that immediately fit our needs. Many current conflation software tools are complex and would be
more time consuming to use in our case. While we did not use conflation software in our project,
we followed the same general principles that these software packages follow, and we developed
a conflation technique specific to our networks.
B. Converting OSM Interstate Names
1
"Welcome Page." Center for Transportation Analysis. UT Battelle. Web. <http://cta.ornl.gov/cta/>.
2
"About." OpenStreetMap. Web. <http://www.openstreetmap.org/about>.
3
"GIS Dictionary." Esri: Support. Web.
<http://support.esri.com/en/knowledgebase/GISDictionary/term/conflation>.
3
As stated previously, OSM organizes its data somewhat differently than CTA does. The
first step that we took when approaching this problem was to scan through OSM data to
determine what modifications we would need to make in order to compare it to CTA data. One
area where the data did not compare very well was in the interstate name. In the CTA network,
all interstates are named in the same format; “I” followed by the identifying number. We wanted
to stick with this uniform naming convention. Because OSM is open-source, the users who enter
information on their network may have inconsistencies with their naming methods. For instance,
the interstates could be named “I-10,” “I 10,” or “I10,” but still refer to the same road. Due to
this fact, we had to find a way to convert all of the interstate names in OSM to the format that
CTA uses, which would be “I10.”
Using Java and the integrated development environment Eclipse, we wrote a program
that would access the database containing all of the OSM interstates. In order to connect from
Eclipse to the CTA and OSM databases, we used Java Database Connectivity (JDBC), which is a
software package containing query/database commands for Java programs. The first program we
used stored all of the interstates and then parsed each individual link name. To parse correctly,
we set the delimiters as anything that wasn’t an “I” or an identifying number. When done
parsing, the program saved the new name and then inserted the name into the database as that
link’s name. While this seemed to be an efficient solution, there were still a few exceptions that
had unusual names in the OSM network. For example, some interstates technically have two
names, such as I40/I75. In the CTA database, there is only one name associated with a given
interstate. We had to manually go into the database and fix these names individually.
C. Creating “SuperLink” Datasets
The second step was to determine a way to organize the OSM data in a way that would make
it easier to compare to the CTA data. We chose to handle larger datasets to compare because the
networks themselves are very big. To do this, we created what we refer to as “SuperLinks,”
which are large segments of an individual interstate. SuperLinks start and end based on a few
guidelines: they start where a starting node is not attached to another interstate, where a starting
node is attached to an interstate with a different name, where a starting node is attached to a
merge between two links, and where an interstate splits into two or more links (each link is the
start of a SuperLink). The endpoints of each SuperLink would be at the end of an interstate,
where two or more links merge together, or where two or more links split apart. With these
guidelines, several SuperLinks would make up a single interstate, and the SuperLinks would
more closely resemble the links in the current CTA datasets.
Again by using Java within Eclipse, we wrote a program that would generate SuperLinks for
each existing interstate. The program begins by inducing a loop through each interstate, and
stores all links. The valuable link information that we stored are the wayID (link identification),
partID (links could have multiple parts), aNode (start node), bNode (end node), distance (in
miles) and name. Next, the program finds all instances where there is a start point on the
interstate – the guidelines mentioned in the previous paragraph. By storing each of these start
points, the program creates a SuperLink for each one of these points and inserts this into the
SuperLink table in the OSM database. Then, for each SuperLink, the program follows the
starting link in the database and finds each proceeding link in order, while storing the new
4
endpoint and distance. Once the program finds the end of one SuperLink, it proceeds to the next
one. When the program finishes one interstate, it moves to the next one until all interstates are
composed of SuperLinks, and each link that is an interstate is contained in a SuperLink. The
result is that each link that is an interstate in the OSM network is now organized into a
SuperLink, making it easily comparable to the CTA network.
D. Generating SuperLink Geometry Unions
The biggest problem that we faced at this point was that none of the SuperLinks had a
geometric aspect to them, meaning they could not be mapped. At this point, the SuperLinks were
just a collection of links and not geometric entities themselves. Each link in the OSM network
has an attribute labeled as “coordinates,” which is just a Line String4
that represents the
geometrical aspect of the link. In the CTA network, this aspect is stored in the “geometryItem”
attribute column. To give the SuperLink its own geometrical aspect, we decided to use the SQL
server’s built in “STUnion” function that combines Line Strings of multiple geometric objects.
However, this function only works for two objects, or in our case, links, at a time.
To account for this, we wrote another Java program to handle each SuperLink and combine
the links in the correct succession. The program selects each SuperLink and stores the individual
link’s information: wayID, partID, aNode, and bNode. Then, the program finds the first link
within the SuperLink and the proceeding link. Next, the program executes the STUnion function
within the OSM database for the two links, and their “coordinates” are combined. We repeated
this for each link within every SuperLink until no more links remained for a given interstate.
This allowed us to have a geometrical representation of the SuperLinks as individual entities to
map on Google Earth. With this, we are now able to more easily compare the SuperLinks to
existing CTA links.
E. Setting HM-164 Designations
Now that the SuperLinks are in the right format to find the corresponding CTA links, our
goal was to find the right places to begin comparing the links. Since our project is handling
hazardous material (HM-164) designated routes, we decided that it would be best to find areas in
the CTA network where HM-164 routes intersected with non-HM-164 routes. Then, we planned
to take the non-HM-164 CTA route and find the corresponding OSM SuperLink. Once we had
the corresponding SuperLink, we set its HM-164 attribute to “false” in the database. When we
designated all of the corresponding OSM routes as non-HM-164, we then set the remaining OSM
SuperLinks as HM-164.
To handle each of these OSM routes, we used a Java program to loop through the
intersection points. To produce the intersection points in the CTA network, we ran a query in the
database to find where HM-164 meets a non-HM-164 link. After finding these points, the
program stored the latitude and longitude of each point. Next, we created a stored procedure to
find the closest OSM SuperLinks to the CTA node/point within a specific radius. Out of the
returned SuperLinks, the program found the distance from aNodes and bNodes for each
SuperLink to the CTA node. If neither the aNode nor the bNode of the SuperLink fell within the
specified radius, that link needed to be split to be properly assigned HM-164 designation. Taking
this SuperLink, the program split it into two, starting the new SuperLink at the closest individual
link to the CTA intersection point.
4
A Line String is a sequence of points representing a linear object.
5
Once each appropriate SuperLink was split, we could then assign HM-164 designations. To
do this, we first had to run a Java program to take the non-HM-164 CTA link in the intersection
and find its geospatial bearing. Finding the bearing would allow us to find the corresponding
OSM SuperLinks and actually assign the links attributes. Once we had the CTA link bearing, we
found the bearing of the closest OSM SuperLinks, based on their direction. Given that the
SuperLinks and CTA links don’t match up exactly, we allowed a tolerance level of 15 degrees
for the difference in bearings. When we found the appropriate SuperLinks, we gave them all
non-HM-164 designations (“false” in the database), matching up with the CTA links. Then,
every remaining SuperLink would get an HM-164 designation (“true” in the database).
IV. CONCLUSION
While we took many steps in order to merge the two networks, the process led to an
accurate, updated version of the CTA network. We manipulated the OSM data to be able to
compare it to corresponding CTA data, starting with OSM nomenclature. Then, we created larger
datasets of links and nodes called “SuperLinks,” with which we created geometrical attributes.
After finding HM-164/non-HM-164 intersections in the CTA database, we split the SuperLinks
appropriately. At each split, we determined the SuperLinks that corresponded to the CTA non
HM-164 link by determining its bearing. Last, once we assigned the correct HM-164 route
designations, the merge was complete. Both directions of travel were accounted for, and we
achieved the same result that commercial conflation software would have provided. However,
even with the more accurate OSM geospatial data and preserved HM-164 designations, there is a
possibility for more research. One possible path for research would be to determine a way to
reduce the time complexity of the steps we took. While the steps worked, they could be designed
in a way that requires less time. Another path for future research would be to test out various
types of networks other than OSM and attributes other than HM-164.
V. REFERENCES
"Welcome Page." Center for Transportation Analysis. UT Battelle. Web.
<http://cta.ornl.gov/cta/>.
"About." OpenStreetMap. Web. <http://www.openstreetmap.org/about>.
"GIS Dictionary." Esri: Support. Web.
<http://support.esri.com/en/knowledgebase/GISDictionary/term/conflation>.

More Related Content

Viewers also liked

Tungum Presentation Annotated version
Tungum Presentation Annotated versionTungum Presentation Annotated version
Tungum Presentation Annotated versionSean Hammond
 
Media Agenda Setting and the rise of Islamophobia
Media Agenda Setting and the rise of IslamophobiaMedia Agenda Setting and the rise of Islamophobia
Media Agenda Setting and the rise of IslamophobiaAda Siddique
 
New Perspective on Marketing in the Service Economy ( Service Marketing)
New Perspective on Marketing in the Service Economy ( Service Marketing) New Perspective on Marketing in the Service Economy ( Service Marketing)
New Perspective on Marketing in the Service Economy ( Service Marketing) Muhammad Ali Khan
 
ныгыманов адлет+тоо рассвет +клиенты
ныгыманов адлет+тоо рассвет +клиентыныгыманов адлет+тоо рассвет +клиенты
ныгыманов адлет+тоо рассвет +клиентыАдлет Ныгыманов
 
Historia de la computación
Historia de la computaciónHistoria de la computación
Historia de la computaciónangela mendoza
 
ныгыманов адлет+каз даму +клиенты
ныгыманов адлет+каз даму +клиентыныгыманов адлет+каз даму +клиенты
ныгыманов адлет+каз даму +клиентыАдлет Ныгыманов
 
Die Sonne – alte Bekannte oder grosse Unbekannte?
Die Sonne – alte Bekannte oder grosse Unbekannte? Die Sonne – alte Bekannte oder grosse Unbekannte?
Die Sonne – alte Bekannte oder grosse Unbekannte? FLARECAST
 
Asturias. maría pilar paños alarcón y maría moyano collado
Asturias.  maría pilar paños alarcón y maría moyano colladoAsturias.  maría pilar paños alarcón y maría moyano collado
Asturias. maría pilar paños alarcón y maría moyano colladoMaría Pilar Paños Alarcón
 
La couronne solaire : du calme à la tempête
La couronne solaire : du calme à la tempêteLa couronne solaire : du calme à la tempête
La couronne solaire : du calme à la tempêteFLARECAST
 
Basic accounting terms
Basic accounting termsBasic accounting terms
Basic accounting termscommerce Pk
 

Viewers also liked (18)

Tungum Presentation Annotated version
Tungum Presentation Annotated versionTungum Presentation Annotated version
Tungum Presentation Annotated version
 
SATISH RESUME DUBAI
SATISH RESUME DUBAISATISH RESUME DUBAI
SATISH RESUME DUBAI
 
Media Agenda Setting and the rise of Islamophobia
Media Agenda Setting and the rise of IslamophobiaMedia Agenda Setting and the rise of Islamophobia
Media Agenda Setting and the rise of Islamophobia
 
New Perspective on Marketing in the Service Economy ( Service Marketing)
New Perspective on Marketing in the Service Economy ( Service Marketing) New Perspective on Marketing in the Service Economy ( Service Marketing)
New Perspective on Marketing in the Service Economy ( Service Marketing)
 
Services Marketing
Services MarketingServices Marketing
Services Marketing
 
ныгыманов адлет+тоо рассвет +клиенты
ныгыманов адлет+тоо рассвет +клиентыныгыманов адлет+тоо рассвет +клиенты
ныгыманов адлет+тоо рассвет +клиенты
 
Historia de la computación
Historia de la computaciónHistoria de la computación
Historia de la computación
 
Poema de mio cid
Poema de mio cidPoema de mio cid
Poema de mio cid
 
ныгыманов адлет+каз даму +клиенты
ныгыманов адлет+каз даму +клиентыныгыманов адлет+каз даму +клиенты
ныгыманов адлет+каз даму +клиенты
 
Die Sonne – alte Bekannte oder grosse Unbekannte?
Die Sonne – alte Bekannte oder grosse Unbekannte? Die Sonne – alte Bekannte oder grosse Unbekannte?
Die Sonne – alte Bekannte oder grosse Unbekannte?
 
Plan de acción tutorial pornografía (1) (1)
Plan de acción tutorial pornografía (1) (1)Plan de acción tutorial pornografía (1) (1)
Plan de acción tutorial pornografía (1) (1)
 
Proyecto de investigación.
Proyecto de investigación.Proyecto de investigación.
Proyecto de investigación.
 
Scan0031
Scan0031Scan0031
Scan0031
 
Deutsche Gazaab
Deutsche GazaabDeutsche Gazaab
Deutsche Gazaab
 
Asturias. maría pilar paños alarcón y maría moyano collado
Asturias.  maría pilar paños alarcón y maría moyano colladoAsturias.  maría pilar paños alarcón y maría moyano collado
Asturias. maría pilar paños alarcón y maría moyano collado
 
La couronne solaire : du calme à la tempête
La couronne solaire : du calme à la tempêteLa couronne solaire : du calme à la tempête
La couronne solaire : du calme à la tempête
 
Mi plan de acción tutorial.
Mi plan de acción tutorial.Mi plan de acción tutorial.
Mi plan de acción tutorial.
 
Basic accounting terms
Basic accounting termsBasic accounting terms
Basic accounting terms
 

Similar to Research Report - Merging Interstate Transportation Networks for routing Hazardous Materials

Concept of node usage probability from complex networks and its applications ...
Concept of node usage probability from complex networks and its applications ...Concept of node usage probability from complex networks and its applications ...
Concept of node usage probability from complex networks and its applications ...redpel dot com
 
Iisrt komathi krishna (networks)
Iisrt komathi krishna (networks)Iisrt komathi krishna (networks)
Iisrt komathi krishna (networks)IISRT
 
Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Papitha Velumani
 
Optimal Content Downloading in Vehicular Network with Density Measurement
Optimal Content Downloading in Vehicular Network with Density MeasurementOptimal Content Downloading in Vehicular Network with Density Measurement
Optimal Content Downloading in Vehicular Network with Density MeasurementZac Darcy
 
Optimal content downloading in vehicular network with density measurement
Optimal content downloading in vehicular network with density measurementOptimal content downloading in vehicular network with density measurement
Optimal content downloading in vehicular network with density measurementZac Darcy
 
The Design of a Simulation for the Modeling and Analysis of Public Transporta...
The Design of a Simulation for the Modeling and Analysis of Public Transporta...The Design of a Simulation for the Modeling and Analysis of Public Transporta...
The Design of a Simulation for the Modeling and Analysis of Public Transporta...CSCJournals
 
CoryCookFinalProject535
CoryCookFinalProject535CoryCookFinalProject535
CoryCookFinalProject535Cory Cook
 
Ncct Ieee Software Abstract Collection Volume 1 50+ Abst
Ncct   Ieee Software Abstract Collection Volume 1   50+ AbstNcct   Ieee Software Abstract Collection Volume 1   50+ Abst
Ncct Ieee Software Abstract Collection Volume 1 50+ Abstncct
 
Vehicle to Vehicle Communication of Content Downloader in Mobile
Vehicle to Vehicle Communication of Content Downloader in MobileVehicle to Vehicle Communication of Content Downloader in Mobile
Vehicle to Vehicle Communication of Content Downloader in Mobileijbuiiir1
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)irjes
 
F233842
F233842F233842
F233842irjes
 
Back-Bone Assisted HOP Greedy Routing for VANET
Back-Bone Assisted HOP Greedy Routing for VANETBack-Bone Assisted HOP Greedy Routing for VANET
Back-Bone Assisted HOP Greedy Routing for VANETijsrd.com
 
Network analysis in gis , part 4 transportation networks
Network analysis in gis , part 4 transportation networksNetwork analysis in gis , part 4 transportation networks
Network analysis in gis , part 4 transportation networksDepartment of Applied Geology
 
Performance Evaluation of Efficient Data Dissemination Approach For QoS Enha...
 Performance Evaluation of Efficient Data Dissemination Approach For QoS Enha... Performance Evaluation of Efficient Data Dissemination Approach For QoS Enha...
Performance Evaluation of Efficient Data Dissemination Approach For QoS Enha...IJCSIS Research Publications
 
X-trace a pervasive network tracing framework
X-trace a pervasive network tracing frameworkX-trace a pervasive network tracing framework
X-trace a pervasive network tracing frameworkssuser804d54
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balmanbalmanme
 

Similar to Research Report - Merging Interstate Transportation Networks for routing Hazardous Materials (20)

Mercury
MercuryMercury
Mercury
 
Concept of node usage probability from complex networks and its applications ...
Concept of node usage probability from complex networks and its applications ...Concept of node usage probability from complex networks and its applications ...
Concept of node usage probability from complex networks and its applications ...
 
Iisrt komathi krishna (networks)
Iisrt komathi krishna (networks)Iisrt komathi krishna (networks)
Iisrt komathi krishna (networks)
 
Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks Trajectory improves data delivery in urban vehicular networks
Trajectory improves data delivery in urban vehicular networks
 
Optimal Content Downloading in Vehicular Network with Density Measurement
Optimal Content Downloading in Vehicular Network with Density MeasurementOptimal Content Downloading in Vehicular Network with Density Measurement
Optimal Content Downloading in Vehicular Network with Density Measurement
 
ECCS 2010
ECCS 2010ECCS 2010
ECCS 2010
 
Computer network solution
Computer network solutionComputer network solution
Computer network solution
 
Optimal content downloading in vehicular network with density measurement
Optimal content downloading in vehicular network with density measurementOptimal content downloading in vehicular network with density measurement
Optimal content downloading in vehicular network with density measurement
 
Data mining
Data miningData mining
Data mining
 
The Design of a Simulation for the Modeling and Analysis of Public Transporta...
The Design of a Simulation for the Modeling and Analysis of Public Transporta...The Design of a Simulation for the Modeling and Analysis of Public Transporta...
The Design of a Simulation for the Modeling and Analysis of Public Transporta...
 
CoryCookFinalProject535
CoryCookFinalProject535CoryCookFinalProject535
CoryCookFinalProject535
 
Ncct Ieee Software Abstract Collection Volume 1 50+ Abst
Ncct   Ieee Software Abstract Collection Volume 1   50+ AbstNcct   Ieee Software Abstract Collection Volume 1   50+ Abst
Ncct Ieee Software Abstract Collection Volume 1 50+ Abst
 
Vehicle to Vehicle Communication of Content Downloader in Mobile
Vehicle to Vehicle Communication of Content Downloader in MobileVehicle to Vehicle Communication of Content Downloader in Mobile
Vehicle to Vehicle Communication of Content Downloader in Mobile
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
 
F233842
F233842F233842
F233842
 
Back-Bone Assisted HOP Greedy Routing for VANET
Back-Bone Assisted HOP Greedy Routing for VANETBack-Bone Assisted HOP Greedy Routing for VANET
Back-Bone Assisted HOP Greedy Routing for VANET
 
Network analysis in gis , part 4 transportation networks
Network analysis in gis , part 4 transportation networksNetwork analysis in gis , part 4 transportation networks
Network analysis in gis , part 4 transportation networks
 
Performance Evaluation of Efficient Data Dissemination Approach For QoS Enha...
 Performance Evaluation of Efficient Data Dissemination Approach For QoS Enha... Performance Evaluation of Efficient Data Dissemination Approach For QoS Enha...
Performance Evaluation of Efficient Data Dissemination Approach For QoS Enha...
 
X-trace a pervasive network tracing framework
X-trace a pervasive network tracing frameworkX-trace a pervasive network tracing framework
X-trace a pervasive network tracing framework
 
Dynamic adaptation balman
Dynamic adaptation balmanDynamic adaptation balman
Dynamic adaptation balman
 

Research Report - Merging Interstate Transportation Networks for routing Hazardous Materials

  • 1. 1 Merging Interstate Transportation Networks for routing Hazardous Materials I. ABSTRACT Andrew Emerson (Furman University, Greenville, SC 29613) Ingrid Busch, PhD (Oak Ridge National Laboratory, Oak Ridge, TN 37830) The Oak Ridge National Laboratory’s Center for Transportation Analysis (CTA) uses intermodal transportation network models to analyze and organize methods for the routing of materials and people. For example, one of the CTA’s ongoing projects is to develop plans and maps for safe transportation of hazardous materials. Current transportation networks contain rich, detailed information such as the designation of roads for hazardous materials. However, the intermodal network that the CTA is currently using is outdated and inaccurate. To improve the network, we planned to merge the current CTA highway network with a newer, open-source network called OpenStreetMap (OSM). OSM contains much more precise and accurate geospatial data and is updated daily. OSM, like other crowd-sourced developed networks, contains detailed geospatial data. However, it lacks many of the attributes that CTA generally relies on for use in its projects. First, we compared the OSM links and nodes to the links and nodes currently in CTA database. The nomenclature OSM uses to classify road types is similar, but not identical, to the classifications that CTA uses. Thus, finding the appropriate highways to update was the first task. Second, we grouped the OSM data into individual datasets which we refer to as “SuperLinks.” Each SuperLink contains links and nodes for various sections of each interstate. By using these larger datasets, we could more easily find the corresponding CTA link containing the attributes to be captured. We identified specific starting nodes within the CTA network and employed geographical techniques to merge the two networks. By doing this, we have built an improved CTA network and developed a reusable merging method leading to a much more accurate and updated transportation model.
  • 2. 2 II. INTRODUCTION The CTA utilizes transportation data to complete various projects ranging from fuel efficiency to the routing of hazardous materials (HM-164). These projects produce research that goes towards the “efficient, safe and free movement of people and goods in our Nation's transportation systems.”1 One of the major transportation networks that the CTA uses contains multimodal routes: highway, railway, waterway, and airway. In projects relating to HM-164 routing, researchers rely on road designations contained in the CTA database. However, the data in the CTA network is inaccurate and outdated. For instance, several of the interstates don’t quite follow mapped satellite images. Additionally, since the CTA network was created, many roads and interstates have been extended or rerouted. An improved network would allow more precise and efficient routing. Also, a method to update the network would allow for a straightforward, reusable process in obtaining new data. To improve the network and provide up-to-date data, our project aimed to merge the current CTA network with a newer, open-source network called OpenStreetMap (OSM). OSM is “built by a community of mappers that contribute and maintain data about roads, trails, cafés, railway stations, and much more, all over the world.”2 Our project was primarily concerned with interstates. To visualize results, we mapped the link/node data on Google Earth. In the process of investigating methods to merge the networks, we considered a couple different strategies. One immediate issue that we faced is that OSM classifies its data in different ways than the current CTA network. To accommodate this, we had to manipulate the downloaded data in the OSM network to even begin our project. In addition to data manipulation, we dealt with other concepts, such as conflation. In this paper, I will discuss conflation, data manipulation, and network merging, along with the methods we decided to use to merge the OSM and CTA transportation networks. III. METHODS A. Conflation When investigating methods to combine the two networks, one existing concept that we found is called “conflation”. Conflation is broadly defined as, “a set of procedures that aligns the features of two geographic data layers and then transfers the attributes of one to the other.”3 This essentially describes the general idea of our project, but there was not a specific set of procedures that immediately fit our needs. Many current conflation software tools are complex and would be more time consuming to use in our case. While we did not use conflation software in our project, we followed the same general principles that these software packages follow, and we developed a conflation technique specific to our networks. B. Converting OSM Interstate Names 1 "Welcome Page." Center for Transportation Analysis. UT Battelle. Web. <http://cta.ornl.gov/cta/>. 2 "About." OpenStreetMap. Web. <http://www.openstreetmap.org/about>. 3 "GIS Dictionary." Esri: Support. Web. <http://support.esri.com/en/knowledgebase/GISDictionary/term/conflation>.
  • 3. 3 As stated previously, OSM organizes its data somewhat differently than CTA does. The first step that we took when approaching this problem was to scan through OSM data to determine what modifications we would need to make in order to compare it to CTA data. One area where the data did not compare very well was in the interstate name. In the CTA network, all interstates are named in the same format; “I” followed by the identifying number. We wanted to stick with this uniform naming convention. Because OSM is open-source, the users who enter information on their network may have inconsistencies with their naming methods. For instance, the interstates could be named “I-10,” “I 10,” or “I10,” but still refer to the same road. Due to this fact, we had to find a way to convert all of the interstate names in OSM to the format that CTA uses, which would be “I10.” Using Java and the integrated development environment Eclipse, we wrote a program that would access the database containing all of the OSM interstates. In order to connect from Eclipse to the CTA and OSM databases, we used Java Database Connectivity (JDBC), which is a software package containing query/database commands for Java programs. The first program we used stored all of the interstates and then parsed each individual link name. To parse correctly, we set the delimiters as anything that wasn’t an “I” or an identifying number. When done parsing, the program saved the new name and then inserted the name into the database as that link’s name. While this seemed to be an efficient solution, there were still a few exceptions that had unusual names in the OSM network. For example, some interstates technically have two names, such as I40/I75. In the CTA database, there is only one name associated with a given interstate. We had to manually go into the database and fix these names individually. C. Creating “SuperLink” Datasets The second step was to determine a way to organize the OSM data in a way that would make it easier to compare to the CTA data. We chose to handle larger datasets to compare because the networks themselves are very big. To do this, we created what we refer to as “SuperLinks,” which are large segments of an individual interstate. SuperLinks start and end based on a few guidelines: they start where a starting node is not attached to another interstate, where a starting node is attached to an interstate with a different name, where a starting node is attached to a merge between two links, and where an interstate splits into two or more links (each link is the start of a SuperLink). The endpoints of each SuperLink would be at the end of an interstate, where two or more links merge together, or where two or more links split apart. With these guidelines, several SuperLinks would make up a single interstate, and the SuperLinks would more closely resemble the links in the current CTA datasets. Again by using Java within Eclipse, we wrote a program that would generate SuperLinks for each existing interstate. The program begins by inducing a loop through each interstate, and stores all links. The valuable link information that we stored are the wayID (link identification), partID (links could have multiple parts), aNode (start node), bNode (end node), distance (in miles) and name. Next, the program finds all instances where there is a start point on the interstate – the guidelines mentioned in the previous paragraph. By storing each of these start points, the program creates a SuperLink for each one of these points and inserts this into the SuperLink table in the OSM database. Then, for each SuperLink, the program follows the starting link in the database and finds each proceeding link in order, while storing the new
  • 4. 4 endpoint and distance. Once the program finds the end of one SuperLink, it proceeds to the next one. When the program finishes one interstate, it moves to the next one until all interstates are composed of SuperLinks, and each link that is an interstate is contained in a SuperLink. The result is that each link that is an interstate in the OSM network is now organized into a SuperLink, making it easily comparable to the CTA network. D. Generating SuperLink Geometry Unions The biggest problem that we faced at this point was that none of the SuperLinks had a geometric aspect to them, meaning they could not be mapped. At this point, the SuperLinks were just a collection of links and not geometric entities themselves. Each link in the OSM network has an attribute labeled as “coordinates,” which is just a Line String4 that represents the geometrical aspect of the link. In the CTA network, this aspect is stored in the “geometryItem” attribute column. To give the SuperLink its own geometrical aspect, we decided to use the SQL server’s built in “STUnion” function that combines Line Strings of multiple geometric objects. However, this function only works for two objects, or in our case, links, at a time. To account for this, we wrote another Java program to handle each SuperLink and combine the links in the correct succession. The program selects each SuperLink and stores the individual link’s information: wayID, partID, aNode, and bNode. Then, the program finds the first link within the SuperLink and the proceeding link. Next, the program executes the STUnion function within the OSM database for the two links, and their “coordinates” are combined. We repeated this for each link within every SuperLink until no more links remained for a given interstate. This allowed us to have a geometrical representation of the SuperLinks as individual entities to map on Google Earth. With this, we are now able to more easily compare the SuperLinks to existing CTA links. E. Setting HM-164 Designations Now that the SuperLinks are in the right format to find the corresponding CTA links, our goal was to find the right places to begin comparing the links. Since our project is handling hazardous material (HM-164) designated routes, we decided that it would be best to find areas in the CTA network where HM-164 routes intersected with non-HM-164 routes. Then, we planned to take the non-HM-164 CTA route and find the corresponding OSM SuperLink. Once we had the corresponding SuperLink, we set its HM-164 attribute to “false” in the database. When we designated all of the corresponding OSM routes as non-HM-164, we then set the remaining OSM SuperLinks as HM-164. To handle each of these OSM routes, we used a Java program to loop through the intersection points. To produce the intersection points in the CTA network, we ran a query in the database to find where HM-164 meets a non-HM-164 link. After finding these points, the program stored the latitude and longitude of each point. Next, we created a stored procedure to find the closest OSM SuperLinks to the CTA node/point within a specific radius. Out of the returned SuperLinks, the program found the distance from aNodes and bNodes for each SuperLink to the CTA node. If neither the aNode nor the bNode of the SuperLink fell within the specified radius, that link needed to be split to be properly assigned HM-164 designation. Taking this SuperLink, the program split it into two, starting the new SuperLink at the closest individual link to the CTA intersection point. 4 A Line String is a sequence of points representing a linear object.
  • 5. 5 Once each appropriate SuperLink was split, we could then assign HM-164 designations. To do this, we first had to run a Java program to take the non-HM-164 CTA link in the intersection and find its geospatial bearing. Finding the bearing would allow us to find the corresponding OSM SuperLinks and actually assign the links attributes. Once we had the CTA link bearing, we found the bearing of the closest OSM SuperLinks, based on their direction. Given that the SuperLinks and CTA links don’t match up exactly, we allowed a tolerance level of 15 degrees for the difference in bearings. When we found the appropriate SuperLinks, we gave them all non-HM-164 designations (“false” in the database), matching up with the CTA links. Then, every remaining SuperLink would get an HM-164 designation (“true” in the database). IV. CONCLUSION While we took many steps in order to merge the two networks, the process led to an accurate, updated version of the CTA network. We manipulated the OSM data to be able to compare it to corresponding CTA data, starting with OSM nomenclature. Then, we created larger datasets of links and nodes called “SuperLinks,” with which we created geometrical attributes. After finding HM-164/non-HM-164 intersections in the CTA database, we split the SuperLinks appropriately. At each split, we determined the SuperLinks that corresponded to the CTA non HM-164 link by determining its bearing. Last, once we assigned the correct HM-164 route designations, the merge was complete. Both directions of travel were accounted for, and we achieved the same result that commercial conflation software would have provided. However, even with the more accurate OSM geospatial data and preserved HM-164 designations, there is a possibility for more research. One possible path for research would be to determine a way to reduce the time complexity of the steps we took. While the steps worked, they could be designed in a way that requires less time. Another path for future research would be to test out various types of networks other than OSM and attributes other than HM-164. V. REFERENCES "Welcome Page." Center for Transportation Analysis. UT Battelle. Web. <http://cta.ornl.gov/cta/>. "About." OpenStreetMap. Web. <http://www.openstreetmap.org/about>. "GIS Dictionary." Esri: Support. Web. <http://support.esri.com/en/knowledgebase/GISDictionary/term/conflation>.