At exactEarth, we have a lot of formats for the same AIS data. We had a proliferation of ad-hoc converters that developers had written with varying quality and performance.
By modelling the problem of format conversion as a graph traversal problem, I was able to create a library of battle tested code that could dynamically produce the converters needed to make the task of format conversion simple for developers.
Truck planning: how to certify the right routeSpeck&Tech
ABSTRACT: Someone says "all roads lead to Rome", but for GRUBER Logistics there is only one right route to bring our customer's goods from the loading to the unloading place. Having a predefined route, without leaving the driver to follow any route suggested by their device, is the basis to guarantee correct planning, on-time delivery, and cost control. We will present the solution developed inside our GRUBER Beyond platform, to certify the route a driver has to follow: the talk will focus both on the user experience and on the architecture applied to implement the new feature on top of our legacy TMS.
BIO: For more than 6 years, Davide Bonetta has been leading the digital innovation office in GRUBER Logistics. His team has the goal to internally develop innovative digitalized solutions. He is mainly focused on the analysis of the solutions and on bridging the gap between business and technology, but he loves technologies and sticking his nose into the code.
Building on the Glimmer rendering engine, Ember continues to make performance and stability dual priorities. Let's discuss what the web looks like in 2017 and how Ember is prepared, and can better prepare, to meet new challenges.
Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...Safe Software
TerraLogik was contracted by NAV CANADA to generate a 1:500,000-scale topographic base from 1:50,000-scale data for use in aeronautical navigation charts. TerraLogik used the raster analysis tools in FME, coupled with custom Python code and the GDAL open-source library, to create generalized contours at 1:500,000 scale from 1:50,000-scale DEMs covering Canada and the Northern United States. See how the WorkspaceRunner is used to perform this process - and reduced processing time from 24 hours to 1 hour per chart.
Truck planning: how to certify the right routeSpeck&Tech
ABSTRACT: Someone says "all roads lead to Rome", but for GRUBER Logistics there is only one right route to bring our customer's goods from the loading to the unloading place. Having a predefined route, without leaving the driver to follow any route suggested by their device, is the basis to guarantee correct planning, on-time delivery, and cost control. We will present the solution developed inside our GRUBER Beyond platform, to certify the route a driver has to follow: the talk will focus both on the user experience and on the architecture applied to implement the new feature on top of our legacy TMS.
BIO: For more than 6 years, Davide Bonetta has been leading the digital innovation office in GRUBER Logistics. His team has the goal to internally develop innovative digitalized solutions. He is mainly focused on the analysis of the solutions and on bridging the gap between business and technology, but he loves technologies and sticking his nose into the code.
Building on the Glimmer rendering engine, Ember continues to make performance and stability dual priorities. Let's discuss what the web looks like in 2017 and how Ember is prepared, and can better prepare, to meet new challenges.
Use FME To Efficiently Create National-Scale Vector Contours From High-Resolu...Safe Software
TerraLogik was contracted by NAV CANADA to generate a 1:500,000-scale topographic base from 1:50,000-scale data for use in aeronautical navigation charts. TerraLogik used the raster analysis tools in FME, coupled with custom Python code and the GDAL open-source library, to create generalized contours at 1:500,000 scale from 1:50,000-scale DEMs covering Canada and the Northern United States. See how the WorkspaceRunner is used to perform this process - and reduced processing time from 24 hours to 1 hour per chart.
Distributed Logging Architecture in the Container EraGlenn Davis
Presentation given at LinuxCon Japan 2016 by Satoshi "Moris" Tagomori (@tagomoris), Treasure Data. Describes various strategies for aggregating log data in a microservices architecture using containers, e.g. Docker.
We discuss and demonstrate various advanced features such as managing multi-DTN endpoints, customized mapping of user identities, public/private deployment configurations, tweaking file transfer performance, and differences/migrating between Globus Connect Server versions 4 and 5.
Charting New Waters: Data Integration Excellence for Port & Marine Operationsmarketing932765
Join us alongside Service Partners T Baker Smith in an enlightening webinar that explores the transformative capabilities of combining iBlueHarbor and FME in ports and marine operations. This session will delve into how these technologies synergize to support data-driven decisions, offering a scalable and straightforward approach to marine planning, maintenance, and operations.
Discover how the integration of iBlueHarbor's analytical capabilities with data transformation in FME facilitates a seamless flow of information. This powerful combination not only accelerates decision-making, but also enhances the efficiency and effectiveness of port and marine operations. Our speakers will showcase a comprehensive solution that provides a holistic view of marine operations survey data, enabling users to make swift, informed decisions.
Learn how leveraging these technologies can be an indispensable tool in enhancing marine safety, optimizing operations, and ensuring the success of your maritime ventures. Whether you're involved in marine planning, maintenance, or operations – this webinar will provide valuable insights and practical solutions to elevate your strategies in this dynamic industry. Register now and reach the full potential of your port and marine operations with the innovative blend of FME and iBlueHarbor.
This is my presentation "CDC to the Max" as presented at the EMEA PUG Challange 16-11-2017. Descibed is an approach how to utilize OpenEdge Change Data Capture (CDC) to offload information to external systems to enable possibilities typically not very well suited for relational databases.
Reconfigurable Coprocessors Synthesis in the MPEG-RVC DomainMDC_UNICA
Flexibility and high efficiency are common design drivers in the embedded systems domain. Coarse-grained reconfigurable coprocessors can tackle these issues, but they suffer of complex design, debugging and applications mapping problems. In this paper, we propose an automated design flow that aids developers in design and managing coarse-grained reconfigurable coprocessors. It provides both the hardware IP and the software drivers, featuring two different levels of coupling with the host processor. The presented solution has been tested on a JPEG codec, targeting a commercial Xilinx Virtex-5 FPGA.
Migrating 500 Nodes from Rackspace to Google Cloud with Zero DowntimePaul Chandler
METRONOM, a multinational B2B Supermarket, migrated round 80 clusters with over 500 nodes from Rackspace in the UK to Google Cloud in Belgium, saving money and surviving Brexit. This is the story of how we managed this with zero production downtime, the problems and solutions we encountered on the way. An example of using Cassandra DCs to migrate data across geographical boundaries.
Using FME Server and Engines to Convert Large Amounts of DataSafe Software
We at Hexagon/Intergraph have implemented an interesting process using FME Server and FME Engines to convert large quantities of data across multiple servers.
We are in the process of repeating this for two other major projects.
The implementation contains
1. FME Server Core (with Failover support)
2. Multiple FME Engines on 7 servers
3. C++ Front End application for job submission built with FME Server API
4. Data staged across the servers
5. Load balancing the servers
6. Check pointing the process
7. Redundancy built in
Various other parts built and automated to complete data processing consistently and accurately.
The initial project was successful without any errors over 6 months where there was new data delivered every other week.
Building on basic Globus administration skills, we introduce topics such as using multiple data transfer nodes for your endpoint, customizing identity mapping, and using Globus connectors to access non-POSIX filesystems such as iRODS and Amazon S3.
Presented at a workshop at KU Leuven on July 8, 2022.
Nagios Conference 2014 - Simon Finch - Monitoring Maturity A 16 Year JourneyNagios
Simon Finch's presentation on Monitoring Maturity A 16 Year Journey .
The presentation was given during the Nagios World Conference North America held Oct 13th - Oct 16th, 2014 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/conference
Evolution of a cloud start up: From C# to Node.jsSteve Jamieson
ComputeNext started 3 years ago to develop the first open marketplace for cloud computing services.
We started by using the technologies we were most familiar with - C# and SQL Server, and our initial architecture and implementation was based on these technologies.
Over time, we have progressively introduced more open source elements, including MongoDB, RabbitMQ and Node.js.
Now we are at the point where most of our back-end services rely on Node.js. The talk will talk about why we did this, how we did this, and discuss our experiences - both good and bad.
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Crescat
Crescat is industry-trusted event management software, built by event professionals for event professionals. Founded in 2017, we have three key products tailored for the live event industry.
Crescat Event for concert promoters and event agencies. Crescat Venue for music venues, conference centers, wedding venues, concert halls and more. And Crescat Festival for festivals, conferences and complex events.
With a wide range of popular features such as event scheduling, shift management, volunteer and crew coordination, artist booking and much more, Crescat is designed for customisation and ease-of-use.
Over 125,000 events have been planned in Crescat and with hundreds of customers of all shapes and sizes, from boutique event agencies through to international concert promoters, Crescat is rigged for success. What's more, we highly value feedback from our users and we are constantly improving our software with updates, new features and improvements.
If you plan events, run a venue or produce festivals and you're looking for ways to make your life easier, then we have a solution for you. Try our software for free or schedule a no-obligation demo with one of our product specialists today at crescat.io
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Distributed Logging Architecture in the Container EraGlenn Davis
Presentation given at LinuxCon Japan 2016 by Satoshi "Moris" Tagomori (@tagomoris), Treasure Data. Describes various strategies for aggregating log data in a microservices architecture using containers, e.g. Docker.
We discuss and demonstrate various advanced features such as managing multi-DTN endpoints, customized mapping of user identities, public/private deployment configurations, tweaking file transfer performance, and differences/migrating between Globus Connect Server versions 4 and 5.
Charting New Waters: Data Integration Excellence for Port & Marine Operationsmarketing932765
Join us alongside Service Partners T Baker Smith in an enlightening webinar that explores the transformative capabilities of combining iBlueHarbor and FME in ports and marine operations. This session will delve into how these technologies synergize to support data-driven decisions, offering a scalable and straightforward approach to marine planning, maintenance, and operations.
Discover how the integration of iBlueHarbor's analytical capabilities with data transformation in FME facilitates a seamless flow of information. This powerful combination not only accelerates decision-making, but also enhances the efficiency and effectiveness of port and marine operations. Our speakers will showcase a comprehensive solution that provides a holistic view of marine operations survey data, enabling users to make swift, informed decisions.
Learn how leveraging these technologies can be an indispensable tool in enhancing marine safety, optimizing operations, and ensuring the success of your maritime ventures. Whether you're involved in marine planning, maintenance, or operations – this webinar will provide valuable insights and practical solutions to elevate your strategies in this dynamic industry. Register now and reach the full potential of your port and marine operations with the innovative blend of FME and iBlueHarbor.
This is my presentation "CDC to the Max" as presented at the EMEA PUG Challange 16-11-2017. Descibed is an approach how to utilize OpenEdge Change Data Capture (CDC) to offload information to external systems to enable possibilities typically not very well suited for relational databases.
Reconfigurable Coprocessors Synthesis in the MPEG-RVC DomainMDC_UNICA
Flexibility and high efficiency are common design drivers in the embedded systems domain. Coarse-grained reconfigurable coprocessors can tackle these issues, but they suffer of complex design, debugging and applications mapping problems. In this paper, we propose an automated design flow that aids developers in design and managing coarse-grained reconfigurable coprocessors. It provides both the hardware IP and the software drivers, featuring two different levels of coupling with the host processor. The presented solution has been tested on a JPEG codec, targeting a commercial Xilinx Virtex-5 FPGA.
Migrating 500 Nodes from Rackspace to Google Cloud with Zero DowntimePaul Chandler
METRONOM, a multinational B2B Supermarket, migrated round 80 clusters with over 500 nodes from Rackspace in the UK to Google Cloud in Belgium, saving money and surviving Brexit. This is the story of how we managed this with zero production downtime, the problems and solutions we encountered on the way. An example of using Cassandra DCs to migrate data across geographical boundaries.
Using FME Server and Engines to Convert Large Amounts of DataSafe Software
We at Hexagon/Intergraph have implemented an interesting process using FME Server and FME Engines to convert large quantities of data across multiple servers.
We are in the process of repeating this for two other major projects.
The implementation contains
1. FME Server Core (with Failover support)
2. Multiple FME Engines on 7 servers
3. C++ Front End application for job submission built with FME Server API
4. Data staged across the servers
5. Load balancing the servers
6. Check pointing the process
7. Redundancy built in
Various other parts built and automated to complete data processing consistently and accurately.
The initial project was successful without any errors over 6 months where there was new data delivered every other week.
Building on basic Globus administration skills, we introduce topics such as using multiple data transfer nodes for your endpoint, customizing identity mapping, and using Globus connectors to access non-POSIX filesystems such as iRODS and Amazon S3.
Presented at a workshop at KU Leuven on July 8, 2022.
Nagios Conference 2014 - Simon Finch - Monitoring Maturity A 16 Year JourneyNagios
Simon Finch's presentation on Monitoring Maturity A 16 Year Journey .
The presentation was given during the Nagios World Conference North America held Oct 13th - Oct 16th, 2014 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/conference
Evolution of a cloud start up: From C# to Node.jsSteve Jamieson
ComputeNext started 3 years ago to develop the first open marketplace for cloud computing services.
We started by using the technologies we were most familiar with - C# and SQL Server, and our initial architecture and implementation was based on these technologies.
Over time, we have progressively introduced more open source elements, including MongoDB, RabbitMQ and Node.js.
Now we are at the point where most of our back-end services rely on Node.js. The talk will talk about why we did this, how we did this, and discuss our experiences - both good and bad.
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Crescat
Crescat is industry-trusted event management software, built by event professionals for event professionals. Founded in 2017, we have three key products tailored for the live event industry.
Crescat Event for concert promoters and event agencies. Crescat Venue for music venues, conference centers, wedding venues, concert halls and more. And Crescat Festival for festivals, conferences and complex events.
With a wide range of popular features such as event scheduling, shift management, volunteer and crew coordination, artist booking and much more, Crescat is designed for customisation and ease-of-use.
Over 125,000 events have been planned in Crescat and with hundreds of customers of all shapes and sizes, from boutique event agencies through to international concert promoters, Crescat is rigged for success. What's more, we highly value feedback from our users and we are constantly improving our software with updates, new features and improvements.
If you plan events, run a venue or produce festivals and you're looking for ways to make your life easier, then we have a solution for you. Try our software for free or schedule a no-obligation demo with one of our product specialists today at crescat.io
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaYara Milbes
Discover the transformative power of the WhatsApp API in our latest SlideShare presentation, "Top 7 Unique WhatsApp API Benefits." In today's fast-paced digital era, effective communication is crucial for both personal and professional success. Whether you're a small business looking to enhance customer interactions or an individual seeking seamless communication with loved ones, the WhatsApp API offers robust capabilities that can significantly elevate your experience.
In this presentation, we delve into the top 7 distinctive benefits of the WhatsApp API, provided by the leading WhatsApp API service provider in Saudi Arabia. Learn how to streamline customer support, automate notifications, leverage rich media messaging, run scalable marketing campaigns, integrate secure payments, synchronize with CRM systems, and ensure enhanced security and privacy.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
2. Overview
• At exactEarth, we deal with data in a lot of different formats.
• We had problems with a proliferation of converters.
• I’m going to discuss a pattern we developed for dealing with situations like
this that turned out well.
6. Automatic Identification System (AIS)
• Designed during the 1990s
• Adopted as a standard in 2002
• Very High Frequency (VHF)
radio transmissions
• 27 different types of messages
transmitted
7. Maritime Mobile Service Identity
(MMSI)
Location
Speed over ground
Course over ground
Heading
Rate of Turn
Message Types
1, 2, 3
• MaritimeMobile ServiceIdentity
(MMSI)
• Name
• IMO Number
• Callsign
• Dimensions of the ship
• Destination and ETA
Message
Type 5
8. Effective January 1 2005, AIS transceivers are
required by:
• All ships of 300 gross tonnage and upwards engaged
on international voyages
• All cargo ships of 500 gross tonnage and upwards
not engaged on international voyages
• All passenger ships irrespective of size.
AIS transceivers must be on at all times
(with some limited exceptions)
10. Many ways to store AIS messages
• NMEA v3, v4
• GNM v3.1
• Internal Binary formats (with several versions)
• “Adapted” formats (several variations):
• CSV
• XML
• JSON
• KML
• OTH-Gold
• Many third-party and “one-off” formats
11. Many ways to store AIS messages
• NMEA v3, v4
• GNM v3.1
• Internal Binary formats (with several versions)
• “Adapted” formats (several variations):
• CSV
• XML
• JSON
• KML
• OTH-Gold
• Many third-party and “one-off” formats
Many representations of the same data
12. Conversions between formats
In order to ingest data from third parties, and to satisfy customer demands
for data in a particular format, we need to be able to convert between all the
formats
13. Lossy vs. Lossless Conversions
Some conversions are lossless:
• For example, both NMEAv4 and GNM v3.1 capture all the same data.
But some are lossy, meaning that data is lost in the conversion:
• For example, NMEAv4 to KML
• KML doesn’t have all of the fields that AIS-specific formats do
14. Lossless Conversion: GNM and NMEAv4
GNM:
$PGHP,1,2011,9,8,18,9,6,300,,104,,1,00*20
!AIVDM,1,1,,,16KDMo@P00:1?IpDF6i=L?v<0<27,0*00
17. Lossless Conversion: GNM and NMEAv4
GNM:
$PGHP,1,2011,9,8,18,9,6,300,,104,,1,00*20
!AIVDM,1,1,,,16KDMo@P00:1?IpDF6i=L?v<0<27,0*00
NMEAv4:
s:104,c:1315505346300*0E!AIVDM,1,1,,,16KDMo@P00:1?IpDF6i=L?v<0<27,0*00
The other fields are either format syntax, checksums, or some trivial additional fields
18. Lossy Conversion NMEAv4 -> KML
Message Type 1:
• MMSI (identifier)
• Timestamp
• Longitude/Latitude
• Heading
• Navigation Status
• Rate of Turn
• Speed Over Ground
• Position Accuracy
• Course over Ground
• …
19. Lossy Conversion NMEAv4 -> KML
Message Type 1:
• MMSI (identifier)
• Timestamp
• Longitude/Latitude
• Heading
• Navigation Status
• Rate of Turn
• Speed Over Ground
• Position Accuracy
• Course over Ground
• …
KML:
<Placemark>
<name>431300061</name>
<TimeStamp><when>2011-09-08T18:09:06Z</when></TimeStamp>
<Point><coordinates>140.08116666666666,35.55616666666667</coordinates></Point>
<Style><IconStyle>
<Icon>
<href>http://maps.google.com/mapfiles/kml/shapes/track.png</href>
<w>64</w><h>64</h>
</Icon><color>ffff0000</color>
<heading>344.0</heading>
</IconStyle>
</Style>
</Placemark>
20. Problem: Proliferation of Converters
• Code Duplication
• Bug prone, not performant
• Testing + optimization efforts were strained by so many
implementations
• Not flexible
• If a component consumes GNM today, it was hard to add the ability to
consume NMEA
• Inadvertent use of lossy conversions
21. Step 1: One format to rule them all
We created a new format: EEA
Built for AIS, faithfully reflects the spec.
Extension fields for format-specific metadata
22. Side benefit: Multi-type fields
Some fields in the AIS spec are multi-typed
Example: Speed over Ground (10 bits, 0-1023)
From the spec:
“Speed over ground in 1/10 knot steps (0-102.2 knots)
1023 = not available, 1022 = 102.2 knots or higher”
Developers were often performing mathematical operations on the fields (!)
In EEA, we made the types of this fields:
Either[double, NOT_AVAILABLE, SPEED_102_POINT_2_KNOTS_OR_HIGHER]
35. Generating the converter
Now that we have a graph, to make a converter just compose the functions
on the edges of the shortest path:
NMEA_v4_payload = merge(serialize(nop(deserialize(tokenize(GNM_input)))))
Function composition in Python:
https://mathieularose.com/function-composition-in-python/#solution
38. Example usage
NM4_to_GNM = get_converter(("NMEA", "4", "PAYLOAD"), ("GNM", "3.1", "PAYLOAD"))
with open("my_nmea_v4_file.nm4", 'rb') as fin:
with open("my_converted_file.gnm", 'wb') as fout:
fout.write(NM4_to_GNM(fin))
39. Prevention of lossy conversions
Create 2 different conversion graphs:
1. Only lossless conversions: “FORWARD_FORMAT_CONVERSIONS”
2. Add lossy conversions: “ALL_FORMAT_CONVERSIONS”
Use lossless graph by default, make users explicitly ask to use lossy
conversions
If the user asks for a lossy conversion without being explicit, there will be
no path in the “FORWARD_FORMAT_CONVERSIONS” graph. Library can check for a
path in “ALL_FORMAT_CONVERSIONS” and give them a nice error message:
“No lossless path from NMEAv4 to KML. If you want to perform a
lossy conversion, you must explicitly allow lossy conversions.”
41. Extra parameters
1. Mark edges as having required parameters:
conversions.add_edge(
(“DOF”, “3”, “PAYLOAD”),
(“DOF”, “3”, “UNPARSED”),
function=tokenize,
required_params=set(["timestamp"])
)
2. Allow the user to supply arbitrary keyword arguments to get_converter():
get_converter(
(“DOF”, “3”, “PAYLOAD”),
(“DOF”, “4”, “PAYLOAD”), timestamp=get_datetime_for_id(id)
)
43. Benefits
• Centralizes the conversion code
• Less bugs, more performant
• Simplifies the code + less duplication
• Don’t need to know all of the input formats a priori
• Dynamic generation of converters
• Reduces chance of accidental lossy conversions
44. Summary
• We had a problem with multiple formats and converters between them
• By modelling it as a graph problem, it was easy to dynamically generate
converters
• This allowed for greater flexibility, greater safety
• When you have a web of conversion steps, you can use graph traversal
libraries to generate the shortest path to get the answers you want.
Why?:
Environmental: reef protection, bilge water dumping, oil spills, but most importantly illegal fishing…
Logistical: Port authorities, logistics companies, scheduling
Security: surveillance, smuggling, piracy
As a ship captain, how do you prevent collisions with other vessels?
People tend to jump immediately to SONAR/RADAR, but there are a few major problems to that:
The equipment is expensive
The equipment requires a lot of power
These systems are actually quite difficult to read. They require some skill to operate.
The most common method was simply to visually observe the other ships and try to estimate their speed, course, heading, and acceleration.
Then you would do the same for your vessel and figure out the calculus to determine if you are going to collide or not.
Obviously, this I also tricky, and fails in situations like:
Night time
Stormy weather
When you are going around a tight curve in a waterway, and can’t see what’s coming at you
Ships don’t stop on a dime. In fact, some of these vessels take in the neighbourhood of 20 minutes to stop. So sometimes, you will have two ships that know they are going to collide well in advance of the collision, but they can’t turn or stop fast enough to do anything about it.
So the problem of ship collisions is what triggered the creation of AIS.
All vessels transmit a “Hear I am! Please don’t hit me” to the other vessels in the area.
Here are some of the fields that are transmitted.
In the Type 1,2,3 messages, we have position information. These are transmitted every few seconds while the vessel is moving.
In other message types, we have more static information that doesn’t change very often, like registration and destination and ETA.
The first thing you do when you have a lot of formats: Create a new format!
We created a new internal format that we called EEA. Not only could it hold all of the AIS message fields, but it also had “extension fields” where we would shove all the fields that might be specific to a format. For example, GNM has some metadata fields that are specific to GNM. We put those into a GNMMetadata field within the EEA spec. So now we have a format that can capture all of the complexity of all the AIS formats.
It was written to be faithful to the AIS spec, which means that it handles the full complexity of the AIS spec.
A side benefit of redesigning our format (and our in-memory representation) with EEA is that we got to fix some of the issues developers were accidentally creating.
AIS is a complicated spec. For example, look at the definition for speed over ground. There are 1024 bits, and while most of the bit values can be interpreted as doubles, there are two reserved values that can not. There is 1023 = Not Available, for when the ship doesn’t know how fast it is going. There is also the 1022 value which means 102.2 knots or greater.
The problem with all the ad hoc parsers is that developers of them would often get the parsing of these fields wrong. They would often just parse the field as a double, not realizing that these special values existed, and then perform mathematical operations on the fields. So they would do things like average the values of the speed over ground, so you would end up with massive values when a large number of vessels were reporting “Not available”.
With EEA, we fixed that, changing the type of the field to be either a double, or one of two special values. This prevents mathematical operations being applied on the field, and forces the developer to stop and think about how they want to actually handle the math.
The next step was to define a common interface of functions we apply when parsing all formats. We eventually settled on this interface.
You start with a “PAYLOAD” which is a collection of bytes representing messages, which might be the contents of a file for example.
You then call tokenize() which finds the boundaries of the messages within the payload and splits the payload on those boundaries. You still haven’t parsed the messages, so you don’t know what they say yet. You only know the bytes that make up each message. We called this “UNPARSED” in this diagram.
You can then call deserialize, which actually parses the bytes of the message and gives you an in-memory representation of the message. Most commonly, this was the new EEA format.
Then on the reverse direction, we take individual messages and call serialize() on them to return them to the UNPARSED tokens we had before. And finally we call merge(), which writes the messages, one after another, into a payload.
So these four functions are pretty much universal to format parsing. They also form a graph, perhaps a sort of state-machine where the data parse levels are the nodes, and the functions are the edges.
We went and implemented these four functions for all of data formats.
And when you put all the conversion graphs beside one another, you notice something.
The PARSED node for most of the formats is EEA.
It’s the same format. Really, it’s all the same node. Conceptually, you could add edges between them with a NO-OP function.
So now if you wanted to convert between GNM and NM4, you can just follow the edges from GNM PAYLOAD to NMEA4 PAYLOAD
And you suddenly have the conversion steps.
In order to represent the graph in code, we need a graphing library.
I came across NetworkX and have had no complaints.
It allows you to create nodes and edges in a graph, and then gives you an easy way to do algorithms like shortest path across graphs.
Here’s what it looks like in code:
We have our conversions object, which is just a networkX graph object.
Then on each line, we add an edge between each of our parse levels. On each edge, we also supply the function to perform.
Notice in the last line, that this is an example where we jump between formats that are already parsed into the EEA in-memory representation. Therefore the function is a simple NO-OP.