This is a presentation material for the paper of "Towards Light-weight and Real-time Line Segment Detection" .
Written by Geonmo Gu*, Byungsoo Ko*, SeoungHyun Go, Sung-Hyun Lee, Jingeun Lee, Minchul Shin (* Authors contributed equally.)
@NAVER/LINE Vision
- Arxiv: https://arxiv.org/abs/2106.00186
- Github: https://github.com/navervision/mlsd
NIPS KANSAI Reading Group #7: 逆強化学習の行動解析への応用Eiji Uchibe
Can AI predict animal movements? Filling gaps in animal trajectories using inverse reinforcement learning, Ecosphere,
Modeling sensory-motor decisions in natural behavior, PLoS Comp. Biol.
NIPS KANSAI Reading Group #7: 逆強化学習の行動解析への応用Eiji Uchibe
Can AI predict animal movements? Filling gaps in animal trajectories using inverse reinforcement learning, Ecosphere,
Modeling sensory-motor decisions in natural behavior, PLoS Comp. Biol.
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...Daichi Kitamura
北村大地, "非負値行列分解の確率的生成モデルと多チャネル音源分離への応用," 慶應義塾大学理工学部電子工学科湯川研究室 招待講演, Kanagawa, November, 2015.
Daichi Kitamura, "Generative model in nonnegative matrix factorization and its application to multichannel sound source separation," Keio University, Science and Technology, Department of Electronics and Electrical Engineeing, Yukawa Laboratory, Invited Talk, Kanagawa, November, 2015.
論文紹介:Dueling network architectures for deep reinforcement learningKazuki Adachi
Wang, Ziyu, et al. "Dueling network architectures for deep reinforcement learning." Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1995-2003, 2016.
Area, Delay and Power Comparison of Adder TopologiesVLSICS Design
Adders form an almost obligatory component of every contemporary integrated circuit. The prerequisite of the adder is that it is primarily fast and secondarily efficient in terms of power consumption and chip area. This paper presents the pertinent choice for selecting the adder topology with the tradeoff between delay, power consumption and area. The adder topology used in this work are ripple carry adder, carry lookahead adder, carry skip adder, carry select adder, carry increment adder, carry save adder and carry bypass adder. The module functionality and performance issues like area, power dissipation and propagation delay are analyzed at 0.12µm 6metal layer CMOS technology using microwind tool.
AN EFFICIENT APPROACH FOR FOUR-LAYER CHANNEL ROUTING IN VLSI DESIGNVLSICS Design
Channel routing is a key problem in VLSI physical design. The main goal of the channel routing problem is to reduce the area of an IC chip. If we concentrate on reducing track number in channel routing problem then automatically the area of an IC chip will be reduced. Here, we propose a new algorithm to reduce the number of tracks using four layers (two horizontal layers and two vertical layers). To be more specific, through this algorithm we convert a two-layer channel routing problem into a four-layer channel routing problem using VCG of the channel. Next, we show the experimental results and graphical structure of that solution.
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...Daichi Kitamura
北村大地, "非負値行列分解の確率的生成モデルと多チャネル音源分離への応用," 慶應義塾大学理工学部電子工学科湯川研究室 招待講演, Kanagawa, November, 2015.
Daichi Kitamura, "Generative model in nonnegative matrix factorization and its application to multichannel sound source separation," Keio University, Science and Technology, Department of Electronics and Electrical Engineeing, Yukawa Laboratory, Invited Talk, Kanagawa, November, 2015.
論文紹介:Dueling network architectures for deep reinforcement learningKazuki Adachi
Wang, Ziyu, et al. "Dueling network architectures for deep reinforcement learning." Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1995-2003, 2016.
Area, Delay and Power Comparison of Adder TopologiesVLSICS Design
Adders form an almost obligatory component of every contemporary integrated circuit. The prerequisite of the adder is that it is primarily fast and secondarily efficient in terms of power consumption and chip area. This paper presents the pertinent choice for selecting the adder topology with the tradeoff between delay, power consumption and area. The adder topology used in this work are ripple carry adder, carry lookahead adder, carry skip adder, carry select adder, carry increment adder, carry save adder and carry bypass adder. The module functionality and performance issues like area, power dissipation and propagation delay are analyzed at 0.12µm 6metal layer CMOS technology using microwind tool.
AN EFFICIENT APPROACH FOR FOUR-LAYER CHANNEL ROUTING IN VLSI DESIGNVLSICS Design
Channel routing is a key problem in VLSI physical design. The main goal of the channel routing problem is to reduce the area of an IC chip. If we concentrate on reducing track number in channel routing problem then automatically the area of an IC chip will be reduced. Here, we propose a new algorithm to reduce the number of tracks using four layers (two horizontal layers and two vertical layers). To be more specific, through this algorithm we convert a two-layer channel routing problem into a four-layer channel routing problem using VCG of the channel. Next, we show the experimental results and graphical structure of that solution.
RT (Ray Tracing) models are widely used in RAN for channel modelling. Another possible application in
processing chain of base station with multiple purposes: positioning, channel estimation/prediction, radio
resources scheduling and others. In this paper RT positioning technique is addressed for Urban Outdoor
scenario. Proposed robust approach achieves several meters accuracy even in NLOS and multipath
conditions. Developed RT tracking was used for multiuser (MU) precoder prediction and demonstrated
significant capacity gain. Also, this paper discloses practical aspects for achieving high accuracy.
RT (Ray Tracing) models are widely used in RAN for channel modelling. Another possible application in processing chain of base station with multiple purposes: positioning, channel estimation/prediction, radio resources scheduling and others. In this paper RT positioning technique is addressed for Urban Outdoor scenario. Proposed robust approach achieves several meters accuracy even in NLOS and multipath conditions. Developed RT tracking was used for multiuser (MU) precoder prediction and demonstrated significant capacity gain. Also, this paper discloses practical aspects for achieving high accuracy.
3D METALLIC PLATE LENS ANTENNA BASED BEAMSPACE CHANNEL ESTIMATION TECHNIQUE F...ijwmn
Beamspace channel estimation mechanism for massive MIMO (multiple input multiple output) antenna
system presents a major process to compensate the 5G spectrum challenges caused by the proliferation of
information from mobile devices. However, this estimation is required to ensure the perfect channel state
information (CSI) for lower amount of Radio Frequency (RF) chains for each beam. In addition, phase
shifter (PS) components used in this estimation need high power to select the beam in the desired direction.
To overcome these limitations, in this work, we propose Regular Scanning Support Detection (RSSD)
based channel estimation mechanism. Moreover, we utilise a 3D lens antenna array having metallic plate
and a switch in our model which compensates the limitation of phase shifters. Simulation results show that
the proposed RSSD based channel estimation surpasses traditional technique and SD based channel
estimation even in lower SNR area which is highly desirable in the millimeter wave (mmWave) massive
MIMO systems.
3D METALLIC PLATE LENS ANTENNA BASED BEAMSPACE CHANNEL ESTIMATION TECHNIQUE F...ijwmn
Beamspace channel estimation mechanism for massive MIMO (multiple input multiple output) antenna
system presents a major process to compensate the 5G spectrum challenges caused by the proliferation of
information from mobile devices. However, this estimation is required to ensure the perfect channel state
information (CSI) for lower amount of Radio Frequency (RF) chains for each beam. In addition, phase
shifter (PS) components used in this estimation need high power to select the beam in the desired direction.
To overcome these limitations, in this work, we propose Regular Scanning Support Detection (RSSD)
based channel estimation mechanism. Moreover, we utilise a 3D lens antenna array having metallic plate
and a switch in our model which compensates the limitation of phase shifters. Simulation results show that
the proposed RSSD based channel estimation surpasses traditional technique and SD based channel
estimation even in lower SNR area which is highly desirable in the millimeter wave (mmWave) massive
MIMO systems.
CHANNEL ESTIMATION AND MULTIUSER DETECTION IN ASYNCHRONOUS SATELLITE COMMUNIC...ijwmn
In this paper, we propose a new method of channel estimation for asynchronous additive white Gaussian noise channels in satellite communications. This method is based on signals correlation and multiuser interference cancellation which adopts a successive structure. Propagation delays and signals amplitudes are jointly estimated in order to be used for data detection at the receiver. As, a multiuser detector, a single stage successive interference cancellation (SIC) architecture is analyzed and integrated to the channel estimation technique and the whole system is evaluated. The satellite access method adopted is the direct sequence code division multiple access (DS CDMA) one. To evaluate the channel estimation and the detection technique, we have simulated a satellite uplink with an asynchronous multiuser access.
Methods for comparing scanpaths and saliency maps: strengths and weaknessesOlivier Le Meur
Methods for comparing saliency maps and scanpaths. More details in:
O. Le Meur & T. Baccino, Methods for comparing scanpaths and saliency maps: strengths and weaknesses, Behavior Research Methods (BRM) 2013, http://dx.doi.org/10.3758/s13428-012-0226-9
Bit Error Rate Performance of MIMO Spatial Multiplexing with MPSK Modulation ...ijsrd.com
Wireless communication is one of the most effective areas of technology development of our time. Wireless communications today covers a very wide array of applications. In this, we study the performance of general MIMO system, the general V-BLAST architecture with MPSK Modulation in Rayleigh fading channels. Based on bit error rate, we show the performance of the 2x2 schemes with MPSK Modulation in noisy environment. We also show the bit error rate performance of 2x2, 3x3, 4x4 systems with BPSK modulation. We see that the bit error rate performance of 2x2 systems with QPSK modulation gives us the best performance among other schemes analysed here.
Enhanced Mobile Node Tracking With Received Signal Strength in Wireless Senso...IOSR Journals
Abstract : Node localization is important parameter in WSN. Node localization is required to report origin of
events which makes it one of the important challenges in WSN. Received signal strength (RSS) is used to
calculate distance between mobile node and reference node. The position of the mobile node is calculated using
multilateration algorithm (MA). Extended Kalman filter (EKF) is utilized to estimate the actual position. In this
paper, the implementation and enhancement of a tracking system based on RSS indicator with the aid of an
Extended Kalman Filter (EKF) is described and an adaptive filter is derived.
Keywords - Extended Kalman filter (EKF), mobile node tracking, multilateration algorithm (MA), received
signal strength (RSS), Wireless sensor networks (WSN)
Enhanced Mobile Node Tracking With Received Signal Strength in Wireless Senso...IOSR Journals
Node localization is important parameter in WSN. Node localization is required to report origin of
events which makes it one of the important challenges in WSN. Received signal strength (RSS) is used to
calculate distance between mobile node and reference node. The position of the mobile node is calculated using
multilateration algorithm (MA). Extended Kalman filter (EKF) is utilized to estimate the actual position. In this
paper, the implementation and enhancement of a tracking system based on RSS indicator with the aid of an
Extended Kalman Filter (EKF) is described and an adaptive filter is derived.
Using the Channel State Information (CSI) at the transmitter is fundamental for the precoder
design in Multi-user Multiple Input Single Output (MU-MISO-OFDM) systems. In Frequency
Division Duplex (FDD) systems, CSI can be just available at the transmitter through a limited
feedback channel [1], where we assume that each user quantizes its channel direction with a
finite number of quantization bits. In this paper, we consider a scalar quantization (SQ) scheme
of the Channel Direction Information (CDI). Although vector quantization (VQ) schemes [2],
[3] still outperform this scalar scheme in terms of quantization error and Sum rate, the former
scheme suffers from an exponential search complexity and high storage requirements at the
receiver for high number of feedback bits.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
3. Introduction
1. Line Segment Detection?
Line Segment Detection
Purpose: Detect line segments in the given images
Line Segment Detector
(LSD)
4. Introduction
1. Line Segment Detection?
Applications of LSD
Simultaneous Localization and Mapping (SLAM)1 Relative Pose Estimation2
1 Monocular-vision based SLAM using Line Segments, ICRA07
2 Line-Based Relative Pose Estimation, CVPR11
9. Introduction
2. Real-time inference is limited
b) Existing methods include multi-module line prediction process
Backbone
Feature
Maps
Input Image
Line
Segments
10. Introduction
2. Real-time inference is limited
Multi-module Processing
Attraction Field Maps
Top-Down
(TD) Strategy
Backbone
Feature
Maps
Input Image
Line
Segments
Squeeze Module
Compute
Line Proposal
Search for
Candidates
Verify
Candidates
b) Existing methods include multi-module line prediction process
11. Introduction
2. Real-time inference is limited
Multi-module Processing
Junction Heatmap
Line
Sampler
Line Proposal
Line
Verification
Bottom-Up
(BU) Strategy
Attraction Field Maps
Top-Down
(TD) Strategy
Backbone
Feature
Maps
Input Image
Line
Segments
LoI
Pooling
Squeeze Module
Compute
Line Proposal
Search for
Candidates
Verify
Candidates
b) Existing methods include multi-module line prediction process
12. Introduction
2. Real-time inference is limited
Multi-module Processing
Junction Heatmap
Line
Sampler
Line Proposal
Line
Verification
Bottom-Up
(BU) Strategy
Attraction Field Maps
Top-Down
(TD) Strategy
Backbone
Feature
Maps
Input Image
Line
Segments
LoI
Pooling
Line Map Center Map
Mixture
of Conv.
Mixture
of Conv.
Line
Generation
Displacement Map
Tri-Point
(TP) Strategy
Point Filter
Module
Line
Segmentation
Squeeze Module
Compute
Line Proposal
Search for
Candidates
Verify
Candidates
b) Existing methods include multi-module line prediction process
13. Introduction
2. Real-time inference is limited
Multi-module Processing
Junction Heatmap
Line
Sampler
Line Proposal
Line
Verification
Bottom-Up
(BU) Strategy
Attraction Field Maps
Top-Down
(TD) Strategy
Single Module Processing
Ours Line
Generation
Backbone
Feature
Maps
Input Image
Line
Segments
LoI
Pooling
Line Map Center Map
Mixture
of Conv.
Mixture
of Conv.
Line
Generation
Displacement Map
Tri-Point
(TP) Strategy
Point Filter
Module
Line
Segmentation
Squeeze Module
Compute
Line Proposal
Search for
Candidates
Verify
Candidates
Center / Displacement Map
b) Existing methods include multi-module line prediction process
14. Introduction
2. Real-time inference is limited
Inference speed comparison
Inference speed of backbone and prediction process on GPU are significantly improved
16. Introduction
3. Mobile-LSD
Motivation
We design an efficient LSD for resource-constrained environments: Mobile LSD (M-LSD)
• Minimize the backbone network and adopt single module of line prediction process
• Present novel training schemes: Segments of Line segment (SoL) and geometric learning schemes
In this paper
Existing LSDs are limited in real-time inference, especially on mobile devices.
• Exploit heavy backbone networks
• Include multi-module line prediction process
20. Feature Extractor
1x1 Conv
1x1 Conv
Upscale
C
Skip
connection
Block: 1 11
Block type A (12, 14)
12 16
M-LSD-tiny
Proposed Method
1. Light-weight Backbone
21. Feature Extractor
1x1 Conv
1x1 Conv
Upscale
C
Skip
connection
3x3 Conv
3x3 Conv
+
Block: 1 11
Block type A (12, 14) Block type B (13, 15)
12 16
M-LSD-tiny
Proposed Method
1. Light-weight Backbone
22. Feature Extractor
1x1 Conv
1x1 Conv
Upscale
C
Skip
connection
3x3 Conv
3x3 Conv
+
Block: 1 11
Block type A (12, 14) Block type B (13, 15)
3x3 Conv
Dilated rate=5
Block type C (16)
12 16
1x1 Conv
3x3 Conv
M-LSD-tiny
Proposed Method
1. Light-weight Backbone
23. Feature Extractor
Upscale
1x1 Conv
1x1 Conv
Upscale
C
Skip
connection
3x3 Conv
3x3 Conv
+
Block: 1 11
Block type A (12, 14) Block type B (13, 15)
3x3 Conv
Dilated rate=5
Block type C (16)
12 16
1x1 Conv
3x3 Conv
M-LSD-tiny
Proposed Method
1. Light-weight Backbone
24. Feature Extractor
Upscale
Final Feature Maps
(H/2 x W/2 x 16)
1x1 Conv
1x1 Conv
Upscale
C
Skip
connection
3x3 Conv
3x3 Conv
+
Block: 1 11
Block type A (12, 14) Block type B (13, 15)
3x3 Conv
Dilated rate=5
Block type C (16)
12 16
Segmentation Maps
(H/2 x W/2 x 2)
Junction map x1
Line map x1
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
1x1 Conv
3x3 Conv
M-LSD-tiny
Proposed Method
1. Light-weight Backbone
25. Feature Extractor
Upscale
Final Feature Maps
(H/2 x W/2 x 16)
1x1 Conv
1x1 Conv
Upscale
C
Skip
connection
3x3 Conv
3x3 Conv
+
Block: 1 11
Block type A (12, 14) Block type B (13, 15)
3x3 Conv
Dilated rate=5
Block type C (16)
12 16
Line Segments
Segmentation Maps
(H/2 x W/2 x 2)
Junction map x1
Line map x1
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
Line
Generation
1x1 Conv
3x3 Conv
M-LSD-tiny
Proposed Method
1. Light-weight Backbone
26. Feature Extractor Final Feature Maps
(H/2 x W/2 x 16)
1x1 Conv
1x1 Conv
Upscale†
C
Skip
connection
3x3 Conv
3x3 Conv
+
Block: 1 14
Block type A (15,17,19,21) Block type B (16,18,20,22)
3x3 Conv
Dilated rate=5
Block type C (23)
15 23
Line Segments
Segmentation Maps
(H/2 x W/2 x 2)
Junction map x1
Line map x1
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
Line
Generation
1x1 Conv
3x3 Conv
† denotes that block 15 skips upscale operation.
M-LSD
Proposed Method
1. Light-weight Backbone
27. Proposed Method
2. Line Segment Representation
Tri-Point (TP) representation
𝑙𝑙𝑠𝑠
𝑑𝑑𝑠𝑠
𝑑𝑑𝑒𝑒
𝑙𝑙𝑐𝑐
𝑙𝑙𝑒𝑒
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
Notation: 𝑥𝑥𝑙𝑙𝑠𝑠
, 𝑦𝑦𝑙𝑙𝑠𝑠
denotes the 𝛼𝛼 point, 𝑑𝑑𝑠𝑠(𝑥𝑥𝑙𝑙𝑐𝑐
, 𝑦𝑦𝑙𝑙𝑐𝑐
) and 𝑑𝑑𝑒𝑒(𝑥𝑥𝑙𝑙𝑐𝑐
, 𝑦𝑦𝑙𝑙𝑐𝑐
)
indicate 2D displacements from the center point 𝑙𝑙𝑐𝑐 to the
corresponding start 𝑙𝑙𝑠𝑠 and end 𝑙𝑙𝑒𝑒 points.
28. Proposed Method
2. Line Segment Representation
Tri-Point (TP) representation
𝑙𝑙𝑠𝑠
𝑑𝑑𝑠𝑠
𝑑𝑑𝑒𝑒
𝑙𝑙𝑐𝑐
𝑙𝑙𝑒𝑒
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
• For the center loss, we use pos / neg separated binary
classification loss.
• For the displacement loss, we use smooth L1 loss for
regression learning.
29. Proposed Method
2. Line Segment Representation
Tri-Point (TP) representation
𝑙𝑙𝑠𝑠
𝑑𝑑𝑠𝑠
𝑑𝑑𝑒𝑒
𝑙𝑙𝑐𝑐
𝑙𝑙𝑒𝑒
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
TP representation can be insufficient in cases where,
• A line segment may be too long to manage within the receptive
field size.
• The center points of two distinct line segments are too close to
each other.
30. Proposed Method
3. SoL Augmentation
Segments of Line segment (SoL) augmentation
SoL Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
𝑙𝑙𝑠𝑠
𝑙𝑙𝑒𝑒
𝑙𝑙1
𝑙𝑙0
𝑙𝑙2
SoL augmentation directly increases the number and the size of
line segments.
31. Proposed Method
3. SoL Augmentation
Segments of Line segment (SoL) augmentation
SoL Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
𝑙𝑙𝑠𝑠
𝑙𝑙𝑒𝑒
𝑙𝑙1
𝑙𝑙0
𝑙𝑙2
1. We compute 𝑘𝑘 internally dividing points and separate the line
segments into subparts with overlapping portions.
2. Each subpart is trained as if it is a typical line segment.
32. Proposed Method
4. Learning with Geometric Information
Matching loss
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
1. Take the endpoints of each prediction and measure the
Euclidean distance 𝑑𝑑() to the endpoints of the GT.
𝑙𝑙𝑠𝑠
𝑙𝑙𝑒𝑒
̂
𝑙𝑙𝑠𝑠
̂
𝑙𝑙𝑒𝑒
𝐶𝐶(̂
𝑙𝑙)
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
33. Proposed Method
4. Learning with Geometric Information
Matching loss
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
2. These distances are used to match predicted line segments
with GT line segments that are under a threshold 𝛾𝛾:
𝑙𝑙𝑠𝑠
𝑙𝑙𝑒𝑒
̂
𝑙𝑙𝑠𝑠
̂
𝑙𝑙𝑒𝑒
𝐶𝐶(̂
𝑙𝑙)
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
34. Proposed Method
4. Learning with Geometric Information
Matching loss
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
3. Compute the matching loss, which aims to minimize
geometric distance of the matched line segments:
𝑙𝑙𝑠𝑠
𝑙𝑙𝑒𝑒
̂
𝑙𝑙𝑠𝑠
̂
𝑙𝑙𝑒𝑒
𝐶𝐶(̂
𝑙𝑙)
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
ℒ𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
35. Junction and Line segmentation
ℒ𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗
ℒ𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
ℒ𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗
Segmentation Maps
(H/2 x W/2 x 2)
Junction map x1
Line map x1
Center point and displacement vectors are highly related to pixel-
wise junctions and line segments in the segmentation maps.
Proposed Method
4. Learning with Geometric Information
36. Junction and Line segmentation
ℒ𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗
ℒ𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
ℒ𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗
Segmentation Maps
(H/2 x W/2 x 2)
Junction map x1
Line map x1
For the junction and line losses, we use pos / neg separated binary
classification loss.
Proposed Method
4. Learning with Geometric Information
37. Length and Degree regression
ℒ𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗
ℒ𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
ℒ𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗
As displacement vectors can be derived from the length and
degree of line segments, they can be additional geometric cues.
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
Proposed Method
4. Learning with Geometric Information
38. Length and Degree regression
ℒ𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗
ℒ𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑
ℒ𝑗𝑗𝑗𝑗𝑗𝑗𝑗𝑗
For the length and degree losses, we use smooth L1 loss
for regression loss
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
Proposed Method
4. Learning with Geometric Information
39. Proposed Method
5. Final Loss Functions
Loss for each map
Final Loss Function
SoL Maps
(H/2 x W/2 x 7)
TP Maps
(H/2 x W/2 x 7)
Displacement
map x4
Center map x1
Length map x1
Degree map x1
Segmentation Maps
(H/2 x W/2 x 2)
Junction map x1
Line map x1
43. Experiments
2. Ablation Study and Interpretability
Baseline and Augmentation
Performance Input augmentations
• Horizontal / vertical flips
• Shearing
• Rotation
• Scaling
44. Experiments
2. Ablation Study and Interpretability
Matching Loss
Saliency map
Performance
Saliency maps generated from TP center map
45. Experiments
2. Ablation Study and Interpretability
Line & Junction segmentation
Saliency map
Performance
Saliency maps generated from each feature map
46. Experiments
2. Ablation Study and Interpretability
Length & Degree regression
Saliency map
Performance
Saliency maps generated from each feature map
47. Experiments
2. Ablation Study and Interpretability
SoL augmentation
Saliency map
Performance
Saliency maps generated from TP center map
50. Inference speed and memory usage on mobile devices
Experiments
5. Deployment on Mobile Devices
We use iPhone (A14 Bionic chipset) and Android phone (Snapdragon 865 chipset), where FP denotes floating point.
51. Real-time box detection on a mobile device
(a) Input image (b) Line detection (c) Box candidates (d) Box detection
Experiments
6. Applications
The potential of real-time LSD can further to: book scanners,
wireframe to image translation, SLAM, and pose estimation
52. Experiments
6. Applications
Real-time box detection on a mobile device
The potential of real-time LSD can further to: book scanners,
wireframe to image translation, SLAM, and pose estimation
54. Conclusion
We design an efficient LSD for resource-constrained environments: Mobile LSD (M-LSD)
• Minimize the backbone network and adopt single module of line prediction process
• Present novel training schemes: Segments of Line segment (SoL) and geometric learning schemes
In this paper
55. Conclusion
• Compared to TP-LSD-Lite, M-LSD-tiny achieves competitive performance with 2.5% of model size
and an increase of 130.5% in inference speed on GPU.
• Our model runs at 56.8 and 48.6 FPS on Android and iPhone, which is the first real-time method
available on mobile devices.
We achieve
58. Supplementary
- Metrics
Precision and Recall of Line Heat Maps (FH)
1. Given a vectorized representation (line), it generates a confidence heat map.
2. It compares with the ground truth heat map by bipartite matching that treats each pixel independently
as a graph node is ran to match between two heat maps.
3. Then, precision and recall curve is computed according to the matching and confidence of each pixel.
59. Supplementary
- Metrics
Structural Average Precision (sAP)
• Defined on vectorized wireframes rather than on a heat map.
• Recall is the proportion of the correctly detected line segments (up to a cutoff score) to all the ground
truth line segments.
• Precision is the proportion of the correctly detected line segments above that cutoff to all the detected
line segments.
60. Supplementary
- Metrics
Line Matching Average Precision (LAP)
• Line Matching Score (LMS)
• Score_theta: the differences in angle and position
• Score_l: the matching degree in length
• LMS = Score_theta X Score_l
• Using LMS to determine true positive, i.e., a detected line segment is considered to be true positive if
LMS > 0.5, we can calculate the LAP on the entire test set
• LAP is defined as the area under the precision recall curve.