update
This commit is contained in:
parent
b3905a76a6
commit
6c9fec214c
190
main.tex
190
main.tex
@ -47,7 +47,7 @@
|
||||
\maketitle
|
||||
|
||||
\begin{abstract}
|
||||
Lane detection is a critical and challenging task in autonomous driving, particularly in real-world scenarios where traffic lanes are often slender, lengthy, and partially obscured by other vehicles, complicating detection efforts. Existing anchor-based methods typically rely on prior straight line anchors to extract features and refine lane location and shape. Though achieving high performance, manually setting prior anchors is cumbersome, and ensuring adequate coverage across diverse datasets often requires a large number of dense anchors. Additionally, Non-Maximum Suppression (NMS) is used to suppress redundant predictions, which complicates real-world deployment and may fail in dense scenarios. In this study, we introduce PolarRCNN, a nms-free anchor-based method for lane detection. By incorporating both local and global polar coordinate systems, PolarRCNN enables flexible anchor proposals and significantly reduces the number of anchors required without compromising performance. Additionally, we introduce a heuristic GNN-based NMS-free head that supports an end-to-end paradigm, making the model more deployment-friendly and enhancing performance in dense scenarios. Our method achieves competitive results on five popular lane detection benchmarks—Tusimple, CULane, LLAMAS, Curvelanes, and DL-Rail—while maintaining a lightweight design and straightforward structure. Our source code are available at \href{https://github.com/ShqWW/PolarRCNN}{\textit{https://github.com/ShqWW/PolarRCNN}}.
|
||||
Lane detection is a critical and challenging task in autonomous driving, particularly in real-world scenarios where traffic lanes are often slender, lengthy, and partially obscured by other vehicles, complicating detection efforts. Existing anchor-based methods typically rely on prior straight line anchors to extract features and refine lane location and shape. Though achieving high performance, manually setting prior anchors is cumbersome, and ensuring adequate coverage across diverse datasets often requires a large number of dense anchors. Additionally, Non-Maximum Suppression (NMS) is used to suppress redundant predictions, which complicates real-world deployment and may fail in dense scenarios. In this study, we introduce PolarRCNN, a nms-free anchor-based method for lane detection. By incorporating both local and global polar coordinate systems, PolarRCNN enables flexible anchor proposals and significantly reduces the number of anchors required without compromising performance. Additionally, we introduce a heuristic GNN-based NMS-free head that supports an end-to-end paradigm, making the model more deployment-friendly and enhancing performance in dense scenarios. Our method achieves competitive results on five popular lane detection benchmarks—Tusimple, CULane, LLAMAS, CurveLanes, and DL-Rail—while maintaining a lightweight design and straightforward structure. Our source code are available at \href{https://github.com/ShqWW/PolarRCNN}{\textit{https://github.com/ShqWW/PolarRCNN}}.
|
||||
\end{abstract}
|
||||
\begin{IEEEkeywords}
|
||||
Lane detection, NMS-free, Graph neural network, Polar coordinate system.
|
||||
@ -132,7 +132,7 @@ Regrading the first issue, \cite{} introduced learned anchors, where the anchor
|
||||
|
||||
Regarding the second issue, nearly all anchor-based methods (including those mentioned above) require direct or indirect Non-Maximum Suppression (NMS) post-processing to eliminate redundant predictions. Although it is necessary to eliminate redundant predictions, NMS remains a suboptimal solution. On the one hand, NMS is not deployment-friendly because it involves defining and calculating distances (e.g., Intersection over Union) between lane pairs. This is more challenging than bounding boxes in general object detection due to the complexity of lane geometry. On the other hand, NMS fails in some dense scenarios where the lane ground truths are closer together compared to sparse scenarios. A larger distance threshold may result in false negatives, as some true positive predictions might be eliminated (as shown in Fig. \ref{nms setting} (a) and (b)) by mistake. Conversely, a smaller distance threshold may not eliminate redundant predictions effectively and can leave false positives (as shown in Fig. \ref{nms setting} © and (d)). Achieving an optimal trade-off in all scenarios by manually setting the distance threshold is challenging. The root cause of this problem is that the distance definition in NMS considers only geometric parameters while ignoring the semantic context in the image. Thus, when two predictions are “close” to each other, it is nearly impossible to determine whether one of them is redundant.
|
||||
|
||||
To address the two issues outlined above, we propose PolarRCNN, a novel anchor-based method for lane detection. For the first issue, we introduce local and global heads based on the polar coordinate system to create anchors with more accurate locations and reduce the number of proposed anchors in sparse scenarios, as illustrated in Fig. \ref{anchor setting} (c). Compared to state-of-the-art previous work \cite{} which uses 192 anchors, PolarRCNN employs only 20 anchors to cover potential lane ground truths. For the second issue, we have revised FastNMS to Graph-based FastNMS and introduced a new heuristic graph neural network block (Polar GNN block) integrated into the non-maximum suppression (NMS) head. The Polar GNN block offers a more interpretable structure compared to traditional NMS, achieving nearly equivalent performance in sparse scenarios and superior performance in dense scenarios. We conducted experiments on five major benchmarks: TuSimple \cite{}, CULane \cite{}, LLAMAS \cite{}, Curvelanes \cite{}, and DL-Rail \cite{}. Our proposed method demonstrates competitive performance compared to state-of-the-art methods.
|
||||
To address the two issues outlined above, we propose PolarRCNN, a novel anchor-based method for lane detection. For the first issue, we introduce local and global heads based on the polar coordinate system to create anchors with more accurate locations and reduce the number of proposed anchors in sparse scenarios, as illustrated in Fig. \ref{anchor setting} (c). Compared to state-of-the-art previous work \cite{} which uses 192 anchors, PolarRCNN employs only 20 anchors to cover potential lane ground truths. For the second issue, we have revised FastNMS to Graph-based FastNMS and introduced a new heuristic graph neural network block (Polar GNN block) integrated into the non-maximum suppression (NMS) head. The Polar GNN block offers a more interpretable structure compared to traditional NMS, achieving nearly equivalent performance in sparse scenarios and superior performance in dense scenarios. We conducted experiments on five major benchmarks: TuSimple \cite{}, CULane \cite{}, LLAMAS \cite{}, CurveLanes \cite{}, and DL-Rail \cite{}. Our proposed method demonstrates competitive performance compared to state-of-the-art methods.
|
||||
|
||||
Our main contributions are summarized as follows:
|
||||
|
||||
@ -407,7 +407,7 @@ For simplicity, FastNMS only satisfies the condition (1) and (2), which may lead
|
||||
|
||||
It is straightforward to demonstrate that, when all elements in $\boldsymbol{M}$ are all set to 1 (regardless of geometric priors), Graph-based FastNMS is equivalent to FastNMS. Building upon our newly proposed Graph-based FastNMS, we can design the structure of the one-to-one classification head in a manner that mirrors the principles of following Graph-based FastNMS.
|
||||
|
||||
According to the analysis of the shortcomings of traditional NMS postprocessing shown in Fig. \ref{nms setting}, the fundamental issue arises from the definition of the distance between predictions. Traditional NMS relies on geometric properties to define distances between predictions, which often neglects the contextual semantics. For example, in some scenarios, two predicted lanes with a small geometric distance should not be suppressed, such as in the case of double lines or crossing lines. Although setting a threshold $\d_{\tau}$ can mitigate this problem, it is challenging to strike a balance between precision and recall.
|
||||
According to the analysis of the shortcomings of traditional NMS postprocessing shown in Fig. \ref{nms setting}, the fundamental issue arises from the definition of the distance between predictions. Traditional NMS relies on geometric properties to define distances between predictions, which often neglects the contextual semantics. For example, in some scenarios, two predicted lanes with a small geometric distance should not be suppressed, such as in the case of double lines or fork lines. Although setting a threshold $\d_{\tau}$ can mitigate this problem, it is challenging to strike a balance between precision and recall.
|
||||
|
||||
To address this, we replace the explicit definition of the distance function with an implicit graph neural network. Additionally, the coordinates of anchors is also replace with the anchor features ${F}_{i}^{roi}$. According to information bottleneck theory \cite{}, ${F}_{i}^{roi}$ , which contains the location and classification information, is sufficient for modelling the explicit geometric distance by neural work. Besides the geometric information, features ${F}_{i}^{roi}$ containes the contextual information of an anchor, which provides additional clues for establishing implicit distances between two anchors. The implicit distance is expressed as follows:
|
||||
|
||||
@ -513,14 +513,6 @@ only one sample with confidence larger than $C_{o2m}$ is chosed as the canditate
|
||||
\end{aligned}
|
||||
\end{equation}
|
||||
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
\includegraphics[width=\linewidth]{thsis_figure/auxloss.png} % 替换为你的图片文件名
|
||||
\caption{Auxloss for segment parameter regression.}
|
||||
\label{auxloss}
|
||||
\end{figure}
|
||||
|
||||
We directly use the GLaneIoU loss, $\mathcal{L} _{GLaneIoU}$, to regression the offset of xs (with g=1) and SmoothL1 loss for the regression of end points (namely the y axis of the start point and the end point), denoted as $\mathcal{L} _{end}$. In order to make model learn the global features, we proposed the auxloss illustrated in fig. \ref{auxloss}:
|
||||
\begin{equation}
|
||||
\begin{aligned}
|
||||
@ -534,70 +526,20 @@ The anchors and ground truth are divided into several segments. Each anchor segm
|
||||
\subsection{Loss function}
|
||||
|
||||
The overall loss function of PolarRCNN is given as follows:
|
||||
|
||||
\begin{equation}
|
||||
\begin{aligned}
|
||||
\mathcal{L}_overall &=\mathcal{L} _{lph}^{cls}+w_{lph}^{reg}\mathcal{L} _{lph}^{reg}\\&+w_{o2m}^{cls}\mathcal{L} _{o2m}^{cls}+w_{o2o}^{cls}\mathcal{L} _{o2o}^{cls}+w_{rank}\mathcal{L} _{rank}\\&+w_{IoU}\mathcal{L} _{IoU}+w_{end}\mathcal{L} _{end}+w_{aux}\mathcal{L} _{aux}
|
||||
\mathcal{L}_{overall} &=\mathcal{L} _{lph}^{cls}+w_{lph}^{reg}\mathcal{L} _{lph}^{reg}\\&+w_{o2m}^{cls}\mathcal{L} _{o2m}^{cls}+w_{o2o}^{cls}\mathcal{L} _{o2o}^{cls}+w_{rank}\mathcal{L} _{rank}\\&+w_{IoU}\mathcal{L} _{IoU}+w_{end}\mathcal{L} _{end}+w_{aux}\mathcal{L} _{aux}
|
||||
\end{aligned}
|
||||
\end{equation}
|
||||
The first line in the loss function represents the loss for the local polar head, which includes both classification and regression components. The second line pertains to the losses associated with the two classification heads (O2M and O2O), while the third line represents the loss for the regression head within the triplet head. Each term in the equation is weighted by a factor to balance the contributions of each component to the gradient. The entire training process is end-to-end.
|
||||
|
||||
\section{Experiment}
|
||||
|
||||
\subsection{Dataset and Evaluation Metric}
|
||||
|
||||
\subsection{Implement Detail}
|
||||
|
||||
\subsection{Comparison with the state-of-the-art results}
|
||||
|
||||
\subsection{Ablation Study and Visualization}
|
||||
|
||||
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
\includegraphics[width=\linewidth]{thsis_figure/anchor_num_method.png}
|
||||
\caption{Anchor Number and f1-score of different methods on CULane.}
|
||||
\label{anchor_num_method}
|
||||
\end{figure}
|
||||
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
\includegraphics[width=\linewidth]{thsis_figure/speed_method.png}
|
||||
\caption{Anchor Number and f1-score of different methods on CULane.}
|
||||
\label{speed_method}
|
||||
\end{figure}
|
||||
|
||||
|
||||
\begin{figure*}[htbp]
|
||||
\centering
|
||||
\def\subwidth{0.325\textwidth}
|
||||
\def\imgwidth{\linewidth}
|
||||
|
||||
\begin{subfigure}{\subwidth}
|
||||
\includegraphics[width=\imgwidth]{thsis_figure/anchor_num/anchor_num_testing_p.png}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{\subwidth}
|
||||
\includegraphics[width=\imgwidth]{thsis_figure/anchor_num/anchor_num_testing_r.png}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{\subwidth}
|
||||
\includegraphics[width=\imgwidth]{thsis_figure/anchor_num/anchor_num_testing.png}
|
||||
\end{subfigure}
|
||||
\caption{Anchor Number and f1-score of different methods on CULane.}
|
||||
\label{fig:anchor_num_testing}
|
||||
\end{figure*}
|
||||
|
||||
|
||||
|
||||
|
||||
\begin{table*}[htbp]
|
||||
\centering
|
||||
\caption{Dataset \& preprocess}
|
||||
\begin{adjustbox}{width=\linewidth}
|
||||
\begin{tabular}{l|l|ccccc}
|
||||
\toprule
|
||||
\multicolumn{2}{c|}{\textbf{Dataset}} & CULane & TUSimple & LLAMAS & DL-Rail & Curvelanes \\
|
||||
\multicolumn{2}{c|}{\textbf{Dataset}} & CULane & TUSimple & LLAMAS & DL-Rail & CurveLanes \\
|
||||
\midrule
|
||||
\multirow{7}*{Dataset Description}
|
||||
& Train &88,880/$55,698^{*}$&3,268 &58,269&5,435&100,000\\
|
||||
@ -634,6 +576,48 @@ The first line in the loss function represents the loss for the local polar head
|
||||
\end{table*}
|
||||
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
\includegraphics[width=\linewidth]{thsis_figure/auxloss.png} % 替换为你的图片文件名
|
||||
\caption{Auxloss for segment parameter regression.}
|
||||
\label{auxloss}
|
||||
\end{figure}
|
||||
|
||||
\section{Experiment}
|
||||
|
||||
|
||||
\subsection{Dataset and Evaluation Metric}
|
||||
We conducted experiments on four widely used lane detection benchmarks and one rail detection dataset: CULane, TuSimple, LLAMAS, CurveLanes, and DL-Rail. Among these datasets, CULane and CurveLanes are particularly challenging. The CULane dataset consists various scenarios but has sparse lane distributions, whereas CurveLanes includes a large number of curved and dense lane types, such as forked and double lanes. The DL-Rail dataset, focused on rail detection across different scenarios, was chosen to evaluate our model’s performance beyond traditional lane detection. The details for five dataset are shown in Tab. \ref{dataset_info}
|
||||
|
||||
We use the F1-score to evaluate our model on the CULane, LLAMAS, DL-Rail, and Curvelanes datasets, maintaining consistency with previous work. The F1-score is defined as follows:
|
||||
\begin{equation}
|
||||
\begin{aligned}
|
||||
F1=\frac{2\times Precision\times Recall}{Precision\,\,+\,\,Recall}
|
||||
\\
|
||||
Precision\,\,=\,\,\frac{TP}{TP+FP}
|
||||
\\
|
||||
Recall\,\,=\,\,\frac{TP}{TP+FN}
|
||||
\end{aligned}
|
||||
\end{equation}
|
||||
In our experiment, we use different IoU thresholds to calculate the F1-score for different datasets: F1@50 and F1@75 for CULane \cite{}, F1@50 for LLAMAS \cite{} and Curvelanes \cite{}, and F1@50, F1@75, and mF1 for DL-Rail \cite{}. The mF1 is defined as:
|
||||
\begin{equation}
|
||||
\begin{align}
|
||||
mF1=\left( F1@50+F1@55+...+F1@95 \right) /10
|
||||
\end{align}
|
||||
\end{equation}
|
||||
|
||||
For Tusimple, the evaluation is formulated as follows:
|
||||
\begin{equation}
|
||||
\begin{align}
|
||||
Accuracy=\frac{\sum{C_{clip}}}{\sum{S_{clip}}}
|
||||
\end{align}
|
||||
\end{equation}
|
||||
where $C_{clip}$ and $S_{clip}$ represent the number of correct points (predicted points within 20 pixels of the ground truth) and the ground truth points, respectively. If the accuracy exceeds 85\%, the prediction is considered correct. Tusimples also report the False Positive rate (FP=1-Precision) and False Negative Rate (FN=1-Recall) formular.
|
||||
|
||||
\subsection{Implement Detail}
|
||||
All input images are cropped and resized to $800\times320$. Similar to \cite{}, we apply random affine transformations and random horizontal flips. For the optimization process, we use the AdamW \cite{} optimizer with a learning rate warm-up and a cosine decay strategy \cite{}. The initial learning rate is set to 0.006. The number of sampled points and regression points for each lane anchor are set to 36 and 72, respectively. Other parameters, such as batch size and loss weights for each dataset, are detailed in Table \ref{dataset_info}. Since some test/validation sets for the five datasets are not accessible, the test/validation sets used are also listed in Table \ref{dataset_info}. All the expoeriments are conducted on a single NVIDIA A100-40G GPU. To make our model simple, we only use CNN based backbone, namely ResNet\cite{} and DLA34\cite{}.
|
||||
|
||||
|
||||
\begin{table*}[htbp]
|
||||
\centering
|
||||
\caption{CULane Result compared with other methods}
|
||||
@ -687,7 +671,7 @@ The first line in the loss function represents the loss for the local polar head
|
||||
PolarRCNN &DLA34 &\textbf{81.49}&\textbf{64.97}&\textbf{94.44}&80.36&\textbf{76.79}&83.68&\textbf{56.52}&90.85&\textbf{80.09}&1133&76.32\\
|
||||
\bottomrule
|
||||
\end{tabular}
|
||||
\end{adjustbox}
|
||||
\end{adjustbox}
|
||||
\label{culane result}
|
||||
\end{table*}
|
||||
|
||||
@ -695,9 +679,6 @@ The first line in the loss function represents the loss for the local polar head
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
\begin{table}[h]
|
||||
\centering
|
||||
\caption{TuSimple Result compared with other methods}
|
||||
@ -719,10 +700,11 @@ The first line in the loss function represents the loss for the local polar head
|
||||
\bottomrule
|
||||
\end{tabular}
|
||||
\end{adjustbox}
|
||||
\label{tusimple result}
|
||||
\end{table}
|
||||
|
||||
|
||||
\begin{table}[h]
|
||||
\begin{table}[h]
|
||||
\centering
|
||||
\caption{LLAMAS test results compared with other methods}
|
||||
\begin{adjustbox}{width=\linewidth}
|
||||
@ -746,7 +728,8 @@ The first line in the loss function represents the loss for the local polar head
|
||||
\bottomrule
|
||||
\end{tabular}
|
||||
\end{adjustbox}
|
||||
\end{table}
|
||||
\label{llamas result}
|
||||
\end{table}
|
||||
|
||||
\begin{table}[h]
|
||||
\centering
|
||||
@ -768,12 +751,13 @@ The first line in the loss function represents the loss for the local polar head
|
||||
\bottomrule
|
||||
\end{tabular}
|
||||
\end{adjustbox}
|
||||
\label{dlrail result}
|
||||
\end{table}
|
||||
|
||||
|
||||
\begin{table}[h]
|
||||
\begin{table}[h]
|
||||
\centering
|
||||
\caption{Curvelanes validation results compared with other methods}
|
||||
\caption{CurveLanes validation results compared with other methods}
|
||||
\begin{adjustbox}{width=\linewidth}
|
||||
\begin{tabular}{lrcccc}
|
||||
\toprule
|
||||
@ -795,7 +779,67 @@ The first line in the loss function represents the loss for the local polar head
|
||||
\hline
|
||||
\end{tabular}
|
||||
\end{adjustbox}
|
||||
\end{table}
|
||||
\label{curvelanes result}
|
||||
\end{table}
|
||||
|
||||
\subsection{Comparison with the state-of-the-art results}
|
||||
The comparison results of our proposed model with other methods are shown in Tables \ref{culane result}, \ref{tusimple result}, \ref{llamas result}, \ref{dlrail result}, and \ref{curvelanes result}. We present results for two versions of our model: the NMS-based version, denoted as $PolarRCNN_{o2m}$, and the NMS-free version, denoted as $PolarRCNN$. The NMS-based version utilizes predictions obtained from the O2M head followed by NMS post-processing, while the NMS-free version derives predictions directly from the O2O classification head without NMS.
|
||||
|
||||
To ensure a fair comparison, we also include results for CLRerNet \cite{} on the CULane and CurveLanes datasets, as we use a similar training strategy and data split. As illustrated in the comparison results, our model demonstrates competitive performance across five datasets. Specifically, on the CULane, TuSimple, LLAMAS, and DL-Rail datasets (sparse scenarios), our model outperforms other anchor-based methods. Additionally, the performance of the NMS-free version is nearly identical to that of the NMS-based version, highlighting the effectiveness of the O2O head in eliminating redundant predictions. On the CurveLanes dataset, the NMS-free version achieves superior F1-measure and Recall compared to both NMS-based and segment\&grid-based methods.
|
||||
|
||||
We also compare the number of anchors and processing speed with other methods. Figure \ref{anchor_num_method} illustrates the number of anchors used by several anchor-based methods on CULane. Our proposed model utilizes the fewest anchors (20) while achieving the highest F1-score on CULane. It remains competitive with state-of-the-art methods like CLRerNet, which uses 192 anchors and a cross-layer refinement strategy. Conversely, the sparse Laneformer, which also uses 20 anchors, does not achieve optimal performance. It is important to note that our model features a simpler structure without additional refinement, indicating that the design of flexible anchors is crucial for performance in sparse scenarios. Furthermore, due to its simple structure and fewer anchors, our model exhibits lower latency compared to most methods, as shown in Figure \ref{speed_method}. The combination of fast processing speed and a straightforward architecture makes our model highly deployable.
|
||||
|
||||
\subsection{Ablation Study and Visualization}
|
||||
|
||||
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
\includegraphics[width=\linewidth]{thsis_figure/anchor_num_method.png}
|
||||
\caption{Anchor Number and f1-score of different methods on CULane.}
|
||||
\label{anchor_num_method}
|
||||
\end{figure}
|
||||
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
\includegraphics[width=\linewidth]{thsis_figure/speed_method.png}
|
||||
\caption{Anchor Number and f1-score of different methods on CULane.}
|
||||
\label{speed_method}
|
||||
\end{figure}
|
||||
|
||||
|
||||
\begin{figure*}[htbp]
|
||||
\centering
|
||||
\def\subwidth{0.325\textwidth}
|
||||
\def\imgwidth{\linewidth}
|
||||
|
||||
\begin{subfigure}{\subwidth}
|
||||
\includegraphics[width=\imgwidth]{thsis_figure/anchor_num/anchor_num_testing_p.png}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{\subwidth}
|
||||
\includegraphics[width=\imgwidth]{thsis_figure/anchor_num/anchor_num_testing_r.png}
|
||||
\end{subfigure}
|
||||
\begin{subfigure}{\subwidth}
|
||||
\includegraphics[width=\imgwidth]{thsis_figure/anchor_num/anchor_num_testing.png}
|
||||
\end{subfigure}
|
||||
\caption{Anchor Number and f1-score of different methods on CULane.}
|
||||
\label{fig:anchor_num_testing}
|
||||
\end{figure*}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
@ -829,7 +873,7 @@ The first line in the loss function represents the loss for the local polar head
|
||||
|
||||
\begin{table}[h]
|
||||
\centering
|
||||
\caption{NMS vs NMS-free on Curvelanes}
|
||||
\caption{NMS vs NMS-free on CurveLanes}
|
||||
\begin{adjustbox}{width=\linewidth}
|
||||
\begin{tabular}{l|l|ccc}
|
||||
\toprule
|
||||
|
10
main2.tex
10
main2.tex
@ -43,7 +43,7 @@
|
||||
\maketitle
|
||||
|
||||
\begin{abstract}
|
||||
Lane detection is a critical and challenging task in autonomous driving, particularly in real-world scenarios where traffic lanes are often slender, lengthy, and partially obscured by other vehicles, complicating detection efforts. Existing anchor-based methods typically rely on prior straight line anchors to extract features and refine lane location and shape. Though achieving high performance, manually setting prior anchors is cumbersome, and ensuring sufficient anchor coverage across diverse datasets requires a large number of dense anchors. Furthermore, NMS postprocessing should be applied to supress the redundant predictions. In this study, we introduce PolarRCNN, a two-stage nms-free anchor-based method for lane detection. By introducing local polar head, the proposal of anchors are dynamic. The number of anchors are decreasing greatly without sacrificing performace. What's more, a GNN based nms free head is proposed to enable the model reach an end-to-end format, which is deployment friendly. Our model yields competitive results on five popular lane detection benchmarks (Tusimple, CULane, LLAMAS, Curvelanes and DL-Rail) while maintaining a lightweight size and a simple structure.
|
||||
Lane detection is a critical and challenging task in autonomous driving, particularly in real-world scenarios where traffic lanes are often slender, lengthy, and partially obscured by other vehicles, complicating detection efforts. Existing anchor-based methods typically rely on prior straight line anchors to extract features and refine lane location and shape. Though achieving high performance, manually setting prior anchors is cumbersome, and ensuring sufficient anchor coverage across diverse datasets requires a large number of dense anchors. Furthermore, NMS postprocessing should be applied to supress the redundant predictions. In this study, we introduce PolarRCNN, a two-stage nms-free anchor-based method for lane detection. By introducing local polar head, the proposal of anchors are dynamic. The number of anchors are decreasing greatly without sacrificing performace. What's more, a GNN based nms free head is proposed to enable the model reach an end-to-end format, which is deployment friendly. Our model yields competitive results on five popular lane detection benchmarks (Tusimple, CULane, LLAMAS, CurveLanes and DL-Rail) while maintaining a lightweight size and a simple structure.
|
||||
\end{abstract}
|
||||
\begin{IEEEkeywords}
|
||||
Lane detection.
|
||||
@ -99,7 +99,7 @@ In order to address the issue we mentioned above better than the previous work,
|
||||
|
||||
|
||||
|
||||
We conducted ecperiment on five mainstream benchmarks including TuSimple \cite{}, CULane \cite{}, LLAMAS\cite{}, Curvelanes\cite{} and DL-Rail\cite{}. Our proposed method is blessed with competitive performance with the state-of-art methods.
|
||||
We conducted ecperiment on five mainstream benchmarks including TuSimple \cite{}, CULane \cite{}, LLAMAS\cite{}, CurveLanes\cite{} and DL-Rail\cite{}. Our proposed method is blessed with competitive performance with the state-of-art methods.
|
||||
|
||||
Our main contribution are summarized as:
|
||||
|
||||
@ -321,7 +321,7 @@ In the testing stage, anchors with the top-$k_{l}$ confidence are the chosed as
|
||||
\begin{adjustbox}{width=\linewidth}
|
||||
\begin{tabular}{l|l|ccccc}
|
||||
\toprule
|
||||
\multicolumn{2}{c|}{\textbf{Dataset}} & CULane & TUSimple & LLAMAS & DL-Rail & Curvelanes \\
|
||||
\multicolumn{2}{c|}{\textbf{Dataset}} & CULane & TUSimple & LLAMAS & DL-Rail & CurveLanes \\
|
||||
\midrule
|
||||
\multirow{7}*{Data info}
|
||||
& Train &3,268 &88,880&58,269&5,435&100,000\\
|
||||
@ -493,7 +493,7 @@ In the testing stage, anchors with the top-$k_{l}$ confidence are the chosed as
|
||||
|
||||
\begin{table}[h]
|
||||
\centering
|
||||
\caption{Curvelanes validation results compared with other methods}
|
||||
\caption{CurveLanes validation results compared with other methods}
|
||||
\begin{adjustbox}{width=\linewidth}
|
||||
\begin{tabular}{lrcccc}
|
||||
\toprule
|
||||
@ -549,7 +549,7 @@ In the testing stage, anchors with the top-$k_{l}$ confidence are the chosed as
|
||||
|
||||
\begin{table}[h]
|
||||
\centering
|
||||
\caption{NMS vs NMS-free on Curvelanes}
|
||||
\caption{NMS vs NMS-free on CurveLanes}
|
||||
\begin{adjustbox}{width=\linewidth}
|
||||
\begin{tabular}{l|l|ccc}
|
||||
\toprule
|
||||
|
Loading…
x
Reference in New Issue
Block a user