update
This commit is contained in:
parent
24982236e5
commit
23f0a7850b
12
main.tex
12
main.tex
@ -127,7 +127,7 @@ Regarding the second issue, nearly all anchor-based methods \cite{laneatt}\cite{
|
||||
To address the above two issues, we propose Polar R-CNN, a novel anchor-based method for lane detection. For the first issue, we introduce \textit{Local Polar Module} based on the polar coordinate system to create anchors with more accurate locations, thereby reducing the number of proposed anchors in sparse scenarios, as illustrated in Fig. \ref{anchor setting}(c). In contrast to \textit{State-Of-The-Art} (SOTA) methods \cite{clrnet}\cite{clrernet}, which utilize 192 anchors, Polar R-CNN employs only 20 anchors to effectively cover potential lane ground truths. For the second issue, we have incorporated a triplet head with a new heuristic \textit{Graph Neural Network} (GNN) \cite{gnn} block. The GNN block offers an interpretable structure, achieving nearly equivalent performance in sparse scenarios and superior performance in dense scenarios. We conducted experiments on five major benchmarks: \textit{TuSimple} \cite{tusimple}, \textit{CULane} \cite{scnn}, \textit{LLAMAS} \cite{llamas}, \textit{CurveLanes} \cite{curvelanes}, and \textit{DL-Rail} \cite{dalnet}. Our proposed method demonstrates competitive performance compared to SOTA approaches. Our main contributions are summarized as follows:
|
||||
\begin{itemize}
|
||||
\item We design a strategy to simplify the anchor parameters by using local and global polar coordinate systems and applied these to the two-stage lane detection framework. Compared to other anchor-based methods, this strategy significantly reduces the number of proposed anchors while achieving better performance.
|
||||
\item We propose a novel triplet detection head with GNN block to implement a NMS-free paradigm. The block is inspired by Fast NMS, providing enhanced interpretability. Our model supports end-to-end training and testing while still allowing for traditional NMS post-processing as an option for a NMS version of our model.
|
||||
\item We propose a novel triplet detection head with a GNN block to implement a NMS-free paradigm. The block is inspired by Fast NMS, providing enhanced interpretability. Our model supports end-to-end training and testing while still allowing for traditional NMS post-processing as an option for a NMS version of our model.
|
||||
\item By integrating the polar coordinate systems and NMS-free paradigm, we present a Polar R-CNN model for fast and efficient lane detection. And we conduct extensive experiments on five benchmark datasets to demonstrate the effectiveness of our model in high performance with fewer anchors and a NMS-free paradigm. %Additionally, our model features a straightforward structure—lacking cascade refinement or attention strategies—making it simpler to deploy.
|
||||
\end{itemize}
|
||||
%
|
||||
@ -198,7 +198,7 @@ The local polar system is designed to predict lane anchors adaptable to both spa
|
||||
|
||||
%This one-to-many approach is essential for ensuring comprehensive anchor proposals, especially since some local features around certain poles may be lost due to damage or occlusion of the lane curve.
|
||||
\par
|
||||
In the local polar coordinate system, the parameters of each lane anchor are determined based on the location of its corresponding local pole. However, in practical terms, once a lane anchor is generated, its definitive position becomes immutable and independent of its original local pole. To simplify the representation of lane anchors in the second stage of Polar-RCNN, a global polar system has been designed, featuring a singular and unified pole that serves as a reference point for the entire image. The location of this global pole is manually set, and in this case, it is positioned near the static \textit{vanishing point} observed across the entire lane image dataset \cite{vanishing}. This approach ensures a consistent and unified polar coordinate for expressing lane anchors within the global context of the image, facilitating accurate regression to the ground truth lane instances.
|
||||
In the local polar coordinate system, the parameters of each lane anchor are determined based on the location of its corresponding local pole. However, in practical terms, once a lane anchor is generated, its definitive position becomes immutable and independent of its original local pole. To simplify the representation of lane anchors in the second stage of Polar R-CNN, a global polar system has been designed, featuring a singular and unified pole that serves as a reference point for the entire image. The location of this global pole is manually set, and in this case, it is positioned near the static \textit{vanishing point} observed across the entire lane image dataset \cite{vanishing}. This approach ensures a consistent and unified polar coordinate for expressing lane anchors within the global context of the image, facilitating accurate regression to the ground truth lane instances.
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
@ -346,9 +346,9 @@ F1&=\frac{2\times Pre\times Rec}{Pre\,\,+\,\,Rec},
|
||||
\end{align}
|
||||
where $TP$, $FP$ and $FN$ represent the true positives, false positives, and false negatives of the entire dataset, respectively. In our experiment, we use different IoU thresholds to calculate the F1-score for different datasets: $F1@50$ and $F1@75$ for CULane \cite{clrnet}, $F1@50$ for LLAMAS \cite{clrnet} and Curvelanes \cite{CondLaneNet}, and $F1@50$, $F1@75$, and $mF1$ for DL-Rail \cite{dalnet}. The $mF1$ is defined as:
|
||||
\begin{align}
|
||||
mF1=\left( F1@50+F1@55+\ldots+F1@95 \right) /10,
|
||||
mF1=\left( F1@50+F1@55+\cdots+F1@95 \right) /10,
|
||||
\end{align}
|
||||
where $F1@50, F1@55, \ldots, F1@95$ are F1 metrics when IoU thresholds are $0.5, 0.55, \ldots, 0.95$, respectively.
|
||||
where $F1@50, F1@55, \cdots, F1@95$ are F1 metrics when IoU thresholds are $0.5, 0.55, \cdots, 0.95$, respectively.
|
||||
For Tusimple, the evaluation is formulated as follows:
|
||||
\begin{align}
|
||||
Accuracy=\frac{\sum{C_{clip}}}{\sum{S_{clip}}},
|
||||
@ -827,7 +827,7 @@ We draw inspiration from Fast NMS \cite{yolact} for the design of the O2O classi
|
||||
\caption{Fast NMS with Geometric Prior.}
|
||||
\begin{algorithmic}[1] %这个1 表示每一行都显示数字
|
||||
\REQUIRE ~~\\ %算法的输入参数:Input
|
||||
The index of all anchors, $1, 2, \ldots, i, \ldots, K$;\\
|
||||
The index of all anchors, $1, 2, \cdots, i, \cdots, K$;\\
|
||||
The positive corresponding anchors, $\left\{ \theta _i,r_{i}^{g} \right\} |_{i=1}^{K}$;\\
|
||||
The confidence emanating from the O2M classification subhead, $s_i^g$;\\
|
||||
The regressions emanating from the O2M regression subhead, denoted as $\left\{ Lane_i \right\} |_{i=1}^{K}$\\
|
||||
@ -1232,7 +1232,7 @@ Given the ground truth label generated by the label assignment strategy for each
|
||||
\section{The Supplement of Implement Detail and Visualization Results.}
|
||||
Some important implement details for each dataset are shown in Table \ref{dataset_info}. It includes the dataset information we employed to conduct experiments and visualizations, the parameters for data processing as well as hyperparameters of Polar R-CNN.
|
||||
|
||||
Fig. \ref{vis_sparse} illustrates the visualization outcomes in sparse scenarios spanning four datasets. The top row depicts the ground truth, while the middle row shows the proposed lane anchors and the bottom row exhibits the predictions generated by Polar-RCNN with NMS-free paradigm. In the top and bottom row, different colors aim to distinguish different lane instances, which do not correspond across the images. From images of the middle row, we can see that LPH of Polar R-CNN effectively proposes anchors that are clustered around the ground truth, providing a robust prior for GPH to achieve the final lane predictions. Moreover, the number of anchors has significantly decreased compared to previous works, making our method faster than other anchor-based methods in theory.
|
||||
Fig. \ref{vis_sparse} illustrates the visualization outcomes in sparse scenarios spanning four datasets. The top row depicts the ground truth, while the middle row shows the proposed lane anchors and the bottom row exhibits the predictions generated by Polar R-CNN with NMS-free paradigm. In the top and bottom row, different colors aim to distinguish different lane instances, which do not correspond across the images. From images of the middle row, we can see that LPH of Polar R-CNN effectively proposes anchors that are clustered around the ground truth, providing a robust prior for GPH to achieve the final lane predictions. Moreover, the number of anchors has significantly decreased compared to previous works, making our method faster than other anchor-based methods in theory.
|
||||
|
||||
Fig. \ref{vis_dense} shows the visualization outcomes in dense scenarios. The first column displays the ground truth, while the second and the third columns reveal the detection results with NMS paradigm of large (\textit{i.e.}, the default threshold NMS@50 with 50 pixels) and small (\textit{i.e.}, the optimal threshold NMS@15 with 15 pixels) NMS thresholds, respectively. The final column shows the detection results with NMS-free paradigm. We observe that NMS@50 mistakenly removes some predictions, leading to false negatives, while NMS@15 fails to eliminate some redundant predictions, leading to false positives. This underscores that the trade-off struggles between large and small NMS thresholds. The visualization distinctly demonstrates that distance becomes less effective in dense scenarios. Only the proposed O2O classification subhead, driven by data, can address this issue by capturing semantic distance beyond geometric distance. As shown in the last column of Fig. \ref{vis_dense}, the O2O classification subhead successfully eliminates redundant predictions while preserving dense predictions, despite their minimal geometric distances.
|
||||
\label{vis_appendix}
|
||||
|
Loading…
x
Reference in New Issue
Block a user