\documentclass[lettersize,journal]{IEEEtran} \usepackage{amsmath,amsfonts} \usepackage{algorithmic} \usepackage{algorithm} \usepackage{array} % \usepackage[caption=false,font=normalsize,labelfont=sf,textfont=sf]{subfig} \usepackage{textcomp} \usepackage{stfloats} \usepackage{url} \usepackage{verbatim} \usepackage{graphicx} \usepackage{cite} \usepackage{subcaption} \usepackage{multirow} \usepackage[T1]{fontenc} \usepackage{adjustbox} \usepackage{amssymb} \usepackage{booktabs} \usepackage{tikz} \usepackage{tabularx} \usepackage{mathrsfs} \usepackage{etoolbox} % 定义一个命令来禁用参考文献引用 \newcommand{\disablecitations}{% \renewcommand{\cite}[1]{}% } % 定义一个命令来恢复参考文献引用 \newcommand{\enablecitations}{% \let\cite\oldcite% } % 保存原始的 \cite 命令 \let\oldcite\cite \usepackage[colorlinks,bookmarksopen,bookmarksnumbered, linkcolor=red]{hyperref} \definecolor{darkgreen}{RGB}{17,159,27} % \aboverulesep=0pt \belowrulesep=0pt \hyphenation{op-tical net-works semi-conduc-tor IEEE-Xpolare} % updated with editorial comments 8/9/2021 % \renewcommand{\includegraphics}[2][]{} % 重定义 \includegraphics 命令为空操作 \begin{document} \disablecitations \enablecitations \title{Appendix and Supplementary Materials} \markboth{Appendix and Supplementary Materials}% {Appendix and Supplementary Materials} \maketitle \begin{appendices} \setcounter{table}{0} %从0开始编号,显示出来表会A1开始编号 \setcounter{figure}{0} \setcounter{section}{0} \setcounter{equation}{0} \renewcommand{\thetable}{A\arabic{table}} \renewcommand{\thefigure}{A\arabic{figure}} \renewcommand{\thesection}{A\arabic{section}} \renewcommand{\theequation}{A\arabic{equation}} \addcontentsline{toc}{section}{Appendix} % 如果需要将附录标题添加到目录中 \section{Details about the Coordinate Systems} In this section, we introduce the details about the coordinate systems employed in our model and coordinate transformations between them. For convenience, we adopted Cartesian coordinate system instead of the image coordinate system, wherein the y-axis is oriented from bottom to top and the x-axis from left to right. The coordinates of of the local poles $\left\{\boldsymbol{c}^l_i\right\}$, the global pole $\boldsymbol{c}^g$, and the sampled points $\{(x_{1,j}^s,y_{1,j}^s),(x_{2,j}^s,y_{2,j}^s),\cdots,(x_{N,j}^s,y_{N,j}^s)\}_{j=1}^{K}$ of anchors are all within this coordinate by default. We now furnish the derivation of the transformations between different coordinate systems, with the crucial symbols elucidated in Fig. \ref{elu_proof}. These geometric transformations can be demonstrated with Analytic geometry theory in Euclidean space. The derivation of local to global polar coordinate system is presented as follows: \begin{align} r_{j}^{g}&=\left\| \overrightarrow{c^gh_{j}^{g}} \right\| =\left\| \overrightarrow{h_{j}^{a}h_{j}^{l}} \right\| =\left\| \overrightarrow{h_{j}^{a}h_{j}^{l}} \right\| \notag\\ &=\left\| \overrightarrow{c_{j}^{l}h_{j}^{l}}-\overrightarrow{h_{j}^{a}c_{j}^{l}} \right\| =\left\| \overrightarrow{c_{j}^{l}h_{j}^{l}} \right\| -\left\| \overrightarrow{c_{j}^{l}h_{j}^{a}} \right\| \notag\\ &=\left\| \overrightarrow{c_{j}^{l}h_{j}^{l}} \right\| - \frac{\overrightarrow{c_{j}^{l}h_{j}^{a}}}{\left\| \overrightarrow{c_{j}^{l}h_{j}^{a}} \right\|}\cdot \overrightarrow{c_{j}^{l}h_{j}^{a}} =\left\| \overrightarrow{c_{j}^{l}h_{j}^{l}} \right\| +\frac{\overrightarrow{c_{j}^{l}h_{j}^{a}}}{\left\| \overrightarrow{c_{j}^{l}h_{j}^{a}} \right\|}\cdot \overrightarrow{c^gc_{j}^{l}} \notag\\ &=r_{j}^{l}+\left[ \cos \theta _j;\sin \theta _j \right] ^T\left( \boldsymbol{c}_{j}^{l}-\boldsymbol{c}^g \right), \label{proof_l2g} \end{align} where $h_j^l$, $h_j^g$ and $h_j^a$ represent the foots of their respective perpendiculars in Fig. \ref{elu_proof}. Analogously, the derivation of sampling points along a lane anchor is provided as follows: \begin{align} &\overrightarrow{c^gp_{i,j}^{s}}\cdot \overrightarrow{c^gh_{j}^{g}}=\overrightarrow{c^gh_{j}^{g}}\cdot \overrightarrow{c^gh_{j}^{g}} \notag\\ \Rightarrow &\overrightarrow{c^gp_{i,j}^{s}}\cdot \overrightarrow{c^gh_{j}^{g}}=\left\| \overrightarrow{c^gh_{j}^{g}} \right\| \left\| \overrightarrow{c^gh_{j}^{g}} \right\| \notag\\ \Rightarrow &\frac{\overrightarrow{c^gh_{j}^{g}}}{\left\| \overrightarrow{c^gh_{j}^{g}} \right\|}\cdot \overrightarrow{c^gp_{i,j}^{s}}=\left\| \overrightarrow{c^gh_{j}^{g}} \right\| \notag\\ \Rightarrow &\left[ \cos \theta _j;\sin \theta _j \right] ^T\left( \boldsymbol{p}_{i,j}^{s}-\boldsymbol{c}^g \right) =r_{j}^{g}\notag\\ \Rightarrow &x_{i,j}^{s}\cos \theta _j+y_{i,j}^{s}\sin \theta _j=r_{j}^{g}+\left[ \cos \theta _j;\sin \theta _j \right] ^T\boldsymbol{c}^g \notag\\ \Rightarrow &x_{i,j}^{s}=-y_{i,j}^{s}\tan \theta _j+\frac{r_{j}^{g}+\left[ \cos \theta _j;\sin \theta _j \right] ^T\boldsymbol{c}^g}{\cos \theta _j}, \label{proof_sample} \end{align} where $p_{i,j}^{s}$ represents the $i$-th sampled point of the $j$-th lane anchor, whose coordinate is $\boldsymbol{p}_{i,j}^{s}\equiv(x_{i,j}^s, y_{i,j}^s)$. \label{appendix_coord} \begin{figure}[t] \centering \includegraphics[width=\linewidth]{thesis_figure/elu_proof.png} \caption{The symbols employed in the derivation of coordinate transformations across different coordinate systems.} \label{elu_proof} \end{figure} \section{The Design Principles of the One-to-one classification Head} Two fundamental prerequisites of the NMS-free framework lie in the label assignment strategies and the head structures. As for the label assignment strategy, previous work use one-to-many label assignments, which make the detection head make redundant predictions for one ground truth, resulting in the need of NMS post-processing. Thus, some works \cite{detr}\cite{learnNMS} proposed one-to-one label assignment such as Hungarian algorithm. This force the model to predict one positive sample for each lane. However, directly using one-to-one label assignment damage the learning of the model, and structures such as MLPs and CNNs struggle to assimilate the ``one-to-one'' characteristics, resulting in the decreasing of performance compared to one-to-many label assignments with NMS post-processing\cite{yolov10}\cite{o2o}. Consider a trivial example: Let $\boldsymbol{F}^{roi}_{i}$ denotes the ROI features extracted from the $i$-th anchor, and the model is trained with one-to-one label assignment. Assuming that the $i$-th anchor and the $j$-th anchor are both close to the ground truth and overlap with each other. So the corresponding RoI features are similar, which can be expressed as follows: \begin{align} \boldsymbol{F}_{i}^{roi}\approx \boldsymbol{F}_{j}^{roi}. \end{align} Suppose that $\boldsymbol{F}^{roi}_{i}$ is assigned as a positive sample while $\boldsymbol{F}^{roi}_{j}$ as a negative sample, the ideal outcome should manifest as: \begin{align} f_{cls}\left( \boldsymbol{F}_{i}^{roi} \right) &\rightarrow 1, \notag\\ f_{cls}\left( \boldsymbol{F}_{j}^{roi} \right) &\rightarrow 0, \label{sharp fun} \end{align} where $f_{cls}$ represents a classification head with an ordinary structure such as MLPs and CNNs. The Eq. (\ref{sharp fun}) implies that the property of $f_{cls}$ need to be ``sharp'' enough to differentiate between two similar features. In other words, the output of $f_{cls}$ changes rapidly over short periods or distances. This ``sharp'' pattern is hard to train for MLPs or CNNs solely. Consequently, additional new heuristic structures like \cite{o3d}\cite{relationnet} need to be developed. We draw inspiration from Fast NMS \cite{yolact} for the design of the O2O classification subhead. Fast NMS serves as an iteration-free post-processing algorithm based on traditional NMS. Furthermore, we have incorporated a sort-free strategy along with geometric priors into Fast NMS, with the specifics delineated in Algorithm \ref{Graph Fast NMS}. \begin{algorithm}[t] \caption{Fast NMS with Geometric Prior.} \begin{algorithmic}[1] %这个1 表示每一行都显示数字 \REQUIRE ~~\\ %算法的输入参数:Input The index of all anchors, $1, 2, \cdots, i, \cdots, K$;\\ The positive corresponding anchors, $\left\{ \theta _i,r_{i}^{g} \right\} |_{i=1}^{K}$;\\ The confidence emanating from the O2M classification subhead, $s_i^g$;\\ The regressions emanating from the O2M regression subhead, denoted as $\left\{ Lane_i \right\} |_{i=1}^{K}$\\ The predetermined thresholds $\tau^\theta$, $\lambda^g$, $\tau_d$ and $\tau_{o2m}$. \ENSURE ~~\\ %算法的输出:Output \STATE Calculate the confidence-prior adjacency matrix $\boldsymbol{A}^{C}\in\mathbb{R}^{K\times K}$, defined as follows: \begin{align} A_{ij}^{C}=\begin{cases} 1,\, \mathrm{if}\,\, s_i>s_j\,\,or\,\,\left( s_i^g=s_j^g\,\,and\,\,i>j \right);\\ 0,\, \mathrm{others}.\\ \end{cases} \label{confidential matrix} \end{align} \STATE Calculate the geometric-prior adjacency matrix $\boldsymbol{A}^{G}\in\mathbb{R}^{K\times K}$, which is defined as follows: \begin{align} A_{ij}^{G}=\begin{cases} 1,\, \mathrm{if}\,\, \left| \theta _i-\theta _j \right|<\tau^{\theta}\,\,and\,\,\left| r_{i}^{g}-r_{j}^{g} \right|<\lambda^g;\\ 0,\, \mathrm{others}.\\ \end{cases} \label{geometric prior matrix} \end{align} \STATE Calculate the inverse distance matrix $\boldsymbol{D} \in \mathbb{R} ^{K \times K}$ The element $D_{ij}$ in $\boldsymbol{D}$ is defined as follows: \begin{align} D_{ij}=d^{-1}\left( Lane_i,Lane_j \right), \label{al_1-3} \end{align} where $d\left(\cdot, \cdot \right)$ is some predefined function to quantify the distance between two lane predictions such as IoU. \STATE Define the adjacent matrix $\boldsymbol{A} = \boldsymbol{A}^{C} \odot \boldsymbol{A}^{G}$ and the final confidence $\tilde{s}_i^g$ is calculate as following: \begin{align} \tilde{s}_{i}^{g}=\begin{cases} 1,\, \mathrm{if}\,\, \mathrm{Max}\left(\mathcal{D}(:,j)|\boldsymbol{A}(:,j)=1\right)<\left( \tau ^d \right) ^{-1};\\ 0,\, \mathrm{others},\\ \end{cases} \label{al_1-4} \end{align} where $j=1,2,\cdots,K$ and $\mathrm{Max}(\cdot|\boldsymbol{A}(:,j)=1)$ is a max operator along the $j$-th column of adjacency matrix $\boldsymbol{A}$ with the element $A_{:j}=1$. \STATE Get the final selection set: \begin{align} \varOmega_{nms}^{pos}=\left\{ i|\tilde{s}_{j}^{g}=1 \right\} \cap \left\{i|s_{i}^{g}>\tau_{o2m} \right\}. \label{al_1-5} \end{align} \RETURN The final selection result $\varOmega_{nms}^{pos}$. \end{algorithmic} \label{Graph Fast NMS} \end{algorithm} The new algorithm possesses a distinctly different format from its predecessor\cite{yolact}. We introduce a geometric-prior adjacency matrix characterized by $\boldsymbol{A}^G$, alleviating the suppression relationship between disparate anchors. It is manifestly to demonstrate that, when all elements within $\boldsymbol{A}^{G}$ are all set as $1$ (\textit{i.e.}, disregarding geometric priors), Algorithm \ref{Graph Fast NMS} is equivalent to Fast NMS. Building upon our newly proposed sort-free Fast NMS with geometric prior, we design the structure of the one-to-one classification head. The principal limitations of the NMS lie in two steps, namely the definitions of distance stem from geometry (\textit{i.e.}, Eq. (\ref{al_1-3})) and the threshold employed to eliminate redundant predictions (\textit{i.e.}, Eq. (\ref{al_1-4})). For instance, in the scenarios involving double lines, despite the minimal geometric distance between the two lane instances, their semantic divergence is remarkably pronounced. Consequently, we replace the aforementioned two steps with trainable neural networks, allowing them to alleviate the limitation of Fast NMS in a data-driven fashion. The neural network blocks to replace Eq. (\ref{al_1-3}) are as follows: \begin{align} \widehat{\boldsymbol{F}}_{i}^{roi}&\gets \mathrm{ReLU}\left( \boldsymbol{W}_{roi}\boldsymbol{F}_{i}^{roi}+\boldsymbol{b}_{roi} \right), i=1,\cdots,K,\label{edge_layer_1}\\ \boldsymbol{F}_{ij}^{edge}&\gets \boldsymbol{W}_{in}\widehat{\boldsymbol{F}}_{j}^{roi}-\boldsymbol{W}_{out}\widehat{\boldsymbol{F}}_{i}^{roi},\label{edge_layer_2}\\ \boldsymbol{D}_{ij}^{edge}&\gets \mathrm{MLP}_{edge}\left(\boldsymbol{F}_{ij}^{edge}+\boldsymbol{W}_s\left( \boldsymbol{x}_{j}-\boldsymbol{x}_{i} \right) +\boldsymbol{b}_s \right).\label{edge_layer_3} \end{align} In Eq. (\ref{edge_layer_3}), the inverse distance $\boldsymbol{D}_{ij}^{edge}\in\mathbb{R}^{d_n}$ transcends its scalar form, encapsulating the semantic distance between predictions. We use element-wise max pooling for the tensor, as the repalcement of the max operation applied to scalar. Furthermore, the predetermined $\left( \tau ^d \right) ^{-1}$ is no longer utilized as the threshold of the distance. We defined a neural work as the implicit decision plane to formulate the final score $\tilde{s}_{i}^{g}$ \begin{align} \boldsymbol{D}_j^{roi}&\gets\mathrm{MPool}_{col}\left(\mathcal{D}^{edge}(:,j,:)|\boldsymbol{A}(:,j)=1\right), \label{maxpooling}\\ \tilde{s}_{j}^{g}&\gets \mathrm{MLP}_{roi}\left( \boldsymbol{D}_{j}^{roi} \right), j=1,\cdots,K, \label{node_layer} \end{align} which serves as the replacement of Eq. (\ref{al_1-4}). The score $\tilde{s}_{i}^{g}$ output by the neural network transitions from a binary score to a continuous soft score ranging from 0 to 1. We introduce a new threshold $\tau_{o2o}$ within the updated criteria of Eq. (\ref{al_1-5}): \begin{align} \varOmega_{nms-free}^{pos}=\left\{i|\tilde{s}_{i}^{g}>\tau_{o2o} \right\} \cap \left\{ i|s_{i}^{g}>\tau_{o2m} \right\}. \end{align} This criteria is also referred to as the \textit{dual confidence selection} in the main text. \label{NMS_appendix} \begin{table*}[t] \centering \caption{Infos and hyperparameters for five datasets. For the CULane dataset, $*$ denotes the actual number of training samples used to train the model. Labels for some validation/test sets are missing and different splits (\textit{i.e.}, validation and test set) are selected for different datasets.} \begin{adjustbox}{width=\linewidth} \begin{tabular}{l|l|ccccc} \toprule \multicolumn{2}{c|}{\textbf{Dataset}} & CULane & TUSimple & LLAMAS & DL-Rail & CurveLanes \\ \midrule \multirow{7}*{Dataset Description} & Train &88,880/$55,698^{*}$&3,268 &58,269&5,435&100,000\\ & Validation &9,675 &358 &20,844&- &20,000 \\ & Test &34,680&2,782 &20,929&1,569&- \\ & Resolution &$1640\times590$&$1280\times720$&$1276\times717$&$1920\times1080$&$2560\times1440$, etc\\ & Lane &$\leqslant4$&$\leqslant5$&$\leqslant4$&$=2$&$\leqslant10$\\ & Environment &urban and highway & highway&highway&railay&urban and highway\\ & Distribution &sparse&sparse&sparse&sparse&sparse and dense\\ \midrule \multirow{2}*{Dataset Split} & Evaluation &Test&Test&Test&Test&Val\\ & Visualization &Test&Test&Val&Test&Val\\ \midrule \multirow{1}*{Data Preprocess} & Crop Height &270&160&300&560&640, etc\\ \midrule \multirow{6}*{Training Hyperparameter} & Epoch Number &32&70&20&90&32\\ & Batch Size &40&24&32&40&40\\ & Warm up iterations &800&200&800&400&800\\ & $w_{aux}$ &0.2&0 &0.2&0.2&0.2\\ & $w_{rank}$ &0.7&0.7&0.1&0.7&0 \\ \midrule \multirow{4}*{Evaluation Hyperparameter} & $H^{l}\times W^{l}$ &$4\times10$&$4\times10$&$4\times10$&$4\times10$&$6\times13$\\ & $K$ &20&20&20&12&50\\ & $d_n$ &5&8&10&5&5\\ & $\tau_{o2m}$ &0.48&0.40&0.40&0.40&0.45\\ & $\tau_{o2o}$ &0.46&0.46&0.46&0.46&0.44\\ \bottomrule \end{tabular} \end{adjustbox} \label{dataset_info} \end{table*} \begin{figure}[t] \centering \includegraphics[width=\linewidth]{thesis_figure/GLaneIoU.png} % 替换为你的图片文件名 \caption{Illustrations of GLaneIoU redefined in our work.} \label{glaneiou} \end{figure} \section{Details of Intersection Over Union between Lane Instances} To ensure the IoU between lane instances aligns with the conventions of general object detection methods \cite{iouloss}\cite{giouloss}, we have redefined the IoU of lane pairs. As depicted in Fig. \ref{glaneiou}, the newly defined IoU for lanes pairs, which we refer to as GLaneIoU, is elaborated as follows: \begin{align} \Delta x_{i,p}^{d}&=x_{i+1,p}^{d}-x_{i-1,p}^{d},\,\, \Delta y_{i,p}^{d}=y_{i+1,p}^{d}-y_{i-1,p}^{d}, \\ w_{i,p}&=\frac{\sqrt{\left( \Delta x_{i,p}^{d} \right) ^2+\left( \Delta y_{i,p}^{d} \right) ^2}}{\Delta y_{i,p}^{d}}w^b,\\ b_{i,p}^{l}&=x_{i,p}^{d}-w_{i,p},\,\, b_{i,p}^{r}=x_{i,p}^{d}+w_{i,p}, \end{align} where $w^{b}$ is the base semi-width parameter and $w_{i,p}$ is the actual semi-width of $p$-th lane instance. The sets $\left\{ b_{i,p}^{l} \right\} _{i=1}^{N}$ and $\left\{ b_{i,p}^{r} \right\} _{i=1}^{N}$ signify the left and right boundaries of the $p$-th lane instance. Subsequently, we defined inter and union between lane instances: \begin{align} d_{i,pq}^{\mathcal{O}}&=\max \left( \min \left( b_{i,p}^{r}, b_{i,q}^{r} \right) -\max \left( b_{i,p}^{l}, b_{i,q}^{l} \right) , 0 \right),\\ d_{i,pq}^{\xi}&=\max \left( \max \left( b_{i,p}^{l}, b_{i,q}^{l} \right) -\min \left( b_{i,p}^{r}, b_{i,q}^{r} \right) , 0 \right),\\ d_{i,pq}^{\mathcal{U}}&=\max \left( b_{i,p}^{r}, b_{i,q}^{r} \right) -\min \left( b_{i,p}^{l}, b_{i,q}^{l} \right). \end{align} The defination of $\left\{d_{i,pq}^{\mathcal{O}}\right\}_{i=1}^{N}$, $\left\{d_{i,pq}^{\xi}\right\}_{i=1}^{N}$ and $\left\{d_{i,pq}^{\mathcal{U}}\right\}_{i=1}^{N}$ denote the over distance, gap distance, and union distance, respectively. These definitions closely resemble but slightly differ from those in \cite{clrnet} and \cite{adnet}, modifications to ensure non-negative values. This formulation aims to maintain consistency with the IoU definitions used for bounding boxes. Thus, the overall GLaneIoU between the $p$-th and $q$-th lane instances is expressed as: \begin{align} GIoU\left( p,q \right)=\frac{\sum\nolimits_{i=j}^k{d_{i,pq}^{\mathcal{O}}}}{\sum\nolimits_{i=j}^k{d_{i,pq}^{\mathcal{U}}}}-g\frac{\sum\nolimits_{i=j}^k{d_{i,pq}^{\xi}}}{\sum\nolimits_{i=j}^k{d_{i,pq}^{\mathcal{U}}}}, \end{align} where j and k are the indices of the start point and the end point, respectively. It's evident that when $g=0$, the $GIoU$ for lane pairs corresponds to that for bounding box, with a value range of $\left[0, 1 \right]$. When $g=1$, the $GIoU$ for lane pairs corresponds to that for bounding box, with a value range of $\left(-1, 1 \right]$. \label{giou_appendix} \begin{figure}[t] \centering \includegraphics[width=\linewidth]{thesis_figure/detection_head_assign.png} \caption{Label assignment and loss function for the triplet head.} \label{head_assign} \end{figure} \begin{figure*}[t] \centering \def\pagewidth{0.49\textwidth} \def\subwidth{0.47\linewidth} \def\imgwidth{\linewidth} \def\imgheight{0.5625\linewidth} \def\dashheight{0.8\linewidth} \begin{subfigure}{\pagewidth} \rotatebox{90}{\small{GT}} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/culane/1_gt.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/culane/2_gt.jpg} \end{minipage} \end{subfigure} \begin{subfigure}{\pagewidth} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/tusimple/1_gt.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/tusimple/2_gt.jpg} \end{minipage} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\pagewidth} \raisebox{-1.5em}{\rotatebox{90}{\small{Anchors}}} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/culane/1_anchor.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/culane/2_anchor.jpg} \end{minipage} \end{subfigure} \begin{subfigure}{\pagewidth} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/tusimple/1_anchor.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/tusimple/2_anchor.jpg} \end{minipage} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\pagewidth} \raisebox{-2em}{\rotatebox{90}{\small{Predictions}}} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/culane/1_pred.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/culane/2_pred.jpg} \end{minipage} \caption{CULane} \end{subfigure} \begin{subfigure}{\pagewidth} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/tusimple/1_pred.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/tusimple/2_pred.jpg} \end{minipage} \caption{TuSimple} \end{subfigure} \vspace{0.5em} % \begin{tikzpicture} % \draw[dashed, pattern=on 8pt off 2pt, color=gray, line width=1pt] (-\textwidth/2,0) -- (\textwidth/2.,0); % \end{tikzpicture} % \vspace{0.05em} \begin{subfigure}{\pagewidth} \rotatebox{90}{\small{GT}} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/llamas/1_gt.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/llamas/2_gt.jpg} \end{minipage} \end{subfigure} \begin{subfigure}{\pagewidth} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/dlrail/1_gt.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/dlrail/2_gt.jpg} \end{minipage} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\pagewidth} \raisebox{-1.5em}{\rotatebox{90}{\small{Anchors}}} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/llamas/1_anchor.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/llamas/2_anchor.jpg} \end{minipage} \end{subfigure} \begin{subfigure}{\pagewidth} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/dlrail/1_anchor.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/dlrail/2_anchor.jpg} \end{minipage} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\pagewidth} \raisebox{-2em}{\rotatebox{90}{\small{Predictions}}} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/llamas/1_pred.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/llamas/2_pred.jpg} \end{minipage} \caption{LLAMAS} \end{subfigure} \begin{subfigure}{\pagewidth} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/dlrail/1_pred.jpg} \end{minipage} \begin{minipage}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_dataset/dlrail/2_pred.jpg} \end{minipage} \caption{DL-Rail} \end{subfigure} \vspace{0.5em} \caption{Visualization of detection outcomes in sparse scenarios of four datasets.} \label{vis_sparse} \end{figure*} \begin{figure*}[t] \centering \def\subwidth{0.24\textwidth} \def\imgwidth{\linewidth} \def\imgheight{0.5625\linewidth} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/redun_gt.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/redun_pred50.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/redun_pred15.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/redun_NMSfree.jpg} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/redun2_gt.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/redun2_pred50.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/redun2_pred15.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/redun2_NMSfree.jpg} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/less_gt.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/less_pred50.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/less_pred15.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/less_NMSfree.jpg} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/less2_gt.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/less2_pred50.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/less2_pred15.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/less2_NMSfree.jpg} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/all_gt.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/all_pred50.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/all_pred15.jpg} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/all_NMSfree.jpg} \end{subfigure} \vspace{0.5em} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/all2_gt.jpg} \caption{GT} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/all2_pred50.jpg} \caption{NMS@50} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/all2_pred15.jpg} \caption{NMS@15} \end{subfigure} \begin{subfigure}{\subwidth} \includegraphics[width=\imgwidth, height=\imgheight]{thesis_figure/view_nms/all2_NMSfree.jpg} \caption{NMSFree} \end{subfigure} \vspace{0.5em} \caption{Visualization of the detection outcomes in sparse and dense scenarios on the CurveLanes dataset.} \label{vis_dense} \end{figure*} \section{Details about The Label assignment and Loss function.} Details about cost function and label assignments for the triplet head are furnished here. A dual label assignment strategy \cite{date} is employed for the triplet head, as illustrated in Fig. \ref{head_assign}. Specifically, we implement one-to-many label assignments for both the O2O classification subhead and the O2M regression subhead. This section closely aligns with previous work \cite{clrernet}. To endow our model with NMS-free paradigm, we additionally incorporate the O2O classification subhead and apply a one-to-one label assignment to it. The cost metrics for both one-to-one and one-to-many label assignments are articulated as follows: \begin{align} \mathcal{C} _{p,q}^{o2o}=\tilde{s}_{p}^{g}\times \left( GIoU\left( p,q \right) \right) ^{\beta} \label{o2o_cost},\\ \mathcal{C} _{p,q}^{o2m}=s_{p}^{g}\times \left( GIoU\left( p,q \right) \right) ^{\beta}, \label{o2m_cost} \end{align} where $\mathcal{C} _{pq}^{o2o}$ and $\mathcal{C} _{pq}^{o2m}$ denote the cost metric between $p$-th prediction and $q$-th ground truth and $g$ in $GIoU$ are set to $0$ to ensure it maintains non-negative. These metrics imply that both the confidence score and geometric distance contribute to the cost metrics. Suppose that there exist $K$ predictions and $G$ ground truth. Let $\pi$ denotes the one-to-one label assignment strategy and $\pi(q)$ represent that the $\pi(q)$-th prediction is assigned to the $q$-th anchor. Additionally, $\mathscr{S}_{K, G}$ denotes the set of all possible one-to-one assignment strategies for K predictions and G ground truth. It's straightforward to demonstrate that the total number of one-to-one assignment strategies $\left| \mathscr{S} _{K,G} \right|$ is $\frac{K!}{\left( K-G \right)!}$. The final optimal assignment $\hat{\pi}$ is determined as follows: \begin{align} \hat{\pi}=\underset{\pi \in \mathscr{S}_{K,G}}{arg\max}\sum_{q=1}^G{\mathcal{C} _{\pi \left( q \right) ,q}^{o2o}}. \end{align} This assignment problem can be solved by Hungarian algorithm \cite{detr}. Finally, $G$ predictions are assigned as positive samples and $K-G$ predictions are assigned as negative samples. In the one-to-many label assignment, we simply use SimOTA \cite{yolox}, which aligns with previous works \cite{clrernet}. Omitting the detailed process of SimOTA, we only introduce the inputs to it, namely the cost matrix $\boldsymbol{M}^C\in \mathbb{R}^{G\times K}$ and the IoU matrix $\boldsymbol{M}^{IoU}\in \mathbb{R}^{G\times K}$. The elements in the two matrices are defined as $M^C_{qp}=\mathcal{C} _{p,q}^{o2m}$ and $M^{IoU}_{qp}= GIoU\left( p,q \right)$ (with $g=0$), respectively. The number of assigned predictions for each ground truth is variable but does not exceed an upper bound $k_{dynamic}$, which is set to $4$ in our experiment. Finally, there are $K_{pos}$ positive samples and $K-K_{pos}$ negative samples, where $K_{pos}$ ranges from $0$ to $Gk_{dynamic}$. Given the ground truth label generated by the label assignment strategy for each prediction, we can conduct the loss function during training phase. As illustrated in Fig. \ref{head_assign}, $\mathcal{L}_{cls}^{o2o}$ and $\mathcal{L}_{rank}$ are for the O2O classification subhead, whereas $\mathcal{L}_{cls}^{o2m}$ is for the O2M classification subhead. Meanwhile, $\mathcal{L}_{GIOU}$ (with $g=1$), $\mathcal{L}_{end}$ and $\mathcal{L}_{aux}$ are designated for the O2M regression subhead. The gradiant from the O2O classification subhead to the RoI pooling layer is stopped to keep the quality of the feature learning. $\left( \hat{\theta}_{i,\cdot}^{seg},\hat{r}_{i,\cdot}^{seg} \right)$ is ingnored during evaluation. \label{assign_appendix} \section{The Supplement of Implement Detail and Visualization Results.} Some important implement details for each dataset are shown in Table \ref{dataset_info}. It includes the dataset information we employed to conduct experiments and visualizations, the parameters for data processing as well as hyperparameters of Polar R-CNN. Fig. \ref{vis_sparse} illustrates the visualization outcomes in sparse scenarios spanning four datasets. The top row depicts the ground truth, while the middle row shows the proposed lane anchors and the bottom row exhibits the predictions generated by Polar R-CNN with NMS-free paradigm. In the top and bottom row, different colors aim to distinguish different lane instances, which do not correspond across the images. From images of the middle row, we can see that LPH of Polar R-CNN effectively proposes anchors that are clustered around the ground truth, providing a robust prior for GPH to achieve the final lane predictions. Moreover, the number of anchors has significantly decreased compared to previous works, making our method faster than other anchor-based methods in theory. Fig. \ref{vis_dense} shows the visualization outcomes in dense scenarios. The first column displays the ground truth, while the second and the third columns reveal the detection results with NMS paradigm of large (\textit{i.e.}, the default threshold NMS@50 with 50 pixels) and small (\textit{i.e.}, the optimal threshold NMS@15 with 15 pixels) NMS thresholds, respectively. The final column shows the detection results with NMS-free paradigm. We observe that NMS@50 mistakenly removes some predictions, leading to false negatives, while NMS@15 fails to eliminate some redundant predictions, leading to false positives. This underscores that the trade-off struggles between large and small NMS thresholds. The visualization distinctly demonstrates that distance becomes less effective in dense scenarios. Only the proposed O2O classification subhead, driven by data, can address this issue by capturing semantic distance beyond geometric distance. As shown in the last column of Fig. \ref{vis_dense}, the O2O classification subhead successfully eliminates redundant predictions while preserving dense predictions, despite their minimal geometric distances. \label{vis_appendix} \bibliographystyle{IEEEtran} \bibliography{reference} %\newpage \end{appendices} \end{document}