Autonomous Design Report
D746 - Squadra Corse Driverless PoliTO
Abstract—This document describes the team’s approach to III. S YSTEM OVERVIEW
develop a full software architecture able to achieve autonomous
driving capabilities to participate to the Formula Student Driver-
Figure 1 shows an overview of the autonomous system.
less Competition. The perception task is faced by using a LiDAR It is composed of two main nodes, the Electronic Control
and two cameras that reliably give the position and color of Unit (ECU) and the Autonomous Computation Unit (ACU)
the cones, defining the track boundaries. Those information, connected by a CAN interface. The ECU is a dSPACE
together with an estimation of the vehicle velocity, are fed to Microautobox III, powered by a ARM Cortex A processor.
a GraphSLAM algorithm that builds a global map of the circuit
and estimates the current pose. A local planning algorithm that
It operates in real-time and executes C code generated in
tracks the middle line is used when the complete map is not MATLAB-Simulink. The ACU is a miniITX x86 computer
known. Once the full race track is known, a global optimizer equipped with a 13th generation Intel Core i7 CPU and a
finds the minimum curvature trajectory and computes a velocity Nvidia GPU to exploit multi-threading and CUDA to process
profile with a GGV diagram exploiting the full performance of cameras’ video stream. The system uses ROS2 framework
the vehicle. To follow the target path and velocity, a lateral
MPC and a PI speed tracking control are employed. Further
(Humble release) with most of the codebase written in C++.
low level controllers make sure of appropriate torque allocation. Each ROS node runs in its own docker container which is
The software is developed using MATLAB/Simulink and ROS2. built nightly as part of Continuous Integration pipeline. This
All the pipeline was tested both through a custom simulation workflow offers several advantages, easing new code deploy-
environment and on-track tests. ment and reducing development time. The ACU collects and
I. I NTRODUCTION processes raw data from a LiDAR and two cameras, extracting
the position and color of cones in the environment. This
Founded in 2021 by students of Politecnico di Torino, information, along with the estimated states of the vehicle,
Squadra Corse Driverless PoliTO team has developed its 2nd is fed to the SLAM module. The output is the global map
autonomous driving prototype, a CFRP single seater with four of the environment used by the path planner to compute a
electric in-wheel motors delivering up to 80 kW power with a 4 target trajectory. High level controllers transform the trajectory
kWh battery. Other research work published by other Formula into steering and torque setpoints. The low-level controllers
Student teams such as AMZ [1] and KA-RaceIng [2] provided deployed on the ECU, perform additional corrections to the
valuable insights that inspired this work. setpoints ensuring optimal performance and delivery to the
II. TARGET SETTING actuators.
Upon reviewing the results of the previous Formula Student IV. P ERCEPTION
season, the primary objective of the upcoming season is to The perception pipeline objective is to recognise cones from
complete successfully all the dynamic events. The focus lies in the environment and obtain an estimate of their position and
the design of a system that exhibits enhanced robustness, con- color. The SCD23 is equipped with one Ouster OS-1 64
sidering uncertainties and addressing edge cases effectively. channels LiDAR with an operating frequency of 10Hz and
The following key targets have been identified to improve the two Alvium 1800 U-507 RGB cameras placed on top of
overall performance and reliability of the vehicle: the vehicle’s main hoop. Both sensors are slightly tilted; the
• Perception: Double the range of detection and augment cameras are positioned in a forward-facing configuration with
the field of view (FoV) with respect to last year solution different angles, accomplishing an overall FoV of 120°.
that was limited to a maximum range of 10m and a FoV
of 80°. A. LiDAR filtering and clustering
• Planning: Developing a planning algorithm with a deter- Processing of raw data from LiDAR starts with filtering out
ministic behavior and negligible sensitivity to outliers. irrelevant sections of the point cloud such as points belonging
• Control: Deploying a low-level control strategy com- to the vehicle. Furthermore, points exceeding specific distance
posed of Traction Control (TC) and Torque Vectoring thresholds are removed, while ground points are isolated and
Control (TVC) to achieve better stability and handling excluded using the algorithm proposed in [3]. These steps
both on straight and corners. reduces point cloud size by 80% making extraction of cone
• SLAM: The needs of a new architecture for localization centroids possible using spatial density cluster reconstruction
and mapping was essential to open up to global optimiza- algorithm in real-time. Since cones dimensions and distances
tion strategies. are known, a model to predict the maximum number of points
• Simulation: A simulator was needed for the purpose of in a cluster is used to reduce spurious cones. PointPillarsNet
testing and reducing integration time of new algorithms. [4] was also implemented to add redundancy and improve false
1
Fig. 1. Autonomous system overview
positive detection ratio on LiDAR processing pipeline. Most projection needed for data association. To improve reliability,
of data has been extracted from previous season’s recordings the projected points are filtered by a dynamic distance metric,
and manually labelled. This approach lead to slightly decrease based on the image vertical deviation from the horizon. To
the number of ghost cones but increased power consumption ensure the redundancy of the method, an Iterative Closest
and slowed down the average processing time per point cloud Point (ICP) algorithm was implemented, yet it didn’t make
from 30ms to 50ms. For these reasons, the algorithm won’t any improvements on the existing sensor fusion pipeline, so it
be implemented in this year’s pipeline. In conclusion, since has been discarded for the final release.
labelling process is time-consuming, implementing a self-
supervised approach for labelling such as Contrastive Learning V. SLAM
[5] would drastically improve the datasets quality, thus the
overall accuracy. The term SLAM refers to the set of algorithms designed
to simultaneously localize the robot (i.e. car) on the racetrack
B. Cones detection and create a map of the environment, characterized by fixed
The camera setup is needed to extend the perception range features (i.e. cones) called landmarks.
since color extraction from the LiDAR’s intensity channel is
only reliable within an 8 meter range. Object detection is A. GraphSLAM
done on the entire image instead of projected clusters regions,
because it provides a more reliable output, even if it increases Graph-based SLAM is the approach chosen to formulate
the computational burden. The result of detection is a batch the problem. It is characterised by the construction of a graph
of 2D bounding boxes with color information. This task is in which the nodes represent the poses of the robot and the
achieved using a YOLO v4-tiny [6] which has a simpler landmarks, while the edges represent the constraints between
architecture than baseline YOLOs, leading to a 10-times-faster two nodes. The goal is to identify a node configuration which
inference time with a negligible loss of accuracy. optimally satisfies all the constraints.
The construction of the graph, called front-end, is performed
C. Sensor Fusion as follows: for each odometry acquisition ut,t−1 , a new node
The sensor fusion algorithm merges information from both is added to the graph corresponding to the pose xpt , and an
cameras and LiDAR systems to obtain a complete information edge is inserted between xpt−1 , xt corresponding to ut,t−1 . For
about the cones positions on track. every landmark measurement zi,j , a data association algorithm
A self-developed point-to-point association algorithm has is executed. If a correspondence is detected between the
been used to associate LiDAR centroids to the center of the measurement and an existing node, a constraint is established
bounding boxes obtained from the object detector. It exploits between the two nodes, corresponding to the measurement zi,j .
the roto-translation matrix of the sensors’ frames and the If no correspondence is found, a new node is added to the
intrinsic parameters of the cameras to perform the 3D to 2D graph corresponding to a new landmark xli .
2
The optimization of the graph, called back-end, is formu- VI. T RAJECTORY P LANNING
lated similarly to [7] as follows: The goal of the trajectory planning module is to identify the
x∗ = arg min Σk eTk Ωk ek (1) optimal route within the track’s boundaries that satisfies the
x vehicle dynamics constraints.
where the sum is considered with respect to all the existing By leveraging SLAM-derived position and color of landmarks,
edges. The error term ek is expressed as: a multi-step process has been employed. The first step in-
volves the discretization of the spatial domain using Delaunay
n
ek = ei,j = ri,j − hn (x̂i , x
ˆj ) (2) Triangulation. In this way the middle line is easily defined
n
by the center points of edges connecting vertices of different
where ri,j is the measurement made by the sensor of xj colors. Then, starting from the position of the car, an offline-
as seen from xi , while hn (x̂i , xˆj ) is the expectation of the generated tree of possible paths is loaded. Ultimately a cost
same measure from the graph. The apex n ∈ l, p expresses function is associated to each branch of the tree and the most
l
either the measurement of a landmark, ri,j = zi,j or a pose, favorable branch to track the middle-line is selected as the
p
ri,j = ui,j . Finally, Ωk = Ωi,j is the information matrix of optimal trajectory.
n
the measurement ri,j , which has been tuned accordingly. This process involves exhaustive exploration of the trajec-
The graph structure and the back-end have been imple- tory space while maintaining real-time computational feasi-
mented by using the g2o library [7] in C++, which offers bility and deterministic behavior, effectively addressing the
high computational efficiency and performance. issues related to randomness. This reduces significantly the
The graph is updated at every acquisition while more computational burden, a problem in last year’s solution, which
attention has been dedicated to the optimization phase, since relied on online rapidly exploring random tree (RRT).
the complexity of the algorithm increases linearly with the Once SLAM detects the loop closure, essential information
number of edges. The optimal trade-off to guarantee real- for accurately defining track boundaries are provided and sub-
time performance has been found when the optimization is sequently optimization techniques can be applied to generate
performed every 10 acquisitions. Using the tools provided by the desired racing line. The implementation is based on the
the g2o library, only the last N number of poses are optimized. minimization of the curvature described in [9].
With this settings, the execution of the back-end task takes a In order to achieve optimal performance while ensuring stabil-
maximum of 5 ms. ity, a speed profile is generated utilizing the properties of the
The same algorithm can also be used for localization only global optimal line and employing tire grip modelling based
purposes if needed, either by stopping the optimization of all on GGV diagrams.
the nodes corresponding to landmarks or by providing an a-
priori known environment map to the graph. VII. M OTION CONTROL
In order to track the reference trajectory as fast as possible,
B. Data Association a motion controller consisting of a high- and a low-level con-
Data Association plays a crucial role in SLAM problem, troller has been developed. At high-level a Model Predictive
since optimization of the pose heavily relies on recognizing Controller (MPC) governs the lateral vehicle dynamics while a
multiple observations of the same landmark. Since every cone PI controller is used for speed tracking. At low-level a traction
has the same appearance, using a feature extraction algorithm control (TC) and torque vectoring control (TVC) have been
to identify repeated landmarks is not a feasible option. implemented to increase stability.
The remaining option is to choose between Bayesian or
A. Lateral MPC
non-Bayesian approaches for data association. Among the
Bayesian methods, Multiple Hypothesis Tracking (MHT) is A dynamic bicycle model has been adopted in the model-
a commonly used example. However, implementing MHT can based controller. The equation of motion describing only the
be challenging and it is computationally expensive to run. On lateral dynamics is given by:
the other hand, the k-nearest neighbors (kNN) approach is ẏ vx sin ψ + vy cos ψ
more straightforward but can pose problems if the landmarks v˙y 1 (Fy,f cos δ + Fy,r − mvx r)
= m (3)
are not well separated and is susceptible to noise. ψ̇ r
Best results have been obtained using a Joint Compatibility 1
ṙ Iz (lf Fy,f cos δ − lr Fy,r )
Branch and Bound (JCBB) algorithm which assigns to every
where m and Iz are mass and yaw inertia of the vehicle, lf
new data point one of the points already present in a recursive
and lr the distances from the center of gravity to the front and
way. The compatibility pruning is achieved by taking into
rear wheels. The system can be linearized around straight line
consideration the Mahalanobis distance both from the new
driving obtaining the continuous-time linear model:
point to the assigned point (called individual compatibility)
and between all assigned points as a whole (called joint
ẋ = Ax + bu (4)
compatibility). A modified version of the JCBB algorithm [8]
has been chosen and implemented to reduce complexity of the where x = [y, vy , ψ, r]T is the state vector and u = δ is the
joint compatibility test from O(n2 ) to O(1). control input.
3
After the discretization of the continuous-time model, a an additional activation logic, based on the acceleration of
linear time-varying MPC (LTV-MPC) problem can be formu- the wheel and independent of the slip estimation, has been
lated, following the methods presented in [10]: introduced. It is possible to derive a steady-state (λ̇ = 0)
PN
2
relationship between the wheel acceleration ω̇ and the wheel
ref 2
min k=1 x k − x k + R∆u k hub acceleration:
∆u1:N , Q
x1:N +1 ax,k
2 ω̇k,std = (9)
+ xN +1 − xref R(1 − λ)
N +1
P
s.t. xk+1 = Axk + buk k = 1, . . . , N where ax,k is the kinematic acceleration.
uk+1 = uk + ∆uk k = 1, . . . , N The second activation condition for the TC is ω̇ > ω̇k,std .
(5) These two logics are fused together to guarantee the correct
x1 = x̂
u1 = û longitudinal force delivery in the complete range of speed and
Dxk + euk + f ≤ 0 k = 1, . . . , N + 1 torque operation.
x ≤ xk ≤ x k = 1, . . . , N + 1 The TC is deactivated when eλ < 0 and the torque
u ≤ uk ≤ u k = 1, . . . , N + 1 command by the TVC is below the output value of the PI,
∆u ≤ ∆uk ≤ ∆u k = 1, . . . , N. since it is supposed that the torque command by the TC is the
greatest possible to ensure the slip to be close to λref .
The problem formulation (5) naturally fits the quadratic pro- The corrected torque is generated by using a bumpless PI
gramming (QP) problem structure and thus can be efficiently control logic with a dynamic saturation of TT C ∈ [0, TT V C ],
solved by sophisticated QP solvers. The OSQP [11] solver has and an anti-windup mechanism.
been adopted, since it demonstrates sufficiently low computa- The ABS works in a dual manner, the only difference is
tional time during simulations. that the regenerative braking is used up to an electric braking
torque threshold, where the hydraulic brake system of the
B. Low Level Controllers
vehicle starts to operate.
The objective of the low-level controllers is to maintain
stability of the vehicle, achieved by implementing torque VIII. S TATE E STIMATION
vectoring and traction controls. State estimation is a fundamental step to ensure accurate
The TVC generates the reference yaw rate profile rref using odometry data delivery to the designed controllers. The full
the steady state relationship between yaw rate and steering state to be estimated is:
angle. x = [x, y, ψ, vx , vy , r]T = [p, v, r]T (10)
Then a PI controller takes the error as input er = rref − r
to generate a moment Mz allocated to each wheel as: Among the others, the critical quantity to be estimated is the
vehicle speed v since affects the accuracy of SLAM that takes
2Mz as input the odometry measurements ut = [vx , vy , r]T · Ts and
TiL = kiL Fx − Rl (6)
t gives an estimate of the pose p.
In the aim of adding redundancy a custom optical ground
2Mz
TiR = kiR Fx + Rr (7) speed sensor capable of directly measuring chassis longitu-
t dinal and lateral velocities has been developed. Velocity is
Fz,ij measured by calculating displacement between two frames
kij = (8) captured by a fast CMOS image sensor.
Fz,tot
During development two alternatives were considered, ex-
where Fx is longitudinal force generated by speed tracking tracting several features points from the image to track their
control, t is the track width of the vehicle, R the wheel radius movement or tracking all images pixels. Preliminary tests
and kiR , kiL are weights based on the load transfer . using FAST algorithm [12] showed poor robustness in tracking
The torque request generated by the TVC does not account and large sensitivity to light conditions, therefore, development
tire slip, which is addressed downstream by the TC if the was shifted to the second approach. Frames are first trans-
requested longitudinal force Fx is positive. Conversely, if the formed to frequency domain to parallelize workload on CUDA
requested force is negative, the torque output is directed to an cores and then several kernels are applied to enhance features
ABS control. and reduce sensor noise accomplishing an overall output
The aim of the proposed TC is to keep the slip of each frequency of 2 kHz. Image stabilization was not employed
wheel, denoted as λi , lower than a reference slip thresh- since higher shutter speeds provided same results without
old value λref . The implemented TC considers the error additional hardware or processing.
eλ = λ − λref and activates the control if eλ > 0. This At time of writing validation with a commercial sensor is
solution highly relies on having a good estimate λ̂ of the still in-progress but showed excellent results on medium-low
slip. In conditions of low velocity and high torque request, the speeds. Additionally, a state estimation algorithm to retrieve
estimation of the longitudinal slip ratio can be critical affecting the vehicle speed and fuse data from each sensor has been
the entire functionality of the motion controller. For this reason developed.
4
The state estimation is performed in two stages: estimation IX. S IMULATION
of longitudinal velocity vx and side slip angle β. Efficient testing of the software pipeline is needed to reduce
A. Longitudinal velocity estimation extensive time-consuming and labor-intensive on-track tests.
The development of a simulator, able to closely match the
Four different estimates of the CoG velocity vi,j , evaluated
dynamics of the real vehicle and replicate the same interface,
from wheel encoders and IMU data, are fused through a fuzzy
stands as a fundamental objective in the pursuit of evaluating
logic by computing weighted mean:
the software pipeline intended for direct deployment on the
Σi,j vi,j · ki,j + (v̂x,t−1 + ax · Ts )kl real vehicle. A custom simulator has been developed using
v̂x = (11)
Σi,j ki,j + kl the ROS2 Humble framework. To accurately represent vehicle
where ki,j , kl are normalization coefficients depending on dynamics, it incorporates a dynamic bicycle model with non-
different driving conditions (normal driving, braking, strong linear tire force laws.
braking, acceleration, strong acceleration, and cornering) and Due to inherent challenges associated with replicating au-
Ts is the sampling time. thentic sensors and environmental conditions, the perception
A good estimate of vx is possible since high slipping pipeline was evaluated using real data acquisitions as they
conditions are avoided beforehand by TC. provide accurate representation of complex and dynamic real-
world scenarios encountered during vehicle operation. Within
B. Side slip angle estimation the simulation environment, direct cone observations are virtu-
The side slip angle estimation algorithm is based on the alized incorporating noises both for position and color estima-
fusion of two different methods [13]. The first one is a tion, these simulated observations are fed into the remaining
kinematic approach and the second one is a continuous- components of the software pipeline.
discrete Extended Kalman filter (EKF) including the dynamic Furthermore, an additional virtual environment has been estab-
model of the vehicle. The kinematic equation is written as lished using MATLAB/Simulink to evaluate the performance
follows: and efficacy of low-level control algorithms. This environment
ay ax β̂comb,t−1 incorporates a highly detailed vehicle model implemented with
β̇kin,t = − − r. (12)
vx vx VI-Grade [14] software.
High frequency component is then filtered out.
R EFERENCES
The dynamic approach instead uses a continuous-discrete EKF
where the state variables are x(t) = [βdyn , r]T : [1] J. K. et al., “Amz driverless: The full autonomous racing system,” 2019.
[2] A. A. et al., “The software stack that won the formula student driverless
1 competition,” 2022.
(Fy,f + Fy,r ) mv −r
ẋ(t) = f (x(t)) + wx (t) = x
+ wx (t) [3] M. Himmelsbach, F. v. Hundelshausen, and H.-J. Wuensche, “Fast
(Fy,f lf − Fy,r lr ) I1z segmentation of 3d point clouds for ground vehicles,” in 2010 IEEE
(13) Intelligent Vehicles Symposium, 2010, pp. 560–565.
where the lateral forces are modeled with the Pacejka magic [4] A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom,
“Pointpillars: Fast encoders for object detection from point clouds,” 2019
formula: IEEE/CVF Conference on Computer Vision and Pattern Recognition
Fy,i = Di sin(Ci arctan(Bi (1 − Ei )αi + (CVPR), pp. 12 689–12 697, 2018.
(14) [5] S. Xie, J. Gu, D. Guo, C. Qi, L. J. Guibas, and O. Litany, “Pointcontrast:
+Ei arctan(Bi αi ))) Unsupervised pre-training for 3d point cloud understanding,” ArXiv, vol.
abs/2007.10985, 2020.
where i ∈ {f, r}. The discrete-time measurement equation is: [6] Z. Jiang, L. Zhao, S. Li, and Y. Jia, “Real-time object detection method
based on improved yolov4-tiny,” 2020.
r [7] G. Grisetti, R. Kümmerle, C. Stachniss, and W. Burgard, “A tutorial on
y(tk ) = Fy,f (αf , β) + wy (tk ) (15) graph-based slam,” IEEE Intelligent Transportation Systems Magazine,
Fy,r (αr , β) vol. 2, no. 4, pp. 31–43, 2010.
[8] X. Shen, E. Frazzoli, D. Rus, and M. H. Ang, “Fast joint compatibility
. branch and bound for feature cloud matching,” in 2016 IEEE/RSJ
The continuous-discrete KF differs from the discrete- International Conference on Intelligent Robots and Systems (IROS),
discrete EKF only in the prediction computation of state and 2016, pp. 1757–1764.
[9] A. Heilmeier, A. Wischnewski, L. Hermansdorfer, J. Betz, M. Lienkamp,
covariance P, which are obtained as follows: and B. Lohmann, “Minimum curvature trajectory planning and control
for an autonomous race car,” Vehicle System Dynamics, vol. 58, no. 10,
x̂n|n−1 = x̂n−1|n−1 + f (x̂n−1|n−1 ) · Ts (16) pp. 1497–1527, 2020.
[10] F. Borrelli, A. Bemporad, and M. Morari, Predictive Control for Linear
Pn|n−1 = Pn−1|n−1 + (FPn−1|n−1 + Pn−1|n−1 FT + Q) · Ts and Hybrid Systems. Cambridge University Press, 2017.
(17) [11] B. Stellato, G. Banjac, P. Goulart, A. Bemporad, and S. Boyd, “OSQP:
where F = ∂f (x(t))
∂x | x=x̂n−1|n−1
, Q is the state covariance an operator splitting solver for quadratic programs,” Mathematical
Programming Computation, vol. 12, no. 4, pp. 637–672.
matrix. [12] E. Rosten and T. Drummond, “Fusing points and lines for high perfor-
The combined side slip angle is finally computed as: mance tracking,” in Tenth IEEE International Conference on Computer
Vision (ICCV’05) Volume 1, vol. 2, 2005, pp. 1508–1515 Vol. 2.
βcomb,t = (βkin · Ts + βcomb,t−1 ) · kkin + βdyn · kdyn (18) [13] E. Villano, B. Lenzo, and A. S, “Cross-combined ukf for vehicle
sideslip angle estimation with a modified dugoff tire model: design and
where kkin , kdyn are normalization coefficients with kdyn ∈ experimental results,” Meccanica, vol. 56, no. 11, pp. 2653–2668.
[0.7, 1] function of the acceleration vector norm. [14] “Vi-grade car real time,” https://www.vi-grade.com/.