Kevin Ta

Publications

UniCal: Unified Neural Sensor Calibration

European Conference on Computer Vision (ECCV), 2024

Ze Yang, George G. Chen, Haowei Zhang, Kevin Ta, Ioan Andrei Bârsan, Daniel Murphy, Sivabalan Manivasagam, and Raquel Urtasun

[paper][blog post][bibtex][arxiv]

Self-driving vehicles (SDVs) require accurate calibration of LiDARs and cameras to fuse sensor data accurately for autonomy. Traditional calibration methods typically leverage fiducials captured in a controlled and structured scene and compute correspondences to optimize over. These approaches are costly and require substantial infrastructure and operations, making it challenging to scale for vehicle fleets. In this work, we propose UniCal, a unified framework for effortlessly calibrating SDVs equipped with multiple LiDARs and cameras. Our approach is built upon a differentiable scene representation capable of rendering multi-view geometrically and photometrically consistent sensor observations. We jointly learn the sensor calibration and the underlying scene representation through differentiable volume rendering, utilizing outdoor sensor data without the need for specific calibration fiducials. This "drive-and-calibrate" approach significantly reduces costs and operational overhead compared to existing calibration systems, enabling efficient calibration for large SDV fleets at scale. To ensure geometric consistency across observations from different sensors, we introduce a novel surface alignment loss that combines feature-based registration with neural rendering, as well as a coarse-to-fine sampling approach to optimize regions of interest for sensor alignment. Comprehensive evaluations on multiple datasets demonstrate that UniCal outperforms or matches the accuracy of existing calibration approaches while being more efficient, demonstrating the value of UniCal for scalable calibration.

MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty

European Conference on Computer Vision (ECCV), 2024

Tim Brödermann, David Bruggemann, Christos Sakaridis, Kevin Ta, Odysseas Liagouris, Jason Corkill, and Luc Van Gool

[paper][dataset][code][bibtex][arxiv]

Achieving level-5 driving automation in autonomous vehicles necessitates a robust semantic visual perception system capable of parsing data from different sensors across diverse conditions. However, existing semantic perception datasets often lack important non-camera modalities typically used in autonomous vehicles, or they do not exploit such modalities to aid and improve semantic annotations in challenging conditions. To address this, we introduce MUSES, the MUlti-SEnsor Semantic perception dataset for driving in adverse conditions under increased uncertainty. MUSES includes synchronized multimodal recordings with 2D panoptic annotations for 2500 images captured under diverse weather and illumination. The dataset integrates a frame camera, a lidar, a radar, an event camera, and an IMU/GNSS sensor. Our new two-stage panoptic annotation protocol captures both class-level and instance-level uncertainty in the ground truth and enables the novel task of uncertainty-aware panoptic segmentation we introduce, along with standard semantic and panoptic segmentation. MUSES proves both effective for training and challenging for evaluating models under diverse visual conditions, and it opens new avenues for research in multimodal and uncertainty-aware dense semantic perception.

UncLe-SLAM: Uncertainty Learning for Dense Neural SLAM

ICCV 2nd Workshop on Uncertainty Quantification for Computer Vision (ICCVW), 2023

Kevin Ta*, Erik Sandström*, Luc Van Gool, and Martin R. Oswald

[paper][code][bibtex][arxiv]

We present an uncertainty learning framework for dense neural simultaneous localization and mapping (SLAM). Estimating pixel-wise uncertainties for the depth input of dense SLAM methods allows to re-weigh the tracking and mapping losses towards image regions that contain more suitable information that is more reliable for SLAM. To this end, we propose an online framework for sensor uncertainty estimation that can be trained in a self-supervised manner from only 2D input data. We further discuss the advantages of the uncertainty learning for the case of multi-sensor input. Extensive analysis, experimentation, and ablations show that our proposed modeling paradigm improves both mapping and tracking accuracy and often performs better than alternatives that require ground truth depth or 3D. Our experiments show that we achieve a 38% and 27% lower absolute trajectory tracking error (ATE) on the 7-Scenes and TUM-RGBD datasets respectively. On the popular Replica dataset on two types of depth sensors we report an 11% F1-score improvement on RGBD SLAM compared to the recent state-of-the-art neural implicit approaches.

L2E: Lasers to Events for 6-DoF Extrinsic Calibration of Lidars and Event Cameras

International Conference on Robotics and Automation (ICRA), 2023

Kevin Ta, David Bruggemann, Tim Brödermann, Christos Sakaridis, and Luc Van Gool

[paper][code][bibtex][arxiv]

As neuromorphic technology is maturing, its application to robotics and autonomous vehicle systems has become an area of active research. In particular, event cameras have emerged as a compelling alternative to frame-based cameras in low-power and latency-demanding applications. To enable event cameras to operate alongside staple sensors like lidar in perception tasks, we propose a direct, temporally-decoupled extrinsic calibration method between event cameras and lidars. The high dynamic range, high temporal resolution, and low-latency operation of event cameras are exploited to directly register lidar laser returns, allowing information-based correlation methods to optimize for the 6-DoF extrinsic calibration between the two sensors. This paper presents the first direct calibration method between event cameras and lidars, removing dependencies on frame-based camera intermediaries and/or highly-accurate hand measurements.

ATTENTIV: Instrumented Peripheral Catheter for the Detection of Catheter Dislodgement in IV Infiltration

International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2022

Jessica Y. Bo, Kevin Ta, Rio Nishida, Gordon Yeh, Vivian W. L. Tsang, Megan Bolton, Manon Ranger, and Konrad Walus

[paper][code][bibtex]

Intravenous (IV) infiltration is a common problem associated with IV infusion therapy in clinical practice. A multitude of factors can cause the leakage of IV fluids into the surrounding tissues, resulting in symptoms ranging from temporary swelling to permanent tissue damage. Severe infiltration outcomes can be avoided or minimized if the patient's care provider is alerted of the infiltration at its earliest onset. However, there is a lack of real-time, continuous infiltration monitoring solutions, especially those suited for clinical use for critically ill patients. Our design of the sensor-integrated ATTENTIV catheter allows direct detection of catheter dislodgement, a root cause of IV infiltration. We verify two detection methods: blood-tissue differentiation with a support vector machine and signal peak identification with a thresholding algorithm. We present promising preliminary testing results on biological and phantom models that utilize bioimpedance as the sensing modality. Clinical relevance- The sensor-embedded ATTENTIV catheter demonstrates potential to automate IV infiltration detection in lieu of using traditional infusion catheters and manual detection methods.

Offline and Real-Time Implementation of a Terrain Classification Pipeline for Pushrim-Activated Power-Assisted Wheelchairs

International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2021

Mahsa Khalili, Kevin Ta, H. F. Machiel Van der Loos, and Jaimie F. Borisoff

[paper][code][bibtex]

Pushrim-activated power-assisted wheelchairs (PAPAWs) are assistive technologies that provide propulsion assist to wheelchair users and enable access to various indoor and outdoor terrains. Therefore, it is beneficial to use PAPAW controllers that adapt to different terrain conditions. To achieve this objective, terrain classification techniques can be used as an integral part of the control architecture. Previously, the feasibility of using learning-based terrain classification models was investigated for offline applications. In this paper, we examine the effects of three model parameters (i.e., feature characteristics, terrain types, and the length of data segments) on offline and real-time classification accuracy. Our findings revealed that Random Forest classifiers are computationally efficient and can be used effectively for real-time terrain classification. These classifiers have the highest performance accuracy when used with a combination of time- and frequency-domain features. Additionally, we found that increasing the number of data points used for terrain estimation improves the prediction accuracy. Finally, our results revealed that classification accuracy can be improved by considering terrains with similar characteristics under one umbrella group. These findings can contribute to the development of real-time adaptive controllers that enhance PAPAW usability on different terrains.

Offline and Real-Time Implementation of a Personalized Wheelchair User Intention Detection Pipeline: A Case Study

International Conference on Robot and Human Interactive Communication (ROMAN), 2021

Mahsa Khalili, Kevin Ta, H. F. Machiel Van der Loos, and Jaimie F. Borisoff

[paper][code][bibtex]

Pushrim-activated power-assisted wheels (PAPAWs) are assistive technologies that provide on-demand assistance to wheelchair users. PAPAWs operate based on a collaborative control scheme and require an accurate interpretation of the user’s intent to provide effective propulsion assistance. This paper investigates a user-specific intention estimation framework for wheelchair users. We used Gaussian Mixture models (GMM) to identify implicit intentions from user-pushrim interactions (i.e., input torque to the pushrims). Six clusters emerged that were associated with different phases of a stroke pattern and the intention about the desired direction of motion. GMM predictions were used as 'ground truth' labels for further intention estimation analysis. Next, Random Forest (RF) classifiers were trained to predict user intentions. The best optimal classifier had an overall prediction accuracy of 94.7%. Finally, a Bayesian filtering (BF) algorithm was used to extract sequential dependencies of the user-pushrim measurements. The BF algorithm improved sequences of intention predictions for some wheelchair maneuvers compared to the GMM and RF predictions. The proposed intention estimation pipeline is computationally efficient and was successfully tested and used for real-time prediction of wheelchair user’s intentions. This framework provides the foundation for the development of user-specific and adaptive PAPAW controllers.

Development of A Learning-Based Terrain Classification Framework for Pushrim-Activated Power-Assisted Wheelchairs

International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2020

Mahsa Khalili, Keenan T. McConkey, Kevin Ta, Lyndia C. Wu, H.F. Machiel Van der Loos, and Jaimie F. Borisoff

[paper][code][bibtex]

Pushrim-activated power-assisted wheels (PAPAWs) are assistive technologies that provide on-demand torque assistance to wheelchair users. Although the available power can reduce the physical load of wheelchair propulsion, it may also cause maneuverability and controllability issues. Commercially-available PAPAW controllers are insensitive to environmental changes, leading to inefficient and/or unsafe wheelchair movements. In this regard, adaptive velocity/torque control strategies could be employed to improve safety and stability. To investigate this objective, we propose a context-aware sensory framework to recognize terrain conditions. In this paper, we present a learning-based terrain classification framework for PAPAWs. Study participants performed various maneuvers consisting of common daily-life wheelchair propulsion routines on different indoor and outdoor terrains. Relevant features from wheelchair frame-mounted gyroscope and accelerometer measurements were extracted and used to train and test the proposed classifiers. Our findings revealed that a one-stage multi-label classification framework has a higher accuracy performance compared to a two-stage classification pipeline with an indoor-outdoor classification in the first stage. We also found that, on average, outdoor terrains can be classified with higher accuracy (90%) compared to indoor terrains (65%). This framework can be used for real-time terrain classification applications and provide the required information for an adaptive velocity/torque controller design.