Ado.net Entity Data Model Visual Studio 2019

Transformer-Based Model for Monocular Visual Odometry: A Video Understanding Approach

Abstract: Estimating the camera’s pose given images from a single camera is a traditional task in mobile robots and autonomous vehicles. This problem is called monocular visual odometry and often ...

Microsoft

LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation

CLIP is one of the most important multimodal foundational models today. What powers CLIP’s capabilities? The rich supervision signals provided by natural language, the carrier of human knowledge, ...

IEEE

SHMamba: Structured Hyperbolic State Space Model for Audio-Visual Question Answering

Abstract: The Audio-Visual Question Answering (AVQA) task holds significant potential for applications. Compared to traditional unimodal approaches, the multi-modal input of AVQA makes feature ...

GitHub

Integrating Feed-Forward Visual Model with IMU, GNSS for High-Functionality SLAM

MASt3R-Fusion is a SLAM system that tightly integrates feed-forward pointmap regression with multi-sensor data (e.g., IMU, GNSS), drawing inspiration from MASt3R-SLAM. It is designed for practical, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results