[go: up one dir, main page]

Yao et al., 2024 - Google Patents

QE-BEV: Query evolution for bird's eye view object detection in varied contexts

Yao et al., 2024

View PDF
Document ID
10017867238122866815
Author
Yao J
Lai Y
Kou H
Wu T
Liu R
Publication year
Publication venue
Proceedings of the 32nd ACM International Conference on Multimedia

External Links

Snippet

3D object detection plays a pivotal role in autonomous driving and robotics, demanding precise interpretation of Bird's Eye View (BEV) images. The dynamic nature of real-world environments necessitates the use of dynamic query mechanisms in 3D object detection to …
Continue reading at dl.acm.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6201Matching; Proximity measures
    • G06K9/6202Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement

Similar Documents

Publication Publication Date Title
Song et al. Robustness-aware 3d object detection in autonomous driving: A review and outlook
Yao et al. QE-BEV: Query evolution for bird's eye view object detection in varied contexts
Cai et al. Objectfusion: Multi-modal 3d object detection with object-centric fusion
Huang et al. Tig-bev: Multi-view bev 3d object detection via target inner-geometry learning
Islam et al. MVS‐SLAM: Enhanced multiview geometry for improved semantic RGBD SLAM in dynamic environment
CN109683699A (en) The method, device and mobile terminal of augmented reality are realized based on deep learning
CN110986969A (en) Map fusion method and device, equipment and storage medium
Li et al. Unimode: Unified monocular 3d object detection
He et al. Alss-yolo: An adaptive lightweight channel split and shuffling network for tir wildlife detection in uav imagery
Wang et al. Mv2dfusion: Leveraging modality-specific object semantics for multi-modal 3d detection
Zhang et al. Lapose: Laplacian mixture shape modeling for rgb-based category-level object pose estimation
Dong et al. Visual detection algorithm for enhanced environmental perception of unmanned surface vehicles in complex marine environments
Song et al. SCE-SLAM: a real-time semantic RGBD SLAM system in dynamic scenes based on spatial coordinate error
Kim et al. Predict to detect: Prediction-guided 3d object detection using sequential images
Zhang et al. Dsnet: double strand robotic grasp detection network based on cross attention
Xiang et al. Fusionvit: Hierarchical 3d object detection via lidar-camera vision transformer fusion
Chen et al. Graph-detr4d: Spatio-temporal graph modeling for multi-view 3d object detection
Fu et al. Monomm: a multi-scale mamba-enhanced network for real-time monocular 3D object detection
CN105303554A (en) Image feature point 3D reconstruction method and device
Sun et al. GGC-SLAM: A VSLAM system based on predicted static probability of feature points in dynamic environments
Liu et al. YOLO-BEV: Generating Bird's-Eye view in the same way as 2D object detection
Hou et al. PolarBEVU: Multi-Camera 3D Object Detection in Polar Bird’s-Eye View via Unprojection
Tao et al. 3d semantic VSLAM of indoor environment based on mask scoring RCNN
Li et al. Stereo neural vernier caliper
CN117455972A (en) UAV ground target positioning method based on monocular depth estimation