Yao et al., 2024 - Google Patents
QE-BEV: Query evolution for bird's eye view object detection in varied contextsYao et al., 2024
View PDF- Document ID
- 10017867238122866815
- Author
- Yao J
- Lai Y
- Kou H
- Wu T
- Liu R
- Publication year
- Publication venue
- Proceedings of the 32nd ACM International Conference on Multimedia
External Links
Snippet
3D object detection plays a pivotal role in autonomous driving and robotics, demanding precise interpretation of Bird's Eye View (BEV) images. The dynamic nature of real-world environments necessitates the use of dynamic query mechanisms in 3D object detection to …
- 238000001514 detection method 0 title abstract description 58
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Song et al. | Robustness-aware 3d object detection in autonomous driving: A review and outlook | |
| Yao et al. | QE-BEV: Query evolution for bird's eye view object detection in varied contexts | |
| Cai et al. | Objectfusion: Multi-modal 3d object detection with object-centric fusion | |
| Huang et al. | Tig-bev: Multi-view bev 3d object detection via target inner-geometry learning | |
| Islam et al. | MVS‐SLAM: Enhanced multiview geometry for improved semantic RGBD SLAM in dynamic environment | |
| CN109683699A (en) | The method, device and mobile terminal of augmented reality are realized based on deep learning | |
| CN110986969A (en) | Map fusion method and device, equipment and storage medium | |
| Li et al. | Unimode: Unified monocular 3d object detection | |
| He et al. | Alss-yolo: An adaptive lightweight channel split and shuffling network for tir wildlife detection in uav imagery | |
| Wang et al. | Mv2dfusion: Leveraging modality-specific object semantics for multi-modal 3d detection | |
| Zhang et al. | Lapose: Laplacian mixture shape modeling for rgb-based category-level object pose estimation | |
| Dong et al. | Visual detection algorithm for enhanced environmental perception of unmanned surface vehicles in complex marine environments | |
| Song et al. | SCE-SLAM: a real-time semantic RGBD SLAM system in dynamic scenes based on spatial coordinate error | |
| Kim et al. | Predict to detect: Prediction-guided 3d object detection using sequential images | |
| Zhang et al. | Dsnet: double strand robotic grasp detection network based on cross attention | |
| Xiang et al. | Fusionvit: Hierarchical 3d object detection via lidar-camera vision transformer fusion | |
| Chen et al. | Graph-detr4d: Spatio-temporal graph modeling for multi-view 3d object detection | |
| Fu et al. | Monomm: a multi-scale mamba-enhanced network for real-time monocular 3D object detection | |
| CN105303554A (en) | Image feature point 3D reconstruction method and device | |
| Sun et al. | GGC-SLAM: A VSLAM system based on predicted static probability of feature points in dynamic environments | |
| Liu et al. | YOLO-BEV: Generating Bird's-Eye view in the same way as 2D object detection | |
| Hou et al. | PolarBEVU: Multi-Camera 3D Object Detection in Polar Bird’s-Eye View via Unprojection | |
| Tao et al. | 3d semantic VSLAM of indoor environment based on mask scoring RCNN | |
| Li et al. | Stereo neural vernier caliper | |
| CN117455972A (en) | UAV ground target positioning method based on monocular depth estimation |