CN118521985B - A road environment perception method, device, system, and storage medium - Google Patents
A road environment perception method, device, system, and storage medium Download PDFInfo
- Publication number
- CN118521985B CN118521985B CN202410744236.XA CN202410744236A CN118521985B CN 118521985 B CN118521985 B CN 118521985B CN 202410744236 A CN202410744236 A CN 202410744236A CN 118521985 B CN118521985 B CN 118521985B
- Authority
- CN
- China
- Prior art keywords
- road environment
- image
- road
- fusion
- visible light
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/588—Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/54—Extraction of image or video features relating to texture
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
 
- 
        - G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
 
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a road environment sensing method, a device, a system and a storage medium, which comprise the following steps of S100, acquiring road environment images, S200, preprocessing the road environment images, S300, carrying out multi-data registration fusion on the preprocessed road environment images, S400, carrying out segmentation recognition sensing on the road images after registration fusion, and realizing road environment recognition on a highland alpine region. By adopting the technical scheme, the external environment monitoring of the vehicle is realized by combining the thermal imaging image, the visible light image and the point cloud data, the environment around the vehicle can be monitored in real time, the early warning is sent out in time, the recognition precision under the severe road environment can be improved, and the driving safety is obviously improved.
    Description
Technical Field
      The invention belongs to the technical field of image processing, and particularly relates to a road environment sensing method, a road environment sensing device, a road environment sensing system and a storage medium.
    Background
      In the alpine region of the plateau, the driving safety faces a great challenge due to severe weather conditions and special conditions such as pedestrian, non-motor vehicle, livestock, wild animal road invasion and the like. In these complex environments, it is difficult to accurately identify road conditions solely by means of computer vision techniques.
      Currently, intelligent vehicles are commonly equipped with laser radars, millimeter wave radars and industrial cameras for recognizing road environments. The fusion of the radar and the camera can truly and intuitively acquire the image environment information, but when extreme weather such as sand dust or fog appears on the weather, the radar and the camera equipment cannot clearly sense all road condition information of the current road, which is dangerous under the complex road conditions of high and cold in the plateau.
    Disclosure of Invention
      The invention aims to solve the technical problem of providing a road environment sensing method, a road environment sensing device, a road environment sensing system and a road environment sensing storage medium, wherein the external environment of a vehicle is monitored by combining a thermal imaging image, a visible light image and point cloud data, the environment around the vehicle can be monitored in real time, early warning can be sent out timely, the recognition precision under a severe road environment can be improved, and the driving safety is remarkably improved.
      In order to achieve the above purpose, the present invention adopts the following technical scheme:
       The invention provides a road environment sensing method, which is based on multi-element sensor equipment to identify and sense road environment, utilizes information acquired by sensors such as a visible light camera, a thermal imaging camera, a millimeter wave radar, a laser radar and the like to carry out multi-source fusion of data, transmits the information to a vehicle-mounted edge calculator in a narrow-band transmission and wired transmission mode, and carries out path planning according to the information by combining an image fusion technology and a deep learning algorithm to identify and sense the road environment, and transmits road conditions to a road cloud through a 5G signal and a vehicle networking technology, thereby facilitating a road centralized control center to know the road conditions in time, and simultaneously informing a driver to take corresponding measures in time if dangerous road conditions occur, and ensuring safe driving: 
       S100, acquiring road environment images, wherein the road environment images are road environment information acquired by a visible light camera, a thermal imaging camera and radar equipment; 
       s200, preprocessing the road environment image; 
       s300, carrying out multi-data registration fusion on the preprocessed road environment image, wherein the multi-data registration fusion comprises the following steps: 
       carrying out multi-source data registration on a visible light image captured by a visible light camera and an infrared image captured by a thermal imaging camera through radar point cloud information; 
       calculating the structure and texture information of the multi-source image after data registration through a visible light and infrared countermeasure fusion model based on the texture information, taking the structure and the texture information as a base layer of the fusion image, converting by utilizing a color space to obtain the fusion image, and mapping the processed point cloud information into the fused image; 
       s400, carrying out segmentation recognition perception on the road images after registration and fusion, and realizing the recognition of the road environment in the highland alpine region. 
      Preferably, preprocessing the road environment image comprises constructing the relation between the gray value of the image and the temperature according to the infrared image data acquired by the thermal imaging camera.
      Preferably, preprocessing the road environment image comprises denoising point cloud information of the point cloud data acquired by the radar equipment.
      Preferably, the preprocessing of the road environment image comprises ultra-cleaning the road environment image data acquired by the visible light camera.
      Preferably, S400 includes:
       extracting reflection coefficient, surface texture and temperature distribution of a road surface through a modified Swin transducer network according to the image data after the registration and fusion of the multiple data; 
       and (3) identifying the road environment in the alpine region of the plateau according to the reflection coefficient, the surface texture and the temperature distribution of the road surface. 
      The invention also provides a road environment sensing device, which comprises:
       The acquisition module is used for acquiring road environment images, wherein the road environment images are road environment information acquired by a visible light camera, a thermal imaging camera and radar equipment; 
       the preprocessing module is used for preprocessing the road environment image; 
       the fusion module is used for carrying out multi-data registration fusion on the preprocessed road environment image; 
       The recognition module is used for carrying out segmentation recognition perception on the road images after registration and fusion, and realizing the recognition of the road environment in the highland alpine region. 
      The invention also provides a road environment awareness system comprising a memory and a processor, the memory having stored thereon a computer program for execution by the processor, the computer program when executed by the processor performing a road environment awareness method.
      The present invention also provides a storage medium having stored thereon a computer program which, when run, performs a road environment awareness method.
      The method realizes the road environment perception in the highland alpine region based on fusion of the thermal imaging, the image and the radar point cloud, ensures the driving safety, adopts the image fusion method based on texture characteristics, ensures the same fusion effect under different illumination conditions, improves the fusion precision, improves the road environment perception efficiency and the road environment perception precision by utilizing the improved Transform network, gives early warning to a driver in time when the dangerous condition of the road surface is found, and feeds the recognized road condition information back to a road centralized control center through a cloud when the vehicle runs in the highland alpine region, thereby facilitating the improvement of the road condition information.
    Drawings
      In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
      FIG. 1 is a flow chart of a road environment awareness method according to an embodiment of the present invention;
       FIG. 2 is a schematic layout of a vehicle unit; 
       FIG. 3 is a diagram of a pixel-image-camera-world coordinate system conversion relationship; 
       FIG. 4 is a schematic diagram of a radar apparatus and multi-source camera joint calibration; 
       FIG. 5 is a schematic diagram of a multi-source image data fusion algorithm; 
       Fig. 6 is a schematic diagram of a road environment recognition algorithm of the fused image. 
    Detailed Description
      The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
      In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
      Example 1:
       The embodiment of the invention provides a road environment sensing method, which is characterized in that a visible light camera, an infrared thermal imaging camera and radar equipment are calibrated, registered and fused, and a deep learning algorithm is utilized to realize the road environment state identification and monitoring based on multi-source data fusion, so that the road environment state can be sensed in real time. The method comprises the steps of selecting and applying a multi-source camera (a visible light camera and an infrared thermal imaging camera), radar equipment (a laser radar and a millimeter wave radar) and an edge computer for data processing, adopting a plurality of communication modes such as wired transmission, narrowband transmission, 5G transmission and the like according to different scene demands and combining with the Internet of vehicles to perform stable data communication so as to ensure timely and barrier-free data transmission, and adopting the edge computer to process the next frame of multi-source data to complete registration fusion and utilizing a deep learning algorithm to classify and identify road environment images after the previous frame fusion. 
      The method comprises the steps of firstly arranging data acquisition equipment to complete acquisition of road image data and respectively preprocessing acquired information, secondly completing registration and calibration of multi-source data according to the relation between pixel-image-camera-radar coordinate systems, realizing fusion of visible light images and thermal imaging images by using texture information fusion algorithm and integrating point cloud information, finally processing the fused image information by using a deep learning algorithm to complete road recognition tasks, transmitting a road environment recognition result to an automobile central control screen in a narrow-band transmission mode to remind a driver of paying attention to the road environment, transmitting the result to a cloud in a 5G transmission mode and feeding back to a road central control center by the cloud so as to know the environment road condition of a high-altitude and high-cold area. As shown in fig. 1, the specific steps are as follows:
       S000, acquisition equipment layout 
      The layout scheme of the equipment is shown in fig. 2, millimeter wave radars are installed around a vehicle, a visible light camera is arranged at the front end of the vehicle, and a laser radar, an infrared thermal imaging camera and a GNSS antenna for transmitting cloud information are arranged at the top end of the vehicle.
      S100, arranging equipment according to the method shown in FIG. 2 and acquiring original road environment image information acquired by each sensor, wherein the original road environment image information comprises road environment information acquired by a thermal imaging camera, a radar device and a camera
      S200, preprocessing image data information collected by each sensor, wherein the preprocessing comprises the following steps of:
       S201, preprocessing the infrared image data collected by the thermal imaging camera, wherein the specific scheme comprises the following steps. 
      1) Image gray scale conversion
      Because the gray image information is less and the calculation is faster, the subsequent fusion and recognition processing are convenient, the gray image information is processed, and the gray conversion formula is as follows:
      GS=0.299*R+0.587*G+0.114*B
       Wherein R, G, B respectively refer to three color channels of red, green and blue 
      2) Establishing relation between gray value and temperature of image
      Experiments show that a first-order functional relation exists between the gray value and the temperature value, so that a relation between the image gray value and the temperature value can be written according to the gray value and the temperature value of any two points, and the maximum temperature and the minimum temperature of the road environment and the image gray value corresponding to the maximum temperature and the minimum temperature are obtained in the thermal imaging diagram most easily, so that the conversion relation between the image gray value and the temperature is as follows:
       Wherein T is the temperature value of any point in the image, T max、Tmin is the highest temperature and the lowest temperature of the road environment captured by the thermal imaging camera, and GS is the gray value of the point. 
      S202, preprocessing point cloud data collected by radar equipment
      And denoising the point cloud information, and denoising the image by using a median filtering method to improve the definition and quality of the point cloud data.
      S203, preprocessing the road environment image data acquired by the visible light camera
      Considering weather reasons in high and cold areas of a plateau, the weather is more in heavy fog and weather in spring and winter, and the sandy dust weather is generated in Shi Yifa in summer and autumn, and pictures shot by a camera can be blurred, so that the image is subjected to ultra-cleaning treatment by adopting a Sobel function, and the method is expressed as follows:
       Wherein S x1 (x, y) and S y1 (x, y) are new coordinate points formed by convolving the point coordinates of the original image and the peripheral 8 points with the corresponding Sobel operator, and g x1、gy1 is edge detection volume kernels in the horizontal direction and the vertical direction. 
      S300, calibrating and registering multi-source sensor
      To perform fusion of multi-source image information data, firstly, the conversion relation of each image coordinate system should be determined, joint calibration of the multi-source sensor is performed according to the conversion relation, then, data registration based on point cloud information is performed, image differences caused by factors such as imaging conditions, visual angles, illumination and the like are eliminated, the same objects in different images can be accurately corresponding, a foundation is provided for subsequent image fusion, and the method comprises the following specific operations:
       s301, conversion relation among pixels, images, cameras and world coordinate systems in calibration process of visible light camera and thermal imaging camera 
      As shown in fig. 3, a pixel coordinate system is defined as uov, and an O-point is defined as an origin of the pixel coordinate system, and the unit of the coordinate system is one pixel. With the midpoint O p(u0,v0 in the pixel coordinate system as the origin of coordinates, the x-axis is established with the position parallel to the u-axis, the y-axis is established with the position parallel to the v-axis, and the constructed xo p y coordinate system is the camera coordinate system. The mapping relationship between the two coordinate systems is expressed in the form of a matrix:
       As shown in fig. 3, with the optical center O C as the origin and the designated plane as the camera coordinate system plane, a three-dimensional camera coordinate system XYZO is established, where there is a coordinate of a P point in the camera coordinate system (X C,YC,ZC)T, assuming that the coordinate of the P point on the image coordinate system is P (X, y), the conversion relationship between P and P is expressed as: 
       Wherein f is the focal length of the camera, and the conversion relationship between the image coordinate system and the camera coordinate system can be obtained through the conversion relationship between P and P, which is as follows: 
       To describe the pose relationship of a camera and an object in a three-dimensional environment, a reference coordinate system may be selected as a world coordinate system consisting of X w,Yw,Zw axes, the rotational transformation of the world coordinate system to the camera coordinate system is expressed using an orthonormal matrix R, and the translational transformation of the coordinate system is expressed using a translation vector t. The rotation matrix R and translation vector t constitute the extrinsic matrix of the camera and the world global coordinate system. The conversion relation between the world coordinate system and the camera coordinate system is as follows: 
       Where x c,yc,zc is the coordinates in the camera coordinate system and x w,yw,zw is the coordinates in the world coordinate system. 
      The corresponding relation between the world coordinate system and the image pixel coordinate system is obtained through the mapping relation between the image coordinate system and the camera coordinate system and the conversion relation between the camera coordinate system and the world coordinate system, and is as follows:
       wherein K is an internal reference matrix of the camera 
      S302, joint calibration of radar equipment and multi-source camera
      As shown in fig. 4, joint calibration is to solve the matching problem between multiple coordinate systems, while the matching problem from the point cloud coordinate system to the image coordinate system is actually to solve the coefficients of the rigid transformation thereof. First, the relation between the point cloud coordinate system and the multi-source camera coordinate system is determined through a formula P c=RPL +t, wherein R represents a rotation matrix, and t represents a translation vector. The method comprises the steps of projecting points on a camera coordinate system onto the image coordinate system through the relation between the camera coordinate system and the image coordinate system, wherein the conversion relation is P I=APC, P I=[u,v,1]T is the three-dimensional coordinates of pixel points in the image coordinate system, u and v represent the coordinates of the image coordinate system in units of pixels, and A is a parameter matrix of the camera. The conversion relation between the point cloud coordinate system and the image coordinate system can be obtained through the conversion relation between the point cloud coordinate system and the camera coordinate system, and the conversion relation between the camera coordinate system and the image coordinate system is as follows:
      pI=A[R,t]PL 
       through the transformation relationship, radar information can be mapped into an image coordinate system acquired by a multi-camera. 
      S303, multi-source data registration based on radar point cloud information
      The corresponding relation between the point cloud coordinate system and the image pixel coordinate system is obtained in step S301 and step S302, and then the visible light image captured by the visible light camera and the infrared image captured by the thermal imaging camera are subjected to multi-source data registration through radar point cloud information, and the conversion relation in the same graph set under the same coordinate system is as follows:
       Wherein, P VIS、PNIR IS the coordinates of the point cloud information on the camera and the thermal imaging camera image, C p IS the point cloud information acquired by the radar, IS VIS、ESVIS IS the internal and external parameters of the camera, IS NIR、ESNIR IS the internal and external parameters of the infrared thermal imaging camera. 
      And carrying out space transformation on the calculated P VIS、PNIR by utilizing a transformation relation between a world coordinate system and an image coordinate system, and further carrying out refinement processing on the transformation coordinate by adopting a MASC algorithm to finish multi-source image data registration based on radar point cloud information.
      S304, multi-source image data fusion
      The embodiment of the invention provides a visible light and infrared countermeasure fusion model based on texture information, as shown in fig. 5. The visible light and infrared countermeasure fusion model aims to solve the problems of low visibility and image blurring caused by snow, fog, dust and other weather when a vehicle runs in a highland alpine region, so that a computer can better understand and identify a road environment. Although the conventional multi-source image fusion algorithm can improve the visual effect of the image, most of the conventional multi-source image fusion algorithm does not consider the requirement of a computer identification task. Therefore, the visible light and infrared countermeasure fusion model of the embodiment of the invention reserves a discriminator to realize the computer identification of the image, improves a generator, adopts a texture filtering fusion algorithm to replace a pure artificial intelligent iteration algorithm so as to definitely fuse the target, and leads the image to be closer to the visible light image, thereby quickly realizing the 'spoofing' of the computer and forming the visible light picture approved by the computer. For vehicles driving in high and cold areas of the plateau, the high efficiency and high stability of fusion are always the first, which is also the reason for optimizing the algorithm and a great innovation point of the invention. The texture filtering fusion algorithm calculates the structure and texture information of the multi-source image through a texture filter, obtains a fusion image by combining color space conversion, maps the point cloud information into the fused image, and finally generates the multi-source image information integrating visible light, infrared thermal imaging and point cloud information through continuous correction, fusion and judgment, thereby effectively improving the image blurring problem and improving the road environment perception capability of a computer in a special highland alpine environment.
      The image fusion algorithm provided by the invention comprises seven steps of structure separation and extraction, texture filtering, color space conversion, image fusion, point cloud fusion and fusion judgment.
      (1) Structural separation and extraction
      The visible light image and the thermal imaging image are composed of a main structure and textures, wherein I a=Ta+Sa,a∈(vis,nir),Ia is used for representing an original image under different channels, T a is used for representing a texture image, S a is used for representing a structural image, a texture information extraction algorithm is used for extracting the texture structure of the image, and the main structure of the image is separated from the texture part of the image, so that the main structure of the image and the texture part of the image can be conveniently separated.
      (2) Texture extraction
      A pixel point p is divided into K sections according to colors, an area is arranged by taking the pixel p as a circle center and taking r as a radius, a straight line with an included angle theta with a horizontal line is arranged, the area is divided into two sub-areas with the same size, and texture information of the two sub-areas is represented by g θ,r(p,k)、hθ,r (p, K). g θ,r(p,k)、hθ,r (p, k) is vector data, and contains color straight direction information, so that a histogram difference in any direction can be obtained:
       where k is the index of the color interval, a texture edge detector mPb can be derived from this, which algorithm is 
      Where Ω= { r 1,r2,...,rm } represents a set of radii, different sets of radii may result in different filtering effects, and mPb is used to increase the efficiency of obtaining texture edges.
      (3) Texture filtering
      And performing texture filtering processing on the separated image texture structure, removing noise in the texture, and optimizing and improving the texture structure of the image. Texture filtering is performed using a filter G generated using a joint bilateral filtering algorithm (Joint Bilateral, JBL):
       Sp and G p、Iq represent smoothing under p pixels, guiding and inputting an image, k represents a weight, the value of the weight is obtained by fusing the results when the anti-network is stable, and f (p, q) and G (G p,Gq) are Gaussian functions and are used for calculating the spatial distance and the color distance of p and q pixels. 
      (4) Color space conversion
      Color space conversion is a critical step that involves converting an image from one color mode to another. This transformation plays a central role in the process of image fusion. Through color space conversion, images can be processed and fused in different color expression modes, so that color information in the images can be extracted and utilized more effectively. In the invention, the main structures of the visible light image and the thermal imaging image are subjected to color space conversion, and the visible light image and the thermal imaging image are converted into the same color system, so that convenience is provided for subsequent image fusion. The basic algorithm is as follows:
      Fv=Vrgb+JBL(Tvis,s)+Tnir 
       Wherein V rgb is a visible light image, T vis is a value obtained by converting an RGB channel into an HSV channel and selecting the V channel as a ground color for fusion, T nir is an infrared texture structure, and JBL (T vis,s) is data obtained by modifying a joint bilateral filter (JBL) wave. 
      (5) Image fusion based on texture information
      The calculated F v and H vis、Svis are converted into RGB images as a result of image fusion. Hvis and Svis are visible light values of H, S channels in the HSV channels respectively. Therefore, the quality and visual effect of the image are improved, meanwhile, temperature data are brought to the image, the problems that the image is blurred due to low visibility caused by dust and fog in the alpine region of the plateau are solved, and meanwhile, the surrounding environment information can be fed back by the temperature data. The method comprises the following steps:
       (6) Point cloud fusion 
      And integrating the point cloud information into the fused image, so that the subsequent computer identification is facilitated, and the position of the obstacle in the graph and the distance information from the vehicle can be judged according to the point cloud information.
      (7) Fusion judgment
      And generating a fusion image F  Melting and melting  through the algorithm, and synchronously inputting the original image captured by the visible light camera and the fusion image F  Melting and melting  into the visible light discriminator D for fusion judgment. The step is called a testing process, in which, the computer judges that the visible light photo is true, the fusion picture is false, if true, the fusion picture is output, if false, the parameters of the filter are modified to carry out image fusion again, the fusion picture which has more optical textures and can 'deception' the computer is gradually formed through the continuous modification of the parameters of the discriminator and the filter, and meanwhile, after the modification is finished, the parameters of the filter are stable under the same road condition at the same section, namely, a visible light image, a thermal imaging image and point cloud data are input, so that a 'deceptive visible light image' can be generated, which is convenient for the subsequent semantic segmentation and recognition of the computer. The arbiter D follows the following algorithm:
       Wherein x represents a real visible light picture, z represents noise input into the filter G, G (z) represents a fused picture generated through a fusion algorithm after being filtered by the filter G, D (x) represents a probability that the discriminator D judges whether the picture is the real visible light, and E (x) represents a mathematical expectation. It is evident that for the arbiter D it is desirable to maximize log D (x) and log (1-D (G (z))), and for the filter G it is the purpose of continually modifying the weights k in order to minimize log (1-D (G (z))) to achieve the goal of M (D, G) minimization. 
      S400, road environment perception
      The road environment perception algorithm is used as a key and core step of the method of the embodiment of the invention, and directly determines the understanding degree of the vehicle on the road environment and the accuracy of driving decision. The invention provides a method for identifying a fused image by using a Swin transform network, which is improved in pertinence, because the transform network is superior to a Convolutional Neural Network (CNN) in terms of the identification efficiency and accuracy of an algorithm, but because the low resolution characteristic mapping and the complexity of the transform network are secondarily increased along with the size of the image, the structure of the transform network is not suitable for being used as a dense and large visual task or a backbone network through which a high resolution image passes, and the defects are fatal to the identification of a road environment. The algorithm flow is shown in fig. 6.
      (1) Identification process
      The improved Swin transducer network can quickly extract key characteristic values from the fused image. The network design is unique, and a blocking strategy and a multi-stage structure are adopted to effectively process large-size images. Meanwhile, the window attention module and the moving window self-attention module are integrated, and the two modules work together to accurately capture various features of an image, including image features, point cloud features and temperature features.
      The fused image is input into a Swin transform network, picture segmentation is carried out through a Patch Partition layer of the Swin transform fusion module, and feature mapping is carried out on segmented data through a Linear Embedding layer to extract key features.
      The output key features of each stage are processed by ConvNext modules. The ConvNext module first uses a convolution operation of n×n to increase the expression capability of the local information. And the number of feature channels is amplified by the multi-layer perceptron to provide a richer representation of the features.
      Finally, the object classification recognition module is constructed by utilizing the non-linear activation units such as the ReLU function and the position code. The regression prediction module is responsible for converting the feature mapping learned by the previous layers into specific results of target detection, classifying and outputting the results, so that effective recognition of road obstacles is realized.
      (2) Loss function and model training
      CASCADE RCNN is adopted as a regression module, and comprises a classifier C x and a regressive f x, each element in the graph is classified, the classifier outputs a probability distribution vector for displaying the probability of classifying each element into the category, and the classification loss is as follows:
       Wherein L cls is cross entropy loss, N is the number of identification elements in the graph, x i is network image input, y i is classification category, and c (x i) is posterior probability of M+1 (M is more than or equal to 1 and less than or equal to N). 
      When classifying elements in a graph, different IOU thresholds are generally used to determine those elements in an image block that need to be classified and identified, and those elements are only used as the background of the image block, if the IOU is higher than the threshold u, which indicates that the elements in the image block need to be classified, the computing function is as follows:
       Where g represents the true category and g y is the location tag of the true box. 
      The regression f (x, b) is used to regress the selected frame b at a certain position to the real target frame g, and the coordinates of each frame are set as (b x,by,bw,bh), so that the loss of the regression is as follows:
       Wherein, L loc is the loss value of L 2, and each stage performs training again after optimizing the IOU threshold, so as to achieve prediction closer to the target, and records the IOU value, and remains stable in a certain period of time, and the cascade loss after optimization is represented as: 
      L(xt,g)=Lds(hi(xt),yt)+λ[yt≥1]Lloc(fi(xt,bt),g)
       B t=ft(xt-1,bt-1) and lambda are balance coefficients, lambda is 1, y t is a label of the t iteration, g represents a real frame object of the t iteration, detection performance and training efficiency can be remarkably improved through training of the algorithm, and the method is suitable for vehicle-mounted recognition with complex and changeable, real-time interaction and huge data. 
      The invention carries out segmentation recognition perception on the fused road image based on a deep learning algorithm to finish the recognition of the road environment in the highland alpine region. After registration and fusion of the multi-source image data, the image information contains the image, temperature and position information of the road. According to the road environment information, whether stones, pits, road invasion, accumulated ice, accumulated snow and the like exist on the road surface or not can be perceived. The camera captures image information of the road surface, the radar acquires distance and reflection information of the road surface, and the thermal imaging camera detects temperature distribution of the road surface. And extracting the characteristic information such as reflection coefficient, surface texture, temperature distribution and the like of the road surface from the fused atlas. And classifying and identifying according to the extracted characteristic information by using a classifier and a machine learning algorithm, and finally outputting whether potential safety hazards exist on the road to a driver.
      S500, recognition result feedback
      And (3) feeding back the road environment recognition result obtained in the step (S4) to a driver, helping the driver to know the driving environment in real time, providing data information support for subsequent path planning, transmitting the data information support to a cloud end through a GNSS antenna, facilitating a road centralized control center to know the real-time road condition of a highland alpine region, forming a data set and providing data reference for subsequent vehicles.
      Example 2:
       The invention also provides a road environment sensing device, which comprises: 
       The acquisition module is used for acquiring road environment images, wherein the road environment images are road environment information acquired by a visible light camera, a thermal imaging camera and radar equipment; 
       the preprocessing module is used for preprocessing the road environment image; 
       the fusion module is used for carrying out multi-data registration fusion on the preprocessed road environment image; 
       The recognition module is used for carrying out segmentation recognition perception on the road images after registration and fusion, and realizing the recognition of the road environment in the highland alpine region. 
      Example 3:
       The embodiment of the invention also provides a road environment sensing system, which comprises a memory and a processor, wherein the memory stores a computer program operated by the processor, and the computer program executes a road environment sensing method when being operated by the processor. 
      Example 4:
       The embodiment of the invention also provides a storage medium, and the storage medium is stored with a computer program which executes the road environment sensing method when running. 
      The above embodiments are merely illustrative of the preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, but various modifications and improvements made by those skilled in the art to which the present invention pertains are made without departing from the spirit of the present invention, and all modifications and improvements fall within the scope of the present invention as defined in the appended claims.
    Claims (7)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202410744236.XA CN118521985B (en) | 2024-06-11 | 2024-06-11 | A road environment perception method, device, system, and storage medium | 
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title | 
|---|---|---|---|
| CN202410744236.XA CN118521985B (en) | 2024-06-11 | 2024-06-11 | A road environment perception method, device, system, and storage medium | 
Publications (2)
| Publication Number | Publication Date | 
|---|---|
| CN118521985A CN118521985A (en) | 2024-08-20 | 
| CN118521985B true CN118521985B (en) | 2025-03-14 | 
Family
ID=92277268
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date | 
|---|---|---|---|
| CN202410744236.XA Active CN118521985B (en) | 2024-06-11 | 2024-06-11 | A road environment perception method, device, system, and storage medium | 
Country Status (1)
| Country | Link | 
|---|---|
| CN (1) | CN118521985B (en) | 
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN119233086A (en) * | 2024-12-02 | 2024-12-31 | 深圳大学 | Electronic sentry system based on array radar, infrared and optics and use method | 
| CN119722713B (en) * | 2025-02-26 | 2025-06-27 | 北京东宇宏达科技有限公司 | Partition data processing method for infrared image in specific area | 
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN118038226A (en) * | 2024-03-07 | 2024-05-14 | 湖南大学 | A road safety monitoring method based on LiDAR and thermal infrared visible light information fusion | 
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN111210481A (en) * | 2020-01-10 | 2020-05-29 | 大连理工大学 | Depth estimation acceleration method of multiband stereo camera | 
| CN111260597B (en) * | 2020-01-10 | 2021-12-03 | 大连理工大学 | Parallax image fusion method of multiband stereo camera | 
| CN112509333A (en) * | 2020-10-20 | 2021-03-16 | 智慧互通科技股份有限公司 | Roadside parking vehicle track identification method and system based on multi-sensor sensing | 
| CN113596335A (en) * | 2021-07-31 | 2021-11-02 | 重庆交通大学 | Highway tunnel fire monitoring system and method based on image fusion | 
| CN114254696B (en) * | 2021-11-30 | 2025-02-14 | 上海西虹桥导航技术有限公司 | Visible light, infrared and radar fusion target detection method based on deep learning | 
| CN118094471A (en) * | 2024-03-04 | 2024-05-28 | 中国地质大学(武汉) | Urban road extraction method, storage medium and equipment based on multi-source data fusion | 
- 
        2024
        - 2024-06-11 CN CN202410744236.XA patent/CN118521985B/en active Active
 
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title | 
|---|---|---|---|---|
| CN118038226A (en) * | 2024-03-07 | 2024-05-14 | 湖南大学 | A road safety monitoring method based on LiDAR and thermal infrared visible light information fusion | 
Non-Patent Citations (1)
| Title | 
|---|
| 面向夜视成像系统的多源信息融合方法研究;孙北辰;中国优秀硕士学位论文全文数据库信息科技辑;20240215;正文第1-6章 * | 
Also Published As
| Publication number | Publication date | 
|---|---|
| CN118521985A (en) | 2024-08-20 | 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| CN118521985B (en) | A road environment perception method, device, system, and storage medium | |
| He et al. | A feature fusion method to improve the driving obstacle detection under foggy weather | |
| US8487991B2 (en) | Clear path detection using a vanishing point | |
| WO2023155903A1 (en) | Systems and methods for generating road surface semantic segmentation map from sequence of point clouds | |
| CN112912920A (en) | Point cloud data conversion method and system for 2D convolutional neural networks | |
| CN110956069B (en) | Method and device for detecting 3D position of pedestrian, and vehicle-mounted terminal | |
| JP6574611B2 (en) | Sensor system for obtaining distance information based on stereoscopic images | |
| CN111461221A (en) | A multi-source sensor fusion target detection method and system for autonomous driving | |
| CN111695403B (en) | Depth perception convolutional neural network-based 2D and 3D image synchronous detection method | |
| CN112215074A (en) | Real-time target recognition, detection and tracking system and method based on UAV vision | |
| CN114578807B (en) | Unmanned target vehicle radar fusion active target detection and obstacle avoidance method | |
| CN111967396A (en) | Processing method, device and equipment for obstacle detection and storage medium | |
| CN117237919A (en) | Intelligent driving sensing method for truck through multi-sensor fusion detection under cross-mode supervised learning | |
| CN119323777B (en) | Automatic obstacle avoidance system of automobile based on real-time 3D target detection | |
| CN113052118A (en) | Method, system, device, processor and storage medium for realizing scene change video analysis and detection based on high-speed dome camera | |
| CN117423077A (en) | BEV perception model, construction method, device, equipment, vehicle and storage medium | |
| CN117710918A (en) | Lane line detection method and system | |
| CN114022563A (en) | Dynamic obstacle detection method for automatic driving | |
| CN116977964A (en) | Vehicle environment sensing method and automatic driving vehicle | |
| Bhupathi et al. | Sharp Curve Detection of Autonomous Vehicles using DBSCAN and Augmented Sliding Window Techniques | |
| Zheng et al. | Research on environmental feature recognition algorithm of emergency braking system for autonomous vehicles | |
| CN115147450B (en) | Moving target detection method and detection device based on motion frame difference image | |
| Xu et al. | Speed bump recognition for autonomous vehicles based on semantic segmentation | |
| CN120655646A (en) | A real-time inspection method and system for road surface defects based on drones | |
| Shinzato et al. | Features image analysis for road following algorithm using neural networks | 
Legal Events
| Date | Code | Title | Description | 
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |