CN120707369A

CN120707369A - Multipath video processing method and device suitable for engineering machinery

Info

Publication number: CN120707369A
Application number: CN202510838240.7A
Authority: CN
Inventors: 王斌; 刘建国; 李飞
Original assignee: Jiangsu Advanced Construction Machinery Innovation Center Ltd
Current assignee: Jiangsu Advanced Construction Machinery Innovation Center Ltd
Priority date: 2025-06-20
Filing date: 2025-06-20
Publication date: 2025-09-26

Abstract

The present invention discloses a multi-channel video processing method and device suitable for construction machinery, belonging to the field of image processing technology. The method comprises: acquiring multi-channel video frames of the construction machinery's surround view; assigning image processing tasks to the multi-channel video frames, synchronously processing the image processing tasks using a thread pool based on a UMat resource management method; and performing surround view splicing and display on the multi-channel video frames after the processing tasks have been completed. By utilizing the UMat resource management method and error frame detection technology to reduce hardware resource utilization, the present invention can be deployed on embedded platforms and is widely applicable to various construction machinery.

Description

Multipath video processing method and device suitable for engineering machinery

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for processing multiple paths of video signals for engineering machinery.

Background

In engineering machinery operation (such as an excavator and a loader), the problem of a visual field blind area of a driver is prominent, and a multi-path video monitoring and obstacle alarm system becomes a key of safety operation. However, as the number of video paths increases, the traditional video processing method exposes the problems of low utilization efficiency of resources, poor multi-task cooperation and the like under the complex working conditions of engineering machinery and high data volume surge and real-time requirements.

The patent publication No. CN113535366A discloses a high-performance distributed combined multipath video real-time processing method, which belongs to the technical field of video processing. The method comprises the steps of A, constructing a video processing pipeline, B, starting multiple processes/threads, wherein each processing module starts a multiple process mode and each process starts multiple threads, the number of the processes/threads started by each module is determined together according to service scene requirements, model performance and software and hardware resource limitation, C, constructing a shared queue between an upstream module and a downstream module, and setting a data access strategy. The video decoding, preprocessing, model reasoning and post-processing modules are all separated, the use is simple, and the specially designed data access mechanism between adjacent modules can remarkably improve the multi-path video processing efficiency. The invention solves the problem of video processing, but has high requirement on hardware resources, and is not suitable for an embedded platform of a mobile vehicle.

Disclosure of Invention

The invention aims to overcome the defects in the prior art, and provides a multipath video processing method and device suitable for engineering machinery, which solve the technical problems that the prior art scheme depends on hardware resources and is not suitable for an embedded platform of a mobile vehicle.

In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:

in a first aspect, the present invention provides a multi-path video processing method suitable for an engineering machine, including:

acquiring multi-path video frame images of the engineering machinery looking around;

Setting image processing tasks for multiple paths of video frame images, and synchronously processing the image processing tasks by using a thread pool based on UMat resource management methods;

and performing look-around splicing and displaying on the multiple paths of video frame images which are subjected to the processing task.

Optionally, the step of acquiring the multi-path video frame image of the engineering machinery comprises the steps of distributing independent threads for each path of video acquisition task, caching the acquired video data by using the annular queue, and extracting the video frame image from the cached video data.

Optionally, the image processing task includes a de-distortion task, where the de-distortion task includes:

constructing a distortion model, wherein the distortion model comprises a radial distortion model and a tangential distortion model;

the radial distortion model is as follows:

the tangential distortion model is:

in the formula, As the coordinates of the image after the distortion,Is the image coordinates without distortion,Is thatThe distance to the center of the image is,As the radial distortion coefficient of the lens,Is a tangential distortion coefficient;

mapping the distorted coordinates to undistorted coordinates by using an iteration method until the preset maximum iteration times are reached:

in the formula, Undistorted image coordinates for iterative acquisition。

Optionally, the image processing task includes a frame error obstacle detection task, and the frame error obstacle detection task includes:

Carrying out graying treatment on an input video frame image to generate a gray image;

calculating difference image of gray level image of previous frame and current frame :

In the formula,Gray images of the current frame and the previous frame respectively;

From differential images Generating a binary image:

In the formula,Is a binary imageMiddle coordinatesIs used for the display of the display panel,As a differential imageMiddle coordinatesPixel values of (2); Is a constant value for distinguishing a moving object from background noise;

profile detection using OpenCV:

in the formula, As a function of the contour detection,List of detected contours;

acquiring a moving area according to a contour list :

In the formula,For an image of an originally input video frame,To move the lower left corner coordinates of the region,Width and height of the moving area;

For moving area Scaling to obtain target size image:

In the formula,Detecting the input width and height of the model for the pre-trained target;

Performing target detection through a pre-trained target detection model:

in the formula, For a pre-trained object detection model,The detection result is obtained.

Optionally, the setting an image processing task on multiple video frame images, and synchronously processing the image processing task by using a thread pool based on UMat resource management method includes:

And creating a thread pool for parallel operation aiming at the image processing task of each path of video frame image, wherein each thread pool acquires Umat in the pool and returns to the pool after the use is completed.

In a second aspect, the present invention provides a multi-path video processing apparatus suitable for use in an engineering machine, including:

the image acquisition module is configured to acquire multiple paths of video frame images of the engineering machinery looking around;

A task processing module configured to set image processing tasks for a plurality of video frame images, and synchronously process the image processing tasks by using a thread pool based on UMat resource management methods;

and the splicing display module is configured to carry out look-around splicing and display on the multiple paths of video frame images which are subjected to the processing task.

In a third aspect, the present invention provides an electronic device, including a processor and a storage medium;

The storage medium is used for storing instructions;

the processor is operative according to the instructions to perform steps according to the method described above.

In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method.

In a fifth aspect, the invention provides a computer program product comprising computer programs/instructions which when executed by a processor implement the steps of the method described above.

Compared with the prior art, the invention has the beneficial effects that:

The multipath video processing method and device suitable for engineering machinery provided by the invention have the advantages that the real-time performance of video processing is obviously improved through thread optimization, the cost of memory allocation and release is reduced through object pool management UMat objects, the performance is improved, and the calculation amount of YOLO is reduced through preliminary screening of a mobile area, so that the resource use is reduced. In conclusion, the invention can be deployed on an embedded platform by reducing the utilization of hardware resources, and can be widely applied to various engineering machinery.

Drawings

Fig. 1 is a schematic flow chart of a multi-path video processing method suitable for engineering machinery according to an embodiment of the present invention;

fig. 2 is a flowchart of an implementation of a multi-path video processing method suitable for an engineering machine according to an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.

Embodiment one:

as shown in fig. 1, the present invention provides a multi-path video processing method suitable for engineering machinery, which includes the following steps:

S1, acquiring multiple paths of video frame images of the engineering machinery looking around.

In particular, in this embodiment, the multiple paths of video frame images generally include four paths of front, rear, left and right of the engineering machinery, and the subsequent stitching of the looking-around images can be performed through the four paths of video frame images.

The method for acquiring the multi-path video frame image of the engineering machinery comprises the steps of distributing independent threads for each path of video acquisition task, caching the acquired video data by utilizing a ring queue, and extracting the video frame image from the cached video data. The method can fully exert the parallel advantages of multiple threads, improve the video acquisition efficiency, effectively avoid data loss and ensure the integrity and continuity of video data.

And S2, setting image processing tasks for the multi-channel video frame images, and synchronously processing the image processing tasks by using a thread pool based on UMat resource management method.

The image processing task may be set according to specific use requirements, and in this embodiment, the image processing task includes a de-distortion task and an erroneous frame obstacle detection task.

(1) The de-distortion tasks include:

(1.1) constructing a distortion model, wherein the distortion model comprises a radial distortion model and a tangential distortion model;

The radial distortion model is:

the tangential distortion model is:

(1.2) mapping the distorted coordinates to undistorted coordinates by using an iteration method until a preset maximum iteration number is reached:

in the formula, Undistorted image coordinates for iterative acquisition。

In order to reduce occupation of CPU resources by the iterative method, UMat (matrix class for unifying CPU and GPU operations in OpenCV, which can implement transparent CPU/GPU acceleration, and automatically use GPU to perform computation) is introduced conventionally, and time-consuming computation is put on the GPU to perform computation. Because the system is mainly used for an embedded platform, synchronous processing of multiple paths of images is needed, GPU resources are limited, and further multiple paths of image de-distortion tasks dynamically allocate resources by creating thread pools, but the dynamic allocation can cause frequent release/creation of the resources, so that the system is crashed. A UMat object pool resource management method is specifically designed for UMat. In the system test experiment, based on RK3568 platform test, UMat frequent release/creation can cause high fluctuation (80% to more than 100%) of GPU resource utilization, resulting in system crash.

UMat design description of object pool resource management method:

By using deque (double-end queue, allowing two ends to perform efficient insertion and deletion operations) in C++ to create UMatPtr (UMat pointer) double-end queue pool (UMat object pool), the UMat resource management method is initialized to create 10 (preset parameters: the load occupancy rate of RK3568GPU is read by calling custom script through command line, and the occupancy rate is lower than 80% as a standard) objects for constructing an object pool, and resource waste caused by frequent release/creation is avoided through the pre-allocation mechanism. Meanwhile, two function methods are designed and exposed for acquiring and releasing UMate objects in pool, and the creation and release realize the exclusive access to the resource pool through QMutex (exclusive lock) and QMutexLocker (exclusive lock).

GetUMat (method of creating function for UMate object) obtaining UMat from pool for undistorted computation of four camera raw images

ReleaseUMat (UMat object release function method) returning to pool after use;

Concrete use instructions of UMat object pool resource management method in the de-distortion process:

Under the RK3568 embedded platform, the thread pools are used for synchronously processing de-skew tasks, namely, for creating parallel operation of the tasks of a plurality of thread pools, each thread pool can acquire UMat in the pool, and after the use is finished, the threads return to the pool, and the multithreading is concurrent, but the use process ensures atomicity by a lock mechanism, so that the concurrent conflict is avoided. Because the embedded platform GPU resources are limited, more UMat GPU resources cannot be allocated, repeated creation/release can lead to GPU proliferation, multiplexing of GPU resources is realized through the UMat resource multiplexing technology of the UMat resource management method, the experimental test result shows that the GPU occupancy rate is less than 70%, and meanwhile, the CPU resource occupancy rate is reduced.

(2) The frame error obstacle detection tasks include:

(2.1) performing graying processing on an input video frame image to generate a gray image;

(2.2) calculating a differential image of gray-scale images of the previous frame and the current frame :

(2.3) from the differential image Generating a binary image:

(2.4) outline detection with OpenCV:

(2.5) acquiring a movement region from the contour list :

(2.6) for moving region Scaling to obtain target size image:

(2.7) target detection by a pre-trained target detection model:

Currently, a mainstream object detection model, such as YOLOv model, includes a bounding box coordinate and a class probability. By comparing the images of the previous frame and the current frame, the mobile area is primarily screened, so that the calculated amount of the YOLOv model can be effectively reduced, and the purpose of reducing the resource use is achieved.

And (3) under the same RK3568 embedded platform, synchronously processing the error frame detection tasks by using thread pools, namely creating a plurality of thread pool tasks for parallel operation, wherein each thread can acquire UMat of pool (UMat object pools), and returning to the pool after use. Multiplexing of GPU resources is achieved and the resource occupancy rate is reduced through UMat resource multiplexing technology of UMat object pool resource management method.

And S3, performing look-around splicing and displaying on the multi-channel video frame images after the processing task is executed.

As shown in fig. 2, in this embodiment, a de-distortion task and a frame error obstacle detection task are given, a de-distortion image of a multi-path video frame image can be obtained by the de-distortion task, and a bounding box coordinate and a class probability of an obstacle on the multi-path video frame image can be obtained by the frame error obstacle detection task.

The multi-path de-distortion image is subjected to ring-looking stitching operation based on the camera internal parameters to obtain a ring-looking image, the boundary frame coordinates are required to be firstly transformed to the original video frame image, then the ring-looking image is subjected to annotation display based on the camera internal parameters, and finally the displayed contents are the ring-looking image, the boundary frame coordinates and the category probability.

Further expanding, the obstacle coordinate alarm sector area can be pre-painted, and the resource waste caused by repeated redrawing is avoided through displaying/hiding the alarm sector area by coordinates, so that the FPS is improved.

Embodiment two:

the embodiment of the invention provides a multipath video processing device suitable for engineering machinery, which comprises:

and the image acquisition module is configured to acquire multiple paths of video frame images of the engineering machinery looking around.

The task processing module is configured to set image processing tasks for multiple paths of video frame images and synchronously process the image processing tasks by using thread pools based on UMat resource management methods, and concretely comprises the steps of creating a thread pool for the image processing tasks of each path of video frame images to perform parallel operation, and returning Umat in pool after use of each thread pool.

And the splicing display module is configured to carry out look-around splicing and display on the multi-channel video frame images after the processing task is executed.

Embodiment III:

the embodiment of the invention provides electronic equipment, which comprises a processor and a storage medium;

the storage medium is used for storing instructions;

Embodiment four:

An embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described method.

Fifth embodiment:

Embodiments of the present invention provide a computer program product comprising a computer program/instruction which, when executed by a processor, implements the steps of the above method.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims

1. A multi-channel video processing method applicable to engineering machinery, characterized by comprising:

Acquire multi-channel video frame images of the surrounding view of the construction machinery;

Setting image processing tasks for the multiple video frame images, and using a thread pool based on the UMat resource management method to synchronously process the image processing tasks;

Perform surround stitching and display on the multiple video frame images that have completed the processing task.

2. The multi-channel video processing method suitable for engineering machinery according to claim 1 is characterized in that the acquisition of multi-channel video frame images of the engineering machinery's surround view includes: allocating an independent thread to each video acquisition task, using a circular queue to cache the collected video data, and extracting video frame images from the cached video data.

3. The multi-channel video processing method for engineering machinery according to claim 1, wherein the image processing task includes a dedistortion task, and the dedistortion task includes:

Constructing a distortion model, wherein the distortion model includes a radial distortion model and a tangential distortion model;

The radial distortion model is:

The tangential distortion model is:

Where, is the distorted image coordinate, are the undistorted image coordinates, for The distance to the center of the image, is the radial distortion coefficient, is the tangential distortion coefficient;

Use an iterative method to map the distorted coordinates to the undistorted coordinates until the preset maximum number of iterations is reached:

Where, The undistorted image coordinates obtained by iteration .

4. The multi-channel video processing method for engineering machinery according to claim 1, wherein the image processing task includes a frame error obstacle detection task, and the frame error obstacle detection task includes:

Perform grayscale processing on the input video frame image to generate a grayscale image;

Calculate the difference image between the grayscale images of the previous frame and the current frame :

Where, are the grayscale images of the current frame and the previous frame respectively;

According to the difference image Generate binary image :

Where, For binary images Center coordinates The pixel value of is the difference image Center coordinates Pixel value of is a constant value used to distinguish moving targets from background noise;

Contour detection using OpenCV:

Where, is the contour detection function, is the list of detected contours;

Get the moving area based on the contour list :

Where, is the original input video frame image, is the coordinate of the lower left corner of the moving area, is the width and height of the moving area;

For mobile areas Perform scaling to obtain the target size image :

Where, The input width and height of the pre-trained object detection model;

Perform object detection using a pre-trained object detection model:

Where, is a pre-trained target detection model, For the test results.

5. The multi-channel video processing method for engineering machinery according to claim 1, wherein the step of setting image processing tasks for the multiple video frame images and synchronously processing the image processing tasks using a thread pool based on a UMat resource management method comprises:

For the image processing task of each video frame image, a thread pool is created for parallel operation. Each thread pool obtains Umat from the pool and returns it to the pool after use.

6. A multi-channel video processing device suitable for engineering machinery, comprising:

An image acquisition module is configured to acquire multi-channel video frame images of the surrounding view of the construction machinery;

A task processing module is configured to set image processing tasks for multiple channels of video frame images and synchronously process the image processing tasks using a thread pool based on a UMat resource management method;

The splicing and display module is configured to perform surround splicing and display on the multiple channels of video frame images that have completed the processing task.

7. The multi-channel video processing device for engineering machinery according to claim 6, wherein the step of setting image processing tasks for the multiple video frame images and synchronously processing the image processing tasks using a thread pool based on a UMat resource management method comprises:

8. An electronic device, comprising a processor and a storage medium;

The storage medium is used to store instructions;

The processor is configured to operate according to the instructions to execute the steps of the method according to any one of claims 1 to 5.

9. A computer-readable storage medium having a computer program stored thereon, wherein when the program is executed by a processor, the steps of the method according to any one of claims 1 to 5 are implemented.

10. A computer program product, comprising a computer program/instruction, wherein the computer program/instruction implements the steps of the method according to any one of claims 1 to 5 when executed by a processor.