Disclosure of Invention
The technical problem to be solved by the invention is to provide the paddy rice yield prediction method and system based on the multi-source time sequence remote sensing data, which are fused with the multi-source remote sensing data and the space-time data, and the accuracy and the reliability of paddy rice yield prediction are obviously improved by dynamically analyzing and filtering the remote sensing characteristics of different climates of the paddy rice.
In order to solve the technical problems, the technical scheme of the invention is as follows:
in a first aspect, a method for predicting rice yield based on multi-source time-series remote sensing data, the method comprising:
acquiring multispectral data and laser radar data, and performing accurate geographic positioning of the data by utilizing satellite positioning to acquire remote sensing data of the multisource unmanned aerial vehicle;
carrying out in-camera orientation, wave band registration, orthographic image splicing and embedding, radiation correction and remote sensing index calculation on the multispectral data to obtain processed multispectral data;
Performing point cloud reconstruction and point cloud filtering on the laser radar data, and calculating the height of rice plants to obtain processed laser radar data;
Registering the processed laser radar data with multispectral data to realize the preprocessing of the remote sensing data of the multisource unmanned plane;
based on the weather period, acquiring multi-source remote sensing data of a time sequence, and performing optimization processing on the time sequence data to obtain optimized time sequence data;
Extracting element features of the sub-region level according to the optimized time sequence data to construct an element feature set of the time sequence;
according to the element feature set of the constructed time sequence, obtaining the actual measurement yield data of the rice in the subarea;
Constructing a training data set according to the preprocessed multi-source remote sensing data and the actually measured yield data of the rice in the subarea;
Training the random forest regression model according to the training data set, and predicting the rice yield by utilizing the trained model to the new multi-source time sequence remote sensing data.
Further, the in-camera orientation, band registration, orthographic image stitching and mosaic, radiation correction and remote sensing index calculation are performed on the multispectral data to obtain processed multispectral data, including:
Geometrically correcting the image by using camera calibration parameters to obtain corrected image and determining internal orientation parameters;
and registering images of different wave bands by adopting an accurate registration algorithm according to the orientation parameters in the camera so as to obtain multiband images after wave band registration.
Performing feature matching on the overlapping region of the multiband images after band registration, and applying an automatic stitching algorithm to generate an orthographic image with uniform geographic coordinates;
performing radiation correction on the orthographic image to obtain a radiation corrected orthographic image;
from the radiation corrected orthographic image, by And calculating a normalized vegetation index to obtain an NDVI image. NIR is near infrared reflectance, RED is RED Duan Fanshe, and NDVI is normalized vegetation index.
Further, performing point cloud reconstruction and point cloud filtering on the laser radar data, and calculating the height of the rice plant to obtain processed laser radar data, including:
Collecting original point cloud data of a paddy field area through flight scanning to obtain the original point cloud data;
Splicing, integrating and eliminating redundant points of the overlapping area to obtain reconstructed point cloud data;
According to the reconstructed point cloud data, performing height threshold filtering and density threshold filtering to obtain filtered point cloud data;
And calculating an actual height image of the vegetation through CHM=DSM-DEM according to the filtered point cloud data to obtain processed laser radar data, wherein DSM represents the comprehensive height of the terrain and the object, DEM represents the pure terrain height, and CHM represents the actual height of the vegetation.
Further, registering the processed laser radar data with the multispectral data to realize preprocessing of remote sensing data of the multisource unmanned aerial vehicle, comprising:
Re-projecting the laser radar data and the multispectral data to the same coordinate system, and analyzing the re-projection of the laser radar data and the multispectral data to obtain a re-projection analysis result;
resampling the laser radar data to the resolution consistent with the multispectral data according to the result of the reprojection analysis to obtain resampled data, and vector clipping the resampled data to obtain image data covering the same geographic range:
and (3) aligning the image data with the actual geographic reference data, and adjusting the position and angle of the image to correct, so that the corrected image residual error is controlled within +/-2 cm, and the preprocessing of the remote sensing data of the multi-source unmanned aerial vehicle is realized.
Further, based on the weather period, obtaining the multi-source remote sensing data of the time sequence, and performing optimization processing on the time sequence data to obtain optimized time sequence data, wherein the method comprises the following steps:
acquiring analysis time nodes including a tillering stage, a jointing stage and a heading stage according to the physical stage of rice growth;
According to the analysis time node, a corresponding unmanned aerial vehicle flight plan is formulated, remote sensing data acquisition is carried out, and a complete time sequence data set is constructed;
According to the time sequence data set, fusing the NDVI and CHM images to obtain a time sequence data set;
Setting up the upper limit and the lower limit of the NDVI according to the time sequence data set, and passing To realize adaptive threshold filtering of NDVI, where NDVI is a normalized vegetation index of the multispectral image, NDVI (x, y, T) is an NDVI value at position (x, y) at climatic period T, T NDVI,min (T) and T NDVI,max (T) are minimum and maximum values of NDVI at climatic period T, respectively, and NDVI f (x, y, T) is an NDVI value after adaptive threshold filtering;
Setting upper and lower limits of CHM according to time sequence data group set and passing To achieve adaptive threshold filtering of canopy height, where CHM is the canopy height of the lidar data, CHM (x, y, t) is the CHM value at position (x, y) at weathered period t,AndIs the adaptive threshold range of CHM at the climatic period t, CHM f (x, y, t) is the CHM value after adaptive threshold filtering;
And optimizing the time sequence data through adaptive threshold filtering of NDVI and adaptive threshold filtering of CHM to obtain optimized time sequence data.
Further, extracting element features of sub-region levels according to the optimized time sequence data to construct an element feature set of the time sequence, wherein the element features comprise vegetation indexes and terrain heights, for example, and the method comprises the following steps:
Dividing a research area into a plurality of subareas according to the optimized time sequence data, wherein each subarea comprises a plurality of pixel points, and marking and classifying all the pixel points in each subarea to realize subarea division;
according to the division of subareas, by Calculating an average value of the normalized vegetation indexes after the adaptive threshold filtering by Calculating an average value of the canopy height after the adaptive threshold filtering, wherein,AndRespectively representing normalized vegetation index and canopy height mean value of the subarea under the physical weather period t, wherein n is the number of effective pixels in the subarea, and p represents the subarea;
and constructing a characteristic set of the time sequence according to the normalized vegetation index and the canopy height mean value of the sub-areas in the weathered period.
Further, according to the element feature set of the constructed time sequence, obtaining the measured yield data of the rice in the subarea, including:
Combining the time sequence element feature sets of all the subareas to form a multi-time sequence feature set of the whole research area;
and according to the multi-time sequence feature set, the element features in each subarea are corresponding to the yield data of the subarea through the corresponding geographic position and the corresponding time stamp, so that the actual measurement yield data of the rice in the subarea is obtained.
In a second aspect, a system for predicting rice yield based on multi-source time-series remote sensing data, comprising:
The acquisition module is used for acquiring multispectral data and laser radar data, carrying out accurate geographic positioning of the data by utilizing satellite positioning to acquire multisource unmanned aerial vehicle remote sensing data, carrying out in-camera orientation, wave band registration, orthographic image splicing, mosaic, radiation correction and remote sensing index calculation on the multispectral data to acquire processed multispectral data, carrying out point cloud reconstruction and point cloud filtering on the laser radar data, and calculating the height of rice plants to acquire processed laser radar data;
The processing module is used for acquiring multi-source remote sensing data of a time sequence based on a weather period, optimizing the time sequence data to obtain optimized time sequence data, extracting element characteristics of a sub-region level according to the optimized time sequence data to construct an element characteristic set of the time sequence, wherein the element characteristics comprise vegetation indexes and terrain heights, acquiring actual measurement yield data of rice in the sub-region according to the element characteristic set of the constructed time sequence, constructing a training data set according to the preprocessed multi-source remote sensing data and the actual measurement yield data of the rice in the sub-region, training a random forest regression model according to the training data set, and predicting the yield of the rice by utilizing the trained model to the new multi-source time sequence remote sensing data.
In a third aspect, a computing device includes:
one or more processors;
And a storage means for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method.
In a fourth aspect, a computer readable storage medium has a program stored therein, which when executed by a processor, implements the method.
The scheme of the invention at least comprises the following beneficial effects:
Through adopting unmanned aerial vehicle platform to carry out data acquisition, unmanned aerial vehicle can carry out the fine monitoring in the low latitude, is fit for small-scale farmland, and can arrange the flight mission in a flexible way, improves data acquisition's real-time and operability and has fused unmanned aerial vehicle multispectral image data and laser radar data. The multispectral data captures the spectral characteristics of rice and reflects the growth condition of crops, and the laser radar data provides accurate three-dimensional structure information, so that comprehensive acquisition of crop phenotype information is realized, and the reliability and the accuracy of yield prediction are improved. The change in the growth process of the rice can be dynamically monitored through space-time data fusion, especially the multi-source remote sensing data of the key weather period. And by combining the space-time characteristics, a high-precision yield prediction model is constructed, the robustness of the model is enhanced, and the capability of predicting yield fluctuation is improved. The threshold value of the remote sensing data can be dynamically adjusted according to the characteristics of different growth periods of the rice based on the adaptive threshold value filtering technology of the physical period. The method ensures the accuracy and the effectiveness of the data in each growth stage, ensures that the yield prediction is more accurate, not only has higher prediction precision, but also can be applied to large-scale agricultural production, adapts to various agricultural scenes, and provides more flexible and reliable rice yield prediction service.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in form and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, an embodiment of the present invention provides a method for predicting rice yield based on multi-source time-series remote sensing data, the method comprising the following steps:
step 11, acquiring multispectral data and laser radar data, and performing accurate geographic positioning of the data by utilizing satellite positioning to acquire remote sensing data of the multisource unmanned aerial vehicle;
Step 12, performing in-camera orientation, band registration, orthographic image stitching and inlaying, radiation correction and remote sensing index calculation on the multispectral data to obtain processed multispectral data;
Step 13, performing point cloud reconstruction and point cloud filtering on the laser radar data, and calculating the height of rice plants to obtain processed laser radar data;
Step 14, registering the processed laser radar data with multispectral data to realize the preprocessing of the remote sensing data of the multisource unmanned aerial vehicle;
Step 15, based on the weather period, acquiring multi-source remote sensing data of a time sequence, and performing optimization processing on the time sequence data to obtain optimized time sequence data;
Step 16, extracting element features of the sub-region level according to the optimized time sequence data to construct an element feature set of the time sequence;
step 17, according to the element feature set of the constructed time sequence, obtaining the actual measurement yield data of the rice in the subarea;
Step 18, constructing a training data set according to the preprocessed multi-source remote sensing data and the actual measurement yield data of the rice in the subarea;
Training the random forest regression model according to the training data set, and predicting the rice yield by utilizing the trained model to the new multi-source time sequence remote sensing data.
In the embodiment of the invention, the accurate geographical positioning of the data is performed by collecting multispectral data and laser radar data and utilizing satellite positioning, so that more accurate and comprehensive farmland information can be obtained, the growth condition and environmental factors of rice can be known, and the accuracy of yield prediction is improved. A series of processing is carried out on multispectral data and laser radar data, such as camera internal orientation, wave band registration, orthographic image splicing and mosaic and radiation correction, so that noise and errors in the data can be eliminated, and the data quality is improved. The processed laser radar data and multispectral data are registered, so that the preprocessing of the remote sensing data of the multisource unmanned aerial vehicle is realized, the advantages of different data sources can be comprehensively utilized, and richer and more comprehensive information is provided. The multi-source remote sensing data of the time sequence is acquired based on the physical weather period, and the data are optimized, so that dynamic changes in the rice growth process can be captured. And extracting element characteristics of sub-region levels, such as vegetation indexes and terrain heights, according to the optimized time sequence data, so that the accuracy and reliability of yield prediction are improved. The preprocessed multisource remote sensing data and the actually measured yield data of the rice in the subarea are utilized to construct a training data set, and the random forest regression model is trained, so that the advantages of multisource data and the machine learning technology can be utilized.
In a preferred embodiment of the present invention, the step 11 includes:
determining a data acquisition area and a target, and determining a unmanned aerial vehicle and remote sensing equipment, wherein the unmanned aerial vehicle comprises a multispectral camera and a laser radar;
The unmanned aerial vehicle is controlled to fly according to the planned flight route, multispectral data are collected through the multispectral camera, elevation information of the landform is collected through the laser radar, and point cloud data are generated.
The satellite positioning system is utilized to accurately perform geographic positioning on the collected multispectral data and the laser radar data, and the collected multispectral data and the laser radar data are arranged and matched according to geographic position information so as to obtain remote sensing data of the multisource unmanned aerial vehicle.
According to the embodiment of the invention, through defining the data acquisition area and the target, unmanned aerial vehicles and remote sensing equipment can be deployed in a targeted manner, the unmanned aerial vehicles have high flexibility and accessibility, can easily cover complex terrains and areas which are difficult to reach, the multispectral camera can capture spectrum information of different wave bands, reflect various characteristics of the terrains, such as vegetation health status and soil components, and the laser radar can accurately acquire elevation information of the terrains and terrains to generate high-resolution point cloud data. The satellite positioning system is utilized to accurately perform geographic positioning on the acquired data, and the high-precision positioning service can effectively improve the unmanned aerial vehicle line precision and the data alignment precision, so that the spatial data in the scanning area can be accurately mapped. Accurate geolocation makes multisource data can effectively integrate and contrast under same geographical frame, has improved availability and the analysis precision of data. Multispectral data and laser radar data are arranged and matched according to geographic position information, a multisource unmanned aerial vehicle remote sensing dataset is formed, and the advantages of different data sources can be comprehensively utilized.
In a specific embodiment of the invention, the data acquisition area and the target are determined, and the selection is based on a Xinjiang Matrice RTK unmanned aerial vehicle, and a long-light Yu multispectral camera and a Xinjiang Buddhist L 1 laser radar camera are mounted. The multispectral camera is used for shooting, has the capability of multiband capture and covers visible light, near infrared and red side bands, collected bands are 450nm@35nm, 555nm@27nm,660nm@22nm, 720nm@10nm, 750nm@10nm and 840nm@30nm, spectral characteristics of crops can be comprehensively captured through the combination of the bands, the red side band and the near infrared band provide sensitive information on vegetation conditions so as to acquire multispectral data, wherein the flight height of an unmanned aerial vehicle is set to 50m, the heading overlapping rate is set to 80%, and the side overlapping rate is set to 70%. The laser radar camera is used for scanning and recording, the camera adopts a high-precision LiDAR technology, accurate three-dimensional point cloud data can be captured in a large-area rice area to acquire laser radar data, wherein flight parameters are set to be that the flying height of an unmanned aerial vehicle is set to be 50m, the laser side overlap rate is set to be 60%, so that the visible side overlap rate of a mapping camera is ensured to be larger than 65%, and the visible heading overlap rate is set to be 80%. The positioning service adopts a thousands of self-defined network RTK to provide centimeter-level positioning precision.
In a preferred embodiment of the present invention, the step 12 includes:
Geometrically correcting the image by using camera calibration parameters to obtain corrected image and determining internal orientation parameters;
and registering images of different wave bands by adopting an accurate registration algorithm according to the orientation parameters in the camera so as to obtain multiband images after wave band registration.
Performing feature matching on the overlapping region of the multiband images after band registration, and applying an automatic stitching algorithm to generate an orthographic image with uniform geographic coordinates;
performing radiation correction on the orthographic image to obtain a radiation corrected orthographic image;
from the radiation corrected orthographic image, by And calculating a normalized vegetation index to obtain an NDVI image. NIR is near infrared reflectance, RED is RED Duan Fanshe, and NDVI is normalized vegetation index.
In the embodiment of the invention, the geometric orientation is carried out by camera calibration parameters, so that image distortion can be eliminated, the geographical position of each pixel is ensured to be accurate, the accurate band registration can ensure that the information of the same geographical position on different bands is consistent, the tiny offset and distortion among the bands are eliminated, the spectral characteristics of the ground objects can be accurately analyzed, and the interpretation capability of remote sensing data is improved. The automatic splicing technology can perform feature matching based on the overlapping area to generate an orthographic image covering the whole research area, so that the working efficiency is improved, the continuity and the seamless performance of the image are ensured, and the radiation correction can eliminate the influence of external factors on the radiation value of the image, so that the spectral information in the image reflects the reflection characteristics of the surface object more truly. By calculating a remote sensing index, such as a normalized vegetation index, the growth and coverage of vegetation can be quantitatively assessed.
In one embodiment of the invention, the long-light Yu multispectral preprocessing software is used for in-camera directional correction of the acquired image, and camera calibration parameters such as focal length, principal point offset and lens distortion parameters are utilized for geometric correction of the image to eliminate image distortion. The multispectral preprocessing software for the long light is used for accurately registering images of different wave bands shot by the multispectral camera, aligning the space positions among the wave bands and eliminating possible tiny offset and distortion among the wave bands. The multi-band images after band registration are spliced and inlaid by using professional unmanned aerial vehicle mapping software to generate an orthographic image map covering the whole research area, the characteristic matching is carried out based on the overlapping area by using automatic splicing in the splicing process, and the radiation correction is carried out on the images by using the standardized data of the on-site reflectivity plate by using long-light Yu-day multi-spectrum preprocessing software. And carrying out remote sensing index calculation through a formula, and preprocessing the remote sensing index image.
In a preferred embodiment of the present invention, the step 13 includes:
Collecting original point cloud data of a paddy field area through flight scanning to obtain the original point cloud data;
Splicing, integrating and eliminating redundant points of the overlapping area to obtain reconstructed point cloud data;
According to the reconstructed point cloud data, performing height threshold filtering and density threshold filtering to obtain filtered point cloud data;
And calculating an actual height image of the vegetation through CHM=DSM-DEM according to the filtered point cloud data to obtain processed laser radar data, wherein DSM represents the comprehensive height of the terrain and the object, DEM represents the pure terrain height, and CHM represents the actual height of the vegetation.
In the embodiment of the invention, the laser radar is used for flight scanning, so that the three-dimensional information of the paddy field area can be captured with high precision, and the original point cloud data can be generated. The reconstructed point cloud data can be obtained by splicing and integrating the point cloud data and eliminating redundant points in the overlapping area, so that the consistency and accuracy of the data are improved, abnormal points and redundant points can be removed by the high-threshold filtering and the density threshold filtering, the point cloud data are purified, and the accuracy and consistency of the data are optimized. According to the filtered point cloud data, the actual height image of vegetation can be accurately calculated.
In a specific embodiment of the invention, the intelligent map software of Xinjiang is used for splicing and integrating the original point cloud data acquired by the laser radar camera of Xinjiang Buddhist L 1, and the ENVI software is used for filtering the standard point cloud data file to remove abnormal points and redundant points. The height threshold value is set according to the flying height of the unmanned aerial vehicle, and the points exceeding the range of +/-2 meters are filtered, and the density threshold value is passedSetting, wherein N 0 is the total number of points in the point cloud, N (A i, r) is the number of neighbor points in a sphere with a radius r and taking the point A i as the center, the neighbor density of the ith point is represented, te is the threshold percentage of the standard for setting screening density, and H is the density threshold. And processing the filtered point cloud data by using ENVI software to obtain the comprehensive height of the terrain and the plants and the terrain height. The precise plant height of the rice can be extracted through the combination of the two, the actual height of the vegetation is obtained through a formula by utilizing the comprehensive height of the topography and the plants and the topography height data, so as to obtain an actual height image of the vegetation after pretreatment.
In a preferred embodiment of the present invention, the step 14 includes:
Re-projecting the laser radar data and the multispectral data to the same coordinate system, and analyzing the re-projection of the laser radar data and the multispectral data to obtain a re-projection analysis result;
resampling the laser radar data to the resolution consistent with the multispectral data according to the result of the reprojection analysis to obtain resampled data;
the image data is aligned with the actual geographic reference data, and the position and angle of the image are adjusted to correct, so that the corrected image residual error is controlled within +/-2 cm, and the preprocessing of the remote sensing data of the multi-source unmanned aerial vehicle is realized;
In the embodiment of the invention, all data sets are re-projected to the same coordinate system, so that the spatial consistency among different data sources is ensured, and the subsequent data fusion and analysis are facilitated. The laser radar data is resampled to the resolution consistent with the multispectral data, the comparability between the data can be improved, the information of different data sources can be compared and analyzed on the same scale through the matching of the resolution, and the availability and the accuracy of the data are enhanced. Vector clipping is carried out on the resampled data, so that all the data are ensured to cover the same geographical range. The method has the advantages that the method is aligned with the actual geographic reference data, the position and the angle of the image are adjusted to be corrected, the geographic positioning precision of remote sensing data can be remarkably improved, the corrected image residual error is controlled within +/-2 cm, the laser radar data and the multispectral data can be effectively fused, and the preprocessed data has a uniform coordinate system, resolution and geographic range.
In a specific embodiment of the invention, a target coordinate system is determined as WGS84, original laser radar data and multispectral data are obtained, geographic information processing software is used for opening the laser radar data and the multispectral data, the target coordinate system is set as WGS84, the two data are respectively reprojected, the data before and after reprojection are compared, the corresponding relation of the data on the geographic space is determined, errors and deformation possibly occurring in the reprojection process are analyzed, an analysis result is recorded, the resolution of the multispectral data is checked and used as the target resolution of the laser radar data, in geographic information processing software, the target resolution is set as the resolution of the multispectral data according to the laser radar data, whether the resolutions are consistent or not is checked, and whether the data quality is affected or not is checked. According to the range or actual requirement of the multispectral data, a vector clipping frame is created, the resampled laser radar data and multispectral data are input, and the created vector clipping frame is used for executing clipping operation, so that the two data are clipped to the same geographical range. The ground control points are acquired in the actual geographic area through the GPS equipment, the cut image data and the ground control point data are loaded, the position and the angle of the image are adjusted, and the corrected image residual error is controlled within +/-2 cm, so that corrected laser radar data and multispectral data are obtained.
In a preferred embodiment of the present invention, the step 15 includes:
acquiring analysis time nodes including a tillering stage, a jointing stage and a heading stage according to the physical stage of rice growth;
According to the analysis time node, a corresponding unmanned aerial vehicle flight plan is formulated, remote sensing data acquisition is carried out, and a complete time sequence data set is constructed;
According to the time sequence data set, fusing the NDVI and CHM images to obtain a time sequence data set;
Setting up the upper limit and the lower limit of the NDVI according to the time sequence data set, and passing To realize adaptive threshold filtering of NDVI, where NDVI is a normalized vegetation index of the multispectral image, NDVI (x, y, T) is an NDVI value at position (x, y) at climatic period T, T NDVI,min (T) and T NDVI,max (T) are minimum and maximum values of NDVI at climatic period T, respectively, and NDVI f (x, y, T) is an NDVI value after adaptive threshold filtering;
Setting upper and lower limits of CHM according to time sequence data group set and passing To achieve adaptive threshold filtering of canopy height, where CHM is the canopy height of the lidar data, CHM (x, y, t) is the CHM value at position (x, y) at weathered period t,AndIs the adaptive threshold range of CHM at the climatic period t, CHM f (x, y, t) is the CHM value after adaptive threshold filtering;
And optimizing the time sequence data through adaptive threshold filtering of NDVI and adaptive threshold filtering of CHM to obtain optimized time sequence data.
In the embodiment of the invention, the abnormal value caused by the environmental factors can be effectively removed by self-adaptive threshold filtering, so that the accuracy of the data is improved, the optimized time series data is more stable and reliable, the high-quality data set enables an agricultural manager to make decisions more quickly, the accurate monitoring of the key physical waiting period in the rice growth process is facilitated, the accurate agricultural management is realized, the resource utilization efficiency is improved, and the environmental burden is reduced.
In a specific embodiment of the invention, the information of the physical stage of the rice in the research area is collected, and the tillering stage, the jointing stage and the heading stage are selected as key time nodes for analysis. And (3) making a flight plan according to the weather period, and acquiring data at each key stage by using the unmanned aerial vehicle, wherein each stage acquires multispectral images and laser radar data so as to construct a complete time sequence data set. And merging the normalized vegetation index of the multispectral image and the canopy height of the laser radar data to obtain a time sequence data set, wherein the data set totally comprises 6 wave band data. Setting an upper limit and a lower limit of a normalized vegetation index according to the time sequence data set, setting an upper limit and a lower limit of a canopy height according to the time sequence data set by realizing self-adaptive threshold filtering of the normalized vegetation index, and optimizing the time sequence data after optimization by the self-adaptive threshold filtering of the normalized vegetation index and the self-adaptive threshold filtering of the canopy height by realizing self-adaptive threshold filtering of the canopy height.
In a preferred embodiment of the present invention, the step 16 includes:
Dividing a research area into a plurality of subareas according to the optimized time sequence data, wherein each subarea comprises a plurality of pixel points, and marking and classifying all the pixel points in each subarea to realize subarea division;
according to the division of subareas, by Calculating an average value of the normalized vegetation indexes after the adaptive threshold filtering by Calculating an average value of the canopy height after the adaptive threshold filtering, wherein,AndRespectively representing normalized vegetation index and canopy height mean value of the subarea under the physical weather period t, wherein n is the number of effective pixels in the subarea, and p represents the subarea;
and constructing a characteristic set of the time sequence according to the normalized vegetation index and the canopy height mean value of the sub-areas in the weathered period.
In the embodiment of the invention, through sub-region division, vegetation growth and canopy structure change can be analyzed on finer spatial scale, which is beneficial to improving research precision and resolution. Calculating the average value for each sub-region can reduce redundant information in the data. The average value is used as the characteristic representation of the subarea, so that the overall condition of the subarea can be reflected more accurately, and the influence of individual abnormal pixel points on the result is reduced. The constructed time sequence feature set can clearly show the change condition of each subarea in different climatic periods, and is convenient for time sequence analysis and comparison.
In a specific embodiment of the present invention, based on the optimized time-series data, the whole research area is divided into a plurality of sub-areas according to the actual field test plan and the geographic boundary, each sub-area should include a plurality of pixels, and ensure that the pixels are continuous in geographic space, so as to ensure that the pixel data in each sub-area can effectively represent the rice growth condition of the specific area, and the GIS software or programming script is used to realize automatic area division and pixel marking, and each pixel is assigned with a specific sub-area identifier. After the subarea division is completed, counting the number of effective pixels in each subarea, and calculating the average value of normalized vegetation indexes and the average value of canopy heights of the pixels subjected to self-adaptive threshold filtering under a specific climatic period to construct a time sequence feature set.
In a preferred embodiment of the present invention, the step 17 includes:
Combining the time sequence element feature sets of all the subareas to form a multi-time sequence feature set of the whole research area;
and according to the multi-time sequence feature set, the element features in each subarea are corresponding to the yield data of the subarea through the corresponding geographic position and the corresponding time stamp, so that the actual measurement yield data of the rice in the subarea is obtained.
In the embodiment of the invention, by combining the time sequence element feature sets of all the subareas, the vegetation growth and canopy structure features of the whole research area can be more comprehensively reflected, and the unilateral property of local data is avoided. And the key features are correspondingly associated with the measured yield data, so that the accuracy of data analysis is ensured. The method has the advantages that the sub-region element characteristics and the yield data are accurately associated, powerful support is provided for accurate agricultural management, and the established multi-time sequence characteristic set and the associated analysis of the yield data provide precious data resources for research in the fields of agricultural science, ecology and the like.
In a specific embodiment of the present invention, a data structure in a programming language is used to store a time sequence element feature set of each sub-region, and by traversing all sub-regions, the feature sets are added one by one into a total multi-time sequence feature set for merging, and each sub-region corresponds to a feature subset, and each feature subset includes a normalized vegetation index and a canopy height mean value of the sub-region under all key weathers. After the multi-time sequence feature set is built, the actual yield data of the corresponding sub-areas are found by matching with the field test records according to the geographical position information of each sub-area in the multi-time sequence feature set, and the actually measured yield data of each sub-area is associated with the corresponding feature subset by creating a data structure containing the feature subset and the corresponding target variable in the programming environment so as to realize the one-to-one correspondence between the feature data and the yield data of each sub-area. And finally, sorting the measured output data of all the subareas to form a complete target variable set.
In a preferred embodiment of the present invention, the step 18 includes:
Extracting features related to rice yield according to the preprocessed multi-source remote sensing data and the actually measured yield data of the rice in the subarea to obtain extracted features, wherein the features comprise average values of normalized vegetation indexes in different climates and average values of canopy heights;
And integrating the extracted features with the measured yield data of the corresponding subareas to form a training data set containing input features and target variables, wherein the input features comprise an average value of normalized vegetation indexes of each weathered period and a remote sensing feature of canopy height, and the target variables are the measured yields of the rice of the corresponding subareas.
In the embodiment of the invention, by extracting the characteristics directly related to the rice yield and integrating the measured yield data, a more accurate prediction model can be constructed, the accurate yield prediction is beneficial to decision makers to optimize the allocation and utilization of agricultural resources, and the extracted characteristics and the integrated data set provide valuable data resources for the research in the fields of agricultural science, ecology and the like. The prediction model constructed based on the training data set can provide scientific basis for agricultural decision. Accurate yield predictions help assess risk and uncertainty in agricultural production.
In a specific embodiment of the invention, the preprocessed multi-source remote sensing data are loaded from a storage medium, the data comprise vegetation indexes and canopy heights in different climatic periods, different climatic periods such as a tillering period, a jointing period and a heading period are determined according to the growth period of rice and the acquisition time of the remote sensing data, and an average value of normalized vegetation indexes and an average value of canopy heights are acquired for the remote sensing data of each climatic period. And obtaining the actual measurement rice yield data of each sub-area, and matching the remote sensing characteristic data of each sub-area with the contemporaneous actual measurement rice yield data. The extracted remote sensing features are used as input features, the corresponding measured output data are used as target variables, and a training data set comprising the input features and the target variables is created, wherein the data set is a two-dimensional array or table, each row represents data of a sub-area or field, and each column represents a feature or variable.
In a preferred embodiment of the present invention, the step 19 includes:
Constructing a random forest regression model RF Yield=f(InputSetp,Yieldp),InputSetp, wherein the random forest regression model RF Yield=f(InputSetp,Yieldp),InputSetp is a feature set formed by normalized vegetation indexes and canopy height features at different time points, YIeld p is actual Yield data of corresponding plots, RF Yield is a final prediction model, the random forest model parameters comprise that the number of decision trees is 100 trees, the maximum depth is 10 layers, the number of selected features at the time of splitting is a random selection part of the features at the time of each splitting, and a bootstrap method is to randomly sample from a training dataset to generate a plurality of subsets, and each subset is used for training different decision trees.
According to the training data set, inputting the time sequence normalized vegetation index and canopy height characteristic of each land block into a model, using the yield as an output variable, generating a plurality of subsets by using a bootstrap method, respectively training decision trees, integrating the prediction results of all the trees in a voting or average value mode, and generating a final prediction model.
And obtaining a yield predicted value through the input time sequence characteristic data by utilizing the trained random forest model.
In the embodiment of the invention, the random forest can generally obtain higher prediction precision than a single model by integrating the prediction results of a plurality of decision trees. Each tree in the random forest is trained based on a randomly sampled subset of data, helping to reduce the risk of model overfitting the training data. Randomly selecting features at each split also increases the diversity of the model, and random forests can effectively process datasets containing a large number of features. The features are randomly selected at each split, and the model can automatically perform feature selection so as to identify the features most important for prediction. Random forests can calculate the extent of contribution of each feature to the prediction results, thereby helping to understand which features are most critical to yield prediction, helping to guide future data collection work, and optimizing agricultural management strategies.
In a specific embodiment of the invention, a random forest regression model is constructed, parameters of the random forest regression model are determined, the random forest regression model is initialized by using selected parameters, a bootstrap method is applied to randomly sample from a training data set to generate a plurality of subsets, each subset is used for training an independent decision tree, each generated subset is used as training data, a time sequence normalized vegetation index and canopy height characteristic are used as input, the actual yield of a corresponding land block is used as output, a decision tree is trained, each tree independently constructs own model without being influenced by other trees in the training process, node splitting is carried out according to the set maximum depth and characteristic selection quantity, after all decision trees are trained, a random forest is formed by integrating the decision trees, and when new input data exists, each tree can be predicted independently. Preparing time sequence normalized vegetation index and canopy height characteristic data of a land block needing to be predicted in yield, inputting the prepared prediction data into a trained random forest regression model, and outputting a yield prediction value of a corresponding land block through an integrated decision tree prediction result.
As shown in fig. 2, an embodiment of the present invention further provides a rice yield prediction system 20 based on multi-source time-series remote sensing data, including:
The acquisition module 21 is used for acquiring multispectral data and laser radar data, performing accurate geographic positioning of the data by utilizing satellite positioning to acquire multisource unmanned aerial vehicle remote sensing data, performing in-camera orientation, wave band registration, orthographic image stitching, mosaic, radiation correction and remote sensing index calculation on the multispectral data to acquire processed multispectral data, performing point cloud reconstruction and point cloud filtering on the laser radar data, and calculating the height of rice plants to acquire processed laser radar data, and registering the processed laser radar data with the multispectral data to realize preprocessing of the multisource unmanned aerial vehicle remote sensing data;
The processing module 22 is configured to obtain multi-source remote sensing data of a time sequence based on a weather period, perform an optimization process on the time sequence data to obtain optimized time sequence data, extract element features of a sub-region level according to the optimized time sequence data to construct an element feature set of the time sequence, where the element features include, for example, a vegetation index and a terrain height, obtain actual measurement yield data of rice in the sub-region according to the element feature set of the constructed time sequence, construct a training dataset according to the preprocessed multi-source remote sensing data and the actual measurement yield data of rice in the sub-region, train a random forest regression model according to the training dataset, and predict rice yield using the trained model for new multi-source time sequence remote sensing data.
It should be noted that, the system is a system corresponding to the above method, and all implementation manners in the above method embodiment are applicable to the embodiment, so that the same technical effects can be achieved.
Embodiments of the invention also provide a computing device comprising a processor, a memory storing a computer program which, when executed by the processor, performs a method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
Embodiments of the present invention also provide a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform a method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.