CN112381083A - Saliency perception image clipping method based on potential region pair - Google Patents
Saliency perception image clipping method based on potential region pair Download PDFInfo
- Publication number
- CN112381083A CN112381083A CN202010538411.1A CN202010538411A CN112381083A CN 112381083 A CN112381083 A CN 112381083A CN 202010538411 A CN202010538411 A CN 202010538411A CN 112381083 A CN112381083 A CN 112381083A
- Authority
- CN
- China
- Prior art keywords
- saliency
- network
- roi
- map
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a saliency perception image clipping method based on potential region pairs, which generates an attractive clipping image by constructing a clipping image frame based on deep learning. The framework includes a multi-scale CNN feature extractor, a deformable salient position sensitive roi (rod) alignment operator, a twinned fully-connected network, and a mixture-loss function. The method fully utilizes the saliency map, considers the saliency information to eliminate poor candidate cropping maps, prevents the model from over-fitting, and integrates the saliency map into the pooling operator to help construct the sense of saliency capable of coding content preference. The present invention reveals the intrinsic mechanism of the clipping process and also reveals the internal connection of potential region pairs. Not only can achieve better aesthetic effect of image cutting, but also can neglect the required calculation burden.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence, and relates to a saliency perception image clipping method based on potential region pairs.
Background
Image cropping, which is intended to find an image crop with the best aesthetic quality, is widely used in image post-processing, visual recommendation, and image selection as an important technique. Especially when a large number of images need to be cropped, image cropping becomes a laborious task. Thus, automatic image cropping has recently attracted increased attention within the research community and industry.
Early cropping methods explicitly designed various manual features based on photographic knowledge (e.g., the trisection method and the center method). With the development of deep learning, a great deal of researchers are dedicated to developing clipping methods in a data-driven manner, and the release of some reference data sets for comparison greatly facilitates the progress of related research.
However, obtaining the best candidate clip map is still extremely difficult, and is mainly influenced by the following three aspects: 1) the potential of image saliency information cannot be fully released. The previous saliency-based cropping methods focus on preserving the most important content in the best cropping map, but ignore the cases: if the rectangle of the saliency region is located near the boundary of the source image, the saliency region and the best cropped picture will overlap. Moreover, the saliency information is only used for the generation of candidate clipping maps and is not continuously used in subsequent clipping modules. 2) The potential region pairs (region of interest (ROI) and region of discard (ROD)) and their internal laws are not well represented. In general, the pairwise cropping method explicitly forms and feeds a pair of source images into an automated cropping model, but the performance of such methods is often poor due to the selection of a source image pair that is overly dependent on detail and uncertain. 3) Traditional indicators for evaluating clipping methods are unreliable and inaccurate. In some cases, the intersection ratio (IoU) and the Boundary Displacement Error (BDE) are not sufficient to subjectively evaluate the performance of their clipping method.
Disclosure of Invention
The invention aims to provide a saliency-sensing image clipping method based on potential region pairs, so as to overcome the defects of the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a salient image cropping method based on potential region pairs comprises the following steps:
step 1), generating a candidate cutting map based on a grid anchor by researching the criteria and procedures of professional photography.
And 2) describing the features of the source image by adopting a multi-scale and lightweight feature extraction network, and then clipping the extracted features by utilizing deformable interesting pooling and deformable uninteresting pooling.
And 3) training a twin aesthetic evaluation network, and predicting the aesthetic scores of the candidate cutting pictures by minimizing a mixing loss function.
Further, candidate clips based on saliency are generated, an initial clip map is first created based on saliency areas, and then candidate clip maps are generated in a grid anchor frame manner.
Further, the algorithm for creating the initial cropping map is as follows:
inputting: the size of the image (I) is wide (W) x high (H), and the magnification is lambdalargeReduction ratio lambdasmallArea function area (·), two rectangles Re1And Re2The closest distance between the outlines of (a) and (b) Clo _ Dis (Re)1, Re2)。
And (3) outputting: initial cut Sinit_crop。
Wherein s is1∈(0,1]And d1∈[0,1]The threshold values of the respective location (b) and location (a)
Further, a method for generating candidate cropping graphs by means of mesh anchors is shown in fig. 2:
wherein the input image size is WXH, which corresponds to M × N bins, M1,m2,n1,n2Respectively, the number of bins from the initial cropped picture to the source image boundary.The total number of the candidate cutting pictures isAnd setting constraint conditions: a qualified clip map should exceed a certain proportion of the input image to exclude a certain number of unsuitable size candidate clip maps:
area(Scrop)=ρarea(I) (1)
wherein area (. cndot.) is an area function, ScroAnd SsalRespectively representing a clipping region and a saliency bounding box region.And the aesthetic quality of the cut-out picture is improved by specifying the length-width ratio of the image:
α1and alpha2Are respectively set to 0.5 and 2
Further, a multi-scale and lightweight feature extraction network is adopted to describe the features of the source image, as shown in module 1 of fig. 1. Through the feature extraction network, a source image can be converted into a feature diagram rich in information and capable of simultaneously representing the whole local context and the local context. The feature extraction network consists of two modules: an infrastructure network and a Feature Aggregation Module (FAM).
Further, the multi-scale features can effectively remove local interference elements and enhance the recognition capability of the features by considering the spatial relationship.
Further, the underlying network may be any effective Convolutional Neural Network (CNN) model to capture image features while preserving a sufficient receptive field. The nth layer and the n-1 st layer of the base network are the last two layers of the base network, and global context information can be provided to some extent by skipping connections.
Further, with respect to FAM, it aims to compensate for the loss of global and multi-scale context during feature extraction. The FAM execution steps are as follows:
And 2, directly performing up-sampling on the low-dimensional feature map through bilinear interpolation to obtain the same size features as the original feature map of the nth layer.
And 3, finally connecting the up-sampling characteristic graphs from different sub-branches into a final output characteristic graph.
Further, the cropped regions are focused using two saliency-oriented alignment operators, i.e., the deformable position-sensitive ROI and the ROD align are perceptually significant. The saliency information is combined with deformable psroi (psrod) pooling and some lightweight head designs to take full advantage of the feature representation.
Further, significance deformable psroi (rod) pooling is defined as:
f' (i, j) and f (i, j) are the output pooled feature map and the original feature map, respectively. (x)lf,ylf) For ROI (ROD) in the upper left corner, n is the number of pixels in bins, (Δ x, Δ y) is the fractional offset learned from the fully connected (fc) layer, S is the saliency map, S is the number of pixels in the bini,j(x, y) is 0 or 1.
Further, as shown in fig. 3, we set C to 8 to reduce the amount of computation of the subsequent subnet, and to some extent, fix k to 3 according to the composition pattern of the 3 × 3 grid. And using the bilinear interpolation values to calculate the exact values employed in ROI (ROD) align to account for rounding errors and misalignment issues that occur in the significance aware deformable PS ROI (ROD) merge, and named significance aware deformable PS ROI (ROD) align.
Further, as shown in block 2 of FIG. 1, F represents the entire feature map generated by the feature extraction network, FROIAnd FRODCharacteristic maps of ROI and ROD, respectively. Applying a saliency-aware deformable PS ROI alignment approach to F at 8 × 8 resolutionROIIs converted intoIn branch 2, the ROD is first reconstructed from mode 4 and four separable component ROD alignments are performed with saliency perception to generate the corresponding feature maps, followed by the 1 x 1 convolutional layer to reduce the channel size. All four feature maps are connected together as an alignment feature mapAnd (4) showing. On the one hand FROIAndis connected asWill be fed into both fully connected layers for final MOS prediction. On the other hand, in the case of a liquid,is marked as being copiedWill be mixed withAre fed together into the twin evaluation network.
Further, twinning the netThe network is shown in block 3 of fig. 1 and is composed of two identical fully-connected networks, which are connected in seriesAndthe weights are shared when extracting features. The twin network inputs the aligned feature map and outputs the predicted aesthetic score. By usingAndinput feature maps indicating ROI and ROD, respectively, and prediction scores thereof are indicated by Φ (ROI _ D _ P4) and Φ (ROD _ P4), respectively. And training the twin aesthetic evaluation network with the following constraints:
here, area (·) represents an area function, γ is an area ratio, and is empirically set to 2/3. After twin network processing, the ranking penalty is defined for each potential pair as follows:
lrank(ROI_D_P4,ROD_P4)=max{0,Φ(ROD_P4)-Φ(ROI_D_P4)}
(7)
let eij=gij-pij,gijAnd pijRespectively the Mean Opinion Score (MOS) and the predicted aesthetic score of the jth cropping map of image i. To enhance robustness to outliers, Huber loss is defined as follows:
the final overall loss function is:
wherein,to balance the parameters, it is empirically set to 1. If a saliency map is not available, all values of the saliency map are set to 0.
Compared with the prior art, the invention has the following beneficial technical effects:
the saliency perception image clipping method based on the potential region pair fully utilizes the saliency map, considers saliency information to eliminate poor candidate clipping maps, prevents the problem of excessive fitting of a model, and integrates the saliency information into a pooling operator to help build a saliency perception sense field capable of coding content preference.
The invention discloses an image cropping method based on significance perception of potential region pairs, which discloses an internal mechanism of a cropping process and reveals internal relation of the potential region pairs. Specifically, four different ROD modes and various combinations of ROIs and RODs were designed in different cases, and then the relative ranking order and ranking loss of ROIs and ROIs were learned.
The invention discloses a saliency perception image clipping method based on potential region pairs, which constructs a clipping image frame based on deep learning to generate an attractive clipping image. The framework includes a multi-scale CNN feature extractor, a deformable salient position sensitive roi (rod) alignment operator, a twinned fully-connected network, and a mixture-loss function.
According to the saliency perception image clipping method based on the potential region pairs, most of indexes can be ignored in calculation load, and the method is superior to other methods.
Drawings
Fig. 1 is a diagram of the overall network architecture of the present invention.
Fig. 2 is a saliency map of a candidate clip map. The red solid line boxes represent saliency areas, the red dashed line boxes represent candidate clip diagrams, and the blue solid line boxes represent initial clip diagrams. (a) The bounding box of the saliency region is located near the boundary of the given image. (b) A salient region is a small portion of the source image. (c) The salient region is directly used as an initial cropping map.
Fig. 3 is a saliency perceived deformable location sensitive ROI pooling map.
Fig. 4 is a diagram of four modes of ROD.
Detailed Description
GAICD _ S dataset: the GAICD dataset first captured about 50K images from the Flickr website and then manually reduced to 10K images with a good composition. For each image, 19 annotators were invited to assign aesthetic scores to the various aspect ratio crop maps using the annotation tool. Among 1,236 images, there are a total of 106,800 candidate cropping patterns. As a condensed version of GAICD, GAICD _ S contains 1,236 photographs containing 100,641 reasonable annotated cutmaps.
For all samples, the short edge is resized to 256 by bilinear interpolation and data enhancement is performed using several conventional operators (random adjustments of contrast, saturation, brightness, hue and horizontal flipping).
In addition, the values of all samples were normalized to [0,1] using the mean and standard deviation calculated on the ImageNet dataset. During training, a pre-trained MobilNetV2 model is loaded into the feature extraction network of the present invention to mitigate overfitting. The network of the present invention is trained with an Adam optimizer by minimizing the mixing loss and setting all hyper-parameters to default values. The initial learning rate lr is 1e-4, and the maximum epoch is set to 100. With respect to saliency maps, using PoolNet can produce a pleasing saliency bounding box. Furthermore, batch normalization and dropout are also used for twin evaluation networks.
Claims (10)
1. A salient perception image clipping method based on potential region pairs is characterized by comprising the following steps:
step 1), generating candidate clipping anchor frames based on significance by researching the criteria and procedures of professional photography.
And 2) describing the features of the source image by adopting a multi-scale and lightweight feature extraction network, and then clipping the extracted features by utilizing deformable interesting pooling and deformable uninteresting pooling.
And 3) training a twin aesthetic evaluation network, and predicting the aesthetic scores of the candidate cutting pictures by minimizing a mixing loss function.
2. The criteria and procedure for generating mesh anchor-based candidate cropping maps from research professional photography of claim 1, wherein the initial cropping map is first created based on salient regions, and then the candidate cropping maps are generated in a mesh anchor fashion.
3. The method for creating an initial clipping map based on the salient region according to claim 2, wherein the algorithm for creating the initial clipping map is as follows:
inputting: the size of the image (I) is wide (W) x high (H), and the magnification is lambdalargeReduction ratio lambdasmallArea function area (·), two rectangles Re1And Re2The closest distance between the outlines of (a) and (b) Clo _ Dis (Re)1,Re2)。
And (3) outputting: initial clipping region Sinit_crop。
Wherein s is1∈(0,1]And d1∈(0,1]The threshold values of the location (b) and the location (a), respectively.
4. The method for generating candidate cropping maps in the form of mesh anchors according to claim 2, wherein as shown in fig. 2:
where the input image size is WxH, which corresponds to the anchor frame MxN blocks, M1,m2,n1,n2Respectively, representing the number of blocks from the initial cropped picture to the source image boundary.The total number of the candidate cutting pictures isAnd setting constraint conditions: a qualified clip map should exceed a certain proportion of the input image to exclude a certain number of unsuitable size candidate clip maps:
area(Scrop)=ρarea(I) (1)
wherein area (. cndot.) is an area function, ScroAnd SsalRespectively representing a clipping region and a saliency bounding box region.And the aesthetic quality of the cut-out picture is improved by specifying the length-width ratio of the image:
α1and alpha2Set to 0.5 and 2, respectively.
5. The method as claimed in claim 1, wherein the source image is characterized by using a multi-scale and lightweight feature extraction network to describe the features of the source image, and focusing the clipping region by using two significance-oriented alignment operators, wherein the source image can be converted into an information-rich feature map capable of simultaneously representing a global context and a local context through the feature extraction network. The feature extraction network consists of two modules: an infrastructure network and a Feature Aggregation Module (FAM).
6. The infrastructure network as in claim 5, wherein the infrastructure network can be any effective Convolutional Neural Network (CNN) model to capture image features while preserving a sufficient receptive field. The nth layer and the n-1 st layer of the base network are the last two layers of the base network, and global context information can be provided to some extent by skipping connections.
7. The Feature Aggregation Module (FAM) according to claim 5, characterized in that it aims to compensate for the loss of global and multi-scale context during feature extraction. The FAM execution steps are as follows:
step 1, firstly, average pooling of different scales is adopted to generate some feature maps, and then the feature maps are added on a 3 × 3 convolutional layer.
And 2, directly performing up-sampling on the low-dimensional feature map through bilinear interpolation to obtain the same size features as the original feature map of the nth layer.
And 3, finally connecting the up-sampling characteristic graphs from different sub-branches into a final output characteristic graph.
8. The method of claim 1, using a multi-scale, lightweight feature extraction network to characterize a source image, using two saliency-guided alignment operators to focus a cropped region, characterized by using saliency-aware deformable position-sensitive ROI and ROD align, combining saliency information with deformable psroi (psrod) pooling and some lightweight head design to fully exploit feature representation. Significance deformable psroi (rod) pooling is defined as:
f' (i, j) and f (i, j) are the output pooled feature map and the original feature map, respectively. (x)mf,ymf) For ROI (ROD) in the upper left corner, n is the number of pixels in bins, (Δ x, Δ y) is the offset learned from the fully connected (fc) layer, S is the saliency mapi,j(x, y) is 0 or 1.
Further, as shown in fig. 3, we set C to 8 to reduce the amount of computation of the subsequent sub-network, and to some extent, fix k to 3 according to the composition pattern of the 3 × 3 grid. And computing the exact values employed in ROI (ROD) align using bilinear interpolation to solve rounding errors and misalignment problems that occur in significance aware deformed PS ROI (ROD) merging, and is named significance aware deformable PS ROI (ROD) align.
9. Training a twin aesthetic evaluation network to predict the aesthetic score of a candidate clipping graph by minimizing a blending loss function according to claim 1, wherein F denotes the entire feature graph generated by the feature extraction network, F, as shown in block 2 of figure 1ROIAnd FROmCharacteristic maps of ROI and ROD, respectively. Applying a deformable PS ROI alignment approach with saliency perception, F at 8 × 8 resolutionROIIs converted intoIn branch 2, the ROD is first reconstructed from mode 4 and four separable component ROD alignments are performed with saliency sensing to generate the corresponding feature maps, followed by the 1 × 1 convolutional layer to reduce the channel size. All four feature maps are connected together as an alignment feature mapAnd (4) showing. On the one hand FROIAndis connected asWill be fed into the fully connected layer for final MOS prediction. On the other hand, in the case of a liquid,is marked as being copiedWill be mixed withAre fed together into the twin evaluation network.
10. Training a twin aesthetic evaluation network to predict the aesthetic score of a candidate cutting graph by minimizing a blending loss function according to claim 1, wherein the twin network is composed of two identical fully connected networks as shown in block 3 of figure 1Andthe weights are shared when extracting features. The twin network inputs the aligned feature map and outputs the predicted aesthetic score. By usingAndinput feature maps indicating ROI and ROD, respectively, and prediction scores thereof are indicated by Φ (ROI _ D _ P4) and Φ (ROD _ P4), respectively. And training the twin aesthetic evaluation network with the following constraints:
here, area (·) represents an area function, γ is an area ratio, and is empirically set to 2/3. After twin network processing, the ranking penalty is defined as follows for each potential pair:
lrank(ROI_D_P4,ROD_P4)=max{0,Φ(ROD_P4)-Φ(ROI_D_P4)} (7)
let eij=gij-pij,gijAnd pijRespectively the Mean Opinion Score (MOS) and the predicted aesthetic score of the jth cropping map of image i. To enhance robustness to outliers, Huber loss is defined as follows:
the final overall loss function is:
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010538411.1A CN112381083A (en) | 2020-06-12 | 2020-06-12 | Saliency perception image clipping method based on potential region pair |
| CN202110400578.6A CN113159028B (en) | 2020-06-12 | 2021-04-14 | Saliency-aware image cropping method and apparatus, computing device, and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010538411.1A CN112381083A (en) | 2020-06-12 | 2020-06-12 | Saliency perception image clipping method based on potential region pair |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN112381083A true CN112381083A (en) | 2021-02-19 |
Family
ID=74586331
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010538411.1A Withdrawn CN112381083A (en) | 2020-06-12 | 2020-06-12 | Saliency perception image clipping method based on potential region pair |
| CN202110400578.6A Active CN113159028B (en) | 2020-06-12 | 2021-04-14 | Saliency-aware image cropping method and apparatus, computing device, and storage medium |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110400578.6A Active CN113159028B (en) | 2020-06-12 | 2021-04-14 | Saliency-aware image cropping method and apparatus, computing device, and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (2) | CN112381083A (en) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113222904A (en) * | 2021-04-21 | 2021-08-06 | 重庆邮电大学 | Concrete pavement crack detection method for improving PoolNet network structure |
| CN113642710A (en) * | 2021-08-16 | 2021-11-12 | 北京百度网讯科技有限公司 | Network model quantification method, device, equipment and storage medium |
| CN113706546A (en) * | 2021-08-23 | 2021-11-26 | 浙江工业大学 | Medical image segmentation method and device based on lightweight twin network |
| CN113724261A (en) * | 2021-08-11 | 2021-11-30 | 电子科技大学 | Fast image composition method based on convolutional neural network |
| CN114025099A (en) * | 2021-11-25 | 2022-02-08 | 努比亚技术有限公司 | A method, device and computer-readable storage medium for controlling composition of photographed images |
| WO2022256020A1 (en) * | 2021-06-04 | 2022-12-08 | Hewlett-Packard Development Company, L.P. | Image re-composition |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113763391B (en) * | 2021-09-24 | 2024-03-19 | 华中科技大学 | Intelligent image cutting method and system based on visual element relation |
| CN115115941B (en) * | 2021-11-09 | 2023-04-18 | 腾晖科技建筑智能(深圳)有限公司 | Laser radar point cloud map rod-shaped target extraction method based on template matching |
| CN116168207A (en) * | 2021-11-24 | 2023-05-26 | 北京字节跳动网络技术有限公司 | Image clipping method, model training method, device, electronic equipment and medium |
| CN114119373A (en) * | 2021-11-29 | 2022-03-01 | 广东维沃软件技术有限公司 | Image cropping method and device and electronic equipment |
| CN114529558B (en) * | 2022-02-09 | 2025-07-11 | 维沃移动通信有限公司 | Image processing method, device, electronic device and readable storage medium |
| CN117152409B (en) * | 2023-08-07 | 2024-09-27 | 中移互联网有限公司 | Image clipping method, device and equipment based on multi-mode perception modeling |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8311364B2 (en) * | 2009-09-25 | 2012-11-13 | Eastman Kodak Company | Estimating aesthetic quality of digital images |
| US10002415B2 (en) * | 2016-04-12 | 2018-06-19 | Adobe Systems Incorporated | Utilizing deep learning for rating aesthetics of digital images |
| WO2020034663A1 (en) * | 2018-08-13 | 2020-02-20 | The Hong Kong Polytechnic University | Grid-based image cropping |
| CN109544524B (en) * | 2018-11-15 | 2023-05-23 | 中共中央办公厅电子科技学院 | Attention mechanism-based multi-attribute image aesthetic evaluation system |
| CN110084284A (en) * | 2019-04-04 | 2019-08-02 | 苏州千视通视觉科技股份有限公司 | Target detection and secondary classification algorithm and device based on region convolutional neural networks |
-
2020
- 2020-06-12 CN CN202010538411.1A patent/CN112381083A/en not_active Withdrawn
-
2021
- 2021-04-14 CN CN202110400578.6A patent/CN113159028B/en active Active
Cited By (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113222904A (en) * | 2021-04-21 | 2021-08-06 | 重庆邮电大学 | Concrete pavement crack detection method for improving PoolNet network structure |
| WO2022256020A1 (en) * | 2021-06-04 | 2022-12-08 | Hewlett-Packard Development Company, L.P. | Image re-composition |
| CN113724261A (en) * | 2021-08-11 | 2021-11-30 | 电子科技大学 | Fast image composition method based on convolutional neural network |
| CN113642710A (en) * | 2021-08-16 | 2021-11-12 | 北京百度网讯科技有限公司 | Network model quantification method, device, equipment and storage medium |
| CN113642710B (en) * | 2021-08-16 | 2023-10-31 | 北京百度网讯科技有限公司 | A quantification method, device, equipment and storage medium for a network model |
| CN113706546A (en) * | 2021-08-23 | 2021-11-26 | 浙江工业大学 | Medical image segmentation method and device based on lightweight twin network |
| CN113706546B (en) * | 2021-08-23 | 2024-03-19 | 浙江工业大学 | Medical image segmentation method and device based on lightweight twin network |
| CN114025099A (en) * | 2021-11-25 | 2022-02-08 | 努比亚技术有限公司 | A method, device and computer-readable storage medium for controlling composition of photographed images |
Also Published As
| Publication number | Publication date |
|---|---|
| CN113159028B (en) | 2022-04-05 |
| CN113159028A (en) | 2021-07-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112381083A (en) | Saliency perception image clipping method based on potential region pair | |
| Hosu et al. | KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment | |
| US6748097B1 (en) | Method for varying the number, size, and magnification of photographic prints based on image emphasis and appeal | |
| US6738494B1 (en) | Method for varying an image processing path based on image emphasis and appeal | |
| CN115147380A (en) | Small transparent plastic product defect detection method based on YOLOv5 | |
| CN111445474B (en) | Kidney CT Image Segmentation Method Based on Bidirectional Complex Attention Deep Network | |
| CN113516116A (en) | Text detection method, system and medium suitable for complex natural scene | |
| CN112233129B (en) | Deep learning-based parallel multi-scale attention mechanism semantic segmentation method and device | |
| CN112801182B (en) | A RGBT Target Tracking Method Based on Difficult Sample Perception | |
| CN111899164B (en) | An Image Stitching Method for Multi-focal Scenes | |
| CN115661628A (en) | A Fish Detection Method Based on Improved YOLOv5S Model | |
| CN117392017A (en) | A face restoration method based on feature points and deformable hybrid attention adversarial network | |
| CN115909493A (en) | A method and system for detecting inappropriate gestures of teachers for classroom live video | |
| CN120147111B (en) | Method for automatically processing portrait photo into standard certificate photo | |
| CN118608926A (en) | Image quality evaluation method, device, electronic device and storage medium | |
| CN114627293B (en) | Portrait cutout method based on multi-task learning | |
| CN112446292A (en) | 2D image salient target detection method and system | |
| CN115100150A (en) | Machine vision universal detection algorithm | |
| CN119090986A (en) | A method for generating images based on artificial intelligence | |
| CN117036711A (en) | Weak supervision semantic segmentation method based on attention adjustment | |
| CN117372259A (en) | Image super-resolution method based on reference image | |
| CN115713464A (en) | Attention text super-resolution method based on text perception loss | |
| CN113240573B (en) | High-resolution image style transformation method and system for local and global parallel learning | |
| Li et al. | An underwater image segmentation model for complex scenes in aquaculture using vision Transformer | |
| CN111833992A (en) | Mid-split phase chromosome region searching method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210219 |
|
| WW01 | Invention patent application withdrawn after publication |