Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides an image searching method, apparatus, electronic device, and storage medium.
According to a first aspect of embodiments of the present disclosure, there is provided an image search method, the method including:
receiving a search request;
when the search request comprises a first image tag, acquiring a first image matched with the first image tag in the electronic equipment, and acquiring a target object corresponding to the first image tag in the first image;
Displaying the first image and the identification of the target object;
each image stored in the electronic device corresponds to at least one image tag, and the first image tag represents a tag of an object contained in the image.
In an exemplary embodiment, the acquiring the target object corresponding to the first image tag in the first image includes:
and acquiring a target object corresponding to the first image tag in the first image based on a preset image segmentation algorithm.
In an exemplary embodiment, the method further comprises:
and when the search request comprises a second image tag, acquiring a second image matched with the second image tag in the electronic equipment, and displaying the second image, wherein the second image tag represents the tag of the image information.
In an exemplary embodiment, the method further comprises:
And when the electronic equipment stores the target image, generating at least one initial image label of the target image, and correspondingly storing the target image and the at least one initial image label in a database of the electronic equipment.
In an exemplary embodiment, the initial image tag includes the first image tag and/or the second image tag, and the generating at least one initial image tag of the target image includes:
Generating at least one first image tag of the target image based on a preset image segmentation algorithm and a preset object recognition algorithm;
And generating at least one second image tag of the target image based on a preset classification algorithm and/or shooting information of the target image.
In an exemplary embodiment, when the initial image tag includes the second image tag, the storing the target image and the at least one image tag in the electronic device includes:
and correspondingly storing the target image and at least one second image label in the electronic equipment according to the weight sequence of each second image label.
In an exemplary embodiment, the method further comprises:
Displaying the initial image tag while viewing the target image;
Determining a target image label based on user input, and displaying the target image label;
and modifying the initial image label in the database into the target image label, and storing a modification record.
In an exemplary embodiment, the preset image segmentation algorithm is a SAM model.
According to a second aspect of embodiments of the present disclosure, there is provided an image search apparatus, the apparatus including:
a receiving module configured to receive a search request;
the processing module is configured to acquire a first image matched with a first image tag in the electronic equipment when the search request comprises the first image tag, and acquire a target object corresponding to the first image tag in the first image;
A display module configured to display the first image and an identification of the target object;
each image stored in the electronic device corresponds to at least one image tag, and the first image tag represents a tag of an object contained in the image.
According to a third aspect of embodiments of the present disclosure, there is provided an electronic device, comprising:
A processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method as described in the first aspect of the disclosed embodiments.
According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform the method as described in the first aspect of embodiments of the present disclosure.
The method has the advantages that the target object is identified in the first image, a user is helped to quickly screen the specific image to be searched in the search result, when the number of images corresponding to the same label is large, difficulty in searching the specific image can be reduced, time for searching the specific image by the user is saved, and image searching experience of the user is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
In the related art, in order to improve user experience, a label is added to an image in a gallery in three ways, and then the label and the image are stored in the gallery in a corresponding way, so that the image in the gallery is classified and sorted based on the label, and a specific image is searched:
In the first mode, when a user opens a photographing application program of the electronic equipment, a label mode is started in advance, after the user photographs an image, an input area is presented on a display interface, the user inputs a label, and the label is added to the image according to input information. However, when the user frequently photographs, the photographing experience is affected, and high-speed photographing cannot be realized.
And secondly, placing an operation control for adding labels on a photographing interface, checking whether the control is triggered, and adding the corresponding label description in the control to the currently photographed image when the control is triggered. But when the control is operated in the photographing interface, a screen area with a certain area is covered, photographing experience is directly affected, and misoperation is caused by the control.
And thirdly, selecting a label editing function in the gallery, and enabling a user to input labels by adopting a batch selection mode of sliding selection, and adding labels for batch images according to input information. However, when the number of pictures is large, a lot of time is still required for setting the tag.
The three ways of adding the labels are all manual, so that the added labels accord with the user expectations, and when specific images are searched through the labels, the specific images can be accurately searched, but the experience of the label adding process is poor. In addition, labels can be automatically added to the images based on an image recognition algorithm, when specific images are searched through the labels, images corresponding to the labels are displayed, but when the number of the images corresponding to the same label is large, a user still needs to spend time searching the specific images, and the labels added by the method have errors with labels expected by the user, and if the search result does not contain the specific images, the user still needs to manually search the images.
In order to solve the problem that when the number of images corresponding to the same image tag is large in the related art, searching for a specific image is difficult, an image searching method is provided, and the method comprises the steps of receiving a searching request, when the searching request comprises a first image tag, acquiring a first image matched with the first image tag in electronic equipment, acquiring a target object corresponding to the first image tag in the first image, displaying the first image and the identification of the target object, wherein each image stored in the electronic equipment corresponds to at least one image tag, and the first image tag represents the tag of the object contained in the image. According to the method, the target object is identified in the first image, so that a user is helped to quickly screen the specific image to be searched in the search result, when the number of images corresponding to the same label is large, the difficulty of searching the specific image can be reduced, the time of searching the specific image by the user is saved, and the image searching experience of the user is improved.
In an exemplary embodiment of the present disclosure, an image searching method is provided, and fig. 1 is a flowchart illustrating an image searching method according to an exemplary embodiment, as shown in fig. 1, including the steps of:
step S101, receiving a search request;
Step S102, when the search request comprises a first image tag, acquiring a first image matched with the first image tag in the electronic equipment, and acquiring a target object corresponding to the first image tag in the first image;
step S103, displaying a first image and an identification of a target object;
each image stored in the electronic device corresponds to at least one image tag, and the first image tag represents a tag of an object contained in the image.
The image searching method in the embodiment of the disclosure is applied to electronic equipment, wherein the electronic equipment comprises electronic equipment with an image storage function, such as smart phones, tablets, intelligent wearable equipment, intelligent household equipment, intelligent vehicle-mounted equipment and the like. The image searching method in the embodiment of the disclosure is applied to an application program for storing images in the electronic device, for example, the electronic device stores the images through a gallery application program, and when searching the images in the gallery, the image searching method in the embodiment is taken as a default image searching method.
In step S101, a search interface is displayed in the application program for storing images, and the search interface may be triggered by a search box of the home page or may be displayed by a separate tab. Fig. 2A is a schematic diagram of a search interface shown in an exemplary embodiment, where, as shown in fig. 2A, a search box is displayed on an application front page in the left two diagrams, the search box is clicked to display the search interface, a part of images sorted according to label classifications is displayed in the search interface, and the search interface may be displayed with the image display interface, or a search interface is displayed in a "search" tab side by side with the front page in the right two diagrams, and the search interface includes the search box and a part of images sorted according to label classifications. The search request represents a search keyword input by a user, the user inputs the keyword to be searched in a search box of the search interface, and clicks a search button, so that the electronic device receives the search request.
In step S102, each image stored in the electronic device corresponds to at least one image tag, the first image tag representing a tag of an object contained in the image, the first image representing an image tagged with the first image tag contained in the search request, and the target object representing the object represented by the first image tag contained in the search request. The electronic equipment stores a plurality of first image labels and at least one first image corresponding to each first image label, when the first image label included in the search request is determined, the first image corresponding to the first image label is matched from the mapping relation, and meanwhile, a target object corresponding to the first image label in the first image is determined.
In an example, three images are included in the gallery, wherein the tree is included in the image 1, the first image tag of the image 1 is "tree", the scissors are included in the image 2 and the image 3, the first image tags of the image 2 and the image 3 are "scissors", when the search keyword corresponding to the search request is "kitchen scissors", the first image tag included in the search request is determined to be "scissors", the image 2 and the image 3 are matched in the gallery, the first image is the image 2 and the image 3, and meanwhile, the region representing the scissors in the first image is acquired.
In step S103, the first image and the identification of the target object in the first image are displayed under the search box of the search interface, and other labels of the first image may also be displayed for further screening. The identification of the target object may be displayed in any manner, and fig. 2B is a schematic diagram of a search result according to an exemplary embodiment, and as shown in fig. 2B, the first images are displayed, and the target object in each first image is circled by a dashed frame.
In the exemplary embodiment of the disclosure, when a user searches an object included in an image, that is, when a search request includes a first image tag, the user obtains a first image matched with the first image tag in an electronic device, and simultaneously obtains a target object corresponding to the first image tag in the first image, and displays the first image and the identification of the target object in a search result.
In an exemplary embodiment of the present disclosure, an image searching method is provided, and fig. 3 is a flowchart illustrating an image searching method according to an exemplary embodiment, as shown in fig. 3, including the steps of:
Step S301, when the electronic device stores a target image, generating at least one initial image tag of the target image, correspondingly storing the target image and the at least one initial image tag in a database of the electronic device, wherein the initial image tag comprises a first image tag and/or a second image tag;
Step S302, receiving a search request;
Step S303, when the search request comprises a first image tag, acquiring a first image matched with the first image tag in the electronic equipment, and acquiring a target object corresponding to the first image tag in the first image based on a preset image segmentation algorithm;
Step S304, displaying the first image and the identification of the target object;
Step S305, when the search request comprises a second image tag, acquiring a second image matched with the second image tag in the electronic equipment;
Step S306, the second image is displayed.
Each image stored in the electronic equipment corresponds to at least one image tag, the first image tag represents a tag of an object contained in the image, and the second image tag represents a tag of image information.
In step S301, the target image may be any image, which may be an image obtained by capturing with the camera application of the electronic device, an image obtained by capturing with another application, or an image saved by any application. When the target image is stored in the electronic equipment, at least one initial image label of the target image is generated, wherein the initial image label can be a first image label, a second image label or both the first image label and the second image label. The first image tag represents a tag of an object in an image, and the second image tag represents a tag of image information, for example, when a target image is an image obtained by photographing, the second image tag includes photographing information such as photographing time, photographing place, photographing holiday, and the like, and may also include image information such as a document, landscape, food, and motion. The number of the initial image tags may be determined according to actual situations, and the present disclosure is not limited.
In some possible embodiments, at least one first image tag of the target image is generated based on a preset image segmentation algorithm and a preset object recognition algorithm, and at least one second image tag of the target image is generated based on a preset classification algorithm and/or photographing information of the target image.
The preset image segmentation algorithm is a SAM (SEGMENT ANYTHING Model) Model. The model can segment any object in any image and even comprises objects and image types which are not encountered in the training process, so that the model can cover wide use cases, is suitable for more images, does not need additional training, and can improve the accuracy of image segmentation results. The network structure of the SAM model adopts an encoder-decoder structure, the encoder is composed of a plurality of convolution layers and a pooling layer for extracting features in an image, and the decoder is composed of a plurality of deconvolution layers and an up-sampling layer for generating a segmentation result of the image. Fig. 4 is a schematic diagram of a SAM model shown in an exemplary embodiment, where, as shown in fig. 4, an encoder of the SAM model includes an image encoder and a prompt encoder, and the decryptor includes a mask decoder, when an image including scissors is input, the image is embedded by the image encoder, the image is mapped to an image feature space, features image embeddings in the image are extracted, then a prompt word is embedded by the prompt encoder, the prompt word includes points, boxes and text, the prompt word is mapped to the prompt word feature space, features prompt embeddings of the prompt word are extracted, and then image embeddings and prompt embeddings are fused by the mask decoder, and all possible segmentation results and confidence score of each segmentation result are obtained according to the fused features, where the confidence score is the final segmentation result.
The preset object recognition algorithm is any object recognition algorithm, such as a Vgg16 model, a AlexNet model and the like in a CNN model, after the target image is subjected to image segmentation, the object in the segmentation result is recognized based on the preset object recognition algorithm, and the recognition result is used as a first image label of the target image. In addition, after the object included in the target image is known, the similarity between the first image tag of the target image and other first image tags stored in advance may be calculated according to the similarity principle, and the tag with higher similarity may be used as the tag of the target image.
The classification algorithm can be any pair of images based on a preset classification algorithm, such as a clustering algorithm, wherein the clustering algorithm is used for dividing a plurality of images into different classes or clusters according to a certain specific standard, the similarity of data in the same class is as small as possible, and the difference of data objects in the same class is as large as possible. For example, the clustering algorithm is a K-Means clustering algorithm, classifies the target image and the stored image, generates a well-defined label based on specific features of each class, for example, generates a first image label corresponding to an object from an image containing the same object, or generates a second image label such as a landscape, a document, a party, a holiday, and the like.
In addition, when the target image is an image obtained by photographing, a second image tag is generated from photographing information of the image. The second image tag is generated, for example, at the shooting time, the shooting place, or the like, and is generated, for example, at an image shot in beijing city, and is set to be "national festival" for example, for each of the images during the national festival.
And correspondingly storing the target image and at least one initial image label in a database of the electronic equipment, and searching for a matched image from the database according to label information included in the search request.
In some possible embodiments, the target image and the at least one second image tag are stored in the electronic device in a weighted order for each second image tag. The weight sequence of each second image label is a preset value, for example, the label with the largest proportion in the images classified into the same type with the target image is marked as a label A, the shooting position of the target image is marked as a label B, the festival information corresponding to the shooting date of the target image is marked as a label C, and if the preset weight rule is that C > B > A, the target image and at least one second image label are stored in the database according to the sequence of C, B, A.
In some possible implementations, the initial image tag is displayed while viewing the target image, the target image tag is determined based on user input, the target image tag is displayed, the initial image tag in the database is modified to the target image tag, and a modification record is saved.
When a user views a target image, displaying an initial image tag while displaying the target image on a large screen, for example, displaying the initial image tag above or below the image, if the user wants to modify the tag, clicking the initial image tag to enter a tag editing page, deleting the initial image tag on the editing page, and modifying the initial image tag into other image tags, wherein the modified image tag can be an image tag existing in a database or a newly added image tag, and the modified image tag is the target image tag. And after the modification is completed, simultaneously modifying the image labels in the database, modifying the initial image labels in the database into target image labels, and simultaneously storing modification records.
In an example, fig. 5 is a schematic diagram of an image tag generation flow shown in an exemplary embodiment, as shown in fig. 5, after a target image is obtained through shooting, the target image is subjected to image segmentation based on a SAM image segmentation model, the segmented image is subjected to feature extraction and object recognition to generate a first image tag, second image tags are generated based on a clustering algorithm, the second image tags are ranked according to a preset weight rule of each second image tag, and the target image, the ranked second image tags and the first image tag are correspondingly stored in a database.
In step S302, upon receiving the search request, tag information included in the search request is determined.
In step S303, when the search request includes the first image tag, a first image matching the first image tag in the electronic device is obtained from the database, the first image and the first image tag are input into a preset image segmentation algorithm, the preset image segmentation algorithm is a SAM model, and the model uses the first image tag as a prompt word, so that a segmentation result of the first image, that is, a target object corresponding to the first image tag is segmented in the first image.
In step S304, the divided area is marked in the first image as the identification of the target object, and the first image and the identification of the target object are displayed in the search result.
In steps S305 and S306, when the search request includes the second image tag, the second image corresponding to the second image tag is searched in the database, the second image is displayed in the search result, and other image tags of the second image may also be displayed at the same time, so that the user can further screen the search result according to the other image tags.
In this embodiment, the image tag is generated through the SAM model, the preset recognition algorithm, the preset classification algorithm and the shooting information, so that frequent operation of a user is not required, a large number of tags of pictures are prevented from being manually set, meanwhile, any influence on a shooting interface and a shooting process is avoided, and the classification and arrangement effects of the images can be improved.
In an exemplary embodiment of the present disclosure, there is provided an image search apparatus, fig. 6 is a block diagram of an image search apparatus according to an exemplary embodiment, as shown in fig. 6, including:
a receiving module 601 configured to receive a search request;
A processing module 602, configured to obtain a first image in the electronic device that matches the first image tag when the search request includes the first image tag, and obtain a target object in the first image that corresponds to the first image tag;
a display module 603 configured to display the first image and an identification of the target object;
each image stored in the electronic device corresponds to at least one image tag, and the first image tag represents a tag of an object contained in the image.
In an exemplary embodiment, the processing module 602 is further configured to:
And acquiring a target object corresponding to the first image tag in the first image based on a preset image segmentation algorithm.
In an exemplary embodiment, the processing module 602 is further configured to:
When the search request comprises a second image tag, acquiring a second image matched with the second image tag in the electronic equipment, wherein the second image tag represents a tag of image information;
The display module 603 is further configured to display a second image.
In an exemplary embodiment, the processing module 602 is further configured to:
when the electronic equipment stores the target image, at least one initial image label of the target image is generated, and the target image and the at least one initial image label are correspondingly stored in a database of the electronic equipment.
In an exemplary embodiment, the initial image tag comprises a first image tag and/or a second image tag, and the processing module 602 is further configured to:
generating at least one first image tag of the target image based on a preset image segmentation algorithm and a preset object recognition algorithm;
and generating at least one second image tag of the target image based on a preset classification algorithm and/or shooting information of the target image.
In an exemplary embodiment, when the initial image tag includes a second image tag, the processing module 602 is further configured to:
And correspondingly storing the target image and at least one second image label in the electronic equipment according to the weight sequence of each second image label.
In an exemplary embodiment, the processing module 602 is further configured to:
displaying an initial image tag while viewing the target image;
determining a target image label based on user input, and displaying the target image label;
and modifying the initial image label in the database into a target image label, and storing a modification record.
In an exemplary embodiment, the preset image segmentation algorithm is a SAM model.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
Fig. 7 is a block diagram of an electronic device 700, according to an example embodiment.
Referring to FIG. 7, an electronic device 700 can include one or more of a processing component 702, a memory 704, a power component 706, a multimedia component 708, an audio component 710, an input/output (I/O) interface 712, a sensor component 714, and a communication component 716.
The processing component 702 generally controls overall operation of the electronic device 700, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 702 may include one or more processors 720 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 702 can include one or more modules that facilitate interaction between the processing component 702 and other components. For example, the processing component 702 may include a multimedia module to facilitate interaction between the multimedia component 708 and the processing component 702.
The memory 704 is configured to store various types of data to support operations at the electronic device 700. Examples of such data include instructions for any application or method operating on the electronic device 700, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 704 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 706 provides power to the various components of the electronic device 700. Power supply components 706 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for electronic device 700.
The multimedia component 708 includes a screen between the electronic device 700 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 708 includes a front-facing camera and/or a rear-facing camera. When the electronic device 700 is in an operational mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 710 is configured to output and/or input audio signals. For example, the audio component 710 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 700 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 704 or transmitted via the communication component 716. In some embodiments, the audio component 710 further includes a speaker for outputting audio signals.
The I/O interface 712 provides an interface between the processing component 702 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to, a home button, a volume button, an activate button, and a lock button.
The sensor assembly 714 includes one or more sensors for providing status assessment of various aspects of the electronic device 700. For example, the sensor assembly 714 may detect an on/off state of the electronic device 700, a relative positioning of the components, such as a display and keypad of the electronic device 700, a change in position of the electronic device 700 or a component of the electronic device 700, the presence or absence of a user's contact with the electronic device 700, an orientation or acceleration/deceleration of the electronic device 700, and a change in temperature of the electronic device 700. The sensor assembly 714 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 714 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 714 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 716 is configured to facilitate communication between the electronic device 700 and other devices, either wired or wireless. The electronic device 700 may access a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication component 716 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 716 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 704, including instructions executable by processor 720 of electronic device 700 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
A non-transitory computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform an image search method, the method comprising any of the methods described above.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.