[go: up one dir, main page]

CN115510336A - Information processing method, information processing device, electronic equipment and storage medium - Google Patents

Information processing method, information processing device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115510336A
CN115510336A CN202110700524.1A CN202110700524A CN115510336A CN 115510336 A CN115510336 A CN 115510336A CN 202110700524 A CN202110700524 A CN 202110700524A CN 115510336 A CN115510336 A CN 115510336A
Authority
CN
China
Prior art keywords
information
entity
vehicle
target
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110700524.1A
Other languages
Chinese (zh)
Inventor
方建伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pateo Network Technology Service Co Ltd
Original Assignee
Shanghai Pateo Network Technology Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pateo Network Technology Service Co Ltd filed Critical Shanghai Pateo Network Technology Service Co Ltd
Priority to CN202110700524.1A priority Critical patent/CN115510336A/en
Publication of CN115510336A publication Critical patent/CN115510336A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Acoustics & Sound (AREA)
  • Remote Sensing (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure relates to an information processing method, an information processing apparatus, an electronic device, and a storage medium, wherein the method includes: acquiring entities in the vehicle surroundings; acquiring voice information in the vehicle; extracting key information from the voice information, wherein the key information is used for describing a target entity; and if the entities in the environment surrounding the vehicle comprise the target entity, generating interaction information of the target entity. Therefore, once the voice information of the user is detected to be associated with the surrounding environment of the vehicle, the interest point which is discussed by the user in the vehicle is related to the target entity in the surrounding environment of the vehicle, and the interaction information with the target entity can be generated in time to provide more service needs for the user.

Description

Information processing method, information processing device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of information, and in particular, to an information processing method and apparatus, an electronic device, and a storage medium.
Background
With the continuous development of the technology, vehicles are gradually popularized and used, and meanwhile, more and more vehicles are more intelligent. For example, in the driving process, the user can input the destination on the vehicle-mounted navigation system, and the vehicle-mounted navigation system can plan the driving route for the user, and even can continuously optimize the driving route according to the road condition.
However, in the existing manner, many related intelligent devices including the car navigation system generally only provide simple services such as planning form routes for users, but cannot provide personalized interactive information for users, so that the existing manner has a single function, and cannot better meet the needs of users.
Disclosure of Invention
To overcome the problems in the related art, the present disclosure provides an information processing method, apparatus, electronic device, and storage medium.
According to a first aspect of the embodiments of the present disclosure, there is provided an information processing method including:
acquiring entities in the vehicle surroundings;
acquiring voice information in the vehicle;
extracting key information from the voice information, wherein the key information is used for describing a target entity;
and if the entities in the environment surrounding the vehicle comprise the target entity, generating interaction information of the target entity.
Optionally, the acquiring an entity in the vehicle surroundings includes:
acquiring multimedia information in the vehicle driving process in real time through a camera device, wherein the multimedia information comprises images and/or videos;
performing real-scene recognition on the multimedia information through a preset recognition algorithm to obtain a first entity recognition result;
acquiring a preset entity contained in the current position area of the vehicle in preset map data;
determining an entity in the vehicle surroundings based on the first entity recognition result and the preset entity.
Optionally, the method further comprises:
if the entities in the vehicle surrounding environment do not comprise the target entity, acquiring historical multimedia information of the vehicle;
performing live-action recognition on the historical multimedia information through a preset recognition algorithm to obtain a second entity recognition result;
determining an entity in the vehicle surroundings based on the second entity recognition result and the preset entity.
Optionally, the method further comprises:
detecting whether the voice information contains question information or not;
and if the voice information contains question information, executing the step of extracting key information from the voice information.
Optionally, the detecting whether the voice information includes question information includes:
converting the voice information into text information, and performing word segmentation processing on the text information to obtain a plurality of words;
judging whether the multiple participles contain preset questioning words or not;
and if the plurality of participles contain preset question words, determining that the voice message contains question information.
Optionally, the extracting key information from the voice information includes:
acquiring attribute information of each participle in the participles;
and taking the attribute information in the multiple participles as a target participle with preset attributes as key information in the question information, wherein the preset attributes comprise entity nouns.
Optionally, the target entity is a target location, and the generating of the interaction information of the target entity includes:
acquiring the current position of the vehicle;
inputting the target location into a preset map, and acquiring a target position of the target location;
and generating route information for driving from the current position to the target location based on the current position and the target position, and taking the route information as the interaction information.
Optionally, the target entity is scene information, and the generating of the interaction information of the target entity includes:
obtaining comment information of the scene information in the voice information;
and collecting the scene information, and associating the comment information with the scene information.
Optionally, the method further comprises:
extracting target voiceprint characteristics in the voice information;
and determining a target account corresponding to the target voiceprint characteristics based on a relationship between the preset voiceprint characteristics and the account, and associating the interaction information of the target entity with the target account.
According to a second aspect of the embodiments of the present disclosure, there is provided an information processing apparatus including:
an entity acquisition module for acquiring entities in the vehicle surroundings;
the voice information acquisition module is used for acquiring the voice information in the vehicle;
the information extraction module is used for extracting key information from the voice information, and the key information is used for describing a target entity;
and the interactive information generation module is used for generating the interactive information of the target entity if the entities in the surrounding environment of the vehicle comprise the target entity.
Optionally, the entity obtaining module includes:
the multimedia information acquisition unit is used for acquiring multimedia information in the driving process of the vehicle in real time through camera equipment, and the multimedia information comprises images and/or videos;
the identification unit is used for carrying out real scene identification on the multimedia information through a preset identification algorithm to obtain a first entity identification result;
an entity obtaining unit configured to obtain a preset entity included in a current position area of the vehicle in preset map data;
an entity determination unit configured to determine an entity in the vehicle surroundings based on the first entity recognition result and the preset entity.
Optionally, the apparatus further comprises:
a history multimedia information obtaining module, configured to obtain history multimedia information of the vehicle when an entity in the vehicle surrounding environment does not include the target entity;
the live-action identification module is used for carrying out live-action identification on the historical multimedia information through a preset identification algorithm to obtain a second entity identification result;
and the entity determining module is used for determining the entities in the vehicle surrounding environment based on the second entity identification result and the preset entities.
Optionally, the apparatus further comprises:
and the information detection module is used for detecting whether the voice information contains question information.
Optionally, the information detecting module includes:
the text processing unit is used for converting the voice information into text information and performing word segmentation processing on the text information to obtain a plurality of words;
the vocabulary judging unit is used for judging whether the plurality of participles contain preset questioning words or not;
and the questioning information determining unit is used for determining that the voice information contains questioning information when the multiple participles contain preset questioning words.
Optionally, the information extraction module includes:
an attribute information acquisition unit configured to acquire attribute information of each of the plurality of segmented words;
and the key information determining unit is used for taking the target participle with the attribute information of the plurality of participles as the preset attribute as the key information in the question information, wherein the preset attribute comprises the entity noun.
Optionally, the target entity is a target location, and the interactive information generating module includes:
a current position acquisition unit configured to acquire a current position of the vehicle;
the template position acquisition unit is used for inputting the target location into a preset map and acquiring the target position of the target location;
and the route information generating unit is used for generating route information of driving from the current position to the target position based on the current position and the target position, and taking the route information as the interaction information.
Optionally, the target entity is scene information, and the interaction information generating module includes:
the comment information acquisition unit is used for acquiring comment information of the scene information in the voice information;
and the information processing unit is used for collecting the scene information and associating the comment information with the scene information.
Optionally, the apparatus further comprises:
the characteristic extraction unit is used for extracting target voiceprint characteristics in the voice information;
and the account processing unit is used for determining a target account corresponding to the target voiceprint characteristic based on a pre-established relationship between the voiceprint characteristic and the account, and associating the interaction information of the target entity with the target account.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform any of the information processing methods described above.
In a fourth aspect of the embodiments of the present disclosure, a non-transitory computer-readable storage medium is provided, in which instructions, when executed by a processor of a mobile terminal, enable the mobile terminal to perform one of the above-mentioned information processing methods.
According to a fifth aspect of embodiments of the present disclosure, there is provided an application program/computer program product which, when run on a computer, causes the computer to perform the steps of the information processing method described in any one of the above embodiments.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
according to the information processing method, the information processing device, the electronic equipment and the storage medium, the entity in the surrounding environment of the vehicle and the voice information in the vehicle are obtained, and when the entity in the surrounding environment of the vehicle contains the target entity extracted from the voice information, the interaction information of the target entity is generated. Therefore, once the voice information of the user is detected to be associated with the surrounding environment of the vehicle, the interest point of the user in the vehicle is related to the target entity in the surrounding environment of the vehicle, and the interactive information with the target entity can be generated in time to provide more service needs for the user.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow diagram illustrating an information processing method according to an exemplary embodiment;
FIG. 2 is a flowchart of one embodiment of step S110 of FIG. 1;
FIG. 3 is another flow diagram illustrating an information processing method according to an exemplary embodiment;
FIG. 4 is a block diagram of an information processing apparatus shown in accordance with an exemplary embodiment;
FIG. 5 is a block diagram of an electronic device shown in accordance with an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating an information processing method according to an exemplary embodiment, which is used in a terminal as shown in fig. 1, and may include the steps of:
in step S110, entities in the vehicle surroundings are acquired.
In the embodiment provided by the disclosure, multimedia data such as images and videos of the surrounding environment of the vehicle can be acquired by installing the image acquisition device on the vehicle. For example, a camera or a video camera may be mounted on a vehicle, and data such as a captured image or video may be processed, so as to identify an entity in the multimedia data through an existing image identification algorithm. The entities in the vehicle surroundings, which may be buildings, people, animals, or natural landscapes, etc., may be set as needed or purposefully identified according to the preference of the user. In addition, multimedia data such as images and/or videos in the surrounding environment of the vehicle can be acquired through a handheld terminal of a user to identify entities in the surrounding environment of the vehicle. The disclosed embodiments are not so limited.
In step S120, voice information in the vehicle is acquired.
Specifically, a sound collection device may be installed in the vehicle, or the voice information in the vehicle may be obtained through a terminal device such as a mobile phone of the user.
In step S130, key information is extracted from the voice information. Wherein the key information is used for describing the target entity.
In an embodiment, the voice information can be processed through the terminal, and the voice information can be uploaded to the cloud and processed through the server.
In step S140, if the entities in the vehicle surroundings include the target entity, interaction information of the target entity is generated.
In an embodiment, a user in a vehicle often talks about related objects of interest in the vehicle surroundings during a conversation with other users. Therefore, whether the entity in the vehicle surrounding environment comprises the target entity is judged by collecting the voice information of the user in the vehicle, extracting the key information of the voice information and matching the key information with the acquired entity in the vehicle surrounding environment. In addition, in the process of driving the vehicle by one user, the interactive information can be generated by acquiring the voice information of the user, for example, the user sees the scene outside the vehicle during driving, and the user unconsciously exclamates: "Duomei lake water". Through processing the voice information of the user and acquiring the entities around the vehicle, the related interactive information including the name, the area, the altitude, the local special local products and the like of the lake water outside the vehicle can be generated for the user and displayed to the user. In other scenarios, the processing may be performed when the user is in a driving process by acquiring voice information when the user is in a call with other languages.
The key information is a description of a target entity, for example, if a user sees a ambitious building, such as a ancient pagoda, in the environment outside the vehicle during the driving of the vehicle or while the vehicle is parked in a certain location. Then the user in the vehicle is likely to talk about the building by obtaining the voice information in the vehicle and extracting the key information in the voice information, if the key information contains the noun of the entity such as "ancient tower" or "tower", and by identifying the video or image around the vehicle collected by the camera device, the target entity of "tower" is identified, which indicates that the user in the vehicle is talking about the building. At this time, the interactive information for the target entity can be generated, for example, more information of the target entity can be provided, so that the user in the vehicle is more interested, the interactive effect is achieved, and more service needs are provided for the user.
According to the information processing method provided by the embodiment of the disclosure, by acquiring the entity in the vehicle surrounding environment and the voice information in the vehicle, when the entity in the vehicle surrounding environment contains the target entity extracted from the voice information, the interaction information of the target entity is generated. Therefore, once the voice information of the user is detected to be associated with the surrounding environment of the vehicle, the interest point which the user talks about in the vehicle is related to the target entity in the surrounding environment of the vehicle, and the interactive information with the target entity can be generated in time so as to provide more services for the user.
In order to detail how to acquire the entities in the vehicle surroundings, in another embodiment provided by the present disclosure, in combination with the above embodiment, as shown in fig. 2, the step S110 may further include the following steps:
and step S111, acquiring multimedia information in the driving process of the vehicle in real time through the camera equipment. Wherein the multimedia information comprises images and/or video.
Step S112, carrying out real scene recognition on the multimedia information through a preset recognition algorithm to obtain a first entity recognition result.
In an embodiment, when the multimedia information in the driving process of the vehicle is acquired, the real scene recognition may be performed through an existing image recognition algorithm to obtain a first entity recognition result, for example, a "ancient tower" or a "temple" is recognized in the real scene. For example, images or videos are preprocessed, features in multimedia information are extracted, and recognition is performed by template matching or the like. Specifically, an existing correlation identification algorithm may be used, which is not described herein again.
In step S113, a preset entity included in the current position area of the vehicle in the preset map data is acquired.
In an embodiment, relevant entities around the current location may be found in the preset map data. Since the application of the current navigation map is mature and information of related buildings, natural scenic spots and the like is generally recorded in the navigation map, a preset entity contained in the current position area of the vehicle can be acquired through preset map data.
And step S114, determining the entities in the vehicle surrounding environment based on the first entity identification result and the preset entities.
In general, the preset map data includes map information such as entities already existing as much as possible, so that a plurality of preset entities included in the current position area of the vehicle found in the preset map data are usually compared, and the multimedia information obtained in real time by the camera device in the driving process of the vehicle is usually limited by the view field of the surrounding environment, for example, whether the multimedia information is blocked or not, and generally only limited entities can be obtained. Therefore, the entities identified in the first identification result can be matched with the preset entities to determine the entities in the surrounding environment of the vehicle, and further, the entities which may be interested by the user in the vehicle can be screened.
In addition, in another embodiment provided by the present disclosure, based on the above embodiment, as shown in fig. 3, the method may further include the following steps:
in step S115, if the entities in the vehicle surroundings do not include the target entity, the history multimedia information of the vehicle is acquired.
And step S116, performing real-scene recognition on the historical multimedia information through a preset recognition algorithm to obtain a second entity recognition result.
In step S117, the entities in the vehicle surroundings are determined based on the second entity recognition result and the preset entities.
If the entities in the vehicle surroundings do not include the target entity, it is indicated that the target entity of interest to the user is likely not already within the current location area, and historical multimedia information of the vehicle needs to be acquired. For example, the vehicle is constantly driving, and the target entity in talk between the users has already passed and is not in the current field of view. The second entity result may be obtained by tracing back already photographed multimedia information forward, for example, several minutes ago, and performing live-action recognition on the traced back historical multimedia information.
The second entity recognition result obtained by performing live-action recognition on the historical multimedia information can firstly find whether the recognized result exists, and if so, the existing recognition result can be directly used as the second entity recognition result, so that the processing efficiency is improved, and repeated recognition is avoided. If not, the existing related identification algorithm can be adopted for identification. And determining the entities in the surrounding environment of the vehicle by matching the second recognition result with the preset entities in the preset map data, so that the entities which may be interested by the user can be screened out.
In another embodiment provided by the present disclosure, based on the above embodiment, the method may further include the following steps:
in step S150, it is detected whether the speech information includes question information.
In step S160, if the voice message includes question information, step S130 is performed.
In the embodiment provided by the disclosure, when the questioning information is detected to exist in the voice information, it is indicated that the user needs to answer, and at this time, interaction is more needed, so that the requirements of the user can be better met. Therefore, the voice information of the user can be processed by taking the questioning information contained in the voice information as the trigger condition, the requirement of processing all the voice information of the user is avoided, and the processing efficiency can be effectively improved.
Specifically, the voice information may be converted into text information, and the text information may be subjected to word segmentation processing to obtain a plurality of words. And judging whether the multiple participles contain preset query words or not, and if the multiple participles contain the preset query words, determining that the voice message contains question information. The embodiment of the disclosure determines whether the voice message contains the question information by detecting whether the voice message contains the preset question words. For example, if the speech information of the user includes words such as "why", "what", "unclear", "know do", etc., it is likely that the user issues a question and needs to be answered. Therefore, in the embodiment of the disclosure, the detected question information contained in the voice information can be used as the trigger condition for extracting the key information in the voice information to process the voice information, and when the user has more conversation contents, the processing efficiency can be greatly improved.
In the embodiment, in the process of extracting the key information from the voice information, the voice information may be specifically converted into text information, the text information is subjected to word segmentation processing to obtain a plurality of words, and attribute information of each word in the plurality of words is obtained, for example, which words belong to verbs, which words belong to names, and the like. And taking the attribute information of the multiple participles as target participles with preset attributes as key information in the questioning information. The preset attribute may be an entity name. Since the user is interested in some objects, such as buildings, scenic spots, and the like, and may also be objects or locations, and the like, in the embodiment of the present disclosure, the target participles whose attribute information is an entity name are extracted as the key information, and the extracted key information is likely to be the entities that the user is interested in.
In another embodiment provided by the present disclosure, in combination with the above embodiment, the key information extracted from the voice information is likely to be description of a location, that is, the target entity may be a target location, and therefore, the step S140 may include the following steps:
in step S141, the current position of the vehicle is acquired.
Step S142, inputting the target location into a preset map, and acquiring a target position of the target location.
And step S143, generating route information for driving from the current position to the target location based on the current position and the target position, and using the route information as interaction information.
In an embodiment, if the user needs to reach the target location or is interested in the target location, the target location where the target location sits is obtained by obtaining the current location of the vehicle and inputting the target location into a preset map. Route information from the current position to the target position can be generated through a preset map, and the route information is displayed to a user in the vehicle as interactive information, so that the user can refer to whether the route needs to go to a target place or not. The route information may include a route map in a form, traffic conditions such as whether vehicles are jammed on the route, and the time length of the form.
In another embodiment provided by the present disclosure, the target entity may also be scene information, and the step S140 may further include the following steps:
step S144, obtaining comment information about scene information in the voice information.
And step S145, collecting the scene information, and associating the comment information with the scene information.
In the embodiment, the scenic spot information and the comment information are collected by the user, so that the user can conveniently refer to the scenic spot information and the comment information by himself or other users.
In addition, in the embodiment provided by the disclosure, the target voiceprint feature in the voice information can be extracted; and determining a target account corresponding to the target voiceprint characteristics based on the relationship between the preset voiceprint characteristics and the account, and associating the interaction information of the target entity with the target account. Because the voiceprint has uniqueness, in the embodiment, the corresponding account can be bound through the voiceprint feature.
Fig. 4 is a block diagram illustrating an information processing apparatus according to an example embodiment. Referring to fig. 4, the apparatus includes an entity acquiring module 10, a voice information acquiring module 20, an information extracting module 30, and an interactive information generating module 40.
An entity acquisition module for acquiring entities in the vehicle surroundings;
the voice information acquisition module is used for acquiring the voice information in the vehicle;
the information extraction module is used for extracting key information from the voice information, and the key information is used for describing a target entity;
and the interactive information generation module is used for generating the interactive information of the target entity if the entities in the surrounding environment of the vehicle comprise the target entity.
Optionally, the entity obtaining module includes:
the system comprises a multimedia information acquisition unit, a display unit and a control unit, wherein the multimedia information acquisition unit is used for acquiring multimedia information in the driving process of the vehicle in real time through a camera device, and the multimedia information comprises images and/or videos;
the identification unit is used for carrying out real scene identification on the multimedia information through a preset identification algorithm to obtain a first entity identification result;
an entity obtaining unit configured to obtain a preset entity included in a current position area of the vehicle in preset map data;
an entity determination unit configured to determine an entity in the vehicle surroundings based on the first entity recognition result and the preset entity.
Optionally, the apparatus further comprises:
a history multimedia information obtaining module, configured to obtain history multimedia information of the vehicle when an entity in the vehicle surrounding environment does not include the target entity;
the live-action identification module is used for carrying out live-action identification on the historical multimedia information through a preset identification algorithm to obtain a second entity identification result;
and the entity determining module is used for determining the entities in the vehicle surrounding environment based on the second entity identification result and the preset entities.
Optionally, the apparatus further comprises:
and the information detection module is used for detecting whether the voice information contains question information.
Optionally, the information detecting module includes:
the text processing unit is used for converting the voice information into text information and performing word segmentation processing on the text information to obtain a plurality of words;
the vocabulary judging unit is used for judging whether the plurality of participles contain preset questioning words or not;
and the questioning information determining unit is used for determining that the voice information contains questioning information when the multiple participles contain preset questioning words.
Optionally, the information extraction module includes:
an attribute information acquisition unit configured to acquire attribute information of each of the plurality of segmented words;
and the key information determining unit is used for taking the target participle with the attribute information of the plurality of participles as the preset attribute as the key information in the question information, wherein the preset attribute comprises the entity noun.
Optionally, the target entity is a target location, and the interaction information generating module includes:
a current position obtaining unit configured to obtain a current position of the vehicle;
the template position acquisition unit is used for inputting the target location into a preset map and acquiring the target position of the target location;
and the route information generating unit is used for generating route information of driving from the current position to the target position based on the current position and the target position, and taking the route information as the interaction information.
Optionally, the target entity is scene information, and the interaction information generating module includes:
the comment information acquisition unit is used for acquiring comment information of the scene information in the voice information;
and the information processing unit is used for collecting the scene information and associating the comment information with the scene information.
Optionally, the apparatus further comprises:
the characteristic extraction unit is used for extracting target voiceprint characteristics in the voice information;
and the account processing unit is used for determining a target account corresponding to the target voiceprint characteristic based on a pre-established relationship between the voiceprint characteristic and the account, and associating the interaction information of the target entity with the target account.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
According to the information processing device provided by the embodiment of the disclosure, by acquiring the entity in the vehicle surrounding environment and the voice information in the vehicle, when the entity in the vehicle surrounding environment contains the target entity extracted from the voice information, the interaction information of the target entity is generated. Therefore, once the voice information of the user is detected to be associated with the surrounding environment of the vehicle, the interest point of the user in the vehicle is related to the target entity in the surrounding environment of the vehicle, and the interaction information with the target entity can be generated in time to provide more service needs for the user.
Fig. 5 is a block diagram illustrating an apparatus 800 for information processing according to an example embodiment. For example, the apparatus 800 is an electronic device, and may be specifically a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 5, the apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 may include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed state of the device 800, the relative positioning of components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or a component of the apparatus 800, the presence or absence of user contact with the apparatus 800, orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The apparatus 800 may access a wireless network based on a communication standard, such as WiFi, an operator network (such as 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described information processing methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
The disclosed embodiments also provide a non-transitory computer-readable storage medium, where instructions in the storage medium, when executed by a processor of a mobile terminal, enable the mobile terminal to perform the above-mentioned information processing method.
There is also provided an application program/computer program product according to an embodiment of the present disclosure, and in yet another embodiment provided by the present disclosure, there is also provided a computer program product including instructions, which when run on a computer, cause the computer to perform the steps of the information processing method described in any one of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the disclosure are, in whole or in part, generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber, DSL (Digital Subscriber Line)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD (Digital Versatile Disk)), or a semiconductor medium (e.g., an SSD (Solid State Disk)), etc.
It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (13)

1. An information processing method characterized by comprising:
acquiring entities in the vehicle surroundings;
acquiring voice information in the vehicle;
extracting key information from the voice information, wherein the key information is used for describing a target entity;
and if the entities in the environment surrounding the vehicle comprise the target entity, generating interaction information of the target entity.
2. The method of claim 1, the obtaining entities in the vehicle surroundings, comprising:
acquiring multimedia information in the vehicle driving process in real time through a camera device, wherein the multimedia information comprises images and/or videos;
performing real-scene recognition on the multimedia information through a preset recognition algorithm to obtain a first entity recognition result;
acquiring a preset entity contained in the current position area of the vehicle in preset map data;
determining an entity in the vehicle surroundings based on the first entity recognition result and the preset entity.
3. The method of claim 2, further comprising:
if the entities in the vehicle surrounding environment do not comprise the target entity, acquiring historical multimedia information of the vehicle;
performing live-action recognition on the historical multimedia information through a preset recognition algorithm to obtain a second entity recognition result;
determining an entity in the vehicle surroundings based on the second entity recognition result and the preset entity.
4. The method of claim 1, further comprising:
detecting whether the voice information contains question information or not;
and if the voice information contains question information, executing the step of extracting key information from the voice information.
5. The method of claim 4, wherein the detecting whether the voice message includes question information comprises:
converting the voice information into text information, and performing word segmentation processing on the text information to obtain a plurality of word segments;
judging whether the multiple participles contain preset questioning words or not;
and if the multiple participles contain preset questioning words, determining that the voice information contains questioning information.
6. The method of claim 5, wherein extracting key information from the voice information comprises:
acquiring attribute information of each participle in the participles;
and taking the attribute information in the multiple participles as a target participle with preset attributes as key information in the question information, wherein the preset attributes comprise entity nouns.
7. The method of claim 1, wherein the target entity is a target location, and the generating interaction information of the target entity comprises:
acquiring the current position of the vehicle;
inputting the target location into a preset map, and acquiring a target position of the target location;
and generating route information for driving from the current position to the target position based on the current position and the target position, and taking the route information as the interaction information.
8. The method of claim 1, wherein the target entity is scene information, and the generating of the interaction information of the target entity comprises:
obtaining comment information of the scene information in the voice information;
and collecting the scene information, and associating the comment information with the scene information.
9. The method of any of claims 1-8, further comprising:
extracting target voiceprint characteristics in the voice information;
and determining a target account corresponding to the target voiceprint characteristics based on a pre-established relationship between the voiceprint characteristics and the account, and associating the interaction information of the target entity with the target account.
10. An information processing apparatus characterized by comprising:
an entity acquisition module for acquiring entities in the vehicle surroundings;
the voice information acquisition module is used for acquiring the voice information in the vehicle;
the information extraction module is used for extracting key information from the voice information, and the key information is used for describing a target entity;
and the interactive information generation module is used for generating the interactive information of the target entity if the entities in the surrounding environment of the vehicle comprise the target entity.
11. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the information processing method of any one of claims 1 to 9.
12. A non-transitory computer-readable storage medium, wherein instructions in the storage medium, when executed by a processor of a mobile terminal, enable the mobile terminal to perform an information processing method, the method comprising the steps of the information processing method of any one of claims 1 to 9.
13. An application program/computer program product, characterized in that it causes a computer to carry out the steps of the information processing method according to any one of claims 1 to 9 when run on the computer.
CN202110700524.1A 2021-06-23 2021-06-23 Information processing method, information processing device, electronic equipment and storage medium Pending CN115510336A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110700524.1A CN115510336A (en) 2021-06-23 2021-06-23 Information processing method, information processing device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110700524.1A CN115510336A (en) 2021-06-23 2021-06-23 Information processing method, information processing device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115510336A true CN115510336A (en) 2022-12-23

Family

ID=84499864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110700524.1A Pending CN115510336A (en) 2021-06-23 2021-06-23 Information processing method, information processing device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115510336A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116300092A (en) * 2023-03-09 2023-06-23 北京百度网讯科技有限公司 Control method, device and equipment of intelligent glasses and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116300092A (en) * 2023-03-09 2023-06-23 北京百度网讯科技有限公司 Control method, device and equipment of intelligent glasses and storage medium
CN116300092B (en) * 2023-03-09 2024-05-14 北京百度网讯科技有限公司 Control method, device, equipment and storage medium of smart glasses

Similar Documents

Publication Publication Date Title
CN109446994B (en) Gesture key point detection method and device, electronic equipment and storage medium
CN109871896B (en) Data classification method and device, electronic equipment and storage medium
CN106557768B (en) Method and device for recognizing text in pictures
CN109543066B (en) Video recommendation method, apparatus and computer-readable storage medium
CN106128478B (en) Voice broadcast method and device
CN109670077B (en) Video recommendation method and device and computer-readable storage medium
CN105869230A (en) Video data management method and device, terminal and server
CN108038102B (en) Recommended method, device, terminal and storage medium for facial expression images
CN110781813B (en) Image recognition method and device, electronic equipment and storage medium
CN110990801B (en) Information verification method and device, electronic equipment and storage medium
US10242678B2 (en) Friend addition using voiceprint analysis method, device and medium
CN111652107B (en) Object counting method and device, electronic equipment and storage medium
CN105100363A (en) Information processing method, information processing device and terminal
CN109543069B (en) Video recommendation method and device and computer-readable storage medium
CN109034150B (en) Image processing method and device
CN111523599A (en) Target detection method and device, electronic equipment and storage medium
CN112381091A (en) Video content identification method and device, electronic equipment and storage medium
CN109388699A (en) Input method, device, equipment and storage medium
CN110909203A (en) Video analysis method and device, electronic equipment and storage medium
CN115510336A (en) Information processing method, information processing device, electronic equipment and storage medium
CN110781975B (en) Image processing method and device, electronic device and storage medium
CN106297408A (en) Information prompt method and device
CN113190725B (en) Object recommendation and model training method and device, equipment, medium and product
CN111401048B (en) Intention identification method and device
CN113127613B (en) Chat information processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination