[go: up one dir, main page]

CN114067033A - System and method for three-dimensional recording and human history restoration - Google Patents

System and method for three-dimensional recording and human history restoration Download PDF

Info

Publication number
CN114067033A
CN114067033A CN202111392210.6A CN202111392210A CN114067033A CN 114067033 A CN114067033 A CN 114067033A CN 202111392210 A CN202111392210 A CN 202111392210A CN 114067033 A CN114067033 A CN 114067033A
Authority
CN
China
Prior art keywords
data
recorded
model
interaction
action
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111392210.6A
Other languages
Chinese (zh)
Other versions
CN114067033B (en
Inventor
王冬晨
李旭滨
陈吉胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Shanghai Intelligent Technology Co Ltd
Original Assignee
Unisound Shanghai Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Shanghai Intelligent Technology Co Ltd filed Critical Unisound Shanghai Intelligent Technology Co Ltd
Priority to CN202111392210.6A priority Critical patent/CN114067033B/en
Publication of CN114067033A publication Critical patent/CN114067033A/en
Application granted granted Critical
Publication of CN114067033B publication Critical patent/CN114067033B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention relates to a method and a system for three-dimensional recording and human history restoration, wherein the method comprises the following steps: carrying out video acquisition, image acquisition and audio acquisition on a person to be recorded to obtain corresponding video data, image data and audio data; performing three-dimensional modeling on the figure to be recorded based on the video data and the image data to obtain a virtual human model; establishing a corresponding voice synthesis model by using the audio data; after triggering, projecting the virtual human model by utilizing a holographic technology, realizing interaction between the virtual human model and a real human by utilizing an AI technology, and synthesizing and simulating the voice of a character to be recorded by a voice synthesis model to perform voice interaction in the interaction process. The invention can project a three-dimensional virtual human model through holographic technology, retains the smile of the character, can realize interaction with a real person by combining AI technology, realizes face-to-face communication interaction with the real person, and can bring greater emotion comfort to the human.

Description

System and method for three-dimensional recording and human history restoration
Technical Field
The invention relates to the technical field of information engineering, in particular to a system and a method for three-dimensional recording and human history restoration.
Background
There is a lot of special time in a person's lifetime, possibly in children, possibly young and strong years, possibly old, or near end. In different environments at different times, there is always some insight that some scenes, some knowledge or wealth, or some other things are desired to be saved.
At present, photos and videos can become a main recording mode, but the photos and videos can bring limited emotional comfort to people, and watching people can only see recorded contents and cannot interact with the recorded contents.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a system and a method for three-dimensional recording and human history restoration, and solves the problems that the existing photo and video recording mode can bring limited emotion comfort to people and cannot further interact with people.
The technical scheme for realizing the purpose is as follows:
the invention provides a method for three-dimensional recording and human history restoration, which comprises the following steps:
carrying out video acquisition, image acquisition and audio acquisition on a person to be recorded to obtain corresponding video data, image data and audio data;
performing three-dimensional modeling on the figure to be recorded based on the video data and the image data to obtain a virtual human model;
establishing a corresponding voice synthesis model by using the audio data; and
after triggering, projecting the virtual person model by utilizing a holographic technology, realizing interaction between the virtual person model and a real person by utilizing an AI technology, and synthesizing and simulating the voice of the character to be recorded by the voice synthesis model to carry out voice interaction in the interaction process.
The invention provides a method for three-dimensional recording and human history restoration, which is used for recording human beings in a video, image and audio mode, but can project a three-dimensional virtual human model through a holographic technology during restoration, retain the sound and smile of the human beings, realize interaction with real people by combining an AI technology, realize face-to-face communication interaction with the real people and bring great emotional comfort to the human beings.
The method for three-dimensional recording and human history reduction of the invention is further improved in that the method further comprises the following steps:
establishing an action behavior library of the person to be recorded based on the video data;
and when interacting with a real person, controlling the virtual person projected by the holographic technology to execute a corresponding action according to the action feedback parameter formed by the AI technology and the action behavior library.
The method for three-dimensional recording and human history reduction of the invention is further improved in that the method further comprises the following steps: acquiring data of a specific scene of a person to be recorded;
modeling an environment in the collected specific scene data to obtain a virtual environment model;
after the specific scene is triggered, the virtual person model and the virtual environment model are projected by utilizing a holographic technology to present the corresponding specific scene.
The method for three-dimensional recording and human history reduction is further improved in that the action behaviors of the person to be recorded and the corresponding action conditions are extracted from the video data to form action intention data;
extracting the dialogue language of the character to be recorded and the corresponding dialogue condition from the audio data to form dialogue intention data;
performing model training on the action intention data and the dialogue intention data by utilizing an AI technology to obtain an intention model;
when the real person is interacted, feedback information which accords with the habit of the person to be recorded is output by utilizing the intention model so as to finish the interaction with the real person.
The method for three-dimensional recording and human history reduction of the invention is further improved in that the method further comprises the following steps:
establishing a cloud brain system;
when interacting with a real person, carrying out video acquisition on the real person to obtain interactive video data;
and uploading the interactive video data to a cloud brain system, and identifying the interactive video data by using the cloud brain system to further form interactive feedback data so as to complete the interaction with the real person.
The invention also provides a system for three-dimensional recording and human history reduction, which comprises:
the acquisition unit is used for carrying out video acquisition, image acquisition and audio acquisition on a person to be recorded so as to obtain corresponding video data, image data and audio data;
the character modeling unit is connected with the acquisition unit and is used for carrying out three-dimensional modeling on the task to be recorded based on the video data and the image data to obtain a corresponding virtual human model;
the voice modeling unit is connected with the acquisition unit and used for establishing a corresponding voice synthesis model by using the audio data; and
and the processing unit is connected with the acquisition unit, the character modeling unit and the voice modeling unit, is used for projecting the virtual human model by utilizing a holographic technology after receiving the trigger signal, realizes the interaction between the virtual human model and the real human by utilizing an AI technology, and is also used for simulating the voice of the character to be recorded by synthesizing a voice synthesis model to carry out voice interaction in the interaction process.
The system for three-dimensional recording and human history restoration is further improved in that the system further comprises an action parameter extraction unit which is connected with the acquisition unit and the processing unit and is used for establishing an action behavior library of the person to be recorded based on the video data;
and the processing unit is also used for controlling the virtual human projected by the holographic technology to execute corresponding actions according to the action feedback parameters formed by the AI technology and the action behavior library when interacting with the real human.
The system for three-dimensional recording and human history restoration is further improved in that the system further comprises a scene recording unit which is connected with the acquisition unit and the processing unit and used for acquiring data of a specific scene of a task to be recorded through the acquisition unit;
the scene recording unit is used for modeling the environment in the collected specific scene data to obtain a virtual environment model;
and the processing unit is used for projecting the virtual human model and the virtual environment model by utilizing a holographic technology to present a corresponding specific scene after receiving the triggering specific scene.
The system for three-dimensional recording and human history restoration is further improved in that the system further comprises an intention modeling unit connected with the acquisition unit and the processing unit and used for extracting the action behaviors of the person to be recorded and the corresponding action conditions from the video data to form action intention data;
the dialogue intention data is formed by the dialogue language and the corresponding dialogue condition of the character to be recorded, which are extracted from the audio data;
the system is also used for carrying out model training on the action intention data and the dialogue intention data by utilizing AI technology to obtain an intention model;
and when the processing unit interacts with a real person, the feedback information which accords with the character habit to be recorded is output by using the intention model so as to complete the interaction with the real person.
The system for three-dimensional recording and human history restoration is further improved in that the system also comprises a cloud brain system;
the processing unit is connected with the cloud brain system, when the real person interacts with the real person, the real person is subjected to video acquisition to obtain interactive video data, the interactive video data are transmitted to the cloud brain system in a long-term mode, and interactive feedback data of the cloud brain system are received to complete interaction with the real person.
Drawings
Fig. 1 is a flow chart of a method for three-dimensional recording and human history restoration according to the present invention.
Fig. 2 is a system diagram of a system for three-dimensional recording and human history restoration according to the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific examples.
Referring to fig. 1, the invention provides a system and a method for three-dimensional recording and human history restoration, which are used for solving the problems that the existing photo and video recording mode is single, only can be watched simply, and cannot be interacted with the existing photo and video recording mode, and the emotion comfort brought to people is limited. The system and the method of the invention combine the holographic technology and the AI technology to manufacture a vivid virtual human which truly imitates and records the character, can communicate and interact with the real human, realize the three-dimensional reduction of the character, keep the smile of the character, can give out warm and familiar communication and interaction, and can bring greater emotional comfort to people. The system and method for three-dimensional recording and human history restoration according to the present invention will be described with reference to the accompanying drawings.
Referring to fig. 2, a system diagram of a system for three-dimensional recording and human history restoration according to the present invention is shown. The system for three-dimensional recording and human history restoration according to the present invention will be described with reference to fig. 2.
As shown in fig. 2, the system for three-dimensional recording and human history restoration of the present invention includes an acquisition unit 21, a human modeling unit 22, a voice modeling unit 23 and a processing unit 24, wherein the task modeling unit 22 is connected to the acquisition unit 21, the voice modeling unit 23 is connected to the acquisition unit 21, and the processing unit 24 is connected to the acquisition unit 21, the human modeling unit 22 and the voice modeling unit 23; the acquisition unit 21 is configured to perform video acquisition, image acquisition and audio acquisition on a person to be recorded so as to obtain corresponding video data, image data and audio data; the character modeling unit 22 is used for performing three-dimensional modeling on the task to be recorded based on the video data and the image data to obtain a corresponding virtual human model; the voice modeling unit 23 is configured to establish a corresponding voice synthesis model using the audio data; and the processing unit 24 is configured to, after receiving the trigger signal, project the virtual human model by using a holographic technique, realize interaction between the virtual human model and the real human by using an AI technique, and perform voice interaction by synthesizing a voice imitating a character to be recorded by using a voice synthesis model during the interaction.
The system of the invention can establish a vivid virtual human based on the video, image and audio of the character to be recorded, and the virtual human can communicate and interact with the real human by recording the sound of the character, thereby bringing greater emotion comfort to the human.
In an embodiment of the present invention, the collecting unit 21 is connected to a camera, and the person to be recorded is photographed by the camera to form corresponding video data, and when the person to be recorded is photographed, the person to be recorded can be collected in different scenes, including sleeping, reading, stuttering, dining and other scenes. The acquisition unit 21 is further connected to a microphone, and records the person to be recorded through the microphone to form corresponding audio data, where the recording content may be recording the dialogue between the person and another person, recording the reading of the book by the person, or recording the feeling of the person. The acquisition unit 21 is further connected to a camera, and acquires an image of a person to be recorded by the camera to form corresponding image data. Preferably, the amount of the video data, the audio data and the image data collected by the collecting unit 21 is larger, so that the virtual human can record people more like.
In another embodiment of the present invention, the acquisition unit 21 may further receive the existing video data, image data and audio data uploaded by the user, that is, the user may form the video data, image data and audio data through other devices and upload the data to the acquisition unit 21.
In a specific embodiment of the present invention, the system of the present invention further includes a storage unit, and the acquisition unit 21 is connected to the storage unit, and the storage unit is configured to store the video data, the image data, and the audio data obtained by the acquisition unit.
In an embodiment of the present invention, the character modeling unit 22 is configured to extract images of all angles of the recorded character from the video data, find images of all angles from the image data, further extract point cloud data of the whole body of the recorded character from the image data, and create a three-dimensional model according to the point cloud data through modeling software to form a virtual human model.
In another embodiment of the present invention, the character modeling unit 22 may further receive the uploaded virtual person model, that is, the recorded character is hand-drawn and modeled to form a virtual person model based on the image and the video of the recorded character offline, and then the virtual person model is uploaded to the character modeling unit 22.
In an embodiment of the present invention, the speech modeling unit 23 uses AI technology to perform clustering analysis on the speech data, i.e. analyze and record the speaking habits, spoken words, etc. of the character, and then combines with the personalized customization operation of TTS speech to customize the sound characteristics such as tone, pitch, etc. so as to obtain the speech synthesis model conforming to the recorded character speaking voice. The text can be converted into the voice of the recorded character by using the voice synthesis model, so that people feel that the recorded character carries out conversation communication when interacting with a real person, and better emotion comfort can be brought to people.
In a specific embodiment of the present invention, the system of the present invention further includes an action parameter extraction unit, connected to the acquisition unit 21 and the processing unit 24, for establishing an action behavior library of a person to be recorded based on the video data;
the processing unit 24 is further configured to control the virtual human projected by the holographic technology to perform a corresponding action according to the action feedback parameter and the action behavior library formed by the AI technology when interacting with the real human.
Based on the above, the system of the invention not only realizes the voice interaction of the virtual human, but also realizes the action interaction of the virtual human, and the combination of the voice and the action ensures that the virtual human is more vivid, thereby realizing the simulation reduction of the recorded characters.
Specifically, the motion of the human body is formed by the motion of limbs, the motion parameters of each part and each joint of the human body can be obtained through analysis of video data, and the motion parameters comprise the motion parameters of the head, the neck, the hands, the arms, the legs, the feet and the joints, and are stored to form a motion behavior library. When the virtual human model is interacted with a real human, corresponding action feedback parameters are formed according to the content of the interaction at the time, such as actions of waving hands, nodding heads and the like, corresponding action motion parameters are searched from the action behavior library according to the action feedback parameters, and then the virtual human model is controlled to perform corresponding motions through the action motion parameters, so that action interaction is formed.
The actions stored in the action behavior library are from the actual actions of the recorded characters, so that the actions of the virtual characters can be very similar to the recorded characters when action interaction is carried out, and the simulation effect is good.
In a specific embodiment of the present invention, the system of the present invention further includes a scene recording unit, connected to the acquisition unit 21 and the processing unit 24, for performing data acquisition of a specific scene on the task to be recorded through the acquisition unit 21;
the scene recording unit is used for modeling the environment in the collected specific scene data to obtain a virtual environment model;
the processing unit 24 is configured to, after receiving the trigger specific scene, project the virtual human model and the virtual environment model by using a holographic technique to present the corresponding specific scene.
The scene recording unit can record the scene which is particularly required to be stored, and three-dimensional restoration is carried out, such as human philosophy, human comprehension, meaningful conversation and the like.
Preferably, an option key associated with a specific scene may be provided on the operation interface, and the user may trigger reproduction of the specific scene by clicking selection of the option key.
In an embodiment of the present invention, the system of the present invention further includes an intention modeling unit connected to the acquisition unit 21 and the processing unit 24, and configured to extract the action behavior of the person to be recorded and the corresponding action condition from the video data to form action intention data;
the dialogue intention data is formed by the dialogue language and the corresponding dialogue condition used for extracting the character to be recorded from the audio data;
the system is also used for carrying out model training on the action intention data and the dialogue intention data by utilizing AI technology to obtain an intention model;
when interacting with the real person, the processing unit 24 outputs feedback information according with the habit of the person to be recorded by using the intention model so as to complete the interaction with the real person.
The intention of the recorded character can be reflected through the set intention model, and the intention feedback according with the habit of the recorded character can be given. The intention model also comprises semantic intention, voice intention, image intention and behavior intention, and can give feedback according with the recorder according to the input information, for example, when a person who is in a depressed conversation faces, the recorder can make tea to the other person and say some comfort. The feedback information given by the intention module can conform to the habit of the recorder, so that the dialog person feels like a real recorder.
Further, when the intention model is created, the knowledge graph relating to the recorded character is created based on all the data relating to the recorded character, and a mesh-like knowledge structure is constructed, so that feedback conforming to the habit of the recorded character can be given based on the input information.
In one embodiment of the present invention, the system of the present invention further comprises a cloud-brain system;
the processing unit is connected with the cloud brain system, when the real person interacts with the real person, the real person is subjected to video acquisition to obtain interactive video data, the interactive video data are transmitted to the cloud brain system in a long-term mode, and interactive feedback data of the cloud brain system are received to complete interaction with the real person.
Preferably, the cloud brain system comprises a voice recognition subsystem, a semantic understanding subsystem, an image recognition subsystem and an encyclopedia knowledge graph, the voice recognition subsystem, the semantic understanding subsystem and the image recognition subsystem can know the intention of a real person (namely a dialog man) in the interactive video data, and corresponding feedback is given based on the understanding of the intention of the dialog man, so that the dialog can be carried out with the dialog man, and the accompany is realized.
Further, the intention model of the invention is stored in the cloud end, the cloud computer system obtains the intention of the speaker after receiving the interactive video data, and inputs the intention of the speaker into the intention model to obtain the feedback data conforming to the recorded characters, the feedback data conforming to the recorded characters is obtained and then sent to the processing unit, the processing unit executes the feedback data, if the feedback data conforming to the recorded characters is not obtained, the corresponding feedback is obtained according to the encyclopedic knowledge graph at the cloud end, and then the feedback is sent to the processing unit to execute the feedback.
In an embodiment of the invention, the system of the invention further comprises means for implementing holographic projection. The trigger signal received by the processing unit can be a trigger signal formed by starting a system, and can also be a trigger signal formed by collecting surrounding environment information and judging certain specific conditions.
The system can record infants, children, adults and the old, and establish a model conforming to the habit of the recorded people to realize communication interaction.
The following describes the workflow of the system of the present invention by taking recording of infants and children as an example.
The first step is as follows: video and image recordings are made of the child's behavior, including the child's feedback on external stimuli, including language, action, behavior, etc.
The second step is that: and classifying behaviors fed back by the children, sorting the behaviors into intentions, and performing model training through an AI (artificial intelligence) technology. The intent includes, without limitation, a semantic intent, a voice intent, an image intent, or a behavior-related intent. For example: "the child hears mother's voice and opens mouth to laugh", "the child sees the toy and stretches out hand to reach the toy", "the child hears loud speech and cries and scream more than ever".
The third step: and training the knowledge graph to construct a mesh knowledge structure.
The fourth step: and (3) projecting the image of the child by utilizing a holographic projection technology.
The fifth step: we can interact with the projection using language, motion. The action language can be identified by using AI technology and the feedback can be carried out by using knowledge map in projection.
The final effect is that the children can be projected to the front of eyes through the form of holographic projection, and the images of the children can be seen and interacted with.
The workflow of the system of the present invention will be described below by taking recording of adults and elderly people as an example.
The first step is as follows: the video and the image of the behavior of the person are recorded, and the behavior of the person can be collected in different scenes, such as sleeping, reading books, losing a place, eating and the like. And the behavior action models under various different situations are recorded and trained, and the three-dimensional projection is conveniently carried out by adopting the holographic technology.
The second step is that: the method comprises the steps of collecting the speech of a real person, utilizing AI technology to perform clustering analysis on the language habit of speaking, spoken language and the like, and utilizing TTS speech personalized customization operation to customize the sound characteristics such as tone and tone.
The third step: the language to be saved, such as human philosophy, human comprehension, or words spoken by others, may be recorded and added to the specific scenario described in step one.
The fourth step: the long-time video shooting can be utilized to record the processing mode of a coming person for some things, possibly speaking, the action to be done, the arrangement becomes the intention, and the model training is carried out through the AI technology. The intent includes, without limitation, a semantic intent, a voice intent, an image intent, or a behavior-related intent. For example, "when a frustrated child is faced, the child can make a pot of tea to say a comfort word," when the wife is difficult to pass, "when the child is not comfortable, the child can say a word to say that the mind of the child keeps cool.
The fifth step: and training the knowledge graph to construct a mesh knowledge structure.
And a sixth step: the projection of the person by means of the holographic technique can be an active open holographic projection technique or can be an automatic triggering of the holographic projection device when certain conditions are triggered. Projecting the content involved may be comforting, teaching, ordering, etc.
The seventh step: the knowledge graph can be loaded with knowledge graphs of encyclopedic classes, and conversation and accompanying can be carried out.
The final effect is that when the user is not calm or has a thought of a lost parent, the user can use the device to holographically project the image thought by the user, so as to carry out communication conversation and relieve the negative feeling of mind.
The invention also provides a method for three-dimensional recording and human history restoration, which is explained below.
As shown in fig. 1, the method of the present invention comprises the steps of:
step S11 is executed, and corresponding video data, image data and audio data are obtained by carrying out video acquisition, image acquisition and audio acquisition on the person to be recorded; then, step S12 is executed;
step S12 is executed, and the character to be recorded is subjected to three-dimensional modeling based on the video data and the image data to obtain a virtual human model; then, step S13 is executed;
step S13 is executed, and a corresponding speech synthesis model is established by using the audio data; then, step S14 is executed;
and step S14 is executed, after triggering, the virtual human model is projected by utilizing the holographic technology, the interaction between the virtual human model and the real human is realized by utilizing the AI technology, and in the interaction process, the voice of the character to be recorded is simulated by synthesizing the voice of the voice synthesis model for voice interaction.
In one embodiment of the present invention, the method further comprises:
establishing an action behavior library of a person to be recorded based on the video data;
and when interacting with a real person, controlling the virtual person projected by the holographic technology to execute a corresponding action according to an action feedback parameter and an action behavior library formed by the AI technology.
In one embodiment of the present invention, the method further comprises: acquiring data of a specific scene of a person to be recorded;
modeling an environment in the collected specific scene data to obtain a virtual environment model;
after the specific scene is triggered, the virtual human model and the virtual environment model are projected by utilizing the holographic technology to present the corresponding specific scene.
In one embodiment of the invention, action behaviors of people to be recorded and corresponding action conditions are extracted from video data to form action intention data;
extracting the dialogue language of the character to be recorded and the corresponding dialogue condition from the audio data to form dialogue intention data;
performing model training on the action intention data and the dialogue intention data by utilizing an AI technology to obtain an intention model;
when the real person is interacted, feedback information which accords with character habits to be recorded is output by using the intention model so as to finish the interaction with the real person.
In one embodiment of the present invention, the method further comprises:
establishing a cloud brain system;
when interacting with a real person, carrying out video acquisition on the real person to obtain interactive video data;
and uploading the interactive video data to a cloud brain system, and identifying the interactive video data by using the cloud brain system to further form interactive feedback data so as to complete the interaction with the real person.
In life, in the face of good memory and warm scenes, the favorite people, or the posters of the elders, the notifier for lovers, or the mementos of the elapsed relatives, are very necessary to be effectively stored and then restored and reproduced, so that great emotional comfort can be brought to people.
While the present invention has been described in detail and with reference to the embodiments thereof as illustrated in the accompanying drawings, it will be apparent to one skilled in the art that various changes and modifications can be made therein. Therefore, certain details of the embodiments are not to be interpreted as limiting, and the scope of the invention is to be determined by the appended claims.

Claims (10)

1.一种三维立体记录及还原人生历程的方法,其特征在于,包括如下步骤:1. a method for three-dimensional recording and restoration of life course, is characterized in that, comprises the steps: 对需记录的人物进行视频采集、图像采集以及音频采集得到对应的视频数据、图像数据以及音频数据;Perform video capture, image capture and audio capture on the person to be recorded to obtain corresponding video data, image data and audio data; 基于所述视频数据和所述图像数据对需记录的人物进行三维建模得到虚拟人模型;Based on the video data and the image data, three-dimensional modeling is performed on the character to be recorded to obtain a virtual human model; 利用所述音频数据建立对应的语音合成模型;以及Building a corresponding speech synthesis model using the audio data; and 触发后,利用全息技术对所述虚拟人模型进行投影,并利用AI技术实现所述虚拟人模型与真人间的交互,且在交互的过程中,通过所述语音合成模型合成模仿所述需记录的人物的声音进行语音交互。After triggering, use holographic technology to project the virtual human model, and use AI technology to realize the interaction between the virtual human model and the real person, and in the process of interaction, the voice synthesis model is used to synthesize and imitate the recording the voice of the character for voice interaction. 2.如权利要求1所述的三维立体记录及还原人生历程的方法,其特征在于,还包括:2. The method for three-dimensional recording and restoration of life history as claimed in claim 1, further comprising: 基于所述视频数据建立所述需记录的人物的动作行为库;Establishing the action behavior library of the characters to be recorded based on the video data; 在与真人进行交互时,根据AI技术形成的动作反馈参数和所述动作行为库,控制全息技术所投影的虚拟人执行相应的动作。When interacting with a real person, according to the action feedback parameters formed by the AI technology and the action behavior library, the virtual human projected by the holographic technology is controlled to perform corresponding actions. 3.如权利要求1所述的三维立体记录及还原人生历程的方法,其特征在于,还包括:对需记录的人物进行特定场景的数据采集;3. The method for three-dimensional recording and restoration of life course as claimed in claim 1, further comprising: carrying out data collection of a specific scene to the character to be recorded; 对采集的特定场景数据中的环境进行建模得到虚拟环境模型;The virtual environment model is obtained by modeling the environment in the collected specific scene data; 待触发特定场景后,利用全息技术对所述虚拟人模型及所述虚拟环境模型进行投影以呈现对应的特定场景。After the specific scene is triggered, the virtual human model and the virtual environment model are projected using holographic technology to present the corresponding specific scene. 4.如权利要求1所述的三维立体记录及还原人生历程的方法,其特征在于,从所述视频数据中提取出需记录的人物的动作行为及对应的动作条件形成动作意图数据;4. The method for three-dimensional recording and restoration of life history as claimed in claim 1, wherein the action intention data of the character to be recorded and corresponding action conditions are extracted from the video data; 从所述音频数据中提取出需记录的人物的对话语言及对应的对话条件形成对话意图数据;Extract the dialogue language of the character to be recorded and the corresponding dialogue conditions from the audio data to form dialogue intention data; 利用AI技术对所述动作意图数据和所述对话意图数据进行模型训练得到意图模型;Use AI technology to perform model training on the action intent data and the dialogue intent data to obtain an intent model; 在与真人进行交互时,利用所述意图模型输出符合需记录的人物习惯的反馈信息以完成与真人间的交互。When interacting with a real person, the intention model is used to output feedback information that conforms to the character's habits to be recorded to complete the interaction with the real person. 5.如权利要求1所述的三维立体记录及还原人生历程的方法,其特征在于,还包括:5. The method for three-dimensional recording and restoration of life course as claimed in claim 1, further comprising: 建立云脑系统;Build a cloud brain system; 在与真人进行交互时,对真人进行视频采集得到交互视频数据;When interacting with a real person, video capture is performed on the real person to obtain interactive video data; 将所述交互视频数据上传至云脑系统,利用所述云脑系统对所述交互视频数据进行识别,进而形成交互反馈数据,从而完成与真人间的交互。The interactive video data is uploaded to the cloud brain system, and the cloud brain system is used to identify the interactive video data, thereby forming interactive feedback data, so as to complete the interaction with the real person. 6.一种三维立体记录及还原人生历程的系统,其特征在于,包括:6. A system for three-dimensional recording and restoration of life course, characterized in that, comprising: 采集单元,用于对需记录的人物进行视频采集、图像采集以及音频采集从而得到对应的视频数据、图像数据以及音频数据;an acquisition unit, configured to perform video acquisition, image acquisition and audio acquisition on the person to be recorded so as to obtain corresponding video data, image data and audio data; 人物建模单元,与所述采集单元连接,用于基于所述视频数据和所述图像数据对需记录的任务进行三维建模得到对应的虚拟人模型;a character modeling unit, connected to the acquisition unit, for performing three-dimensional modeling of the task to be recorded based on the video data and the image data to obtain a corresponding virtual human model; 语音建模单元,与所述采集单元连接,用于利用所述音频数据建立对应的语音合成模型;以及A speech modeling unit, connected with the acquisition unit, for establishing a corresponding speech synthesis model using the audio data; and 处理单元,与所述采集单元、人物建模单元以及语音建模单元连接,用于在接收触发信号后,利用全息技术对所述虚拟人模型进行投影,并利用AI技术实现所述虚拟人模型与真人间的交互,在交互的过程中,还用于通过语音合成模型合成模仿所述需记录的人物的声音进行语音交互。The processing unit is connected with the acquisition unit, the character modeling unit and the voice modeling unit, and is used to project the virtual human model by using the holographic technology after receiving the trigger signal, and realize the virtual human model by using the AI technology The interaction with a real person is also used to synthesize and imitate the voice of the person to be recorded through the speech synthesis model to perform voice interaction during the interaction process. 7.如权利要求6所述的三维立体记录及还原人生历程的系统,其特征在于,还包括动作参数提取单元,与所述采集单元和所述处理单元连接,用于基于所述视频数据建立所述需记录的人物的动作行为库;7. The system for three-dimensional recording and restoration of life history as claimed in claim 6, further comprising an action parameter extraction unit, connected with the acquisition unit and the processing unit, for establishing based on the video data The action behavior library of the characters to be recorded; 所述处理单元还用于在与真人进行交互时,根据AI技术形成的动作反馈参数和所述动作行为库控制全息技术所投影的虚拟人执行相应的动作。The processing unit is further configured to control the virtual human projected by the holographic technology to perform corresponding actions according to the action feedback parameters formed by the AI technology and the action behavior library when interacting with a real person. 8.如权利要求6所述的三维立体记录及还原人生历程的系统,其特征在于,还包括场景记录单元,与所述采集单元和所述处理单元连接,通过所述采集单元对需记录的任务进行特定场景的数据采集;8. The system for three-dimensional recording and restoration of life history as claimed in claim 6, further comprising a scene recording unit, which is connected with the acquisition unit and the processing unit, and records the data to be recorded by the acquisition unit. The task is to collect data for a specific scene; 所述场景记录单元用于对采集的特定场景数据中的环境进行建模得到虚拟环境模型;The scene recording unit is used for modeling the environment in the collected specific scene data to obtain a virtual environment model; 所述处理单元用于在接收到触发特定场景后,利用全息技术对所述虚拟人模型和所述虚拟环境模型进行投影以呈现对应的特定场景。The processing unit is configured to project the virtual human model and the virtual environment model by using the holographic technology to present the corresponding specific scene after receiving the trigger specific scene. 9.如权利要求6所述的三维立体记录及还原人生历程的系统,其特征在于,还包括与所述采集单元和所述处理单元连接的意图建模单元,用于从所述视频数据中提取出需记录的人物的动作行为及对应的动作条件形成动作意图数据;9. The system for three-dimensional stereoscopic recording and restoration of life history as claimed in claim 6, further comprising an intention modeling unit connected with the acquisition unit and the processing unit, for extracting data from the video data. Extract the action behavior of the characters to be recorded and the corresponding action conditions to form action intent data; 用于从所述音频数据中提取出需记录的人物的对话语言及对应的对话条件形成对话意图数据;For extracting the dialogue language of the character to be recorded and the corresponding dialogue conditions from the audio data to form dialogue intention data; 还用于利用AI技术对所述动作意图数据和所述对话意图数据进行模型训练得到意图模型;It is also used to perform model training on the action intent data and the dialogue intent data by using AI technology to obtain an intent model; 所述处理单元在与真人进行交互时,利用所述意图模型输出符合需记录的人物习惯的反馈信息以完成与真人间的交互。When the processing unit interacts with a real person, the intention model is used to output feedback information conforming to the habit of the character to be recorded to complete the interaction with the real person. 10.如权利要求6所述的三维立体记录及还原人生历程的系统,其特征在于,还包括云脑系统;10. The system for three-dimensional recording and restoration of life history as claimed in claim 6, further comprising a cloud brain system; 所述处理单元与所述云脑系统连接,在与真人进行交互时,对真人进行视频采集得到交互视频数据,并将交互视频数据长传至云脑系统并接收云脑系统的交互反馈数据以完成与真人间的交互。The processing unit is connected to the cloud brain system, and when interacting with a real person, it collects video from the real person to obtain interactive video data, transmits the interactive video data to the cloud brain system, and receives the interactive feedback data from the cloud brain system to obtain interactive video data. Complete the interaction with real people.
CN202111392210.6A 2021-11-19 2021-11-19 System and method based on interaction between virtual human model and real person Active CN114067033B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111392210.6A CN114067033B (en) 2021-11-19 2021-11-19 System and method based on interaction between virtual human model and real person

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111392210.6A CN114067033B (en) 2021-11-19 2021-11-19 System and method based on interaction between virtual human model and real person

Publications (2)

Publication Number Publication Date
CN114067033A true CN114067033A (en) 2022-02-18
CN114067033B CN114067033B (en) 2025-02-28

Family

ID=80279238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111392210.6A Active CN114067033B (en) 2021-11-19 2021-11-19 System and method based on interaction between virtual human model and real person

Country Status (1)

Country Link
CN (1) CN114067033B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445563A (en) * 2022-03-15 2022-05-06 深圳市爱云信息科技有限公司 3D holographic image interaction method, device, system and medium
CN115081488A (en) * 2022-07-11 2022-09-20 西安财经大学 A scene control method based on holographic projection technology
CN115494941A (en) * 2022-08-22 2022-12-20 同济大学 Neural Network-based Metaverse Emotional Escort Virtual Human Realization Method and System
CN115826745A (en) * 2022-11-17 2023-03-21 中图云创智能科技(北京)有限公司 Image broadcasting method based on real person in meta-space virtual environment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107562195A (en) * 2017-08-17 2018-01-09 英华达(南京)科技有限公司 Man-machine interaction method and system
CN107765852A (en) * 2017-10-11 2018-03-06 北京光年无限科技有限公司 Multi-modal interaction processing method and system based on visual human
CN107797663A (en) * 2017-10-26 2018-03-13 北京光年无限科技有限公司 Multi-modal interaction processing method and system based on visual human
CN108804698A (en) * 2018-03-30 2018-11-13 深圳狗尾草智能科技有限公司 Man-machine interaction method, system, medium based on personage IP and equipment
CN110647636A (en) * 2019-09-05 2020-01-03 深圳追一科技有限公司 Interaction method, interaction device, terminal equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107562195A (en) * 2017-08-17 2018-01-09 英华达(南京)科技有限公司 Man-machine interaction method and system
CN107765852A (en) * 2017-10-11 2018-03-06 北京光年无限科技有限公司 Multi-modal interaction processing method and system based on visual human
CN107797663A (en) * 2017-10-26 2018-03-13 北京光年无限科技有限公司 Multi-modal interaction processing method and system based on visual human
CN108804698A (en) * 2018-03-30 2018-11-13 深圳狗尾草智能科技有限公司 Man-machine interaction method, system, medium based on personage IP and equipment
CN110647636A (en) * 2019-09-05 2020-01-03 深圳追一科技有限公司 Interaction method, interaction device, terminal equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445563A (en) * 2022-03-15 2022-05-06 深圳市爱云信息科技有限公司 3D holographic image interaction method, device, system and medium
CN115081488A (en) * 2022-07-11 2022-09-20 西安财经大学 A scene control method based on holographic projection technology
CN115494941A (en) * 2022-08-22 2022-12-20 同济大学 Neural Network-based Metaverse Emotional Escort Virtual Human Realization Method and System
CN115826745A (en) * 2022-11-17 2023-03-21 中图云创智能科技(北京)有限公司 Image broadcasting method based on real person in meta-space virtual environment

Also Published As

Publication number Publication date
CN114067033B (en) 2025-02-28

Similar Documents

Publication Publication Date Title
JP6888096B2 (en) Robot, server and human-machine interaction methods
US12204513B2 (en) Artificial intelligence platform with improved conversational ability and personality development
CN114067033B (en) System and method based on interaction between virtual human model and real person
JP6902683B2 (en) Virtual robot interaction methods, devices, storage media and electronic devices
JP6019108B2 (en) Video generation based on text
JP2022534708A (en) A Multimodal Model for Dynamically Reacting Virtual Characters
JP2014519082A5 (en)
KR102701578B1 (en) Method and system for remembering the activities of patients with physical difficulties and memories of the deceased on the metaverse platform
CN117523088A (en) Personalized three-dimensional digital human holographic interaction forming system and method
CN117462130A (en) Mental health assessment method and system based on digital person
CN110427099A (en) Information recording method, device, system, electronic equipment and information acquisition method
CN116524791A (en) A Lip Language Learning Auxiliary Training System Based on Metaverse and Its Application
JP7130290B2 (en) information extractor
KR20210108565A (en) Virtual contents creation method
CN112508161A (en) Control method, system and storage medium for accompanying digital substitution
Bryer et al. Re‐animation: multimodal discourse around text
Gaspers et al. A multimodal corpus for the evaluation of computational models for (grounded) language acquisition
KR101900684B1 (en) Apparatus and method for virtual reality call
CN119942291B (en) Multi-source perception-based multi-dimensional pixel fusion method and system
CN119311242B (en) Human-computer interaction method and related device suitable for young children
TWI859084B (en) Pet reconstruction system based on virtual world and implementation method thereof
KR20090001681A (en) Contents / Service Scenario Development Chart Modeling Method for Intelligent Robots
Dhanushkodi et al. SPEECH DRIVEN 3D FACE ANIMATION.
JP2025049264A (en) system
JP2025049213A (en) system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant