[go: up one dir, main page]

CN119228991A - Method, device, equipment and medium for three-dimensional reconstruction of human face based on artificial intelligence - Google Patents

Method, device, equipment and medium for three-dimensional reconstruction of human face based on artificial intelligence Download PDF

Info

Publication number
CN119228991A
CN119228991A CN202411157941.6A CN202411157941A CN119228991A CN 119228991 A CN119228991 A CN 119228991A CN 202411157941 A CN202411157941 A CN 202411157941A CN 119228991 A CN119228991 A CN 119228991A
Authority
CN
China
Prior art keywords
dimensional
model
loss
face
reconstruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411157941.6A
Other languages
Chinese (zh)
Inventor
赵泽政
徐昊鑫
葛昊
李茜萌
陈远旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202411157941.6A priority Critical patent/CN119228991A/en
Publication of CN119228991A publication Critical patent/CN119228991A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本申请属于人工智能领域与金融科技领域,涉及一种基于人工智能的人脸三维重建方法、装置、计算机设备及存储介质,包括:对二维人脸图像数据进行感兴趣区域提取得到人脸图像,对人脸图像进行关键点检测得到二维关键点,对人脸图像进行人脸分割得到人脸掩膜图像;基于初始重建模型对人脸图像进行处理得到三维人脸模型;基于三维人脸模型与人脸图像生成第一损失;基于三维人脸模型、二维关键点以及人脸掩膜图像生成第二损失;基于第一损失与第二损失生成综合损失;基于综合损失对初始重建模型进行优化得到目标重建模型;基于目标重建模型对人脸图像进行三维重建处理得到目标三维人脸模型。本申请基于目标重建模型的使用提高了三维人脸重建的精度。

The present application belongs to the field of artificial intelligence and financial technology, and relates to a method, device, computer equipment and storage medium for three-dimensional reconstruction of human face based on artificial intelligence, including: extracting the region of interest from two-dimensional human face image data to obtain human face image, detecting key points from human face image to obtain two-dimensional key points, and performing face segmentation on human face image to obtain human face mask image; processing human face image based on initial reconstruction model to obtain three-dimensional human face model; generating first loss based on three-dimensional human face model and human face image; generating second loss based on three-dimensional human face model, two-dimensional key points and human face mask image; generating comprehensive loss based on first loss and second loss; optimizing initial reconstruction model based on comprehensive loss to obtain target reconstruction model; performing three-dimensional reconstruction processing on human face image based on target reconstruction model to obtain target three-dimensional human face model. The present application improves the accuracy of three-dimensional human face reconstruction based on the use of target reconstruction model.

Description

Artificial intelligence-based face three-dimensional reconstruction method, device, equipment and medium
Technical Field
The application relates to the technical field of artificial intelligence development and the technical field of finance, in particular to a face three-dimensional reconstruction method, a device, computer equipment and a storage medium based on artificial intelligence.
Background
With the rapid development of the fields of man-machine interaction, film and television production, video live broadcasting, financial application and the like, the three-dimensional reconstruction technology of the human face is taken as a core component part of the fields, and the importance and the application prospect of the three-dimensional reconstruction technology are increasingly prominent. Traditional high-precision face three-dimensional model reconstruction methods rely on expensive three-dimensional reconstruction scanning equipment and post-processing by professionals, which greatly limit the popularization and application of personal digital images.
At present, the mainstream three-dimensional face reconstruction method mainly adopts a model-free three-dimensional face reconstruction method. The model-free three-dimensional face reconstruction method tries to directly calculate characteristic points in a two-dimensional image through a neural network so as to return to three-dimensional face coordinates. Although the reconstruction method is very direct in processing, the characteristics of the human face can be lost to a certain extent, the requirements of high-precision and high-fidelity three-dimensional human face reconstruction cannot be met, and the precision of the three-dimensional human face reconstruction is lower.
Disclosure of Invention
The embodiment of the application aims to provide a three-dimensional face reconstruction method, device, computer equipment and storage medium based on artificial intelligence, so as to solve the technical problems that the existing model-free three-dimensional face reconstruction method can lose the properties of a face to a certain extent, cannot meet the requirements of high-precision and high-fidelity three-dimensional face reconstruction, and causes lower precision of three-dimensional face reconstruction.
In order to solve the technical problems, the embodiment of the application provides a face three-dimensional reconstruction method based on artificial intelligence, which adopts the following technical scheme:
Acquiring pre-constructed two-dimensional face image data;
Extracting a face region of interest from the two-dimensional face image data to obtain a corresponding face image, detecting face key points of the face image to obtain corresponding two-dimensional key points, and carrying out face segmentation on the face image to obtain a corresponding face mask image;
information extraction is carried out on the face image based on a convolutional neural network in a preset initial reconstruction model to obtain corresponding model parameters, and the model parameters are processed based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model;
generating a corresponding first loss based on the three-dimensional face model and the face image;
Generating a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image;
generating a composite loss based on the first loss and the second loss;
optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model;
and carrying out three-dimensional reconstruction processing on the input face image to be processed based on the target reconstruction model to obtain a corresponding target three-dimensional face model.
Further, the step of generating a corresponding first loss based on the three-dimensional face model and the face image specifically includes:
Acquiring a three-dimensional scanning face model corresponding to the face image;
Acquiring first vertex information of the three-dimensional scanning face model;
Acquiring second vertex information of the three-dimensional face model;
calculating a first Euclidean distance between the first vertex information and the second vertex information;
and taking the first Euclidean distance as the first loss.
Further, the step of generating the corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image specifically includes:
Performing projection processing on the three-dimensional face model to obtain a corresponding target two-dimensional key point;
Rendering the three-dimensional face model to obtain a corresponding two-dimensional face image;
Calculating a second Euclidean distance between the target two-dimensional key point and the two-dimensional key point, and taking the second Euclidean distance as a key point loss;
Calculating pixel loss between the two-dimensional face image and the face mask image;
Calculating the similarity loss between the two-dimensional face image and the face mask image;
The second loss is generated based on the keypoint loss, the pixel loss, and the similarity loss.
Further, the step of generating a comprehensive loss based on the first loss and the second loss specifically includes:
Acquiring a first weight, a second weight, a third weight and a fourth weight which respectively correspond to the first loss, the key point loss, the pixel loss and the similarity loss;
acquiring a preset loss calculation formula;
Calculating the first loss, the key point loss, the pixel loss, the similarity loss, the first weight, the second weight, the third weight and the fourth weight based on the loss calculation formula to obtain a corresponding calculation result;
And taking the calculation result as the comprehensive loss.
Further, the step of obtaining the pre-constructed two-dimensional face image data specifically includes:
acquiring initial two-dimensional face image data acquired in advance;
performing data cleaning processing on the initial two-dimensional face image data to obtain corresponding first face image data;
performing data clipping processing on the first face image data to obtain corresponding second face image data;
normalizing the second face image data to obtain corresponding third face image data;
and taking the third face image data as the two-dimensional face image data.
Further, after the step of optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model, the method further includes:
constructing verification data based on the face image data;
performing performance verification on the target reconstruction model based on the verification data to obtain performance index data of the target reconstruction model;
Performing data analysis on the performance index data to generate a performance evaluation result corresponding to the target reconstruction model;
and carrying out corresponding model adjustment processing on the target reconstruction model based on the performance evaluation result.
Further, after the step of performing three-dimensional reconstruction processing on the input face image to be processed based on the target reconstruction model to obtain a corresponding target three-dimensional face model, the method further includes:
performing smoothing treatment on the target three-dimensional face model to obtain a corresponding first target three-dimensional face model;
Performing texture mapping processing on the first target three-dimensional face model to obtain a corresponding second target three-dimensional face model;
And storing the second target three-dimensional face model.
In order to solve the technical problems, the embodiment of the application also provides a face three-dimensional reconstruction device based on artificial intelligence, which adopts the following technical scheme:
the acquisition module is used for acquiring the pre-constructed two-dimensional face image data;
The first processing module is used for extracting a face region of interest from the two-dimensional face image data to obtain a corresponding face image, detecting face key points of the face image to obtain corresponding two-dimensional key points, and carrying out face segmentation on the face image to obtain a corresponding face mask image;
the second processing module is used for extracting information from the face image based on a convolutional neural network in a preset initial reconstruction model to obtain corresponding model parameters, and processing the model parameters based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model;
the first generation module is used for generating a corresponding first loss based on the three-dimensional face model and the face image;
The second generation module is used for generating a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image;
a third generation module for generating a composite loss based on the first loss and the second loss;
The optimization module is used for carrying out optimization treatment on the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model;
and the reconstruction module is used for carrying out three-dimensional reconstruction processing on the input face image to be processed based on the target reconstruction model to obtain a corresponding target three-dimensional face model.
In order to solve the above technical problems, the embodiment of the present application further provides a computer device, which adopts the following technical schemes:
Acquiring pre-constructed two-dimensional face image data;
Extracting a face region of interest from the two-dimensional face image data to obtain a corresponding face image, detecting face key points of the face image to obtain corresponding two-dimensional key points, and carrying out face segmentation on the face image to obtain a corresponding face mask image;
information extraction is carried out on the face image based on a convolutional neural network in a preset initial reconstruction model to obtain corresponding model parameters, and the model parameters are processed based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model;
generating a corresponding first loss based on the three-dimensional face model and the face image;
Generating a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image;
generating a composite loss based on the first loss and the second loss;
optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model;
and carrying out three-dimensional reconstruction processing on the input face image to be processed based on the target reconstruction model to obtain a corresponding target three-dimensional face model.
In order to solve the above technical problems, an embodiment of the present application further provides a computer readable storage medium, which adopts the following technical schemes:
Acquiring pre-constructed two-dimensional face image data;
Extracting a face region of interest from the two-dimensional face image data to obtain a corresponding face image, detecting face key points of the face image to obtain corresponding two-dimensional key points, and carrying out face segmentation on the face image to obtain a corresponding face mask image;
information extraction is carried out on the face image based on a convolutional neural network in a preset initial reconstruction model to obtain corresponding model parameters, and the model parameters are processed based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model;
generating a corresponding first loss based on the three-dimensional face model and the face image;
Generating a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image;
generating a composite loss based on the first loss and the second loss;
optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model;
and carrying out three-dimensional reconstruction processing on the input face image to be processed based on the target reconstruction model to obtain a corresponding target three-dimensional face model.
Compared with the prior art, the embodiment of the application has the following main beneficial effects:
The method comprises the steps of firstly obtaining pre-built two-dimensional face image data, then extracting a face region of interest from the two-dimensional face image data to obtain a corresponding face image, detecting face key points of the face image to obtain a corresponding two-dimensional key point, carrying out face segmentation on the face image to obtain a corresponding face mask image, then extracting information from the face image based on a convolution neural network in a preset initial reconstruction model to obtain corresponding model parameters, processing the model parameters based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model, subsequently generating a corresponding first loss based on the three-dimensional face model and the face image, generating a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image, further generating a comprehensive loss based on the first loss and the second loss, further optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model, and finally carrying out three-dimensional reconstruction on the target to be processed to obtain the target reconstructed model. According to the application, the target reconstruction model is constructed by combining the first loss of the two-dimensional layer and the second loss of the three-dimensional layer for mixed supervision training, so that the target reconstruction model can be better guided to learn, and the reconstruction precision of the target reconstruction model is improved.
Drawings
In order to more clearly illustrate the solution of the present application, a brief description will be given below of the drawings required for the description of the embodiments of the present application, it being apparent that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without the exercise of inventive effort for a person of ordinary skill in the art.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of an artificial intelligence based face three-dimensional reconstruction method in accordance with the present application;
FIG. 3 is a schematic structural view of one embodiment of an artificial intelligence based face three-dimensional reconstruction device according to the present application;
FIG. 4 is a schematic structural diagram of one embodiment of a computer device in accordance with the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs, the terms used in the description herein are used for the purpose of describing particular embodiments only and are not intended to limit the application, and the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the above description of the drawings are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In order to make the person skilled in the art better understand the solution of the present application, the technical solution of the embodiment of the present application will be clearly and completely described below with reference to the accompanying drawings.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Mov i ng P i cture Experts G roup Aud i o Layer I I I, dynamic video expert compression standard audio plane 3), MP4 (Mov i ng P i ctu re Experts G roup Aud i o Layer I V, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the three-dimensional face reconstruction method based on artificial intelligence provided by the embodiment of the application is generally executed by a server/terminal device, and correspondingly, the three-dimensional face reconstruction device based on artificial intelligence is generally arranged in the server/terminal device.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow chart of one embodiment of an artificial intelligence based face three-dimensional reconstruction method in accordance with the present application is shown. The order of the steps in the flowchart may be changed and some steps may be omitted according to various needs. The artificial intelligence-based face three-dimensional reconstruction method provided by the embodiment of the application can be applied to any scene needing face three-dimensional reconstruction, and can be applied to products of the scenes, such as face three-dimensional reconstruction processing in financial application, and comprises the following steps:
step S201, two-dimensional face image data constructed in advance is acquired.
In this embodiment, the electronic device (for example, the server/terminal device shown in fig. 1) on which the artificial intelligence-based face three-dimensional reconstruction method operates may acquire two-dimensional face image data through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection may include, but is not limited to, 3G/4G/5G connections, wifi connections, bluetooth connections, wimax connections, Z i gbee connections, UWB (u l t r a W i deband) connections, and other now known or later developed wireless connection means. The above specific implementation process of acquiring the pre-constructed two-dimensional face image data will be described in further detail in the following specific embodiments, which will not be described herein.
Step S202, extracting a face region of interest from the two-dimensional face image data to obtain a corresponding face image, detecting face key points from the face image to obtain corresponding two-dimensional key points, and carrying out face segmentation on the face image to obtain a corresponding face mask image.
In this embodiment, the face region of interest is extracted from the two-dimensional face image data to detect the face position in the two-dimensional face image data and obtain a corresponding face image. In addition, the face key points of the face image are detected to obtain two-dimensional key points of the target number. The target number is not particularly limited, and for example, 68 may be selected. And the resulting two-dimensional keypoints are used to construct the keypoint loss of the 2D portion. In addition, the face image is subjected to face segmentation to obtain a corresponding face mask image, and the face mask image does not contain ears and necks. The obtained face mask image is used for subsequently constructing pixel loss and similarity loss of the 2D part.
And step S203, carrying out information extraction on the face image based on a convolution neural network in a preset initial reconstruction model to obtain corresponding model parameters, and processing the model parameters based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model.
In this embodiment, the initial reconstruction model at least includes a convolutional neural network and a processing network. The input of the convolutional neural network is a face image extracted through a face region of interest, the face image is subjected to characteristic information extraction through the convolutional neural network, and corresponding model parameters are output, wherein the model parameters comprise deformation parameters for controlling deformation of a three-dimensional deformable model (3DMorphab l e Mode l,3DMM), camera parameters required by subsequent rendering, and texture parameters required by texture generation. The input of the processing network is the model parameter output by the convolutional neural network, and a three-dimensional face model corresponding to the face image is generated. The generated three-dimensional face model is used for calculating Euclidean distance between vertexes with the three-dimensional scanned face model obtained by scanning characters in the input face image, so that loss of the 3D part is obtained.
Step S204, generating a corresponding first loss based on the three-dimensional face model and the face image.
In this embodiment, the foregoing specific implementation process of generating the corresponding first loss based on the three-dimensional face model and the face image will be described in further detail in the following specific embodiments, which will not be described herein.
Step S205, generating a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image.
In this embodiment, the foregoing specific implementation process of generating the corresponding second loss based on the three-dimensional face model, the two-dimensional key points, and the face mask image will be described in further detail in the following specific embodiments, which will not be described herein.
Step S206, generating a comprehensive loss based on the first loss and the second loss.
In this embodiment, the foregoing implementation process of generating the integrated loss based on the first loss and the second loss will be described in further detail in the following embodiments, which will not be described herein.
And step S207, optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model.
In the embodiment, the training process of the initial reconstruction model comprises forward propagation, namely inputting a face image into the initial reconstruction model, and calculating the predicted three-dimensional face model parameters through the forward propagation of the initial reconstruction model. And generating a three-dimensional face model according to the prediction parameters, and calculating loss function values of the 2D layer and the 3D layer. The back propagation and optimization comprises the steps of calculating gradient information of a loss function on model parameters by using a back propagation algorithm, updating the model parameters according to the gradient information to minimize the loss function value, and repeating the forward propagation and back propagation processes until a preset training round is reached or other stop conditions are met, so that the training and optimization process of an initial reconstruction model is completed, and the target reconstruction model is obtained.
And step S208, carrying out three-dimensional reconstruction processing on the input face image to be processed based on the target reconstruction model to obtain a corresponding target three-dimensional face model.
In this embodiment, the face image to be processed is a two-dimensional image. And inputting the face image to be processed into the target reconstruction model, performing three-dimensional reconstruction processing on the face image to be processed through the target reconstruction model, and outputting a target three-dimensional face model corresponding to the face image to be processed.
The method comprises the steps of firstly obtaining pre-built two-dimensional face image data, then extracting a face region of interest from the two-dimensional face image data to obtain a corresponding face image, detecting face key points of the face image to obtain a corresponding two-dimensional key point, carrying out face segmentation on the face image to obtain a corresponding face mask image, then extracting information from the face image based on a convolution neural network in a preset initial reconstruction model to obtain corresponding model parameters, processing the model parameters based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model, subsequently generating a corresponding first loss based on the three-dimensional face model and the face image, generating a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image, further generating a comprehensive loss based on the first loss and the second loss, further optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model, and finally carrying out three-dimensional reconstruction on the target to be processed to obtain the target reconstructed model. According to the application, the target reconstruction model is constructed by combining the first loss of the two-dimensional layer and the second loss of the three-dimensional layer for mixed supervision training, so that the target reconstruction model can be better guided to learn, and the reconstruction precision of the target reconstruction model is improved.
In some alternative implementations, step S204 includes the steps of:
and acquiring a three-dimensional scanning face model corresponding to the face image.
In this embodiment, the three-dimensional scanned face model is three-dimensional scanned data obtained after performing a real scanning process on the face image.
And acquiring first vertex information of the three-dimensional scanning face model.
In this embodiment, vertex labeling processing is performed on the three-dimensional scanned face model to extract first vertex information of the three-dimensional scanned face model.
And obtaining second vertex information of the three-dimensional face model.
In this embodiment, vertex labeling processing is performed on the three-dimensional face model to extract second vertex information of the three-dimensional face model.
A first euclidean distance between the first vertex information and the second vertex information is calculated.
In this embodiment, euclidean distance (Euc L I DEAN D I STANCE) is a common concept in mathematics and computer science that is used to measure the true distance between two points in an m-dimensional space. This distance definition is based on the calculation of the "normal" (i.e. straight line) distance between two points in euclidean space. In two-dimensional and three-dimensional space, euclidean distance is the distance between two points. A first euclidean distance between the first vertex information and the second vertex information may be calculated by using a euclidean distance formula.
And taking the first Euclidean distance as the first loss.
In this embodiment, the euclidean distance error between the vertices is calculated by calculating the three-dimensional face model predicted by the initial reconstruction model and the three-dimensional scanned face model obtained by real scanning on the 3D layer, and is used as the first loss, and the first loss can be used to monitor the appearance of the initial reconstruction model later.
The method comprises the steps of obtaining a three-dimensional scanning face model corresponding to the face image, obtaining first vertex information of the three-dimensional scanning face model, obtaining second vertex information of the three-dimensional face model, and subsequently calculating a first Euclidean distance between the first vertex information and the second vertex information, wherein the first Euclidean distance is used as the first loss. According to the method and the device for generating the first loss, the first Euclidean distance between the first vertex information of the three-dimensional scanning face model corresponding to the face image and the second vertex information of the three-dimensional face model is calculated, so that the first loss corresponding to the initial reconstruction model is generated rapidly and conveniently, and the generation efficiency of the obtained first loss is improved.
In some alternative implementations of the present embodiment, step S205 includes the steps of:
and carrying out projection processing on the three-dimensional face model to obtain a corresponding target two-dimensional key point.
In this embodiment, the three-dimensional face model may be projected onto a two-dimensional plane and rendered to generate a corresponding 2D picture. In the projection process, various factors such as illumination, viewing angle, camera parameters and the like are considered to ensure that the rendered 2D picture is as close as possible to the real situation. And then predicting the key points of the target quantity, namely the target two-dimensional key points, on the rendered 2D picture by using deep learning or other algorithms. These key points typically include the locations of facial feature points such as eyes, nose, mouth, etc., which are critical to the tasks of face recognition, expression analysis, etc. The target number is not particularly limited, and for example, 68 may be selected.
And rendering the three-dimensional face model to obtain a corresponding two-dimensional face image.
In this embodiment, the three-dimensional face model is projected onto a two-dimensional plane and rendered to generate a corresponding 2D picture, and the 2D picture is used as the two-dimensional face image.
And calculating a second Euclidean distance between the target two-dimensional key point and the two-dimensional key point, and taking the second Euclidean distance as a key point loss.
In this embodiment, the second euclidean distance between the target two-dimensional keypoint and the two-dimensional keypoint may be calculated by using a euclidean distance formula.
And calculating pixel loss between the two-dimensional face image and the face mask image.
In this embodiment, the pixel loss may be obtained by calculating the pixel difference (color information error at the pixel level) between the rendered two-dimensional face image and the real image, that is, the face mask image.
And calculating the similarity loss between the two-dimensional face image and the face mask image.
In this embodiment, the similarity between the rendered two-dimensional face image and the real image, that is, the face mask image, may be calculated by using a face recognition algorithm, so as to obtain the similarity loss.
The second loss is generated based on the keypoint loss, the pixel loss, and the similarity loss.
In this embodiment, the key point loss, the pixel loss and the similarity loss are integrated to obtain integrated data, and the integrated data is used as the second loss. The method comprises the steps of calculating a key point loss between the target two-dimensional key point and the two-dimensional key point on a 2D layer, calculating a pixel loss between the two-dimensional face image and the face mask image, calculating a similarity loss between the two-dimensional face image and the face mask image, and constructing a second loss, wherein the second loss can be used for supervising the similarity of the initial reconstruction model on a 2D original image.
The method comprises the steps of obtaining a corresponding target two-dimensional key point through projection processing of the three-dimensional face model, obtaining a corresponding two-dimensional face image through rendering processing of the three-dimensional face model, then calculating a second Euclidean distance between the target two-dimensional key point and the two-dimensional key point, taking the second Euclidean distance as key point loss, calculating pixel loss between the two-dimensional face image and the face mask image, calculating similarity loss between the two-dimensional face image and the face mask image, and generating the second loss based on the key point loss, the pixel loss and the similarity loss. According to the method, the corresponding key point loss, pixel loss and similarity loss are obtained through calculation based on the three-dimensional face model, the two-dimensional key points and the face mask image, so that the second loss corresponding to the initial reconstruction model is quickly and conveniently generated based on the key point loss, the pixel loss and the similarity loss, and the generation efficiency of the obtained second loss is improved.
In some alternative implementations, step S206 includes the steps of:
And acquiring a first weight, a second weight, a third weight and a fourth weight which respectively correspond to the first loss, the key point loss, the pixel loss and the similarity loss.
In this embodiment, the numerical selection of the first weight, the second weight, the third weight, and the fourth weight is not specifically limited, and may be set according to the actual service usage requirement. In addition, a weight generation algorithm may be further used to generate a first weight, a second weight, a third weight, and a fourth weight corresponding to the first loss, the keypoint loss, the pixel loss, and the similarity loss, respectively, so as to improve accuracy of the generated weight value.
And acquiring a preset loss calculation formula.
In this embodiment, the predetermined loss calculation formula specifically includes s=a×h+b×i+c×j+d×k, where H is a first loss, a is a first weight, I is a key point loss, b is a second weight, J is a pixel loss, c is a third weight, K is a similarity loss, and d is a fourth weight.
And calculating the first loss, the key point loss, the pixel loss, the similarity loss, the first weight, the second weight, the third weight and the fourth weight based on the loss calculation formula to obtain a corresponding calculation result.
In this embodiment, the first loss, the keypoint loss, the pixel loss, the similarity loss, the first weight, the second weight, the third weight, and the fourth weight are substituted into corresponding positions in the above-mentioned loss calculation formula to perform calculation processing, so as to obtain a corresponding calculation result.
And taking the calculation result as the comprehensive loss.
According to the method, based on the obtained first weight, second weight, third weight and fourth weight which correspond to the first loss, the key point loss, the pixel loss and the similarity loss respectively, the first loss, the key point loss, the pixel loss and the similarity loss are calculated by using a preset loss calculation formula, so that corresponding comprehensive losses are generated rapidly and accurately, the generation efficiency of the comprehensive losses is improved, and the accuracy of the generated comprehensive losses is guaranteed.
In some alternative implementations, step S201 includes the steps of:
and acquiring initial two-dimensional face image data acquired in advance.
In this embodiment, the initial two-dimensional face image data is two-dimensional face image data for different scenes acquired in advance.
And performing data cleaning processing on the initial two-dimensional face image data to obtain corresponding first face image data.
In this embodiment, the above data cleaning refers to removing outliers, repeated items, or low-quality images in the initial two-dimensional face image data.
And performing data clipping processing on the first face image data to obtain corresponding second face image data.
In this embodiment, the data clipping processing refers to clipping the first face image data with a preset size to obtain second face image data meeting the input size requirement of the convolutional neural network in the initial reconstruction model. The numerical selection of the preset size can be set according to actual model construction requirements.
And carrying out normalization processing on the second face image data to obtain corresponding third face image data.
In this embodiment, the normalization process is to normalize the pixel value of the second face image data, and scale it to a range of [0,1] or [ -1,1] in general.
And taking the third face image data as the two-dimensional face image data.
According to the application, the pre-acquired initial two-dimensional face image data is acquired, and then the data cleaning treatment, the data clipping treatment and the normalization treatment are carried out on the initial two-dimensional face image data, so that the pre-processing of the initial two-dimensional face image data is rapidly and intelligently completed, the corresponding third face image data is obtained, the generation efficiency of the third face image data is effectively improved, and the data normalization of the obtained third face image data is improved.
In some optional implementations of this embodiment, after step S207, the electronic device may further perform the following steps:
And constructing verification data based on the face image data.
In this embodiment, the face image data may be randomly selected from the data of the face image data and used as the verification data. The value of the above specified ratio is not particularly limited, and may be determined according to actual use requirements, and may be set to 0.3, for example.
And performing performance verification on the target reconstruction model based on the verification data to obtain performance index data of the target reconstruction model.
In this embodiment, the verification data are respectively input into the target reconstruction model, so that the three-dimensional reconstruction processing is performed on the verification data through the target reconstruction model, and then the performance verification processing is performed on the target reconstruction model according to the reconstruction result output by the target reconstruction model, that is, the performance evaluation is performed on the target reconstruction model to obtain the performance index data of the target reconstruction model. The performance index data may include calculated accuracy values of the target reconstruction model for evaluating the performance of the model, such as distance errors between vertices, pixel differences, and facial similarity.
And carrying out data analysis on the performance index data to generate a performance evaluation result corresponding to the target reconstruction model.
In this embodiment, the performance index data specifically includes a distance error between vertices, a pixel difference, and a face similarity. Comparing the calculated distance error between the vertexes with a preset error threshold, comparing the pixel difference with a preset difference threshold, and comparing the human face similarity with a preset similarity threshold, if the distance error between the vertexes is larger than the corresponding error threshold, the pixel difference is larger than the corresponding difference threshold, and the human face similarity is larger than the corresponding similarity threshold, judging that the target reconstruction model passes the performance verification, and generating a first performance evaluation result that the target reconstruction model passes the performance verification. And if at least one index data in all the performance index data is smaller than the corresponding index threshold, judging that the target reconstruction model fails to pass the performance verification, and generating a second performance evaluation result that the target reconstruction model fails to pass the performance verification. The values of the error threshold, the difference threshold and the similarity threshold are not particularly limited, and may be set according to actual model verification requirements.
And carrying out corresponding model adjustment processing on the target reconstruction model based on the performance evaluation result.
In this embodiment, if the content of the performance evaluation result is that the target reconstruction model passes the performance verification, the target reconstruction model is determined to be a model meeting the required performance requirement, and then no adjustment is required for the target reconstruction model. In addition, if the content of the performance evaluation result is that the target reconstruction model fails performance verification, a preset model improvement strategy can be adopted to carry out improvement treatment on the target reconstruction model. The model improvement strategy can comprise strategies such as adjusting model parameters or training parameters, replacing a layer structure of a model, modifying a model framework, and enhancing data strategies.
The method comprises the steps of constructing verification data based on the face image data, performing performance verification on the target reconstruction model based on the verification data to obtain performance index data of the target reconstruction model, performing data analysis on the performance index data to generate a performance evaluation result corresponding to the target reconstruction model, and performing corresponding model adjustment processing on the target reconstruction model based on the performance evaluation result. After the construction of the target reconstruction model is completed, the performance of the target reconstruction model is intelligently verified by using verification data constructed based on the face image data, and the target reconstruction model is subjected to corresponding model adjustment processing according to the performance evaluation result after the analysis of the performance index data of the obtained target reconstruction model, so that the finally obtained target reconstruction model can meet the corresponding performance requirement expectation, the processing accuracy of the three-dimensional reconstruction processing of the input face image to be processed by using the target reconstruction model is ensured, and the accuracy of the generated target three-dimensional face model corresponding to the face image to be processed is improved.
In some optional implementations of this embodiment, after step S208, the electronic device may further perform the following steps:
and carrying out smoothing treatment on the target three-dimensional face model to obtain a corresponding first target three-dimensional face model.
In this embodiment, the smoothing process is performed on the target three-dimensional face model to eliminate noise and unnecessary details in the target three-dimensional face model, so as to obtain a corresponding first target three-dimensional face model.
And performing texture mapping processing on the first target three-dimensional face model to obtain a corresponding second target three-dimensional face model.
In this embodiment, the texture mapping process refers to mapping texture information on a two-dimensional face image to be processed onto the first target three-dimensional face model, so as to improve the reality of the first target three-dimensional face model, and obtain a corresponding second target three-dimensional face model.
And storing the second target three-dimensional face model.
In this embodiment, the storage manner of the second target three-dimensional face model is not specifically limited, and for example, a storage manner such as blockchain storage, local database storage, cloud server storage, etc. may be used.
The method comprises the steps of carrying out smoothing treatment on the target three-dimensional face model to obtain a corresponding first target three-dimensional face model, carrying out texture mapping treatment on the first target three-dimensional face model to obtain a corresponding second target three-dimensional face model, and subsequently storing the second target three-dimensional face model. According to the application, after the input face image to be processed is subjected to three-dimensional reconstruction processing based on the target reconstruction model to obtain the target three-dimensional face model, the target three-dimensional face model is further subjected to smoothing processing and texture mapping processing to obtain the second target three-dimensional face model, so that the optimization processing of the target three-dimensional face model is realized, and the authenticity of the second target three-dimensional face model is improved. In addition, the second target three-dimensional face model is stored, so that the privacy and the safety of the second target three-dimensional face model can be ensured.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
It is emphasized that, to further ensure the privacy and security of the target reconstruction model, the target reconstruction model may also be stored in a blockchain node.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (B l ockcha i n), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (ART I F I C I A L I NTE L L I GENCE, A I) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by computer readable instructions stored in a computer readable storage medium that, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-On-y Memory (ROM), or a random access Memory (Random Access Memory, RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a face three-dimensional reconstruction apparatus based on artificial intelligence, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 3, the artificial intelligence-based face three-dimensional reconstruction device 300 according to the present embodiment includes an acquisition module 301, a first processing module 302, a second processing module 303, a first generating module 304, a second generating module 305, a third generating module 306, an optimizing module 307, and a reconstruction module 308. Wherein:
An acquisition module 301, configured to acquire two-dimensional face image data constructed in advance;
The first processing module 302 is configured to extract a face region of interest from the two-dimensional face image data to obtain a corresponding face image, detect a face key point from the face image to obtain a corresponding two-dimensional key point, and perform face segmentation on the face image to obtain a corresponding face mask image;
The second processing module 303 is configured to extract information from the face image based on a convolutional neural network in a preset initial reconstruction model to obtain corresponding model parameters, and process the model parameters based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model;
a first generation module 304, configured to generate a corresponding first loss based on the three-dimensional face model and the face image;
A second generation module 305, configured to generate a corresponding second loss based on the three-dimensional face model, the two-dimensional key points, and the face mask image;
a third generation module 306 for generating a composite loss based on the first loss and the second loss;
an optimization module 307, configured to perform optimization processing on the initial reconstruction model based on the comprehensive loss, so as to obtain a corresponding target reconstruction model;
the reconstruction module 308 is configured to perform three-dimensional reconstruction processing on the input face image to be processed based on the target reconstruction model, so as to obtain a corresponding target three-dimensional face model.
In this embodiment, the operations performed by the modules or units respectively correspond to the steps of the artificial intelligence-based face three-dimensional reconstruction method in the foregoing embodiment, and are not described herein again.
In some alternative implementations of the present embodiment, the first generating module 304 includes:
the first acquisition sub-module is used for acquiring a three-dimensional scanning face model corresponding to the face image;
The second acquisition sub-module is used for acquiring first vertex information of the three-dimensional scanning face model;
the third acquisition sub-module is used for acquiring second vertex information of the three-dimensional face model;
a first calculation sub-module for calculating a first euclidean distance between the first vertex information and the second vertex information;
and the first determining submodule is used for taking the first Euclidean distance as the first loss.
In this embodiment, the operations performed by the modules or units respectively correspond to the steps of the artificial intelligence-based face three-dimensional reconstruction method in the foregoing embodiment, and are not described herein again.
In some alternative implementations of the present embodiment, the second generating module 305 includes:
the projection sub-module is used for carrying out projection processing on the three-dimensional face model to obtain a corresponding target two-dimensional key point;
The rendering sub-module is used for rendering the three-dimensional face model to obtain a corresponding two-dimensional face image;
The second calculation sub-module is used for calculating a second Euclidean distance between the target two-dimensional key point and the two-dimensional key point, and taking the second Euclidean distance as a key point loss;
a third calculation sub-module, configured to calculate a pixel loss between the two-dimensional face image and the face mask image;
A fourth calculation sub-module, configured to calculate a similarity loss between the two-dimensional face image and the face mask image;
a generation sub-module for generating the second loss based on the keypoint loss, the pixel loss, and the similarity loss.
In this embodiment, the operations performed by the modules or units respectively correspond to the steps of the artificial intelligence-based face three-dimensional reconstruction method in the foregoing embodiment, and are not described herein again.
In some optional implementations of this embodiment, the third generating module 306 includes:
a fourth obtaining sub-module, configured to obtain a first weight, a second weight, a third weight, and a fourth weight corresponding to the first loss, the keypoint loss, the pixel loss, and the similarity loss, respectively;
a fifth obtaining sub-module, configured to obtain a preset loss calculation formula;
A fifth calculation sub-module, configured to perform calculation processing on the first loss, the keypoint loss, the pixel loss, the similarity loss, the first weight, the second weight, the third weight, and the fourth weight based on the loss calculation formula, so as to obtain a corresponding calculation result;
and the second determining submodule is used for taking the calculation result as the comprehensive loss.
In this embodiment, the operations performed by the modules or units respectively correspond to the steps of the artificial intelligence-based face three-dimensional reconstruction method in the foregoing embodiment, and are not described herein again.
In some optional implementations of this embodiment, the acquiring module 301 includes:
A sixth acquisition sub-module, configured to acquire initial two-dimensional face image data acquired in advance;
The cleaning submodule is used for carrying out data cleaning processing on the initial two-dimensional face image data to obtain corresponding first face image data;
The cutting sub-module is used for carrying out data cutting processing on the first face image data to obtain corresponding second face image data;
The normalization sub-module is used for carrying out normalization processing on the second face image data to obtain corresponding third face image data;
and the third determining submodule is used for taking the third face image data as the two-dimensional face image data.
In this embodiment, the operations performed by the modules or units respectively correspond to the steps of the artificial intelligence-based face three-dimensional reconstruction method in the foregoing embodiment, and are not described herein again.
In some optional implementations of this embodiment, the artificial intelligence based face three-dimensional reconstruction device further includes:
the construction module is used for constructing verification data based on the face image data;
The verification module is used for performing performance verification on the target reconstruction model based on the verification data to obtain performance index data of the target reconstruction model;
The fourth generation module is used for carrying out data analysis on the performance index data and generating a performance evaluation result corresponding to the target reconstruction model;
And the adjusting module is used for carrying out corresponding model adjusting processing on the target reconstruction model based on the performance evaluation result.
In this embodiment, the operations performed by the modules or units respectively correspond to the steps of the artificial intelligence-based face three-dimensional reconstruction method in the foregoing embodiment, and are not described herein again.
In some optional implementations of this embodiment, the artificial intelligence based face three-dimensional reconstruction device further includes:
The smoothing module is used for carrying out smoothing treatment on the target three-dimensional face model to obtain a corresponding first target three-dimensional face model;
The mapping module is used for carrying out texture mapping processing on the first target three-dimensional face model to obtain a corresponding second target three-dimensional face model;
and the storage module is used for storing the second target three-dimensional face model.
In this embodiment, the operations performed by the modules or units respectively correspond to the steps of the artificial intelligence-based face three-dimensional reconstruction method in the foregoing embodiment, and are not described herein again.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 4, fig. 4 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It should be noted that only computer device 4 having components 41-43 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware thereof includes, but is not limited to, a microprocessor, an application specific integrated circuit (APP L I CAT I on SPEC I F I C I NTEGRATED C I rcu I t, AS IC), a programmable gate array (Fie l d-Programmab L E GATE AR RAY, FPGA), a digital Processor (D I G I TA L S I GNA L Processor, DSP), an embedded device, and the like.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a smart memory card (SMART MED I A CARD, SMC), a secure digital (Secu RE D I G I TA L, SD) card, a flash memory card (F L ASH CARD) or the like, which are provided on the computer device 4. Of course, the memory 41 may also comprise both an internal memory unit of the computer device 4 and an external memory device. In this embodiment, the memory 41 is generally used to store an operating system and various application software installed on the computer device 4, such as computer readable instructions of a face three-dimensional reconstruction method based on artificial intelligence. Further, the memory 41 may be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a central processing unit (Cent ra lProcess i ng Un i t, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute computer readable instructions stored in the memory 41 or process data, for example, execute computer readable instructions of the artificial intelligence-based face three-dimensional reconstruction method.
The network interface 43 may comprise a wireless network interface or a wired network interface, which network interface 43 is typically used for establishing a communication connection between the computer device 4 and other electronic devices.
The present application also provides another embodiment, namely, a computer readable storage medium, where computer readable instructions are stored, where the computer readable instructions are executable by at least one processor to cause the at least one processor to perform the steps of the artificial intelligence based face three-dimensional reconstruction method as described above.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.

Claims (10)

1.一种基于人工智能的人脸三维重建方法,其特征在于,包括下述步骤:1. A method for 3D reconstruction of human face based on artificial intelligence, characterized by comprising the following steps: 获取预先构建的二维人脸图像数据;Obtain pre-constructed two-dimensional face image data; 对所述二维人脸图像数据进行人脸感兴趣区域提取得到对应的人脸图像,并对所述人脸图像进行人脸关键点检测得到对应的二维关键点,以及对所述人脸图像进行人脸分割得到对应的人脸掩膜图像;Extracting a facial region of interest from the two-dimensional facial image data to obtain a corresponding facial image, detecting facial key points from the facial image to obtain corresponding two-dimensional key points, and segmenting the facial image to obtain a corresponding facial mask image; 基于预设的初始重建模型中的卷积神经网络对所述人脸图像进行信息提取得到对应的模型参数,并基于所述初始重建模型中的处理网络对所述模型参数进行处理得到对应的三维人脸模型;Extracting information from the face image based on a convolutional neural network in a preset initial reconstruction model to obtain corresponding model parameters, and processing the model parameters based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model; 基于所述三维人脸模型与所述人脸图像生成对应的第一损失;Generate a corresponding first loss based on the three-dimensional face model and the face image; 基于所述三维人脸模型、所述二维关键点以及所述人脸掩膜图像生成对应的第二损失;Generate a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image; 基于所述第一损失与所述第二损失生成综合损失;generating a comprehensive loss based on the first loss and the second loss; 基于所述综合损失对所述初始重建模型进行优化处理,得到对应的目标重建模型;Optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model; 基于所述目标重建模型对输入的待处理人脸图像进行三维重建处理,得到对应的目标三维人脸模型。The input face image to be processed is subjected to three-dimensional reconstruction processing based on the target reconstruction model to obtain a corresponding target three-dimensional face model. 2.根据权利要求1所述的基于人工智能的人脸三维重建方法,其特征在于,所述基于所述三维人脸模型与所述人脸图像生成对应的第一损失的步骤,具体包括:2. The method for 3D reconstruction of a face based on artificial intelligence according to claim 1, wherein the step of generating a corresponding first loss based on the 3D face model and the face image specifically comprises: 获取与所述人脸图像对应的三维扫描人脸模型;Acquire a three-dimensional scanned face model corresponding to the face image; 获取所述三维扫描人脸模型的第一顶点信息;Acquire first vertex information of the three-dimensional scanned face model; 获取所述三维人脸模型的第二顶点信息;Acquire second vertex information of the three-dimensional face model; 计算所述第一顶点信息与所述第二顶点信息之间的第一欧氏距离;Calculating a first Euclidean distance between the first vertex information and the second vertex information; 将所述第一欧式距离作为所述第一损失。The first Euclidean distance is used as the first loss. 3.根据权利要求1所述的基于人工智能的人脸三维重建方法,其特征在于,所述基于所述三维人脸模型、所述二维关键点以及所述人脸掩膜图像生成对应的第二损失的步骤,具体包括:3. The method for 3D reconstruction of a face based on artificial intelligence according to claim 1, wherein the step of generating a corresponding second loss based on the 3D face model, the 2D key points and the face mask image specifically comprises: 对所述三维人脸模型进行投影处理得到对应的目标二维关键点;Performing projection processing on the three-dimensional face model to obtain corresponding target two-dimensional key points; 对所述三维人脸模型进行渲染处理得到对应的二维人脸图像;Rendering the three-dimensional face model to obtain a corresponding two-dimensional face image; 计算所述目标二维关键点与所述二维关键点之间的第二欧式距离,并将所述第二欧式距离作为关键点损失;Calculating a second Euclidean distance between the target two-dimensional key point and the two-dimensional key point, and using the second Euclidean distance as the key point loss; 计算所述二维人脸图像与所述人脸掩膜图像之间的像素损失;Calculating a pixel loss between the two-dimensional face image and the face mask image; 计算所述二维人脸图像与所述人脸掩膜图像之间的相似度损失;Calculating the similarity loss between the two-dimensional face image and the face mask image; 基于所述关键点损失、所述像素损失以及所述相似度损失生成所述第二损失。The second loss is generated based on the key point loss, the pixel loss, and the similarity loss. 4.根据权利要求3所述的基于人工智能的人脸三维重建方法,其特征在于,所述基于所述第一损失与所述第二损失生成综合损失的步骤,具体包括:4. The method for 3D reconstruction of a human face based on artificial intelligence according to claim 3, wherein the step of generating a comprehensive loss based on the first loss and the second loss specifically comprises: 获取与所述第一损失、所述关键点损失、所述像素损失以及所述相似度损失分别对应的第一权重、第二权重、第三权重以及第四权重;Obtaining a first weight, a second weight, a third weight, and a fourth weight corresponding to the first loss, the key point loss, the pixel loss, and the similarity loss, respectively; 获取预设的损失计算公式;Get the preset loss calculation formula; 基于所述损失计算公式对所述第一损失、所述关键点损失、所述像素损失、所述相似度损失、所述第一权重、所述第二权重、所述第三权重以及所述第四权重进行计算处理,得到对应的计算结果;Based on the loss calculation formula, the first loss, the key point loss, the pixel loss, the similarity loss, the first weight, the second weight, the third weight, and the fourth weight are calculated and processed to obtain corresponding calculation results; 将所述计算结果作为所述综合损失。The calculation result is taken as the comprehensive loss. 5.根据权利要求1所述的基于人工智能的人脸三维重建方法,其特征在于,所述获取预先构建的二维人脸图像数据的步骤,具体包括:5. The method for 3D reconstruction of human face based on artificial intelligence according to claim 1, characterized in that the step of obtaining pre-constructed 2D human face image data specifically comprises: 获取预先采集的初始二维人脸图像数据;Acquire pre-collected initial two-dimensional face image data; 对所述初始二维人脸图像数据进行数据清洗处理,得到对应的第一人脸图像数据;Performing data cleaning processing on the initial two-dimensional face image data to obtain corresponding first face image data; 对所述第一人脸图像数据进行数据裁剪处理,得到对应的第二人脸图像数据;Performing data cropping processing on the first facial image data to obtain corresponding second facial image data; 对所述第二人脸图像数据进行归一化处理,得到对应的第三人脸图像数据;Normalizing the second facial image data to obtain corresponding third facial image data; 将所述第三人脸图像数据作为所述二维人脸图像数据。The third facial image data is used as the two-dimensional facial image data. 6.根据权利要求1所述的基于人工智能的人脸三维重建方法,其特征在于,在所述基于所述综合损失对所述初始重建模型进行优化处理,得到对应的目标重建模型的步骤之后,还包括:6. The method for 3D reconstruction of a human face based on artificial intelligence according to claim 1, characterized in that after the step of optimizing the initial reconstruction model based on the comprehensive loss to obtain the corresponding target reconstruction model, it also includes: 基于所述人脸图像数据构建验证数据;Constructing verification data based on the facial image data; 基于所述验证数据对所述目标重建模型进行性能验证,得到所述目标重建模型的性能指标数据;Performing performance verification on the target reconstruction model based on the verification data to obtain performance indicator data of the target reconstruction model; 对所述性能指标数据进行数据分析,生成与所述目标重建模型对应的性能评估结果;Performing data analysis on the performance indicator data to generate a performance evaluation result corresponding to the target reconstruction model; 基于所述性能评估结果对所述目标重建模型进行对应的模型调整处理。Based on the performance evaluation result, corresponding model adjustment processing is performed on the target reconstruction model. 7.根据权利要求1所述的基于人工智能的人脸三维重建方法,其特征在于,在所述基于所述目标重建模型对输入的待处理人脸图像进行三维重建处理,得到对应的目标三维人脸模型的步骤之后,还包括:7. The method for 3D reconstruction of human faces based on artificial intelligence according to claim 1, characterized in that after the step of performing 3D reconstruction processing on the input human face image to be processed based on the target reconstruction model to obtain the corresponding target 3D human face model, it also includes: 对所述目标三维人脸模型进行平滑处理,得到对应的第一目标三维人脸模型;Performing smoothing processing on the target three-dimensional human face model to obtain a corresponding first target three-dimensional human face model; 对所述第一目标三维人脸模型进行纹理映射处理,得到对应的第二目标三维人脸模型;Performing texture mapping processing on the first target three-dimensional human face model to obtain a corresponding second target three-dimensional human face model; 存储所述第二目标三维人脸模型。The second target 3D face model is stored. 8.一种基于人工智能的人脸三维重建装置,其特征在于,包括:8. A 3D face reconstruction device based on artificial intelligence, comprising: 获取模块,用于获取预先构建的二维人脸图像数据;An acquisition module, used for acquiring pre-constructed two-dimensional face image data; 第一处理模块,用于对所述二维人脸图像数据进行人脸感兴趣区域提取得到对应的人脸图像,并对所述人脸图像进行人脸关键点检测得到对应的二维关键点,以及对所述人脸图像进行人脸分割得到对应的人脸掩膜图像;A first processing module is used to extract a facial region of interest from the two-dimensional facial image data to obtain a corresponding facial image, perform facial key point detection on the facial image to obtain corresponding two-dimensional key points, and perform facial segmentation on the facial image to obtain a corresponding facial mask image; 第二处理模块,用于基于预设的初始重建模型中的卷积神经网络对所述人脸图像进行信息提取得到对应的模型参数,并基于所述初始重建模型中的处理网络对所述模型参数进行处理得到对应的三维人脸模型;A second processing module is used to extract information from the face image based on a convolutional neural network in a preset initial reconstruction model to obtain corresponding model parameters, and to process the model parameters based on a processing network in the initial reconstruction model to obtain a corresponding three-dimensional face model; 第一生成模块,用于基于所述三维人脸模型与所述人脸图像生成对应的第一损失;A first generating module, configured to generate a corresponding first loss based on the three-dimensional face model and the face image; 第二生成模块,用于基于所述三维人脸模型、所述二维关键点以及所述人脸掩膜图像生成对应的第二损失;A second generating module, used to generate a corresponding second loss based on the three-dimensional face model, the two-dimensional key points and the face mask image; 第三生成模块,用于基于所述第一损失与所述第二损失生成综合损失;A third generating module, configured to generate a comprehensive loss based on the first loss and the second loss; 优化模块,用于基于所述综合损失对所述初始重建模型进行优化处理,得到对应的目标重建模型;An optimization module, used for optimizing the initial reconstruction model based on the comprehensive loss to obtain a corresponding target reconstruction model; 重建模块,用于基于所述目标重建模型对输入的待处理人脸图像进行三维重建处理,得到对应的目标三维人脸模型。The reconstruction module is used to perform three-dimensional reconstruction processing on the input face image to be processed based on the target reconstruction model to obtain a corresponding target three-dimensional face model. 9.一种计算机设备,其特征在于,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现如权利要求1至7中任一项所述的基于人工智能的人脸三维重建方法的步骤。9. A computer device, characterized in that it includes a memory and a processor, wherein the memory stores computer-readable instructions, and when the processor executes the computer-readable instructions, the steps of the artificial intelligence-based three-dimensional face reconstruction method as described in any one of claims 1 to 7 are implemented. 10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如权利要求1至7中任一项所述的基于人工智能的人脸三维重建方法的步骤。10. A computer-readable storage medium, characterized in that computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the steps of the artificial intelligence-based three-dimensional face reconstruction method as described in any one of claims 1 to 7 are implemented.
CN202411157941.6A 2024-08-21 2024-08-21 Method, device, equipment and medium for three-dimensional reconstruction of human face based on artificial intelligence Pending CN119228991A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411157941.6A CN119228991A (en) 2024-08-21 2024-08-21 Method, device, equipment and medium for three-dimensional reconstruction of human face based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411157941.6A CN119228991A (en) 2024-08-21 2024-08-21 Method, device, equipment and medium for three-dimensional reconstruction of human face based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN119228991A true CN119228991A (en) 2024-12-31

Family

ID=94039412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411157941.6A Pending CN119228991A (en) 2024-08-21 2024-08-21 Method, device, equipment and medium for three-dimensional reconstruction of human face based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN119228991A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120147559A (en) * 2025-05-15 2025-06-13 之江实验室 A three-dimensional reconstruction method, device, equipment and medium based on image set

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120147559A (en) * 2025-05-15 2025-06-13 之江实验室 A three-dimensional reconstruction method, device, equipment and medium based on image set
CN120147559B (en) * 2025-05-15 2025-09-05 之江实验室 A three-dimensional reconstruction method, device, equipment and medium based on image set

Similar Documents

Publication Publication Date Title
US12062249B2 (en) System and method for generating image landmarks
CN113420719B (en) Method and device for generating motion capture data, electronic equipment and storage medium
CN112395979B (en) Image-based health state identification method, device, equipment and storage medium
CN111814620B (en) Face image quality evaluation model establishment method, optimization method, medium and device
CN114973349B (en) Facial image processing method and facial image processing model training method
CN113763249B (en) Text image super-resolution reconstruction method and related equipment
CN113177892B (en) Method, apparatus, medium and program product for generating image restoration model
CN109685873B (en) Face reconstruction method, device, equipment and storage medium
CN113240071B (en) Method and device for processing graph neural network, computer equipment and storage medium
WO2024098685A1 (en) Face driving method and apparatus for virtual character, and terminal device and readable storage medium
WO2024198747A1 (en) Processing method and apparatus for motion capture data, and device and storage medium
CN112669244B (en) Face image enhancement method, device, computer equipment and readable storage medium
CN119228991A (en) Method, device, equipment and medium for three-dimensional reconstruction of human face based on artificial intelligence
CN114792355A (en) Virtual image generation method and device, electronic equipment and storage medium
CN114399497A (en) Text image quality detection method and device, computer equipment and storage medium
CN113362249A (en) Text image synthesis method and device, computer equipment and storage medium
CN119383289A (en) Video generation method, device, equipment and medium
CN119229535A (en) A behavior identification method, device, computer equipment and storage medium
CN112541436B (en) Concentration analysis method and device, electronic equipment and computer storage medium
CN119296151A (en) A face recognition method, device, computer equipment and storage medium
CN117152352B (en) Image processing method, deep learning model training method and device
CN112966150A (en) Video content extraction method and device, computer equipment and storage medium
CN117455883A (en) Identification methods, devices, computer equipment and media for steel bar acceptance
CN114972008A (en) A coordinate restoration method, device and related equipment
CN118379586B (en) Training method, device, equipment, medium and product of key point prediction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination