[go: up one dir, main page]

CN118283182A - AI-based intelligent voice call prediction method, program product, device and medium - Google Patents

AI-based intelligent voice call prediction method, program product, device and medium Download PDF

Info

Publication number
CN118283182A
CN118283182A CN202410703069.4A CN202410703069A CN118283182A CN 118283182 A CN118283182 A CN 118283182A CN 202410703069 A CN202410703069 A CN 202410703069A CN 118283182 A CN118283182 A CN 118283182A
Authority
CN
China
Prior art keywords
call
user
data
detection
detection result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410703069.4A
Other languages
Chinese (zh)
Inventor
王小东
徐志华
吕文勇
周智杰
康钰于
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu New Hope Finance Information Co Ltd
Original Assignee
Chengdu New Hope Finance Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu New Hope Finance Information Co Ltd filed Critical Chengdu New Hope Finance Information Co Ltd
Priority to CN202410703069.4A priority Critical patent/CN118283182A/en
Publication of CN118283182A publication Critical patent/CN118283182A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • H04M3/5166Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing in combination with interactive voice response systems or voice portals, e.g. as front-ends
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • H04M3/5175Call or contact centers supervision arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing
    • H04M3/523Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing with call distribution or queueing
    • H04M3/5238Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing with call distribution or queueing with waiting time or load prediction arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/527Centralised call answering arrangements not requiring operator intervention

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Child & Adolescent Psychology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Psychiatry (AREA)
  • Hospice & Palliative Care (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application provides an AI-based intelligent voice call prediction method, a program product, equipment and a medium, wherein the method comprises the following steps: receiving user call data in the current call; detecting the call data of the user to obtain a call detection result; the detection processing comprises at least one of emotion detection, audio detection, speech speed detection, keyword detection and dialogue flow detection; based on the call detection result, the ending time of the current call is predicted, and based on the ending time of the current call, whether the next call is called is determined. And obtaining at least one call detection result corresponding to the detection processing, predicting the ending time of the current call, and confirming whether to call the next call according to the ending time. Customer service can answer the call of the next call faster, so that waiting time of the customer service is reduced, and working efficiency is improved.

Description

基于AI的智能语音呼叫预测方法、程序产品、设备及介质AI-based intelligent voice call prediction method, program product, device and medium

技术领域Technical Field

本申请涉及人工智能技术领域,具体而言,涉及一种基于AI的智能语音呼叫预测方法、程序产品、设备及介质。The present application relates to the field of artificial intelligence technology, and more specifically, to an AI-based intelligent voice call prediction method, program product, device, and medium.

背景技术Background technique

目前的客服工作主要依赖人工坐席电话作业,人工坐席在呼叫过程中需要一个一个电话进行呼叫,呼通后就进入到人工坐席和客户的对话过程中,等呼叫结束继续拨打下一通电话。这种方式人工坐席需要耗费较多的时间等待用户接听,导致工作效率低。At present, customer service work mainly relies on manual agent telephone operations. During the call process, the manual agent needs to make calls one by one. After the call is connected, the manual agent and the customer will enter the conversation process, and then continue to make the next call after the call is completed. In this way, the manual agent needs to spend a lot of time waiting for the user to answer the call, resulting in low work efficiency.

发明内容Summary of the invention

本申请实施例的目的在于一种基于AI的智能语音呼叫预测方法、程序产品、设备及介质,用于解决耗费较多的时间等待用户接听,导致工作效率低的问题。The purpose of the embodiments of the present application is to provide an AI-based intelligent voice call prediction method, program product, device and medium, which are used to solve the problem of spending a lot of time waiting for users to answer the call, resulting in low work efficiency.

第一方面,本申请实施例提供了一种基于AI的智能语音呼叫预测方法,包括:接收当前通话中的用户通话数据;对用户通话数据进行检测处理,获得通话检测结果;检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项;基于通话检测结果,预测当前通话的结束时长,基于当前通话的结束时长,确定是否呼叫下一通电话。In the first aspect, an embodiment of the present application provides an AI-based intelligent voice call prediction method, including: receiving user call data in a current call; detecting and processing the user call data to obtain a call detection result; the detection processing includes at least one of emotion detection, audio detection, speech speed detection, keyword detection, and dialogue flow detection; based on the call detection result, predicting the end duration of the current call, and based on the end duration of the current call, determining whether to call the next call.

在上述的实现过程中,通过对当前通话中的用户通话数据进行实时检测,检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项,获得检测处理对应的至少一项通话检测结果,预测当前通话的结束时长,根据结束时长确认是否呼叫下一通电话。客服可以更快的接听到下一通电话,减少客服等待的时间,提高工作效率。In the above implementation process, by performing real-time detection on the user's call data in the current call, the detection process includes at least one of emotion detection, audio detection, speech speed detection, keyword detection and dialogue flow detection, obtaining at least one call detection result corresponding to the detection process, predicting the end time of the current call, and confirming whether to call the next call based on the end time. The customer service can answer the next call faster, reduce the waiting time of the customer service, and improve work efficiency.

可选地,在任一实施例的基础上,用户通话数据包括用户通话音频数据和/或用户通话文本数据;通话检测结果包括情绪检测结果;对用户通话数据进行检测处理,获得通话检测结果,包括:将用户通话数据输入预设的情绪检测模型,获得情绪检测结果;情绪检测模型通过对预设的情绪数据和情绪数据对应的情绪标签训练获得。Optionally, based on any embodiment, the user call data includes user call audio data and/or user call text data; the call detection result includes an emotion detection result; the user call data is detected and processed to obtain the call detection result, including: inputting the user call data into a preset emotion detection model to obtain the emotion detection result; the emotion detection model is obtained by training the preset emotion data and the emotion labels corresponding to the emotion data.

在上述的实现过程中,利用情绪检测模型对用户通话数据中的用户通话音频数据和/或用户通话文本数据镜像情绪检测,获得情绪检测结果。情绪状态可以一定程度反映用户接听或挂断电话的意愿,将情绪检测结果作为预测当前通话的结束时长的因素之一,提高结束时间预测的准确性。In the above implementation process, the emotion detection model is used to perform emotion detection on the user call audio data and/or the user call text data in the user call data to obtain the emotion detection result. The emotional state can reflect the user's willingness to answer or hang up the call to a certain extent. The emotion detection result is used as one of the factors for predicting the end time of the current call, thereby improving the accuracy of the end time prediction.

可选地,在任一实施例的基础上,通话检测结果包括音频检测结果,对用户通话数据进行检测处理,获得通话检测结果,包括:将用户通话数据中的用户通话音频数据进行音频检测,获得音频检测结果;音频检测包括声调检测和/或分贝检测。Optionally, based on any embodiment, the call detection result includes an audio detection result, and the user call data is detected and processed to obtain the call detection result, including: performing audio detection on the user call audio data in the user call data to obtain the audio detection result; the audio detection includes tone detection and/or decibel detection.

在上述的实现过程中,对用户通话数据中的用户通话音频数据进行音频检测,获得音频检测结果,音频检测结果包括声调检测结果和/或分贝检测结果。将音频检测结果作为预测当前通话的结束时长的因素之一,提高结束时间预测的准确性。In the above implementation process, audio detection is performed on the user call audio data in the user call data to obtain an audio detection result, which includes a tone detection result and/or a decibel detection result. The audio detection result is used as one of the factors for predicting the end time of the current call, thereby improving the accuracy of the end time prediction.

可选地,在任一实施例的基础上,通话检测结果包括通话步骤检测结果,对用户通话数据进行检测处理,获得通话检测结果,包括:获取当前通话中的客服通话数据;将客服通话数据和用户通话数据输入预设的通话流程模型,获得通话步骤检测结果;通话流程模型通过对通话流程中每个步骤对应的问答数据进行特征提取,获得步骤特征;对步骤特征添加对应的步骤标签;对步骤特征以及步骤特征对应的标签进行训练获得。Optionally, based on any embodiment, the call detection result includes a call step detection result, and the user call data is detected and processed to obtain the call detection result, including: obtaining the customer service call data in the current call; inputting the customer service call data and the user call data into a preset call process model to obtain the call step detection result; the call process model obtains step features by extracting features from the question and answer data corresponding to each step in the call process; adding corresponding step labels to the step features; and training the step features and the labels corresponding to the step features.

在上述的实现过程中,利用通话流程模型,根据客服通话数据和用户通话数据获得通话步骤检测结果。通话步骤检测结果表征当前通话进行到哪一步骤,较大程度上反映出当前通话距离结束呼叫还需要的时间,根据通话步骤检测结果预测当前通话的结束时长,提高结束时间预测的准确性。In the above implementation process, the call process model is used to obtain the call step detection results based on the customer service call data and the user call data. The call step detection results represent the step of the current call, and largely reflect the time required for the current call to end. The end time of the current call is predicted based on the call step detection results, thereby improving the accuracy of the end time prediction.

可选地,在任一实施例的基础上,基于通话检测结果,预测当前通话的结束时长,包括:利用预设的拟合预测函数,根据通话检测结果,获得当前通话的结束时长。Optionally, based on any embodiment, predicting the end duration of the current call based on the call detection result includes: using a preset fitting prediction function to obtain the end duration of the current call according to the call detection result.

在上述的实现过程中,利用预设的拟合预测函数,根据通话检测结果,获得当前通话的结束时长,在当前通话的结束时长小于时间阈值的情况下呼叫下一通电话。客服可以更快的接听到下一通电话,减少客服等待的时间,提高工作效率。In the above implementation process, the preset fitting prediction function is used to obtain the end time of the current call according to the call detection result, and the next call is called when the end time of the current call is less than the time threshold. The customer service can answer the next call faster, reduce the waiting time of the customer service, and improve work efficiency.

可选地,在任一实施例的基础上,通话检测结果包括情绪检测结果、音频检测结果、语速检测结果、关键字检测结果以及通话步骤检测结果,拟合预测函数,包括:Optionally, based on any embodiment, the call detection result includes an emotion detection result, an audio detection result, a speech speed detection result, a keyword detection result, and a call step detection result, and the fitting prediction function includes:

其中,y为当前通话的结束时长,k为第一参数,emotion.pt为情绪检测结果,a为情绪检测参数,Tone.pt为音频检测结果,b为音频检测参数,Speed.pt为语速检测结果,c为语速检测参数,Key.pt为关键字检测结果,d为关键字检测参数,Process.pt为对话流程检测结果,e为对话流程检测参数。Among them, y is the end time of the current call, k is the first parameter, emotion.pt is the emotion detection result, a is the emotion detection parameter, Tone.pt is the audio detection result, b is the audio detection parameter, Speed.pt is the speech speed detection result, c is the speech speed detection parameter, Key.pt is the keyword detection result, d is the keyword detection parameter, Process.pt is the dialogue process detection result, and e is the dialogue process detection parameter.

可选地,在任一实施例的基础上,在确认呼叫下一通电话之后,方法还包括:呼叫下一通电话,并利用智能外呼机器人与下一通电话对应的第二用户进行通话;对第二用户的通话语音数据进行实时语音识别,获得用户实时通话文本,以及用户实时通话文本的时间;获取智能外呼机器人的智能外呼机器人实时通话文本,以及智能外呼机器人实时通话文本的时间;智能外呼机器人实时通话文本根据预设文本和/或第二用户的通话语音数据生成;按照时间顺序将用户实时通话文本和智能外呼机器人实时通话文本进行排序,获得智能外呼机器人与第二用户的对话内容;将对话内容进行展示,以使客服根据智能外呼机器人与第二用户的对话内容,确认是否接管智能外呼机器人,继续与第二用户进行通话。Optionally, based on any embodiment, after confirming the call for the next call, the method also includes: calling the next call, and using the intelligent outbound call robot to talk to the second user corresponding to the next call; performing real-time voice recognition on the call voice data of the second user to obtain the user's real-time call text, and the time of the user's real-time call text; obtaining the intelligent outbound call robot's real-time call text, and the time of the intelligent outbound call robot's real-time call text; the intelligent outbound call robot's real-time call text is generated according to the preset text and/or the call voice data of the second user; sorting the user's real-time call text and the intelligent outbound call robot's real-time call text in chronological order to obtain the conversation content between the intelligent outbound call robot and the second user; displaying the conversation content so that the customer service can confirm whether to take over the intelligent outbound call robot and continue to talk to the second user based on the conversation content between the intelligent outbound call robot and the second user.

在上述的实现过程中,将智能外呼机器人与第二用户的通话内容进行实时展示,以使客服根据通话内容决定是否要接管下一通电话,服务第二用户。减少客服等待的时间,提高工作效率的同时,提高用户的体验感,以及服务质量。In the above implementation process, the conversation content between the intelligent outbound call robot and the second user is displayed in real time, so that the customer service can decide whether to take over the next call and serve the second user according to the conversation content. This reduces the waiting time of the customer service, improves work efficiency, and improves the user experience and service quality.

可选地,在任一实施例的基础上,方法还包括:获取当前通话的对话数据;对话数据包括智能外呼机器人与用户的对话数据和/或客服与用户的对话数据;将对话数据输入预设的文本总结模型,获得通话总结;文本总结模型为对预训练模型进行LORA微调获得;对预训练模型进行LORA微调,包括:加载预训练模型,并保持预训练模型的预设权重参数不变;在预训练模型中添加可训练层;通过低秩自适应方法,使用预设的对话样本数据集对可训练层进行训练;利用训练好的可训练层更新预训练模型,获得文本总结模型。Optionally, based on any embodiment, the method also includes: obtaining conversation data of the current call; the conversation data includes conversation data between the intelligent outbound call robot and the user and/or conversation data between the customer service and the user; inputting the conversation data into a preset text summary model to obtain a call summary; the text summary model is obtained by performing LORA fine-tuning on the pre-trained model; performing LORA fine-tuning on the pre-trained model, including: loading the pre-trained model and keeping the preset weight parameters of the pre-trained model unchanged; adding a trainable layer to the pre-trained model; training the trainable layer using a preset conversation sample data set through a low-rank adaptive method; and updating the pre-trained model using the trained trainable layer to obtain a text summary model.

在上述的实现过程中,将对话数据输入预设的文本总结模型,获得通话总结,减少人工记录通话总结的时间,提高总结效率。并且文本总结模型是对预训练模型进行LORA微调获得,改善预训练模型文本内容总结能力弱的问题,微调之后强化其在专业领域的总结能力,提高总结的准确性。In the above implementation process, the conversation data is input into the preset text summary model to obtain the call summary, which reduces the time of manually recording the call summary and improves the summary efficiency. In addition, the text summary model is obtained by fine-tuning the pre-trained model with LORA to improve the problem of the weak text content summary ability of the pre-trained model. After fine-tuning, its summary ability in professional fields is strengthened and the accuracy of the summary is improved.

可选地,在任一实施例的基础上,方法还包括:对用户通话数据中的用户通话音频数据进行识别,获得用户通话文本数据;对用户通话文本数据进行标点符号预测,并根据标点符号预测结果对用户通话文本数据进行划分,获得目标用户语句;利用预设的话术推荐模型,根据目标用户语句,获得对应的答复话术;话术推荐模型通过获取知识库数据以及预先搭建的大语言模型;将大语言模型和语言链工具集成;利用语言链工具对知识库数据进行向量化获得。Optionally, based on any embodiment, the method also includes: identifying user call audio data in the user call data to obtain user call text data; predicting punctuation marks on the user call text data, and dividing the user call text data according to the punctuation mark prediction results to obtain target user sentences; using a preset speech recommendation model to obtain corresponding reply speech according to the target user sentences; the speech recommendation model obtains knowledge base data and a pre-built large language model; integrates the large language model and the language chain tool; and uses the language chain tool to vectorize the knowledge base data.

在上述的实现过程中,利用预设的话术推荐模型,根据目标用户语句,获得对应的答复话术,提高话术推荐的准确性和效率。In the above implementation process, the preset speech recommendation model is used to obtain the corresponding reply speech according to the target user's sentence, thereby improving the accuracy and efficiency of the speech recommendation.

第二方面,本申请实施例还提供了一种基于AI的智能语音呼叫预测装置,包括:接收用户数据模块,用于接收当前通话中的用户通话数据;检测模块,用于对用户通话数据进行检测处理,获得通话检测结果;检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项;呼叫预测模块,用于基于通话检测结果,预测当前通话的结束时长,基于当前通话的结束时长,确定是否呼叫下一通电话。On the second aspect, the embodiment of the present application also provides an AI-based intelligent voice call prediction device, including: a user data receiving module, used to receive user call data in a current call; a detection module, used to detect and process the user call data to obtain a call detection result; the detection process includes at least one of emotion detection, audio detection, speech speed detection, keyword detection and dialogue flow detection; a call prediction module, used to predict the end duration of the current call based on the call detection result, and determine whether to call the next call based on the end duration of the current call.

第三方面,本申请实施例还提供了一种电子设备,包括:处理器和存储器,存储器存储有处理器可执行的机器可读指令,机器可读指令被处理器执行时执行如上面描述的方法。In a third aspect, an embodiment of the present application further provides an electronic device, comprising: a processor and a memory, wherein the memory stores machine-readable instructions executable by the processor, and when the machine-readable instructions are executed by the processor, the method described above is performed.

第四方面,本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上面描述的方法。In a fourth aspect, an embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method described above is executed.

采用本申请提供基于AI的智能语音呼叫预测方法、程序产品、设备及介质,通过对当前通话中的用户通话数据进行实时检测,检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项,获得检测处理对应的至少一项通话检测结果,预测当前通话的结束时长,根据结束时长确认是否呼叫下一通电话。客服可以更快的接听到下一通电话,减少客服等待的时间,提高工作效率。This application provides an AI-based intelligent voice call prediction method, program product, device and medium. By performing real-time detection on the user call data in the current call, the detection process includes at least one of emotion detection, audio detection, speech speed detection, keyword detection and dialogue flow detection, obtaining at least one call detection result corresponding to the detection process, predicting the end time of the current call, and confirming whether to call the next call based on the end time. Customer service can answer the next call faster, reduce the waiting time of customer service, and improve work efficiency.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本申请的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for use in the embodiments of the present application will be briefly introduced below. It should be understood that the following drawings only show certain embodiments of the present application and therefore should not be regarded as limiting the scope. For ordinary technicians in this field, other related drawings can be obtained based on these drawings without paying creative work.

图1为本申请实施例提供的一种基于AI的智能语音呼叫预测方法的流程示意图;FIG1 is a flow chart of an AI-based intelligent voice call prediction method provided in an embodiment of the present application;

图2为本申请实施例提供的检测处理的流程示意图;FIG2 is a schematic diagram of a detection process provided in an embodiment of the present application;

图3为本申请实施例提供的基于文本总结模型获得通话总结的示意图;FIG3 is a schematic diagram of obtaining a call summary based on a text summary model provided in an embodiment of the present application;

图4为本申请示出的话术推荐的流程示意图;FIG4 is a schematic diagram of the process of speech recommendation shown in the present application;

图5为本申请示出的智能呼叫的流程示意图;FIG5 is a schematic diagram of a smart call process according to the present application;

图6为本申请实施例提供的基于AI的智能语音呼叫预测装置200的结构示意图;FIG6 is a schematic diagram of the structure of an AI-based intelligent voice call prediction device 200 provided in an embodiment of the present application;

图7为本申请实施例提供的电子设备的结构示意图。FIG. 7 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.

具体实施方式Detailed ways

下面将结合附图对本申请技术方案的实施例进行详细的描述。以下实施例仅用于更加清楚地说明本申请的技术方案,因此只作为示例,而不能以此来限制本申请的保护范围。The following embodiments of the technical solution of the present application will be described in detail in conjunction with the accompanying drawings. The following embodiments are only used to more clearly illustrate the technical solution of the present application, and are therefore only used as examples, and cannot be used to limit the scope of protection of the present application.

除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同;本文中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by technicians in the technical field to which this application belongs; the terms used herein are only for the purpose of describing specific embodiments and are not intended to limit this application.

在本申请实施例的描述中,技术术语“第一”、“第二”等仅用于区别不同对象,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量、特定顺序或主次关系。在本申请实施例的描述中,“多个”的含义是两个及以上,除非另有明确具体的限定。In the description of the embodiments of the present application, the technical terms "first", "second", etc. are only used to distinguish different objects, and cannot be understood as indicating or implying relative importance or implicitly indicating the number, specific order or primary and secondary relationship of the indicated technical features. In the description of the embodiments of the present application, the meaning of "multiple" is two or more, unless otherwise clearly and specifically defined.

目前在很多领域都有电话客服岗位,客服主要通过电话为客户提供专业的咨询服务。具体的工作流程为人工坐席拨打电话等待用户接通,在结束该通电话之后,继续拨打下一通,当然,在结束当前通话之后,还可以对当前通话进行记录,获得通话总结。也就是说,人工坐席电话作业拨打每一通电话都需要等待用户接通,人工坐席的工作时间有很大一部分都在等待,导致工作效率较低。Currently, there are telephone customer service positions in many fields. Customer service mainly provides professional consulting services to customers through telephone. The specific workflow is that the human agent makes a call and waits for the user to answer. After the call is ended, the next call is made. Of course, after the current call is ended, the current call can also be recorded to obtain a call summary. In other words, the human agent needs to wait for the user to answer each call made by the telephone operator. A large part of the working time of the human agent is spent waiting, resulting in low work efficiency.

本申请提供一种基于AI的智能语音呼叫预测方法,通过对当前通话中的用户通话数据进行实时检测,检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项,获得检测处理对应的至少一项通话检测结果,预测当前通话的结束时长,根据结束时长确认是否呼叫下一通电话。客服可以更快的接听到下一通电话,减少客服等待的时间,提高工作效率。The present application provides an AI-based intelligent voice call prediction method, which detects the user call data in the current call in real time, including emotion detection, audio detection, speech speed detection, keyword detection, and dialogue flow detection. The detection process obtains at least one call detection result corresponding to the detection process, predicts the end time of the current call, and confirms whether to call the next call based on the end time. Customer service can answer the next call faster, reduce the waiting time of customer service, and improve work efficiency.

请参见图1示出的本申请实施例提供的一种基于AI的智能语音呼叫预测方法的流程示意图。本申请实施例提供的基于AI的智能语音呼叫预测方法可以应用于电子设备,电子设备上装配有智能呼叫系统。该电子设备可以包括终端以及服务器;其中终端具体可以为智能手机、平板电脑、计算机、个人数字助理(Personal Digital Assitant,PDA)等;服务器具体可以为应用服务器,也可以为Web服务器。该基于AI的智能语音呼叫预测方法可以包括:Please refer to FIG. 1 for a flow chart of an AI-based intelligent voice call prediction method provided in an embodiment of the present application. The AI-based intelligent voice call prediction method provided in an embodiment of the present application can be applied to an electronic device equipped with an intelligent call system. The electronic device may include a terminal and a server; wherein the terminal may specifically be a smart phone, a tablet computer, a computer, a personal digital assistant (PDA), etc.; the server may specifically be an application server or a web server. The AI-based intelligent voice call prediction method may include:

步骤S110:接收当前通话中的用户通话数据。Step S110: receiving call data of the user in the current call.

步骤S120:对用户通话数据进行检测处理,获得通话检测结果;检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项。Step S120: Detect and process the user's call data to obtain a call detection result; the detection process includes at least one of emotion detection, audio detection, speech speed detection, keyword detection, and dialogue flow detection.

步骤S130:基于通话检测结果,预测当前通话的结束时长,基于当前通话的结束时长,确定是否呼叫下一通电话。Step S130: Based on the call detection result, predict the end time of the current call, and determine whether to make the next call based on the end time of the current call.

在步骤S110中,通过智能外呼系统呼叫用户的电话,智能呼叫系统可以提供标准化的自动外呼服务,并在经过用户同意的情况下,对通话中的对话内容进行记录。In step S110, a call is made to the user's phone through the intelligent outbound calling system. The intelligent outbound calling system can provide a standardized automatic outbound calling service and record the conversation content during the call with the user's consent.

在当前电话接通之后,可以由人工客服或智能外呼机器人与用户进行对话,在进行业务对话之前,需确认用户是否同意对通话内容进行记录,用户同意之后,与用户进行对话并记录用户通话数据。用户通话数据为用户在通话中所表达的内容,用户通话数据包括用户在当前通话中的音频数据,以及对音频数据识别获得的文本数据。After the current call is connected, a human customer service or intelligent outbound call robot can have a conversation with the user. Before the business conversation, it is necessary to confirm whether the user agrees to record the call content. After the user agrees, the conversation will be conducted with the user and the user's call data will be recorded. The user's call data is the content expressed by the user during the call. The user's call data includes the user's audio data in the current call and the text data obtained by recognizing the audio data.

可以理解的是,可以直接获取用户通话数据,也可以获取人工客服或智能外呼机器人与用户的完整对话内容,从完全对话内容中提取用户通话数据。It is understandable that the user call data can be obtained directly, or the complete conversation content between the human customer service or intelligent outbound call robot and the user can be obtained, and the user call data can be extracted from the complete conversation content.

在步骤S120中,情绪检测是指评估用户通话数据中的用户通话音频数据和/或用户通话文本数据中所表达的情绪状态的过程。情绪检测可以通过基于规则、基于机器学习或基于深度学习的方法实现。情绪检测结果例如积极、消极、中立等,还可以为疑问等情绪。In step S120, emotion detection refers to the process of evaluating the emotional state expressed in the user call audio data and/or the user call text data in the user call data. Emotion detection can be implemented by a rule-based, machine learning-based, or deep learning-based method. The emotion detection result can be, for example, positive, negative, neutral, etc., and can also be an emotion such as question.

音频检测是指对用户通话数据中的用户通话音频数据的语调以及音量等进行检测,音频检测结果例如语调的分类或者音量分贝等。Audio detection refers to detecting the intonation and volume of user call audio data in the user call data. The audio detection result may be, for example, the classification of the intonation or the decibel of the volume.

语速检测是指对用户通话数据中的用户通话音频数据的讲话节奏以及讲话速度等进行检测。Speech rate detection refers to detecting the speech rhythm and speech speed of the user call audio data in the user call data.

关键字检测是指对用户通话数据中的用户通话文本数据进行预设关键字的匹配检测,关键字检测结果可以表征用户通话文本数据是否匹配预设关键字;以及若匹配到关键字,关键字检测结果还可以表征匹配到的关键字。例如匹配到再见等关键词,可能表征电话即将挂断。Keyword detection refers to the matching detection of preset keywords on the user call text data in the user call data. The keyword detection result can indicate whether the user call text data matches the preset keyword; and if the keyword is matched, the keyword detection result can also indicate the matched keyword. For example, matching keywords such as goodbye may indicate that the call is about to be hung up.

对话流程检测是指对当前通话中的用户通话数据以及客服通话数据进行实时检测,确定目前的通话在整个流程中所属的步骤,对话流程检测结果为当前的通话数据所属的步骤检测结果。例如在一些外呼业务中,有较为固定的流程步骤,在电话接通之后,在正常通话中可以包括以下步骤:第一步获得用户授权;第二步向用户介绍本次通话的目的;第三步介绍具体的业务项目,以及用户反馈意向;第四步可以询问对本次的服务是否满意,以及进行结束话术等。每一个步骤可以具有特定的模式或话术,通过对话流程检测即可确定当前通话所进行的步骤。Dialogue process detection refers to real-time detection of user call data and customer service call data in the current call, and determines the steps of the current call in the entire process. The dialogue process detection result is the step detection result of the current call data. For example, in some outbound call services, there are relatively fixed process steps. After the call is connected, a normal call may include the following steps: the first step is to obtain user authorization; the second step is to introduce the purpose of this call to the user; the third step is to introduce specific business projects and user feedback intentions; the fourth step can be to ask whether the service is satisfactory, and end the conversation, etc. Each step can have a specific pattern or conversation, and the steps of the current call can be determined through dialogue process detection.

可以理解的是,对用户通话数据的检测处理可以从情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中确定一项或多项检测处理;在需要进行多项检测时,可以根据需求确定检测顺序,当然也可以并发检测。It is understandable that the detection and processing of user call data can determine one or more detection processes from emotion detection, audio detection, speech rate detection, keyword detection and dialogue flow detection; when multiple detections are required, the detection order can be determined according to needs, and of course concurrent detection is also possible.

在另一个实施例中,可以不仅对用户通话数据进行检测,也可以对完整对话内容进行检测处理,获得通话检测结果。In another embodiment, not only the user call data but also the complete conversation content may be detected to obtain the call detection result.

在步骤S130中,基于通话检测结果,预测当前通话的结束时长,通话检测结果包括情绪检测结果、音频检测结果、语速检测结果、关键字检测结果以及通话步骤检测结果中的至少一项。可以利用预设的函数或者深度学习模型,基于通话检测结果,预测当前通话的结束时长,当前通话的结束时长是指当前通话距离结束通话所需的时间。In step S130, based on the call detection result, the end time of the current call is predicted, and the call detection result includes at least one of the emotion detection result, the audio detection result, the speech speed detection result, the keyword detection result, and the call step detection result. A preset function or deep learning model can be used to predict the end time of the current call based on the call detection result, and the end time of the current call refers to the time required for the current call to end.

在获取到当前通话的结束时长之后,可以基于当前通话的结束时长以及预设的时间阈值,确定是否呼叫下一通电话,例如若当前通话的结束时长小于时间阈值,则通过智能外呼系统呼叫下一通电话。时间阈值可以根据实际需求设置,例如15秒、20秒或30秒等,本申请对此不做限定。After obtaining the end time of the current call, it can be determined whether to call the next call based on the end time of the current call and the preset time threshold. For example, if the end time of the current call is less than the time threshold, the next call is called through the intelligent outbound calling system. The time threshold can be set according to actual needs, such as 15 seconds, 20 seconds or 30 seconds, etc., and this application does not limit this.

在上述的实现过程中,通过对当前通话中的用户通话数据进行实时检测,检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项,获得检测处理对应的至少一项通话检测结果,预测当前通话的结束时长,根据结束时长确认是否呼叫下一通电话。客服可以更快的接听到下一通电话,减少客服等待的时间,提高工作效率。In the above implementation process, by performing real-time detection on the user's call data in the current call, the detection process includes at least one of emotion detection, audio detection, speech speed detection, keyword detection and dialogue flow detection, obtaining at least one call detection result corresponding to the detection process, predicting the end time of the current call, and confirming whether to call the next call based on the end time. The customer service can answer the next call faster, reduce the waiting time of the customer service, and improve work efficiency.

请参见图2示出的本申请实施例提供的检测处理的流程示意图。Please refer to FIG. 2 which is a schematic diagram of the detection process provided in an embodiment of the present application.

可选的,在任一实施例的基础上,用户通话数据包括用户通话音频数据和/或用户通话文本数据;通话检测结果包括情绪检测结果;对用户通话数据进行检测处理,获得通话检测结果,包括:将用户通话数据输入预设的情绪检测模型,获得情绪检测结果;情绪检测模型通过对预设的情绪数据和情绪数据对应的情绪标签训练获得。Optionally, based on any embodiment, the user call data includes user call audio data and/or user call text data; the call detection result includes an emotion detection result; the user call data is detected and processed to obtain the call detection result, including: inputting the user call data into a preset emotion detection model to obtain the emotion detection result; the emotion detection model is obtained by training the preset emotion data and the emotion labels corresponding to the emotion data.

用户在通话过程中的情绪可以较好的反映出用户的通话意向,例如情绪积极或者中立的情况下,接听电话的意向度更高;而情绪消极的情况下挂断电话的意向度更高,即通话的结束时长会较短。因此,可以对用户在通话过程中的用户通话数据进行情绪检测,获得用户的情绪检测结果,将情绪检测结果作为预测当前通话的结束时长的因素之一。The user's emotions during a call can better reflect the user's call intention. For example, when the emotion is positive or neutral, the intention to answer the call is higher; when the emotion is negative, the intention to hang up the call is higher, that is, the call will end shorter. Therefore, the user's call data during the call can be emotionally detected to obtain the user's emotion detection results, and the emotion detection results can be used as one of the factors to predict the end time of the current call.

下面对生成情绪检测模型的过程进行描述。首先获得情绪数据,情绪数据包括情绪文本数据和/或情绪音频数据,根据训练数据的不同,情绪检测模型可以包括文本情绪检测模型、音频情绪检测模型以及多模态情绪检测模型。其中,文本情绪检测模型可以通过训练情绪文本数据获得,音频情绪检测模型通过训练情绪音频数据获得,以及多模态情绪检测模型通过训练情绪文本数据和情绪音频数据两种模态的信息获得。The process of generating an emotion detection model is described below. First, emotion data is obtained. The emotion data includes emotion text data and/or emotion audio data. Depending on the training data, the emotion detection model may include a text emotion detection model, an audio emotion detection model, and a multimodal emotion detection model. Among them, the text emotion detection model can be obtained by training emotion text data, the audio emotion detection model can be obtained by training emotion audio data, and the multimodal emotion detection model can be obtained by training information of both emotion text data and emotion audio data.

对于文本情绪检测模型,情绪文本数据可以通过对用户的语音做实时语音ASR识别获得。对情绪文本数据添加情绪标签,情绪标签可以为积极、消极、中立等,还可以是正常或异常等,情绪标签取决于所需的情绪分类粒度。还可以对情绪标签进行标记,例如异常为1,正常为0。对情绪文本数据进行预处理,使用词嵌入(如Word2Vec、GloVe等)将文本转换为固定长度的向量表示。之后选择合适的机器学习算法或深度学习模型,将预处理后的情绪文本数据和对应的情绪标签输入到模型中,进行训练,通过调整模型的参数之后获得文本情绪检测模型。For the text emotion detection model, the emotion text data can be obtained by real-time speech ASR recognition of the user's voice. Emotion labels are added to the emotion text data. The emotion labels can be positive, negative, neutral, normal or abnormal, etc. The emotion labels depend on the required emotion classification granularity. The emotion labels can also be marked, for example, abnormal is 1 and normal is 0. The emotion text data is preprocessed and the text is converted into a fixed-length vector representation using word embedding (such as Word2Vec, GloVe, etc.). Then, a suitable machine learning algorithm or deep learning model is selected, and the preprocessed emotion text data and the corresponding emotion labels are input into the model for training. The text emotion detection model is obtained by adjusting the parameters of the model.

对于音频情绪检测模型,可以对情绪音频数据添加情绪标签,情绪标签参见上述。还可以对情绪音频数据进行降噪、采样转换率或分帧等处理。之后进行提取音频特征,并根据需求筛选合适的音频特征作为音频情绪检测模型的训练数据。选择合适的机器学习算法或深度学习模型,如卷积神经网络(CNN)和循环神经网络(RNN)等,将预处理后的音频特征和对应的情绪标签输入到模型中,进行训练,通过调整模型的参数之后获得音频情绪检测模型。For the audio emotion detection model, you can add emotion tags to the emotion audio data. For emotion tags, see the above. You can also perform noise reduction, sampling conversion rate or frame division on the emotion audio data. Then extract the audio features and select the appropriate audio features as the training data for the audio emotion detection model according to the needs. Select the appropriate machine learning algorithm or deep learning model, such as convolutional neural network (CNN) and recurrent neural network (RNN), input the preprocessed audio features and corresponding emotion tags into the model for training, and obtain the audio emotion detection model by adjusting the parameters of the model.

对于多模态情绪检测模型,可以对文本特征和音频特征进行融合,获得多模态特征,融合方式包括通过将文本特征和音频特征进行拼接或加权平均等。选择一个能够处理多模态输入的模型,例如基于深度学习的多模态融合模型,将多模态特征输入到模型中,并使用带有情绪标签的数据集进行训练,通过调整模型的参数之后获得多模态情绪检测模型。For the multimodal emotion detection model, text features and audio features can be fused to obtain multimodal features. The fusion methods include concatenating or weighted averaging text features and audio features. Select a model that can handle multimodal input, such as a multimodal fusion model based on deep learning, input multimodal features into the model, and train it with a dataset with emotion labels. After adjusting the parameters of the model, a multimodal emotion detection model is obtained.

在进行情绪检测时,将用户通话数据输入预设的情绪检测模型,获得情绪检测结果,例如将用户通话音频数据输入音频情绪检测模型,获得情绪检测结果;或者将用户通话文本数据输入文本情绪检测模型,获得情绪检测结果;再或者将用户通话音频数据和用户通话文本数据输入多模态情绪检测模型,获得情绪检测结果。When performing emotion detection, the user call data is input into a preset emotion detection model to obtain an emotion detection result. For example, the user call audio data is input into an audio emotion detection model to obtain an emotion detection result; or the user call text data is input into a text emotion detection model to obtain an emotion detection result; or the user call audio data and the user call text data are input into a multimodal emotion detection model to obtain an emotion detection result.

在上述的实现过程中,利用情绪检测模型对用户通话数据中的用户通话音频数据和/或用户通话文本数据镜像情绪检测,获得情绪检测结果。情绪状态可以一定程度反映电话用户接听或挂断电话的意愿,将情绪检测结果作为预测当前通话的结束时长的因素之一,提高结束时间预测的准确性。In the above implementation process, the emotion detection model is used to perform emotion detection on the user call audio data and/or the user call text data in the user call data to obtain the emotion detection result. The emotional state can reflect the willingness of the phone user to answer or hang up the call to a certain extent. The emotion detection result is used as one of the factors for predicting the end time of the current call, thereby improving the accuracy of the end time prediction.

可选的,在任一实施例的基础上,通话检测结果包括音频检测结果,对用户通话数据进行检测处理,获得通话检测结果,包括:将用户通话数据中的用户通话音频数据进行音频检测,获得音频检测结果;音频检测包括声调检测和/或分贝检测。Optionally, based on any embodiment, the call detection result includes an audio detection result, and the user call data is detected and processed to obtain the call detection result, including: performing audio detection on the user call audio data in the user call data to obtain the audio detection result; the audio detection includes tone detection and/or decibel detection.

声调检测或分贝检测在一定程度上可以反映出用户的通话意向,例如声调多为疑问句的声调,那么可能表征用户对通话内容较为感兴趣,想进一步咨询更多信息。Tone detection or decibel detection can reflect the user's call intention to a certain extent. For example, if the tone is mostly that of a question sentence, it may indicate that the user is more interested in the content of the call and wants to inquire for more information.

基于此考虑,可以对用户通话音频数据进行声调检测和/或分贝检测,相应的音频检测结果包括声调检测结果和/或分贝检测结果。声调检测例如提取用户通话音频数据的频率特征,基于频率特征识别声调,获得声调检测结果。分贝检测例如对用户通话音频数据的声波进行测量,获得分贝检测结果。Based on this consideration, tone detection and/or decibel detection can be performed on the user's call audio data, and the corresponding audio detection result includes a tone detection result and/or a decibel detection result. Tone detection, for example, extracts the frequency characteristics of the user's call audio data, identifies the tone based on the frequency characteristics, and obtains the tone detection result. Decibel detection, for example, measures the sound waves of the user's call audio data to obtain the decibel detection result.

在上述的实现过程中,对用户通话数据中的用户通话音频数据进行音频检测,获得音频检测结果,音频检测结果包括声调检测结果和/或分贝检测结果。将音频检测结果作为预测当前通话的结束时长的因素之一,提高结束时间预测的准确性。In the above implementation process, audio detection is performed on the user call audio data in the user call data to obtain an audio detection result, which includes a tone detection result and/or a decibel detection result. The audio detection result is used as one of the factors for predicting the end time of the current call, thereby improving the accuracy of the end time prediction.

可选的,在任一实施例的基础上,通话检测结果包括通话步骤检测结果,对用户通话数据进行检测处理,获得通话检测结果,包括:获取当前通话中的客服通话数据;将客服通话数据和用户通话数据输入预设的通话流程模型,获得通话步骤检测结果。例如通话流程模型将客服通话数据和用户通话数据分别进行特征提取,获得待匹配特征;利用通话流程模型将待匹配特征与通话流程模型中步骤特征进行分类匹配,获得目标匹配特征,将目标匹配特征对应的步骤作为通话步骤检测结果。Optionally, based on any embodiment, the call detection result includes a call step detection result, and the user call data is detected and processed to obtain the call detection result, including: obtaining the customer service call data in the current call; inputting the customer service call data and the user call data into a preset call process model to obtain the call step detection result. For example, the call process model extracts features from the customer service call data and the user call data to obtain features to be matched; the call process model is used to classify and match the features to be matched with the step features in the call process model to obtain the target matching features, and the steps corresponding to the target matching features are used as the call step detection results.

下面对生成通话流程模型的过程进行描述。对通话流程中每个步骤对应的问答数据进行特征提取,获得步骤特征。该步骤特征能够反映出文本所属流程步骤的特征。对步骤特征添加对应的步骤标签,步骤标签用于表征该问答数据所属的流程步骤。根据问题的特点和数据的规模,选择合适的机器学习或深度学习模型,使用标注好的数据训练模型,将步骤特征以及步骤特征对应的步骤标签输入到模型中进行训练,通过调整模型的参数和结构,优化模型的性能,获得最终的通话流程模型。The following describes the process of generating a call flow model. Feature extraction is performed on the question and answer data corresponding to each step in the call flow to obtain step features. The step features can reflect the characteristics of the process step to which the text belongs. Corresponding step labels are added to the step features, and the step labels are used to characterize the process step to which the question and answer data belongs. According to the characteristics of the problem and the scale of the data, a suitable machine learning or deep learning model is selected, and the model is trained using labeled data. The step features and the step labels corresponding to the step features are input into the model for training. By adjusting the parameters and structure of the model, the performance of the model is optimized to obtain the final call flow model.

在上述的实现过程中,利用通话流程模型,根据客服通话数据和用户通话数据获得通话步骤检测结果。通话步骤检测结果表征当前通话进行到哪一步骤,较大程度上反映出当前通话距离结束呼叫还需要的时间,如当通话进行到较靠后的步骤,那么结束时长可能较短。根据通话步骤检测结果预测当前通话的结束时长,提高结束时间预测的准确性。In the above implementation process, the call process model is used to obtain the call step detection results based on the customer service call data and the user call data. The call step detection results represent the step of the current call, and to a large extent reflect the time required for the current call to end. For example, when the call is at a later step, the end time may be shorter. The end time of the current call is predicted based on the call step detection results to improve the accuracy of the end time prediction.

可选的,在任一实施例的基础上,基于通话检测结果,预测当前通话的结束时长,包括:利用预设的拟合预测函数,根据通话检测结果,获得当前通话的结束时长。Optionally, based on any embodiment, the end duration of the current call is predicted based on the call detection result, including: using a preset fitting prediction function to obtain the end duration of the current call according to the call detection result.

预设的拟合预测函数的函数表达式为指数类函数。The function expression of the preset fitting prediction function is an exponential function.

通话检测结果包括情绪检测结果、音频检测结果、语速检测结果、关键字检测结果以及通话步骤检测结果,拟合预测函数,包括:The call detection results include emotion detection results, audio detection results, speech speed detection results, keyword detection results, and call step detection results, and the fitting prediction function includes:

其中,y为当前通话的结束时长,k为第一参数,emotion.pt为情绪检测结果,a为情绪检测参数,Tone.pt为音频检测结果,b为音频检测参数,Speed.pt为语速检测结果,c为语速检测参数,Key.pt为关键字检测结果,d为关键字检测参数,Process.pt为对话流程检测结果,e为对话流程检测参数。Among them, y is the end time of the current call, k is the first parameter, emotion.pt is the emotion detection result, a is the emotion detection parameter, Tone.pt is the audio detection result, b is the audio detection parameter, Speed.pt is the speech speed detection result, c is the speech speed detection parameter, Key.pt is the keyword detection result, d is the keyword detection parameter, Process.pt is the dialogue process detection result, and e is the dialogue process detection parameter.

拟合预测函数中的第一参数、情绪检测参数、音频检测参数、语速检测参数、关键字检测结果以及对话流程检测参数可以根据历史数据确定。The first parameter in the fitting prediction function, the emotion detection parameter, the audio detection parameter, the speech rate detection parameter, the keyword detection result, and the dialogue flow detection parameter can be determined based on historical data.

在上述的实现过程中,利用预设的拟合预测函数,根据通话检测结果,获得当前通话的结束时长,在当前通话的结束时长小于时间阈值的情况下呼叫下一通电话。客服可以更快的接听到下一通电话,减少客服等待的时间,提高工作效率。In the above implementation process, the preset fitting prediction function is used to obtain the end time of the current call according to the call detection result, and the next call is called when the end time of the current call is less than the time threshold. The customer service can answer the next call faster, reduce the waiting time of the customer service, and improve work efficiency.

可选的,在任一实施例的基础上,可能存在客服的上一通电话还未挂断,智能外呼系统呼叫下一通电话,且下一通电话已经接通的情况,那么可以利用智能外呼机器人与下一通电话的用户进行通话,并将智能外呼机器人与下一通电话的用户的通话内容进行实时展示,以使人工客服或坐席根据通话内容决定是否要接管下一通电话,服务下一通电话的用户。Optionally, based on any embodiment, there may be a situation where the customer service's previous call has not been hung up, the intelligent outbound call system calls the next call, and the next call has been connected. In this case, the intelligent outbound call robot can be used to talk to the user of the next call, and the content of the conversation between the intelligent outbound call robot and the user of the next call can be displayed in real time, so that the manual customer service or agent can decide whether to take over the next call and serve the user of the next call based on the content of the call.

在确认呼叫下一通电话之后,方法还包括:通过智能外呼系统自动呼叫下一通电话,并利用智能外呼机器人与下一通电话对应的第二用户进行通话。智能外呼机器人是智能外呼系统提供的基于人工智能的客户服务功能,智能外呼机器人可以使用语音与第二用户进行交流。After confirming the next call, the method further includes: automatically calling the next call through the intelligent outbound calling system, and using the intelligent outbound calling robot to talk to the second user corresponding to the next call. The intelligent outbound calling robot is an artificial intelligence-based customer service function provided by the intelligent outbound calling system, and the intelligent outbound calling robot can communicate with the second user using voice.

对第二用户的通话语音数据进行实时语音识别,获得用户实时通话文本,以及用户实时通话文本的时间。例如在第二用户的说话过程中,对通话语音数据做实时语音端点检测技术(Voice Activity Detection,VAD)检测,VAD)检测可以在语音中准确地定位出语音的开始和结束点。在第二用户说话时,则对通话语音数据开启实时流ASR识别服务,ASR识别服务的作用是将人类语音转换为计算机可读的文本或命令。记录通话语音数据中用户实时通话文本,以及用户实时通话文本中每一句语音的时间,这里的时间是指通话过程的时间,例如在通话过程中的第10秒钟,或在通话过程中的第1分钟等。Perform real-time speech recognition on the second user's call voice data to obtain the user's real-time call text and the time of the user's real-time call text. For example, during the second user's speech, perform real-time voice activity detection (VAD) detection on the call voice data. VAD detection can accurately locate the start and end points of the voice in the voice. When the second user is speaking, start the real-time stream ASR recognition service for the call voice data. The function of the ASR recognition service is to convert human voice into computer-readable text or commands. Record the user's real-time call text in the call voice data and the time of each sentence in the user's real-time call text. The time here refers to the time of the call process, such as the 10th second during the call, or the 1st minute during the call, etc.

示例性的,第二用户的用户实时通话文本,以及用户实时通话文本的时间可以表示为:{“00.09”:“你找我什么事?”,“00.27”:“你打错了我没贷款”,“......”},其中,“00.09”为用户实时通话文本“你找我什么事?”该句语音的时间,可以表示通话开始的第9秒。Exemplarily, the real-time call text of the second user and the time of the real-time call text of the user can be expressed as: {"00.09": "What do you want from me?", "00.27": "You called the wrong number. I don't have a loan", "..."}, where "00.09" is the time of the voice of the real-time call text of the user "What do you want from me?", which can indicate the 9th second from the start of the call.

获取智能外呼机器人的智能外呼机器人实时通话文本,以及智能外呼机器人实时通话文本的时间;智能外呼机器人实时通话文本根据预设文本和/或第二用户的通话语音数据生成。在一种情况下,智能外呼机器人对第二用户发出的语音是按照文本合成的,因此无需对智能外呼机器人进行语音识别,而是可以直接获取智能外呼机器人的实时通话文本,以及智能外呼机器人对第二用户对话的时间。示例性的智能外呼机器人实时通话文本,以及智能外呼机器人实时通话文本的时间可以表示为:{“00.00”:“请问你是XX先生吗?”,“00.10”:“你在我行的贷款已逾期”,“......”}。Obtain the real-time call text of the intelligent outbound call robot and the time of the real-time call text of the intelligent outbound call robot; the real-time call text of the intelligent outbound call robot is generated according to the preset text and/or the call voice data of the second user. In one case, the voice issued by the intelligent outbound call robot to the second user is synthesized according to the text, so there is no need to perform voice recognition on the intelligent outbound call robot, but the real-time call text of the intelligent outbound call robot and the time of the conversation between the intelligent outbound call robot and the second user can be directly obtained. Exemplary real-time call text of the intelligent outbound call robot and the time of the real-time call text of the intelligent outbound call robot can be expressed as: {"00.00": "Excuse me, are you Mr. XX?", "00.10": "Your loan in our bank is overdue", "..."}.

按照时间顺序将用户实时通话文本和智能外呼机器人实时通话文本进行排序,获得智能外呼机器人与第二用户的对话内容。接上述实施例,对话内容可以为:{“00.00”智能外呼机器人:“请问你是XX先生吗?”,“00.09”第二用户:“你找我什么事?”,“00.10” 智能外呼机器人:“你在我行的贷款已经逾期了”,“00.27” 第二用户:“你打错了我没贷款”,“......”}。The real-time call text of the user and the real-time call text of the intelligent outbound call robot are sorted in chronological order to obtain the conversation content between the intelligent outbound call robot and the second user. Continuing with the above embodiment, the conversation content can be: {“00.00” intelligent outbound call robot: “Excuse me, are you Mr. XX?”, “00.09” second user: “What do you want to talk to me about?”, “00.10” intelligent outbound call robot: “Your loan in our bank has been overdue”, “00.27” second user: “You called the wrong number. I don’t have a loan”, “…”}.

将对话内容进行展示,可以在人工客服的显示界面展示,以使人工客服根据智能外呼机器人与第二用户的对话内容,确认是否接管智能外呼机器人,继续与第二用户进行通话。例如,在智能外呼机器人无法确定回答的情况下,人工客服可以接管智能了客服,解答第二用户的问题,提高用户的体验感。The content of the conversation can be displayed on the display interface of the manual customer service, so that the manual customer service can confirm whether to take over the intelligent outbound call robot and continue the conversation with the second user based on the content of the conversation between the intelligent outbound call robot and the second user. For example, if the intelligent outbound call robot cannot determine the answer, the manual customer service can take over the intelligent customer service and answer the second user's questions to improve the user experience.

在上述的实现过程中,将智能外呼机器人与第二用户的通话内容进行实时展示,以使客服根据通话内容决定是否要接管下一通电话,服务第二用户。减少客服等待的时间,提高工作效率的同时,提高用户的体验感,以及服务质量。In the above implementation process, the conversation content between the intelligent outbound call robot and the second user is displayed in real time, so that the customer service can decide whether to take over the next call and serve the second user according to the conversation content. This reduces the waiting time of the customer service, improves work efficiency, and improves the user experience and service quality.

图2中的语音情绪模型用于进行情绪检测,声调和分贝检测对应音频检测,还包括语速检测、关键字检测和对话流程检测,这些检测用于进行AI机器人指数预测,即用于预测当前通话的结束时长,并基于当前通话的结束时长,确定是否呼叫下一通电话。The speech emotion model in Figure 2 is used for emotion detection. Tone and decibel detection correspond to audio detection. It also includes speech rate detection, keyword detection and dialogue flow detection. These detections are used to predict the AI robot index, that is, to predict the end time of the current call and determine whether to make the next call based on the end time of the current call.

请参见图3示出的本申请实施例提供的基于文本总结模型获得通话总结的示意图。Please refer to FIG. 3 , which is a schematic diagram of obtaining a call summary based on a text summary model provided in an embodiment of the present application.

可选的,在任一实施例的基础上,方法还包括:获取当前通话的对话数据;对话数据包括智能外呼机器人与用户的对话数据和/或客服与用户的对话数据;将对话数据输入预设的文本总结模型,获得通话总结。作为一种实施方式可以调用文本总结模型对应的接口生成通话总结。Optionally, based on any embodiment, the method further includes: obtaining conversation data of the current call; the conversation data includes conversation data between the intelligent outbound call robot and the user and/or conversation data between the customer service and the user; inputting the conversation data into a preset text summary model to obtain a call summary. As an implementation method, an interface corresponding to the text summary model can be called to generate a call summary.

示例性的,对话数据可以为:{“坐席A”:“你好,先生”,“客户B”:“找我啥事”,“坐席A”:“我们是机构的,我们机构近期推出一款贷款产品,请问你有兴趣没?”,“客户B”:“哦,你们这个利率有点高啊”,“坐席A”:“我们在整个行业来看利率是很低的了”,“.......”}For example, the conversation data may be: {“Agent A”: “Hello, sir”, “Customer B”: “What can I do for you?”, “Agent A”: “We are from an institution. Our institution has recently launched a loan product. Are you interested?”, “Customer B”: “Oh, your interest rate is a bit high”, “Agent A”: “Our interest rate is very low in the entire industry”, “….”}

通话总结可以为:{“业务”:“贷款产品”;“是否解决”:“是”;“是否回访”:“是”;“通话内容总结”:“客户目前无意愿,同时也觉得利率高,如果利率降低,希望去营业厅了解”}。The call summary can be: {“Business”: “Loan product”; “Is it resolved”: “Yes”; “Is there a follow-up visit”: “Yes”; “Summary of call content”: “The customer currently has no intention and also feels that the interest rate is high. If the interest rate is lowered, he hopes to go to the business hall to find out”}.

在一个可选的实施例中,通过LORA微调生成通话总结之后,可以通过调用接口API,并进行语音转换生成目标用户语句。同时,还可以在语音转换获取目标用户语句之后,通过调用接口API生成通话总结。In an optional embodiment, after the call summary is generated by LORA fine-tuning, the target user's sentence can be generated by calling the interface API and performing voice conversion. At the same time, the call summary can also be generated by calling the interface API after the target user's sentence is obtained by voice conversion.

下面对生成文本总结模型的过程进行描述。LORA是指低秩自适应方法(Low-RankAdaptation),文本总结模型为使用LORA微调对预训练模型进行微调获得,是指使用低秩自适应方法(Low-Rank Adaptation)对预训练模型进行微调,获得文本总结模型。预训练模型包括基于Qwen-72b或ChatGLM搭建大语言模型。LORA通过添加一小部分可训练参数来优化预训练模型,而不需要修改原始的预训练模型的全部参数。The following describes the process of generating a text summary model. LORA refers to the Low-Rank Adaptation method. The text summary model is obtained by fine-tuning the pre-trained model using LORA fine-tuning. It means fine-tuning the pre-trained model using the Low-Rank Adaptation method to obtain the text summary model. The pre-trained model includes building a large language model based on Qwen-72b or ChatGLM. LORA optimizes the pre-trained model by adding a small number of trainable parameters without modifying all the parameters of the original pre-trained model.

对预训练模型进行LORA微调,包括:加载预训练模型,并保持预训练模型的预设权重参数不变。例如加载预训练的大模型到GPU显卡A100上,确保模型可以在A100上正确运行,并且冻结预训练模型的预设权重参数。Fine-tune the pre-trained model using LoRa, including: loading the pre-trained model and keeping the preset weight parameters of the pre-trained model unchanged. For example, load a large pre-trained model to the GPU graphics card A100, ensure that the model can run correctly on the A100, and freeze the preset weight parameters of the pre-trained model.

之后在预训练模型中添加可训练层。例如根据模型架构和微调需求,在预训练模型中的每个Transformer块中注入LORA层。这些可训练层将用于替代部分原始参数,以及设置合适的秩大小,这决定了可训练层的规模。Then add trainable layers to the pre-trained model. For example, inject LORA layers into each Transformer block in the pre-trained model according to the model architecture and fine-tuning requirements. These trainable layers will be used to replace some of the original parameters and set the appropriate rank size, which determines the scale of the trainable layer.

准备用于微调的对话样本数据集,并将对话样本数据集加载到GPU显卡A100上,对话样本数据集应以适当的格式进行预处理,以便能够输入到预训练模型中。对话样本数据集的格式参照上述的“坐席A”和“客户B”的对话数据以及通话总结。若对话样本数据集为用户的真实数据,需在记录之前获得用户的同意。Prepare a conversation sample dataset for fine-tuning and load it onto the GPU graphics card A100. The conversation sample dataset should be preprocessed in an appropriate format so that it can be input into the pre-trained model. The format of the conversation sample dataset refers to the conversation data and call summary of "Agent A" and "Customer B" mentioned above. If the conversation sample dataset is the user's real data, the user's consent must be obtained before recording.

通过低秩自适应方法,使用预设的对话样本数据集对可训练层进行训练;利用训练好的可训练层更新预训练模型,获得文本总结模型。配置训练参数,如学习率、批量大小、训练轮数等,这些参数将影响微调的效果和速度。在GPU显卡A100上启动训练过程。只有LORA层的参数会被更新,而原始的预训练模型的预设权重参数保持不变。Through the low-rank adaptive method, the trainable layer is trained using the preset dialogue sample dataset; the pre-trained model is updated with the trained trainable layer to obtain the text summarization model. Configure the training parameters, such as learning rate, batch size, number of training rounds, etc. These parameters will affect the effect and speed of fine-tuning. Start the training process on the GPU graphics card A100. Only the parameters of the LORA layer will be updated, while the preset weight parameters of the original pre-trained model remain unchanged.

在上述的实现过程中,将对话数据输入预设的文本总结模型,获得通话总结,减少人工记录通话总结的时间,提高总结效率。并且文本总结模型是对预训练模型进行LORA微调获得,改善预训练模型文本内容总结能力弱的问题,微调之后强化其在专业领域的总结能力,提高总结的准确性。In the above implementation process, the conversation data is input into the preset text summary model to obtain the call summary, which reduces the time of manually recording the call summary and improves the summary efficiency. In addition, the text summary model is obtained by fine-tuning the pre-trained model with LORA to improve the problem of the weak text content summary ability of the pre-trained model. After fine-tuning, its summary ability in professional fields is strengthened and the accuracy of the summary is improved.

在一个可选的实施例中,还可以对对话内容进行质量检查,在客服和客户对话结束时,需要对客户侧和客服测语音进行语音质检,识别客服客户是否有投诉倾向,识别客服是否按照流程要求作业,客服是否合规作业等,质检该对话内容是否存在问题需要立即处理或者告警。In an optional embodiment, the content of the conversation can also be quality checked. At the end of the conversation between the customer service and the client, it is necessary to perform voice quality checks on the client side and the customer service voice to identify whether the customer service client has a tendency to complain, whether the customer service is operating in accordance with the process requirements, whether the customer service is operating in compliance, etc., to quality check whether there are any problems with the content of the conversation that need to be dealt with immediately or an alarm needs to be issued.

质检包括客户侧语音质检和客服侧语音质检。客户侧语音质检:对于客户侧是为了识别投诉,失联等情况,在获得用户同意之后,可以使用关键字匹配,如“投诉”等词语识别用户是否投诉;如果关键字未命中,则使用Fasttext建立一个投诉语义预测模型,基于文本对话内容去识别用户是否投诉,做到语义识别。客服侧语音质检:对于客服侧,需要识别坐席是否按照流程来操作,客服是否存在情绪不稳定,大声怒吼或抢插话等,基于关键字匹配和正则匹配识别坐席侧的异常行为,通话VAD检测对语音分段,基于分段后的语音时间定位是否存在抢插话等。Quality inspection includes voice quality inspection on the client side and voice quality inspection on the customer service side. Voice quality inspection on the client side: For the client side, it is to identify complaints, loss of connection, etc. After obtaining the user's consent, keyword matching can be used, such as "complaint" and other words to identify whether the user has complained; if the keyword is not hit, Fasttext is used to establish a complaint semantic prediction model to identify whether the user has complained based on the text conversation content to achieve semantic recognition. Voice quality inspection on the customer service side: For the customer service side, it is necessary to identify whether the agent is operating according to the process, whether the customer service is emotionally unstable, yelling loudly or interrupting, etc., identify abnormal behavior on the agent side based on keyword matching and regular matching, segment the voice with call VAD detection, and locate whether there is interruption based on the time of the segmented voice.

请参见图4示出的本申请示出的话术推荐的流程示意图。Please refer to FIG. 4 for a flow chart of the speech recommendation process shown in the present application.

可选的,在任一实施例的基础上,方法还包括:对用户通话数据中的用户通话音频数据进行识别,获得用户通话文本数据;对用户通话文本数据进行标点符号预测,并根据标点符号预测结果对用户通话文本数据进行划分,获得目标用户语句。目标用户语句可以是通话文本数据划分出的语句集中任意语句,为了提高实效性,目标用户语句还可以是通话文本数据划分出的语句集中的最新的一个语句,也即用户提出的最新的问题。Optionally, based on any embodiment, the method further includes: identifying the user call audio data in the user call data to obtain the user call text data; predicting punctuation marks on the user call text data, and dividing the user call text data according to the punctuation mark prediction results to obtain the target user sentence. The target user sentence can be any sentence in the sentence set divided from the call text data. In order to improve effectiveness, the target user sentence can also be the latest sentence in the sentence set divided from the call text data, that is, the latest question raised by the user.

将目标用户语句输入预设的话术推荐模型,利用预设的话术推荐模型,根据目标用户语句,获得对应的答复话术。可以将答复话术结合预设的提示模板,展示给客服,由客服回答用户。Input the target user's sentence into the preset speech recommendation model, and use the preset speech recommendation model to obtain the corresponding reply speech according to the target user's sentence. The reply speech can be combined with the preset prompt template and displayed to the customer service, who will answer the user.

下面对生成话术推荐模型的过程进行描述。获取知识库的数据以及预先搭建的大语言模型,如图4,知识库数据可以是本地问答知识库,大语言模型可以用于进行大模型本地问答,大模型本地问答是指用大语言模型进行本地的问答查询的过程。知识库数据可以是一个业务领域的问答数据,可以对问答数据进行清洗、标记和预处理,预处理包括对知识库的数据进行文本化、文本分割,并生成文本Chunks(组块)。大语言模型可以为LLM,选择合适的LLM模型架构:例如Transformer架构。The following describes the process of generating a speech recommendation model. Obtain the data of the knowledge base and the pre-built large language model, as shown in Figure 4. The knowledge base data can be a local question-and-answer knowledge base, and the large language model can be used for large-model local question-and-answer. Large-model local question-and-answer refers to the process of using a large language model to perform local question-and-answer queries. The knowledge base data can be question-and-answer data in a business field. The question-and-answer data can be cleaned, marked, and preprocessed. The preprocessing includes texting the data of the knowledge base, segmenting the text, and generating text chunks. The large language model can be LLM. Select a suitable LLM model architecture: for example, the Transformer architecture.

将大语言模型和语言链工具集成:语言链工具可以为lang-chain,lang-chain是一个用于构建复杂文本处理管道的框架,lang-chain可以与特定类型的API接口进行交互。因此,可以为LLM创建一个包装器(wrapper),使其符合lang-chain的期望格式。lang-chain的期望格式包括输入格式和输出格式等,其中输入格式可以是字符串或指令等;输出格式也可以是字符串,还可以是JSON、XML或其他结构化数据格式等。这个包装器应该能够接收lang-chain发送的请求,将其转换为LLM可以理解的格式,之后将大语言模型和语言链工具集成。Integrate the large language model with the language chain tool: The language chain tool can be lang-chain, which is a framework for building complex text processing pipelines. Lang-chain can interact with specific types of API interfaces. Therefore, a wrapper can be created for LLM to conform to the expected format of lang-chain. The expected format of lang-chain includes input format and output format, etc., where the input format can be a string or an instruction, etc.; the output format can also be a string, or JSON, XML or other structured data formats, etc. This wrapper should be able to receive the request sent by lang-chain, convert it into a format that LLM can understand, and then integrate the large language model with the language chain tool.

利用语言链工具对知识库数据进行向量化获得。使用lang-chain提供的工具或API将处理过的文档导入。在这个过程中,lang-chain会使用某种向量表示方法(如词嵌入、TF-IDF等)将文档内容转换为向量形式,可以记为向量Store(存储)。这些向量能够捕捉文档中的语义信息,使得后续的向量相似度计算和问答匹配更加准确。Use the language chain tool to vectorize the knowledge base data. Use the tools or APIs provided by lang-chain to import the processed documents. In this process, lang-chain will use some vector representation method (such as word embedding, TF-IDF, etc.) to convert the document content into vector form, which can be recorded as vector Store. These vectors can capture the semantic information in the document, making the subsequent vector similarity calculation and question-answer matching more accurate.

有了向量化后的知识库,即获得话术推荐模型。可以对目标用户语句进行向量化表示,即图4中的查询向量化,获得用户向量,用户向量即为查询向量,将用户向量与话术推荐模型中向量化后的知识库(向量Store)进行向量相似度查询,获得用户向量对应的答复话术,标记为相关Chunks。获取提示模板,并从用户的查询中获取提示词,将提示词输入提示模板,生成问答查询中的问句,之后利用大模型(大语言模型)进行话术生成。With the vectorized knowledge base, we can obtain the speech recommendation model. We can vectorize the target user's sentence, that is, the query vectorization in Figure 4, to obtain the user vector, which is the query vector. We can perform a vector similarity query between the user vector and the vectorized knowledge base (vector store) in the speech recommendation model to obtain the reply speech corresponding to the user vector and mark it as related chunks. We can obtain the prompt template and the prompt word from the user's query, input the prompt word into the prompt template, generate the question in the question-answer query, and then use the large model (large language model) to generate speech.

在上述的实现过程中,利用预设的话术推荐模型,根据目标用户语句,获得对应的答复话术,提高话术推荐的准确性和效率。In the above implementation process, the preset speech recommendation model is used to obtain the corresponding reply speech according to the target user's sentence, thereby improving the accuracy and efficiency of the speech recommendation.

请参见图5示出的本申请示出的智能呼叫的流程示意图。Please refer to FIG. 5 which is a schematic diagram of the process of the intelligent call shown in the present application.

在一个可选的实施例中,在首次使用智能呼叫系统时,即AI首次呼叫时,用户可以选择和自己业务有关的智能外呼机器人拨打电话,智能外呼机器人获取手机号和客户信息后,利用主动外呼功能,调用呼叫平台的SIP去拨打电话。当电话打通后,利用智能呼叫系统的转人工功能将该通电话插入到人工坐席队列。电话进入到人工队列后,坐席会立马接起和客户进行对话,该自动呼叫过程用户无感知的情况下,提升用户体验感。In an optional embodiment, when using the intelligent call system for the first time, that is, when AI calls for the first time, the user can choose an intelligent outbound call robot related to his or her business to make a call. After the intelligent outbound call robot obtains the mobile phone number and customer information, it uses the active outbound call function to call the SIP of the call platform to make a call. When the call is connected, the transfer to manual function of the intelligent call system is used to insert the call into the manual agent queue. After the call enters the manual queue, the agent will immediately pick up the call and talk to the customer. The user experience is improved without the user's perception of the automatic call process.

之后进行坐席实时对话和话术推荐,例如坐席接起电话后,和客户实时沟通,可以利用预设的话术推荐模型,根据目标用户语句,获得对应的答复话术。并且在呼叫过程中,可以通过对话术进行实时结束预测,确认是否拨打下一通电话,即AI预测呼叫。若智能外呼机器人提前拨通下一通电话,客服可以根据智能外呼机器人与第二用户的对话内容,确认是否接管智能外呼机器人,继续与第二用户进行通话,实现客服坐席与智能外呼机器人的协同作业,也即实现坐席和AI预测式外呼协同。After that, the agent will have a real-time conversation and recommend words of conversation. For example, after the agent picks up the phone and communicates with the customer in real time, he can use the preset word recommendation model to obtain the corresponding reply words according to the target user's statement. And during the call, the conversation can be used to predict the end in real time to confirm whether to make the next call, that is, AI predictive call. If the intelligent outbound robot dials the next call in advance, the customer service can confirm whether to take over the intelligent outbound robot based on the conversation content between the intelligent outbound robot and the second user, and continue to talk to the second user, so as to realize the collaborative operation of the customer service agent and the intelligent outbound robot, that is, to realize the collaboration between the agent and AI predictive outbound call.

请参见图6示出的本申请实施例提供的基于AI的智能语音呼叫预测装置200的结构示意图;本申请实施例提供了一种基于AI的智能语音呼叫预测装置200,包括:Please refer to FIG. 6 for a schematic diagram of the structure of an AI-based intelligent voice call prediction device 200 provided in an embodiment of the present application. The embodiment of the present application provides an AI-based intelligent voice call prediction device 200, including:

接收用户数据模块210,用于接收当前通话中的用户通话数据;The user data receiving module 210 is used to receive the call data of the user in the current call;

检测模块220,用于对用户通话数据进行检测处理,获得通话检测结果;检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项;The detection module 220 is used to detect and process the user's call data to obtain a call detection result; the detection process includes at least one of emotion detection, audio detection, speech speed detection, keyword detection and dialogue flow detection;

呼叫预测模块230,用于基于通话检测结果,预测当前通话的结束时长,基于当前通话的结束时长,确定是否呼叫下一通电话。The call prediction module 230 is used to predict the end time of the current call based on the call detection result, and determine whether to make the next call based on the end time of the current call.

可选地,在任一实施例的基础上,基于AI的智能语音呼叫预测装置200,用户通话数据包括用户通话音频数据和/或用户通话文本数据;通话检测结果包括情绪检测结果;检测模块220,还用于将用户通话数据输入预设的情绪检测模型,获得情绪检测结果;情绪检测模型通过对预设的情绪数据和情绪数据对应的情绪标签训练获得。Optionally, based on any embodiment, in the AI-based intelligent voice call prediction device 200, the user call data includes user call audio data and/or user call text data; the call detection result includes an emotion detection result; the detection module 220 is also used to input the user call data into a preset emotion detection model to obtain an emotion detection result; the emotion detection model is obtained by training the preset emotion data and the emotion labels corresponding to the emotion data.

可选地,在任一实施例的基础上,基于AI的智能语音呼叫预测装置200,通话检测结果包括音频检测结果,检测模块220,还用于将用户通话数据中的用户通话音频数据进行音频检测,获得音频检测结果;音频检测包括声调检测和/或分贝检测。Optionally, based on any embodiment, the AI-based intelligent voice call prediction device 200, the call detection result includes an audio detection result, and the detection module 220 is also used to perform audio detection on the user call audio data in the user call data to obtain the audio detection result; the audio detection includes tone detection and/or decibel detection.

可选地,在任一实施例的基础上,基于AI的智能语音呼叫预测装置200,通话检测结果包括通话步骤检测结果,检测模块220,还用于获取当前通话中的客服通话数据;将客服通话数据和用户通话数据输入预设的通话流程模型,获得通话步骤检测结果;通话流程模型通过对通话流程中每个步骤对应的问答数据进行特征提取,获得步骤特征;对步骤特征添加对应的步骤标签;对步骤特征以及步骤特征对应的标签进行训练获得。Optionally, based on any embodiment, the AI-based intelligent voice call prediction device 200, the call detection result includes the call step detection result, the detection module 220, and is also used to obtain customer service call data in the current call; the customer service call data and the user call data are input into a preset call process model to obtain the call step detection result; the call process model obtains step features by extracting features from the question and answer data corresponding to each step in the call process; adding corresponding step labels to the step features; and training the step features and the labels corresponding to the step features.

可选地,在任一实施例的基础上,基于AI的智能语音呼叫预测装置200,呼叫预测模块230,用于利用预设的拟合预测函数,根据通话检测结果,获得当前通话的结束时长。Optionally, based on any embodiment, the AI-based intelligent voice call prediction device 200, the call prediction module 230 is used to obtain the end duration of the current call according to the call detection result using a preset fitting prediction function.

可选地,在任一实施例的基础上,基于AI的智能语音呼叫预测装置200,通话检测结果包括情绪检测结果、音频检测结果、语速检测结果、关键字检测结果以及通话步骤检测结果,拟合预测函数,包括:Optionally, based on any embodiment, the AI-based intelligent voice call prediction device 200, the call detection result includes an emotion detection result, an audio detection result, a speech speed detection result, a keyword detection result, and a call step detection result, and the fitting prediction function includes:

其中,y为当前通话的结束时长,k为第一参数,emotion.pt为情绪检测结果,a为情绪检测参数,Tone.pt为音频检测结果,b为音频检测参数,Speed.pt为语速检测结果,c为语速检测参数,Key.pt为关键字检测结果,d为关键字检测参数,Process.pt为对话流程检测结果,e为对话流程检测参数。Among them, y is the end time of the current call, k is the first parameter, emotion.pt is the emotion detection result, a is the emotion detection parameter, Tone.pt is the audio detection result, b is the audio detection parameter, Speed.pt is the speech speed detection result, c is the speech speed detection parameter, Key.pt is the keyword detection result, d is the keyword detection parameter, Process.pt is the dialogue process detection result, and e is the dialogue process detection parameter.

可选地,在任一实施例的基础上,基于AI的智能语音呼叫预测装置200,还包括,呼叫协同模块,用于呼叫下一通电话,并利用智能外呼机器人与下一通电话对应的第二用户进行通话;对第二用户的通话语音数据进行实时语音识别,获得用户实时通话文本,以及用户实时通话文本的时间;获取智能外呼机器人的智能外呼机器人实时通话文本,以及智能外呼机器人实时通话文本的时间;智能外呼机器人实时通话文本根据预设文本和/或第二用户的通话语音数据生成;按照时间顺序将用户实时通话文本和智能外呼机器人实时通话文本进行排序,获得智能外呼机器人与第二用户的对话内容;将对话内容进行展示,以使客服根据智能外呼机器人与第二用户的对话内容,确认是否接管智能外呼机器人,继续与第二用户进行通话。Optionally, based on any embodiment, the AI-based intelligent voice call prediction device 200 also includes a call collaboration module, which is used to call the next call and use the intelligent outbound call robot to talk to the second user corresponding to the next call; perform real-time voice recognition on the call voice data of the second user to obtain the user's real-time call text and the time of the user's real-time call text; obtain the intelligent outbound call robot's real-time call text and the time of the intelligent outbound call robot's real-time call text; the intelligent outbound call robot's real-time call text is generated according to the preset text and/or the call voice data of the second user; sort the user's real-time call text and the intelligent outbound call robot's real-time call text in chronological order to obtain the conversation content between the intelligent outbound call robot and the second user; display the conversation content so that the customer service can confirm whether to take over the intelligent outbound call robot and continue to talk with the second user based on the conversation content between the intelligent outbound call robot and the second user.

可选地,在任一实施例的基础上,基于AI的智能语音呼叫预测装置200,还包括:通话总结模块,用于获取当前通话的对话数据;对话数据包括智能外呼机器人与用户的对话数据和/或客服与用户的对话数据;将对话数据输入预设的文本总结模型,获得通话总结;文本总结模型为对预训练模型进行LORA微调获得;对预训练模型进行LORA微调,包括:加载预训练模型,并保持预训练模型的预设权重参数不变;在预训练模型中添加可训练层;通过低秩自适应方法,使用预设的对话样本数据集对可训练层进行训练;利用训练好的可训练层更新预训练模型,获得文本总结模型。Optionally, based on any embodiment, the AI-based intelligent voice call prediction device 200 also includes: a call summary module, which is used to obtain conversation data of the current call; the conversation data includes conversation data between the intelligent outbound call robot and the user and/or conversation data between the customer service and the user; the conversation data is input into a preset text summary model to obtain a call summary; the text summary model is obtained by performing LORA fine-tuning on the pre-trained model; performing LORA fine-tuning on the pre-trained model, including: loading the pre-trained model and keeping the preset weight parameters of the pre-trained model unchanged; adding a trainable layer to the pre-trained model; training the trainable layer using a preset conversation sample data set through a low-rank adaptive method; updating the pre-trained model using the trained trainable layer to obtain a text summary model.

可选地,在任一实施例的基础上,基于AI的智能语音呼叫预测装置200,还包括:话术推荐模块,用于对用户通话数据中的用户通话音频数据进行识别,获得用户通话文本数据;对用户通话文本数据进行标点符号预测,并根据标点符号预测结果对用户通话文本数据进行划分,获得目标用户语句;利用预设的话术推荐模型,根据目标用户语句,获得对应的答复话术;话术推荐模型通过获取知识库数据以及预先搭建的大语言模型;将大语言模型和语言链工具集成;利用语言链工具对知识库数据进行向量化获得。Optionally, based on any embodiment, the AI-based intelligent voice call prediction device 200 also includes: a speech recommendation module, which is used to identify the user call audio data in the user call data to obtain the user call text data; predict punctuation marks on the user call text data, and divide the user call text data according to the punctuation mark prediction results to obtain the target user sentences; use the preset speech recommendation model to obtain the corresponding reply speech according to the target user sentences; the speech recommendation model obtains the knowledge base data and the pre-built large language model; integrates the large language model and the language chain tool; and uses the language chain tool to vectorize the knowledge base data.

应理解的是,该装置与上述的基于AI的智能语音呼叫预测方法实施例对应,能够执行上述方法实施例涉及的各个步骤,该装置具体的功能可以参见上文中的描述,为避免重复,此处适当省略详细描述。该装置包括至少一个能以软件或固件(firmware)的形式存储于存储器中或固化在装置的操作系统(operating system,OS)中的软件功能模块。It should be understood that the device corresponds to the above-mentioned AI-based intelligent voice call prediction method embodiment and can execute the various steps involved in the above-mentioned method embodiment. The specific functions of the device can be found in the description above. To avoid repetition, the detailed description is appropriately omitted here. The device includes at least one software function module that can be stored in a memory in the form of software or firmware or solidified in the operating system (OS) of the device.

请参见图7示出的本申请实施例提供的电子设备的结构示意图。本申请实施例提供的一种电子设备300,包括:处理器310和存储器320,存储器320存储有处理器310可执行的机器可读指令,机器可读指令被处理器310执行时执行如上的方法。Please refer to the structural diagram of the electronic device provided by the embodiment of the present application shown in Figure 7. An electronic device 300 provided by the embodiment of the present application includes: a processor 310 and a memory 320, the memory 320 stores machine-readable instructions executable by the processor 310, and the machine-readable instructions are executed by the processor 310 to perform the above method.

本申请实施例还提供了一种存储介质,该存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如上的方法。An embodiment of the present application further provides a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above method is executed.

其中,存储介质可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(Static Random Access Memory, 简称SRAM),电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory, 简称EEPROM),可擦除可编程只读存储器(Erasable Programmable Read Only Memory, 简称EPROM),可编程只读存储器(Programmable Red-Only Memory, 简称PROM),只读存储器(Read-OnlyMemory, 简称ROM),磁存储器,快闪存储器,磁盘或光盘。Among them, the storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable red-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk or optical disk.

本申请实施例所提供的几个实施例中,应该理解到,所揭露的装置和方法,也可以通过其他的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,附图中的流程图和框图显示了根据本申请实施例的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功能也可以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。In several embodiments provided by the embodiments of the present application, it should be understood that the disclosed devices and methods can also be implemented in other ways. The device embodiments described above are merely schematic. For example, the flowcharts and block diagrams in the accompanying drawings show the possible architecture, functions and operations of the devices, methods and computer program products according to the multiple embodiments of the embodiments of the present application. In this regard, each box in the flowchart or block diagram can represent a module, a program segment or a part of the code, and a part of the module, program segment or code contains one or more executable instructions for implementing the specified logical function. It should also be noted that in some alternative implementations, the functions marked in the box can also occur in a different order than the order marked in the accompanying drawings. For example, two consecutive boxes can actually be executed substantially in parallel, and they can sometimes be executed in the opposite order, depending on the functions involved. It should also be noted that each box in the block diagram and/or flowchart, and the combination of boxes in the block diagram and/or flowchart can be implemented with a dedicated hardware-based system that performs a specified function or action, or can be implemented with a combination of dedicated hardware and computer instructions.

另外,在本申请实施例各个实施例中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。In addition, the functional modules in each embodiment of the present application can be integrated together to form an independent part, or each module can exist separately, or two or more modules can be integrated to form an independent part.

以上的描述,仅为本申请实施例的可选实施方式,但本申请实施例的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请实施例揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请实施例的保护范围之内。The above description is only an optional implementation manner of the embodiments of the present application, but the protection scope of the embodiments of the present application is not limited thereto. Any technician familiar with the technical field can easily think of changes or replacements within the technical scope disclosed in the embodiments of the present application, which should be covered within the protection scope of the embodiments of the present application.

Claims (12)

1.一种基于AI的智能语音呼叫预测方法,其特征在于,包括:1. An AI-based intelligent voice call prediction method, characterized by comprising: 接收当前通话中的用户通话数据;Receive call data of the user in the current call; 对所述用户通话数据进行检测处理,获得通话检测结果;所述检测处理包括情绪检测、音频检测、语速检测、关键字检测以及对话流程检测中的至少一项;Performing detection processing on the user call data to obtain a call detection result; the detection processing includes at least one of emotion detection, audio detection, speech speed detection, keyword detection, and dialogue flow detection; 基于所述通话检测结果,预测所述当前通话的结束时长,基于所述当前通话的结束时长,确定是否呼叫下一通电话。Based on the call detection result, the end time of the current call is predicted, and based on the end time of the current call, it is determined whether to make a next call. 2.根据权利要求1所述的方法,其特征在于,所述用户通话数据包括用户通话音频数据和/或用户通话文本数据;所述通话检测结果包括情绪检测结果;对所述用户通话数据进行检测处理,获得通话检测结果,包括:2. The method according to claim 1, characterized in that the user call data includes user call audio data and/or user call text data; the call detection result includes an emotion detection result; performing detection processing on the user call data to obtain the call detection result comprises: 将所述用户通话数据输入预设的情绪检测模型,获得所述情绪检测结果;所述情绪检测模型通过对预设的情绪数据和所述情绪数据对应的情绪标签训练获得。The user call data is input into a preset emotion detection model to obtain the emotion detection result; the emotion detection model is obtained by training the preset emotion data and the emotion labels corresponding to the emotion data. 3.根据权利要求1所述的方法,其特征在于,所述通话检测结果包括所述音频检测结果,对所述用户通话数据进行检测处理,获得通话检测结果,包括:3. The method according to claim 1, wherein the call detection result includes the audio detection result, and the detection and processing of the user call data to obtain the call detection result comprises: 将所述用户通话数据中的用户通话音频数据进行音频检测,获得所述音频检测结果;所述音频检测包括声调检测和/或分贝检测。Performing audio detection on the user call audio data in the user call data to obtain the audio detection result; the audio detection includes tone detection and/or decibel detection. 4.根据权利要求1所述的方法,其特征在于,所述通话检测结果包括通话步骤检测结果,对所述用户通话数据进行检测处理,获得通话检测结果,包括:4. The method according to claim 1, wherein the call detection result includes a call step detection result, and the call detection result is obtained by performing detection processing on the user call data, comprising: 获取所述当前通话中的客服通话数据;Obtain customer service call data in the current call; 将所述客服通话数据和所述用户通话数据输入预设的通话流程模型,获得所述通话步骤检测结果;所述通话流程模型通过对通话流程中每个步骤对应的问答数据进行特征提取,获得步骤特征;对所述步骤特征添加对应的步骤标签;对所述步骤特征以及所述步骤特征对应的标签进行训练获得。The customer service call data and the user call data are input into a preset call process model to obtain the call step detection result; the call process model obtains step features by extracting features from the question and answer data corresponding to each step in the call process; corresponding step labels are added to the step features; and the step features and the labels corresponding to the step features are trained to obtain the features. 5.根据权利要求1所述的方法,其特征在于,基于所述通话检测结果,预测所述当前通话的结束时长,包括:5. The method according to claim 1, characterized in that predicting the end duration of the current call based on the call detection result comprises: 利用预设的拟合预测函数,根据所述通话检测结果,获得所述当前通话的结束时长。The end duration of the current call is obtained according to the call detection result by using a preset fitting prediction function. 6.根据权利要求5所述的方法,其特征在于,所述通话检测结果包括情绪检测结果、音频检测结果、语速检测结果、关键字检测结果以及通话步骤检测结果,所述拟合预测函数,包括:6. The method according to claim 5, characterized in that the call detection result includes an emotion detection result, an audio detection result, a speech rate detection result, a keyword detection result and a call step detection result, and the fitting prediction function includes: 其中,y为所述当前通话的结束时长,k为第一参数,emotion.pt为情绪检测结果,a为情绪检测参数,Tone.pt为音频检测结果,b为音频检测参数,Speed.pt为语速检测结果,c为语速检测参数,Key.pt为关键字检测结果,d为关键字检测参数,Process.pt为对话流程检测结果,e为对话流程检测参数。Among them, y is the end duration of the current call, k is the first parameter, emotion.pt is the emotion detection result, a is the emotion detection parameter, Tone.pt is the audio detection result, b is the audio detection parameter, Speed.pt is the speech speed detection result, c is the speech speed detection parameter, Key.pt is the keyword detection result, d is the keyword detection parameter, Process.pt is the dialogue process detection result, and e is the dialogue process detection parameter. 7.根据权利要求1所述的方法,其特征在于,在确认呼叫下一通电话之后,所述方法还包括:7. The method according to claim 1, characterized in that after confirming the next call, the method further comprises: 呼叫下一通电话,并利用智能外呼机器人与所述下一通电话对应的第二用户进行通话;Call the next call, and use the intelligent outbound calling robot to talk to the second user corresponding to the next call; 对所述第二用户的通话语音数据进行实时语音识别,获得用户实时通话文本,以及所述用户实时通话文本的时间;Performing real-time voice recognition on the call voice data of the second user to obtain the real-time call text of the user and the time of the real-time call text of the user; 获取所述智能外呼机器人的智能外呼机器人实时通话文本,以及所述智能外呼机器人实时通话文本的时间;所述智能外呼机器人实时通话文本根据预设文本和/或所述第二用户的通话语音数据生成;Acquire the real-time call text of the intelligent outbound calling robot and the time of the real-time call text of the intelligent outbound calling robot; the real-time call text of the intelligent outbound calling robot is generated according to the preset text and/or the call voice data of the second user; 按照时间顺序将所述用户实时通话文本和所述智能外呼机器人实时通话文本进行排序,获得所述智能外呼机器人与所述第二用户的对话内容;Sort the real-time call text of the user and the real-time call text of the intelligent outbound call robot in chronological order to obtain the conversation content between the intelligent outbound call robot and the second user; 将所述对话内容进行展示,以使客服根据所述智能外呼机器人与所述第二用户的对话内容,确认是否接管所述智能外呼机器人,继续与所述第二用户进行通话。The conversation content is displayed so that the customer service can confirm whether to take over the intelligent outbound call robot and continue talking with the second user based on the conversation content between the intelligent outbound call robot and the second user. 8.根据权利要求1所述的方法,其特征在于,所述方法还包括:8. The method according to claim 1, characterized in that the method further comprises: 获取所述当前通话的对话数据;所述对话数据包括智能外呼机器人与所述用户的对话数据和/或客服与所述用户的对话数据;Acquire the conversation data of the current call; the conversation data includes the conversation data between the intelligent outbound call robot and the user and/or the conversation data between the customer service and the user; 将所述对话数据输入预设的文本总结模型,获得通话总结;所述文本总结模型为对预训练模型进行LORA微调获得;所述对预训练模型进行LORA微调,包括:加载所述预训练模型,并保持所述预训练模型的预设权重参数不变;在所述预训练模型中添加可训练层;通过低秩自适应方法,使用预设的对话样本数据集对所述可训练层进行训练;利用训练好的所述可训练层更新所述预训练模型,获得所述文本总结模型。The conversation data is input into a preset text summary model to obtain a call summary; the text summary model is obtained by performing LORA fine-tuning on a pre-trained model; the LORA fine-tuning on the pre-trained model includes: loading the pre-trained model and keeping the preset weight parameters of the pre-trained model unchanged; adding a trainable layer to the pre-trained model; training the trainable layer using a preset conversation sample data set through a low-rank adaptive method; and updating the pre-trained model using the trained trainable layer to obtain the text summary model. 9.根据权利要求1-8任一所述的方法,其特征在于,所述方法还包括:9. The method according to any one of claims 1 to 8, characterized in that the method further comprises: 对所述用户通话数据中的用户通话音频数据进行识别,获得用户通话文本数据;Identifying the user call audio data in the user call data to obtain the user call text data; 对所述用户通话文本数据进行标点符号预测,并根据标点符号预测结果对所述用户通话文本数据进行划分,获得目标用户语句;Predicting punctuation marks on the user call text data, and dividing the user call text data according to the punctuation mark prediction results to obtain target user sentences; 利用预设的话术推荐模型,根据所述目标用户语句,获得对应的答复话术;所述话术推荐模型通过,获取知识库数据以及预先搭建的大语言模型,将所述大语言模型和语言链工具集成;利用所述语言链工具对所述知识库数据进行向量化获得。Using a preset speech recommendation model, the corresponding reply speech is obtained according to the target user's sentence; the speech recommendation model obtains knowledge base data and a pre-built large language model, and integrates the large language model with a language chain tool; and the knowledge base data is vectorized using the language chain tool. 10.一种计算机程序产品,包括计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时执行如权利要求1至9任一所述方法的方法。10. A computer program product, comprising computer program instructions, wherein when the computer program instructions are executed by a processor, the method according to any one of claims 1 to 9 is performed. 11.一种电子设备,其特征在于,包括:处理器和存储器,所述存储器存储有所述处理器可执行的机器可读指令,所述机器可读指令被所述处理器执行时执行如权利要求1至9任一所述的方法。11. An electronic device, comprising: a processor and a memory, wherein the memory stores machine-readable instructions executable by the processor, and when the machine-readable instructions are executed by the processor, the method according to any one of claims 1 to 9 is performed. 12.一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至9任一所述的方法。12. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method according to any one of claims 1 to 9 is executed.
CN202410703069.4A 2024-06-03 2024-06-03 AI-based intelligent voice call prediction method, program product, device and medium Pending CN118283182A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410703069.4A CN118283182A (en) 2024-06-03 2024-06-03 AI-based intelligent voice call prediction method, program product, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410703069.4A CN118283182A (en) 2024-06-03 2024-06-03 AI-based intelligent voice call prediction method, program product, device and medium

Publications (1)

Publication Number Publication Date
CN118283182A true CN118283182A (en) 2024-07-02

Family

ID=91643978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410703069.4A Pending CN118283182A (en) 2024-06-03 2024-06-03 AI-based intelligent voice call prediction method, program product, device and medium

Country Status (1)

Country Link
CN (1) CN118283182A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111885272A (en) * 2020-07-24 2020-11-03 南京易米云通网络科技有限公司 Intelligent call-out method for supporting telephone by call center seat and intelligent call center system
CN113688221A (en) * 2021-09-08 2021-11-23 中国平安人寿保险股份有限公司 Model-based dialect recommendation method and device, computer equipment and storage medium
CN116189713A (en) * 2021-11-29 2023-05-30 上海畅跃信息技术有限公司 A voice recognition-based outbound call management method and device
CN116248798A (en) * 2023-02-17 2023-06-09 深度好奇(杭州)科技有限公司 Call processing method and device based on call duration prediction
CN117112731A (en) * 2023-08-28 2023-11-24 江苏科技大学 A customer service call assistance method and system based on natural language processing
CN117763119A (en) * 2023-12-26 2024-03-26 北京声智科技有限公司 Intelligent voice customer service dialogue method and device, electronic equipment and storage medium
CN118052907A (en) * 2024-02-19 2024-05-17 腾讯科技(深圳)有限公司 Text map generation method and related device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111885272A (en) * 2020-07-24 2020-11-03 南京易米云通网络科技有限公司 Intelligent call-out method for supporting telephone by call center seat and intelligent call center system
CN113688221A (en) * 2021-09-08 2021-11-23 中国平安人寿保险股份有限公司 Model-based dialect recommendation method and device, computer equipment and storage medium
CN116189713A (en) * 2021-11-29 2023-05-30 上海畅跃信息技术有限公司 A voice recognition-based outbound call management method and device
CN116248798A (en) * 2023-02-17 2023-06-09 深度好奇(杭州)科技有限公司 Call processing method and device based on call duration prediction
CN117112731A (en) * 2023-08-28 2023-11-24 江苏科技大学 A customer service call assistance method and system based on natural language processing
CN117763119A (en) * 2023-12-26 2024-03-26 北京声智科技有限公司 Intelligent voice customer service dialogue method and device, electronic equipment and storage medium
CN118052907A (en) * 2024-02-19 2024-05-17 腾讯科技(深圳)有限公司 Text map generation method and related device

Similar Documents

Publication Publication Date Title
CN112804400B (en) Customer service call voice quality inspection method and device, electronic equipment and storage medium
CN111028827B (en) Interaction processing method, device, equipment and storage medium based on emotion recognition
US20220044679A1 (en) Speech communication system and method with human-machine coordination
US20240354054A1 (en) Natural Language Processing Platform For Automated Event Analysis, Translation, and Transcription Verification
US9742912B2 (en) Method and apparatus for predicting intent in IVR using natural language queries
US12374321B2 (en) Reducing biases of generative language models
US11989514B2 (en) Identifying high effort statements for call center summaries
US12374324B2 (en) Transcript tagging and real-time whisper in interactive communications
US20230298615A1 (en) System and method for extracting hidden cues in interactive communications
CN113782022B (en) Communication method, device, equipment and storage medium based on intention recognition model
US20150220618A1 (en) Tagging relations with n-best
CN115186051A (en) Sensitive word detection method and device and computer readable storage medium
CN113407677B (en) Method, apparatus, device and storage medium for evaluating consultation dialogue quality
US20240386883A1 (en) Systems and methods for intent prediction and usage
CN114969195B (en) Dialogue content mining method and dialogue content evaluation model generation method
CN116246632A (en) Method and device for guiding external call operation
CN117975937A (en) Multi-tone word voice processing method and device and readable storage medium
CN117634471A (en) NLP quality inspection method and computer readable storage medium
CN118283182A (en) AI-based intelligent voice call prediction method, program product, device and medium
CN116127011A (en) Intent recognition method, device, electronic device and storage medium
CN115831154A (en) Emotion recognition method, device, equipment and storage medium
CN114254088A (en) The Construction Method of Auto-responder Model and Auto-responder Method
US12347418B2 (en) Systems and methods for training natural language processing models in a contact center
CN117668150A (en) Dialogue quality inspection method, medium and equipment
CN119578521A (en) Text information generation method, device, equipment and storage medium based on Prompt model

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20240702