[go: up one dir, main page]

CN108115678B - Robot and its motion control method and device - Google Patents

Robot and its motion control method and device Download PDF

Info

Publication number
CN108115678B
CN108115678B CN201611069128.9A CN201611069128A CN108115678B CN 108115678 B CN108115678 B CN 108115678B CN 201611069128 A CN201611069128 A CN 201611069128A CN 108115678 B CN108115678 B CN 108115678B
Authority
CN
China
Prior art keywords
robot
behavioral
emotion
predetermined action
thing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611069128.9A
Other languages
Chinese (zh)
Other versions
CN108115678A (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xinhui Yongtai Technology Co.,Ltd.
Original Assignee
Shenzhen Kuang Chi Hezhong Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Kuang Chi Hezhong Technology Ltd filed Critical Shenzhen Kuang Chi Hezhong Technology Ltd
Priority to CN201611069128.9A priority Critical patent/CN108115678B/en
Priority to PCT/CN2017/092038 priority patent/WO2018095041A1/en
Publication of CN108115678A publication Critical patent/CN108115678A/en
Application granted granted Critical
Publication of CN108115678B publication Critical patent/CN108115678B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Manipulator (AREA)

Abstract

The invention discloses a robot and an action control method and device thereof. The method comprises the following steps: receiving a control instruction, wherein the control instruction is used for instructing the robot to execute a preset action; inputting the control instruction into a preset model to obtain an output result, wherein the output result comprises a behavior emotion mode corresponding to a preset action executed by the robot, and the preset model is obtained by training at least based on emotion evaluation parameters fed back by a user; and controlling the robot to execute a preset action according to the behavior emotion mode according to the output result. By the method and the device, the problem that the robot cannot meet the requirements of users due to single behavior is solved.

Description

机器人及其动作控制方法和装置Robot and its motion control method and device

技术领域technical field

本发明涉及机器人领域,具体而言,涉及一种机器人及其动作控制方法和装置。The present invention relates to the field of robots, and in particular, to a robot and its motion control method and device.

背景技术Background technique

机器人发展至今,功能越来越完善,对机器人的要求也越来越高,现有的强化学习技术在机器人上有广泛应用,但大多数集中在运动决策上,比如控制平衡、控制机器人行走的方法,但是在决策模式上尚未深度发展,另外强化学习(Reinforcement Learning,简称为RL)技术本身也正在发展,远达不到成熟的程度。原理上,强化学习灵感来自行为心理学,最早体现在巴甫洛夫的狗,斯金纳箱等实验,生物学上也有神经可塑性等理论支持。但是强化学习方法并未触及情感的训练模型,机器人上也并未有类似的实现方式,停留在计算机视觉、自然语言处理等程度,机器人无法根据用户的喜好有不同的行为表现。Since the development of robots, the functions have become more and more perfect, and the requirements for robots have become higher and higher. The existing reinforcement learning technology is widely used in robots, but most of them focus on motion decision-making, such as controlling balance and controlling robot walking. However, the decision-making model has not yet been deeply developed. In addition, the reinforcement learning (RL) technology itself is also developing, and it is far from mature. In principle, reinforcement learning is inspired by behavioral psychology. It was first embodied in experiments such as Pavlov's dog and Skinner box. It is also supported by theories such as neuroplasticity in biology. However, the reinforcement learning method does not touch the emotional training model, and there is no similar implementation on the robot. It stays at the level of computer vision, natural language processing, etc., and the robot cannot have different behaviors according to the user's preferences.

针对相关技术中机器人行为表现单一无法满足用户需求的问题,目前尚未提出有效的解决方案。Aiming at the problem that the robot's behavior in the related technologies cannot meet the needs of users, there is no effective solution yet.

发明内容SUMMARY OF THE INVENTION

本发明的主要目的在于提供一种机器人及其动作控制方法和装置,以解决行为表现单一无法满足用户需求的问题。The main purpose of the present invention is to provide a robot and its motion control method and device, so as to solve the problem that a single behavior performance cannot meet the needs of users.

为了实现上述目的,根据本发明的一个方面,提供了一种机器人动作控制方法,该方法包括:接收控制指令,其中,控制指令用于指示机器人执行预定动作;将控制指令输入预设模型,得到输出结果,其中,输出结果包括与机器人执行的预定动作相对应的行为情感模式,预设模型至少基于用户反馈的情感评价参数训练得到;根据输出结果控制机器人按照行为情感执行预定动作。In order to achieve the above object, according to an aspect of the present invention, there is provided a robot motion control method, the method comprising: receiving a control instruction, wherein the control instruction is used to instruct the robot to perform a predetermined action; inputting the control instruction into a preset model to obtain Output results, wherein the output results include behavioral emotion patterns corresponding to the predetermined actions performed by the robot, and the preset models are trained based on at least the emotion evaluation parameters fed back by the user; control the robot to perform predetermined actions according to the behavioral emotions according to the output results.

进一步地,所述行为情感模式用于确定所述机器人对所述预定动作的行为情感状态,在所述机器人执行所述预定动作时,所述机器人的状态为所述行为情感状态。Further, the behavior emotion mode is used to determine the behavior emotion state of the robot for the predetermined action, and when the robot performs the predetermined action, the state of the robot is the behavior emotion state.

进一步地,在将所述控制指令输入预设模型,得到输出结果之前,该方法还包括:接收事物集中的多个事物对应的行为情感指数,其中,行为情感指数用于表示机器人对事物集中的多个事物的喜好程度;根据事物集中的多个事物对应的行为情感指数建立预设模型。Further, before inputting the control instruction into a preset model and obtaining an output result, the method further includes: receiving behavioral emotion indexes corresponding to a plurality of things in the object set, wherein the behavioral emotion index is used to indicate the robot's attitude towards the object set. The degree of preference of multiple things; a preset model is established according to the behavioral emotion index corresponding to multiple things in the set of things.

进一步地,事物集中的多个事物为机器人执行的预定动作对应的事物。Further, the multiple things in the thing set are things corresponding to the predetermined actions performed by the robot.

进一步地,事物集中事物的数量为n个,n为大于1的整数;预设模型的计算公式为

Figure BDA0001164466160000021
其中,wx为预设的基础参数,wi为事物集中的第i项事物的影响参数,xi为事物集中的第i项事物当前被设置的开关状态参数,bi为事物集中的第i项事物的评价结果数值。Further, the number of things in the thing set is n, and n is an integer greater than 1; the calculation formula of the preset model is
Figure BDA0001164466160000021
Among them, w x is the preset basic parameter, wi is the influence parameter of the i -th thing in the thing set, x i is the current switch state parameter of the i-th thing in the thing set, and b i is the ith thing in the thing set. The value of the evaluation result of the i item.

进一步地,输出结果还包括预定动作的期望值,在根据输出结果控制机器人按照行为情感模式执行预定动作之后,该方法还包括:接收对机器人执行预定动作的反馈结果;根据反馈结果和期望值更新预设模型。Further, the output result also includes the expected value of the predetermined action, and after controlling the robot to perform the predetermined action according to the behavioral emotion pattern according to the output result, the method further includes: receiving a feedback result of the robot performing the predetermined action; updating the preset according to the feedback result and the expected value Model.

进一步地,控制指令包括以下至少之一:图像控制指令;语音控制指令;生物信号控制指令。Further, the control instructions include at least one of the following: image control instructions; voice control instructions; biological signal control instructions.

为了实现上述目的,根据本发明的另一方面,还提供了一种机器人动作控制装置,该装置包括:第一接收单元,用于接收控制指令,其中,控制指令用于指示机器人执行预定动作;输入单元,用于将控制指令输入预设模型,得到输出结果,其中,输出结果包括与机器人执行的预定动作相对应的行为情感模式,预设模型至少基于用户反馈的情感评价参数训练得到;控制单元,用于根据输出结果控制机器人按照行为情感模式执行预定动作。In order to achieve the above object, according to another aspect of the present invention, a robot motion control device is further provided, the device comprising: a first receiving unit for receiving a control instruction, wherein the control instruction is used to instruct the robot to perform a predetermined action; The input unit is used for inputting the control instruction into the preset model to obtain the output result, wherein the output result includes the behavioral emotion pattern corresponding to the predetermined action performed by the robot, and the preset model is obtained by training at least based on the emotion evaluation parameters fed back by the user; The unit is used to control the robot to perform a predetermined action according to the behavioral emotion mode according to the output result.

进一步地,行为情感模式用于确定机器人对预定动作的行为情感状态,在机器人执行预定动作时,机器人的状态为行为情感状态。Further, the behavior emotion mode is used to determine the behavior emotion state of the robot for the predetermined action, and when the robot performs the predetermined action, the state of the robot is the behavior emotion state.

进一步地,该装置还包括:第二接收单元,用于在将控制指令输入预设模型,得到输出结果之前,接收事物集中的多个事物对应的行为情感指数,其中,行为情感指数用于表示机器人对事物集中的多个事物的喜好程度;建立单元,用于根据事物集中的多个事物对应的行为情感指数建立预设模型。Further, the device further includes: a second receiving unit, configured to receive the behavioral emotion index corresponding to a plurality of things in the thing set before inputting the control instruction into the preset model and obtaining the output result, wherein the behavioral emotion index is used to represent The robot's preference for multiple things in the thing set; the establishment unit is used to establish a preset model according to the behavioral emotion index corresponding to the multiple things in the thing set.

进一步地,事物集中的多个事物为机器人执行的预定动作对应的事物。Further, the multiple things in the thing set are things corresponding to the predetermined actions performed by the robot.

进一步地,事物集中事物的数量为n个,n为大于1的整数;预设模型的计算公式为

Figure BDA0001164466160000022
其中,wx为预设的基础参数,wi为事物集中的第i项事物的影响参数,xi为事物集中的第i项事物当前被设置的开关状态参数,bi为事物集中的第i项事物的评价结果数值。Further, the number of things in the thing set is n, and n is an integer greater than 1; the calculation formula of the preset model is
Figure BDA0001164466160000022
Among them, w x is the preset basic parameter, wi is the influence parameter of the i -th thing in the thing set, x i is the current switch state parameter of the i-th thing in the thing set, and b i is the ith thing in the thing set. The value of the evaluation result of the i item.

进一步地,输出结果还包括预定动作的期望值,该装置还包括:第三接收单元,用于在根据输出结果控制机器人按照行为情感模式执行预定动作之后,接收对机器人执行预定动作的反馈结果;更新单元,用于根据反馈结果和期望值更新预设模型。Further, the output result also includes the expected value of the predetermined action, and the device further includes: a third receiving unit, configured to receive a feedback result for the robot to perform the predetermined action after controlling the robot to perform the predetermined action according to the behavioral emotion mode according to the output result; update the The unit is used to update the preset model according to the feedback results and expected values.

进一步地,控制指令包括以下至少之一:图像控制指令;语音控制指令;生物信号控制指令。Further, the control instructions include at least one of the following: image control instructions; voice control instructions; biological signal control instructions.

为了实现上述目的,根据本发明的另一方面,还提供了一种机器人,该机器人包括:本发明实施例的机器人动作控制装置。In order to achieve the above object, according to another aspect of the present invention, a robot is also provided, and the robot includes: the robot motion control device according to the embodiment of the present invention.

本发明通过接收控制指令,其中,控制指令用于指示机器人执行预定动作;将控制指令输入预设模型,得到输出结果,其中,输出结果包括与机器人执行的预定动作相对于的行为情感模式,预设模型至少基于用户反馈的情感评价参数训练得到;根据输出结果控制机器人按照行为情感模式执行预定动作,解决了机器人行为表现单一无法满足用户需求的问题,进而达到了根据用户需要表现出不同的行为的效果。The present invention receives control instructions, wherein the control instructions are used to instruct the robot to perform a predetermined action; input the control instructions into a preset model to obtain an output result, wherein the output result includes the behavioral emotion pattern relative to the predetermined action performed by the robot, and the preset It is assumed that the model is trained at least based on the emotional evaluation parameters fed back by the user; according to the output results, the robot is controlled to perform predetermined actions according to the behavioral emotion pattern, which solves the problem that the robot's single behavior cannot meet the user's needs, and then achieves different behaviors according to the user's needs. Effect.

附图说明Description of drawings

构成本申请的一部分的附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The accompanying drawings constituting a part of the present application are used to provide further understanding of the present invention, and the exemplary embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the attached image:

图1是根据本发明第一实施例的机器人动作控制方法的流程图;FIG. 1 is a flowchart of a robot motion control method according to a first embodiment of the present invention;

图2是根据本发明第二实施例的机器人动作控制方法的流程图;以及FIG. 2 is a flowchart of a robot motion control method according to a second embodiment of the present invention; and

图3是根据本发明实施例的机器人动作控制装置的示意图。FIG. 3 is a schematic diagram of a robot motion control apparatus according to an embodiment of the present invention.

具体实施方式Detailed ways

需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present invention will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to make those skilled in the art better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only The embodiments are part of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the scope of protection of the present application.

需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the description and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances for the embodiments of the application described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.

本发明实施例提供了一种机器人动作控制方法。Embodiments of the present invention provide a method for controlling robot motion.

图1是根据本发明第一实施例的机器人动作控制方法的流程图,如图1所示,该方法包括以下步骤:Fig. 1 is a flow chart of a robot motion control method according to a first embodiment of the present invention. As shown in Fig. 1 , the method includes the following steps:

步骤S102:接收控制指令,其中,控制指令用于指示机器人执行预定动作;Step S102: receiving a control instruction, wherein the control instruction is used to instruct the robot to perform a predetermined action;

步骤S104:将控制指令输入预设模型,得到输出结果,其中,输出结果包括与机器人执行的预定动作相对应的行为情感模式,预设模型至少基于用户反馈的情感评价参数训练得到。Step S104: Input the control instruction into the preset model to obtain an output result, wherein the output result includes the behavioral emotion pattern corresponding to the predetermined action performed by the robot, and the preset model is obtained by training at least based on the emotion evaluation parameters fed back by the user.

步骤S106:根据输出结果控制机器人按照行为情感模式执行预定动作。Step S106: Control the robot to perform a predetermined action according to the behavioral emotion mode according to the output result.

该实施例采用接收控制指令,然后将控制指令输入预设模型,得到输出结果,根据输出结果控制机器人按照行为情感模式执行预定动作,由于预设模型是基于用户反馈的情感评价参数训练得到的,能够确定机器人执行预定动作时的行为情感模式,因此,根据模型输出结果控制机器人按照行为情感模式执行预定动作能够解决机器人行为表现单一无法满足用户需求的问题,进而达到了根据用户需要表现出不同的行为的效果。In this embodiment, the control instruction is received, and then the control instruction is input into the preset model to obtain the output result, and according to the output result, the robot is controlled to perform the predetermined action according to the behavioral emotion pattern. Since the preset model is obtained by training based on the emotion evaluation parameters fed back by the user, It can determine the behavioral emotion pattern of the robot when it performs a predetermined action. Therefore, controlling the robot to perform the predetermined action according to the behavioral emotion pattern according to the model output can solve the problem that the robot's behavior can not meet the needs of users, and then achieve different behaviors according to the needs of users. effect of behavior.

本发明实施例的机器人可以是智能服务型机器人,陪伴型机器人等,可选地,控制指令包括图像控制指令、语音控制指令、生物信号控制指令中的至少一个,控制指令也可以是其他类型的指令,接收控制指令可以是机器人通过摄像头检测到主人发出的动作,接收到的图像控制指令,或者是接收到语音控制指令,例如,接收到主人向机器人发出的“跳舞”语音控制指令,机器人还可以接收可穿戴的生物信号控制指令,例如,主人穿戴的一些生物信号检测的设备检测到主人身体状态出现异常发出控制指令,机器人接收到生物信号控制指令。控制指令控制机器人执行预定动作,预定动作可以是机器人跳舞、扫地、转圈等动作,在接收到控制指令之后,将控制指令输入预设模型,进行模型计算,得到输出结果,根据得到的输出结果控制机器人按照行为情感执行预定动作。通过本发明实施例的技术方案,机器人能够具有多种情感模式与用户交互。The robot in the embodiment of the present invention may be an intelligent service robot, a companion robot, etc. Optionally, the control instruction includes at least one of an image control instruction, a voice control instruction, and a biological signal control instruction, and the control instruction may also be of other types Instructions, receiving control instructions can be the robot detects the action of the owner through the camera, receives the image control instruction, or receives the voice control instruction, for example, receives the "dance" voice control instruction from the owner to the robot, the robot also It can receive wearable bio-signal control instructions. For example, some bio-signal detection equipment worn by the owner detects that the host's physical state is abnormal and sends out control instructions, and the robot receives bio-signal control instructions. The control command controls the robot to perform a predetermined action. The predetermined action can be the robot dancing, sweeping, turning in circles, etc. After receiving the control command, the control command is input into the preset model, and the model is calculated to obtain the output result, and control according to the obtained output result. The robot performs predetermined actions according to the behavioral emotion. Through the technical solutions of the embodiments of the present invention, the robot can interact with the user in multiple emotional modes.

可选地,在将控制指令输入预设模型,得到输出结果之前,接收事物集中的多个事物对应的行为情感指数,其中,行为情感指数用于表示机器人对事物集中的多个事物的喜好程度;根据事物集中的多个事物对应的行为情感指数建立预设模型。Optionally, before inputting the control instruction into the preset model and obtaining the output result, the behavior emotion index corresponding to the multiple things in the thing set is received, wherein the behavior emotion index is used to indicate the robot's preference for the multiple things in the thing set. ; Establish a preset model according to the behavioral emotion index corresponding to multiple things in the thing set.

预设模型的建立需要事物集中的事物对应的指数,其中,事物集中的多个事物可以是能够反映机器人的行为情感的事物,例如,唱歌跳舞、扫地等事物,事物集中的事物对应的指数可以是用户对该事物的喜好程度,例如,在用户比较在意某个事物时指数比较高,用户不太在意某个事物时,指数比较低。在接收到事物集中的多个事物对应的指数之后,根据对应的行为情感指数建立预设模型。行为情感模式用于确定机器人对预定动作的行为情感状态,在机器人执行预定动作时,机器人的状态为该行为情感状态,行为情感状态可以与行为情感模式相对应,在一种行为情感模式下,可以具有一种行为情感状态。The establishment of the preset model requires the indices corresponding to the objects in the object set. The multiple objects in the object set can be things that can reflect the behavior and emotion of the robot, for example, things such as singing and dancing, sweeping the floor, etc. The indices corresponding to the objects in the object set can be It is the user's preference for the thing. For example, when the user cares more about a certain thing, the index is relatively high, and when the user does not care much about a certain thing, the index is relatively low. After receiving indices corresponding to multiple things in the thing set, a preset model is established according to the corresponding behavioral emotion indices. The behavioral emotion mode is used to determine the behavioral emotional state of the robot for a predetermined action. When the robot performs a predetermined action, the state of the robot is the behavioral emotional state, and the behavioral emotional state can correspond to the behavioral emotion mode. In a behavioral emotion mode, Can have a behavioral affective state.

可选地,输出结果包括预定动作的期望值,在根据输出结果控制机器人按照行为情感执行预定动作之后,接收对机器人的动作的反馈结果;根据反馈结果和期望值更新预设模型。Optionally, the output result includes the expected value of the predetermined action, and after controlling the robot to perform the predetermined action according to the behavior and emotion according to the output result, the feedback result of the robot's action is received; the preset model is updated according to the feedback result and the expected value.

将控制指令输入预设模型得到的输出结果包括预定动作的期望值,预定动作的期望值可以用于表示机器人执行预定动作时的行为情感的期望值,例如,机器人在执行预定动作时是否是开心的执行,在机器人按照行为情感执行预定动作之后,接收对机器人的动作的反馈结果,反馈结果是由用户发出的,例如,主人对机器人的行为进行打分,接收对机器人的动作的反馈结果,然后根据反馈结果和预定动作的期望值更新预设模型,例如,可以根据反馈结果和预定动作的期望值的差值更新预设模型。The output result obtained by inputting the control instruction into the preset model includes the expected value of the predetermined action, and the expected value of the predetermined action can be used to represent the expected value of the behavior and emotion of the robot when performing the predetermined action, for example, whether the robot is happy when performing the predetermined action, After the robot performs a predetermined action according to the behavior and emotion, it receives the feedback result of the robot's action, and the feedback result is sent by the user. For example, the owner scores the robot's behavior, receives the feedback result of the robot's action, and then according to the feedback result The preset model is updated with the expected value of the predetermined action. For example, the preset model can be updated according to the difference between the feedback result and the expected value of the predetermined action.

可选地,事物集中事物的数量为n个,其中,n为大于1的整数,预设模型的计算公式为

Figure BDA0001164466160000051
其中,wx为预设的基础参数,wi为事物集中的第i项事物的影响参数,xi为事物集中的第i项事物当前被设置的开关状态参数,bi为事物集中的第i项事物的评价结果数值。Optionally, the number of things in the thing set is n, where n is an integer greater than 1, and the calculation formula of the preset model is:
Figure BDA0001164466160000051
Among them, w x is the preset basic parameter, wi is the influence parameter of the i -th thing in the thing set, x i is the current switch state parameter of the i-th thing in the thing set, and b i is the ith thing in the thing set. The value of the evaluation result of the i item.

预设模型的计算公式为

Figure BDA0001164466160000052
其中,wx为预设的基础参数,可以是一个定值,由用户设置或者出厂前设置,wi为事物集中的第i项事物的影响参数,第i项事物的影响参数可以是第i项事物对主人的重要程度,xi为事物集中的第i项事物当前被设置的开关状态参数,xi具有两个数值,在取第一数值时,例如,xi=1,表示第i项事物对于主人的影响状态为开启,在取第二数值时,例如,xi=0,表示第i项事物对于主人的影响状态为关闭。在第i项事物对于主人的影响状态为关闭时,机器人对该事物的行为情感不会对主人产生影响。bi为事物集中的第i项事物的评价结果数值,可以是主人对于机器人执行该事物时的行为情感的打分。The calculation formula of the preset model is
Figure BDA0001164466160000052
Among them, w x is a preset basic parameter, which can be a fixed value, set by the user or set before leaving the factory, w i is the influence parameter of the ith thing in the thing set, and the influence parameter of the ith thing can be the ith thing The importance of the item to the owner, x i is the current switch state parameter of the i-th item in the transaction set, and x i has two values. When taking the first value, for example, x i =1, indicating that the i-th item is The influence state of the item on the owner is ON, and when the second value is taken, for example, x i =0, it means that the influence state of the item i on the owner is OFF. When the influence state of the i-th thing on the owner is off, the behavior and emotion of the robot will not affect the owner. b i is the value of the evaluation result of the i-th thing in the thing set, which can be the score of the owner's behavior and emotion when the robot executes the thing.

在一个可选的应用场景中,机器人在接收到主人的语音指令“扫地”之后,根据预设的模型得到扫地时的行为情感,然后执行扫地动作,如果预设模型输出的行为情感模式为开心模式,则机器人表现出开心的扫地,例如,动作轻快,同时播放音乐等,主人对机器人的行为情感很满意,打分较高,则机器人根据主人打分可以判断出,主人喜欢在扫地时表现出的开心的行为情感状态,因此,在下一次机器人扫地时,依旧表现出开心的行为情感,如果主人厌倦了机器人在扫地时表现出的开心的行为情感,则给机器人打低分,机器人根据打分判断出主人不喜欢扫地时表现出的开心的行为情感状态,则在下一次扫地时,表现出不开心的行为情感状态。In an optional application scenario, after receiving the owner's voice command "sweep the floor", the robot obtains the behavioral emotion when sweeping the floor according to the preset model, and then performs the sweeping action. If the behavioral emotion mode output by the preset model is happy Mode, the robot shows a happy sweeping, for example, the action is brisk, and music is played at the same time, etc., the owner is very satisfied with the robot's behavior and emotion, and the score is high, then the robot can judge according to the owner's score, the owner likes to sweep the floor. Happy behavior and emotional state. Therefore, when the robot sweeps the floor next time, it still shows happy behavior and emotion. If the owner is tired of the happy behavior and emotion shown by the robot when sweeping the floor, the robot will be given a low score, and the robot will judge according to the score. If the owner does not like the happy behavioral emotional state shown when sweeping the floor, he will show an unhappy behavioral emotional state when sweeping the floor next time.

图2是根据本发明第二实施例的机器人动作控制方法的流程图,该实施例可以作为上述第一实施例的优选实施方式,如图2所示,该机器人动作控制方法包括以下步骤:Fig. 2 is a flow chart of a robot motion control method according to a second embodiment of the present invention, which can be used as a preferred implementation of the above-mentioned first embodiment. As shown in Fig. 2, the robot motion control method includes the following steps:

步骤S201:接收控制指令。Step S201: Receive a control instruction.

控制指令可以是视觉输入信号(Vision input),语言输入(Language input)和可穿戴生物信号输入(Wearable biosignal input)中的一个或多个。The control instruction may be one or more of a visual input signal (Vision input), a language input (Language input) and a wearable biosignal input (Wearable biosignal input).

步骤S202:在线强化训练。Step S202: online reinforcement training.

在线强化训练(Soul Model Reinforcement Learning Core)可以是将控制指令输入预设的模型进行计算,得到输出结果,根据输出结果确定机器人的行为输出。并且,预设模型能够根据奖惩机制的反馈进行更新,以实现在线强化训练。Online reinforcement training (Soul Model Reinforcement Learning Core) can input control instructions into a preset model for calculation, obtain output results, and determine the behavior output of the robot according to the output results. Moreover, the preset model can be updated according to the feedback of the reward and punishment mechanism to realize online reinforcement training.

步骤S203:行为输出。Step S203: Behavior output.

在根据预设模型得到输出结果之后,根据输出结果控制机器人的行为输出(Behavior output)。After the output result is obtained according to the preset model, the behavior output of the robot is controlled according to the output result.

步骤S204:用户评价反馈。Step S204: User evaluation feedback.

在接收到用户评价反馈(Human evaluation feedback),例如,用户对机器人的行为打分。After receiving the user evaluation feedback (Human evaluation feedback), for example, the user scores the behavior of the robot.

步骤S205:反馈,奖惩机制。Step S205: feedback, reward and punishment mechanism.

通过引入奖惩机制反馈(Feedback Reward/punishment block),对预设模型进行修正,实现在线强化训练。Through the introduction of reward and punishment mechanism feedback (Feedback Reward/punishment block), the preset model is revised to realize online reinforcement training.

步骤S206:期望值。Step S206: Expected value.

根据预设模型得到期望值(Expectation),将期望值和用户反馈评价作为反馈奖惩机制的依据,可以是将用户反馈评价和期望值的差值作为反馈参数。The expectation value is obtained according to the preset model, and the expectation value and the user feedback evaluation are used as the basis for the feedback reward and punishment mechanism, and the difference between the user feedback evaluation and the expectation value can be used as the feedback parameter.

本发明实施例的技术方案在原理上采用了行为心理学的原理,最早应用在巴甫洛夫的狗,斯金纳箱等实验中,生物学上也有神经可塑性等理论支持。但是现有技术的强化学习并未触及情感的训练模型,机器人上也并未有类似的实现方式,停留在计算机视觉、自然语言处理等程度,本发明实施例首次提出情感模型的构建以及实现。The technical solution of the embodiment of the present invention adopts the principle of behavioral psychology in principle, and was first applied in experiments such as Pavlov's dog and Skinner box, and is also supported by theories such as neuroplasticity in biology. However, the reinforcement learning of the prior art does not involve the training model of emotion, and there is no similar implementation on the robot, which stays at the level of computer vision, natural language processing, etc. The embodiment of the present invention proposes the construction and implementation of the emotion model for the first time.

本发明是关于如何让机器生命更好的与人类进行情感交流,使用在线的强化学习加入心理学概念正负强化和正负惩罚进行长时间的训练,目的在于用户与机器人的交互过程中能感受到接近人类的情感反馈。The present invention is about how to make the machine life better communicate with human beings, using online reinforcement learning to add psychological concepts of positive and negative reinforcement and positive and negative punishment for long-term training, the purpose is that the user can feel the interaction process with the robot. to close to human emotional feedback.

行为心理学的强化惩罚理论涵盖正负强化和正负惩罚。以下是简单介绍:The reinforcement punishment theory of behavioral psychology covers positive and negative reinforcement and positive and negative punishment. The following is a brief introduction:

1.正强化:给予一种好刺激。为了建立一种适应性的行为模式,运用奖励的方式,使这种行为模式重复出现,并保持下来。例如企业对积极提出合理化建议的职工颁发奖金。1. Positive reinforcement: Give a good stimulus. In order to establish an adaptive behavior pattern, use the reward method to make this behavior pattern recur and maintain. For example, companies award bonuses to employees who actively put forward reasonable suggestions.

2.负强化:去掉一个坏刺激。为引发所希望的行为的出现而设立。例如企业不允许在工作时间打个人电话,一位员工有这种习惯,这种行为一出现就受到指责,但一旦他停止这种行为了,就应立即停止对他的指责。2. Negative reinforcement: remove a bad stimulus. Set up to induce the appearance of the desired behavior. For example, the company does not allow personal phone calls during working hours. An employee who has this habit will be blamed as soon as this behavior occurs, but once he stops this behavior, the blame should be stopped immediately.

3.正惩罚:施加一个坏刺激。这是当不适当的行为出现时,给予处罚的一种方法。3. Positive Punishment: Applying a bad stimulus. This is a way of giving penalties when inappropriate behavior occurs.

4.负惩罚:去掉一个好刺激。这种惩罚比之正惩罚更为常用。当不适当的行为出现时,不再给予原有的奖励。4. Negative punishment: remove a good stimulus. This punishment is more commonly used than the positive punishment. When inappropriate behavior occurs, the original reward is no longer given.

基于强化学习的基本思想,反馈回路是奖励驱动(reward-driven,简称为RL)的,期望可以表示为wx+b,b表示为奖励参数,由评价函数判断赋值。但是心理学认为人类的行为情感来源于两方面,分别是好刺激与坏刺激,人的行为可以理解为复数的好刺激和复数的坏刺激的乘加效果,但是现有的RL并未加入这类分类机制,本发明实施例的技术方案加入此机制,并应用于人机交互机器人。因此RL的期望模型为wx+(w1x1+b1)+(w2x2+b2)+……。在原有的强化学习的基础上扩展了维度。关于新增加的维度,一部分是来源于预设。根据公式,w1x1+b1表示为该事物下的期望贡献模型,w1为定值,预设于设计师,在概念上等同于人类对某种事物的喜好程度,而b1代表为该事物的奖惩参数。假如拖地是一类事物,对于小孩子来说很可能就是厌恶事物,则w1为负数,x1为0或者1代表该事物的开关状态,b1代表当前事物的奖惩程度。事物集可以增加,但需要预设该事物的喜好程度。综合所有事物的期望贡献模型,则可以得到该个体当前时间点的奖惩程度,反馈于该个体的输入。模型的改变直接影响了该模型的训练方式,人机交互机器人虽说可以使用预先训练好的转移模型,但是人性化方面总是为人所诟病,因此本发明实施例的技术方案将该预先训练好的转移模型改成在线学习方式,让用户亲自训练该机器人,使该机器人的行为模式逐渐贴近用户,最终服务于用户。Based on the basic idea of reinforcement learning, the feedback loop is reward-driven (referred to as RL), and the expectation can be expressed as wx+b, where b is the reward parameter, which is determined and assigned by the evaluation function. However, psychology believes that human behavior and emotion come from two aspects, namely good stimuli and bad stimuli. Human behavior can be understood as the multiplication and addition effect of complex good stimuli and complex bad stimuli, but the existing RL does not add this. Class classification mechanism, the technical solutions of the embodiments of the present invention add this mechanism and are applied to human-computer interaction robots. So the desired model of RL is w x +(w 1 x 1 +b 1 )+(w 2 x 2 +b 2 )+…. On the basis of the original reinforcement learning, the dimension is extended. Regarding the newly added dimensions, part of it comes from presets. According to the formula, w 1 x 1 +b 1 is expressed as the expected contribution model under the thing, w 1 is a fixed value, preset by the designer, which is conceptually equivalent to the degree of human preference for a certain thing, and b 1 represents is the reward and punishment parameter of the thing. If mopping the floor is a kind of thing, it is likely to be disgusting for children, then w 1 is a negative number, x 1 is 0 or 1 to represent the on-off state of the thing, and b 1 represents the reward and punishment degree of the current thing. The set of things can be increased, but the preference of the thing needs to be preset. By synthesizing the expected contribution model of all things, the degree of reward and punishment of the individual at the current time point can be obtained and fed back to the input of the individual. The change of the model directly affects the training method of the model. Although the human-computer interaction robot can use the pre-trained transfer model, it is always criticized for its humanization. Therefore, the technical solution of the embodiment of the present invention uses the pre-trained transfer model. The transfer model is changed to an online learning method, allowing the user to personally train the robot, so that the robot's behavior pattern is gradually closer to the user, and finally serves the user.

需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。It should be noted that the steps shown in the flowcharts of the accompanying drawings may be executed in a computer system, such as a set of computer-executable instructions, and, although a logical sequence is shown in the flowcharts, in some cases, Steps shown or described may be performed in an order different from that herein.

本发明实施例提供了一种机器人动作控制装置,该装置可以用于执行本发明实施例的机器人动作控制方法。The embodiment of the present invention provides a robot motion control apparatus, and the apparatus can be used to execute the robot motion control method of the embodiment of the present invention.

图3是根据本发明实施例的机器人动作控制装置的示意图,如图3所示,该装置包括:FIG. 3 is a schematic diagram of a robot motion control device according to an embodiment of the present invention. As shown in FIG. 3 , the device includes:

第一接收单元10,用于接收控制指令,其中,所述控制指令用于指示机器人执行预定动作;a first receiving unit 10, configured to receive a control instruction, wherein the control instruction is used to instruct the robot to perform a predetermined action;

输入单元20,用于将所述控制指令输入预设模型,得到输出结果,其中,输出结果包括与机器人执行的预定动作相对应的行为情感模式,所述预设模型至少基于用户反馈的情感评价参数训练得到;The input unit 20 is configured to input the control instruction into a preset model to obtain an output result, wherein the output result includes a behavioral emotion pattern corresponding to the predetermined action performed by the robot, and the preset model is at least based on the emotion evaluation fed back by the user parameters are trained;

控制单元30,用于根据所述输出结果控制所述机器人按照所述行为情感模式执行所述预定动作。The control unit 30 is configured to control the robot to perform the predetermined action according to the behavioral emotion mode according to the output result.

该实施例采用第一接收单元,用于接收控制指令,其中,控制指令用于指示机器人执行预定动作;输入单元,用于将控制指令输入预设模型,得到输出结果,其中,预设模型用于确定机器人执行预定动作时的行为情感模式,预设模型至少基于用户反馈的情感评价参数训练得到;控制单元,用于根据输出结果控制机器人按照行为情感模式执行预定动作,从而解决了机器人行为表现单一无法满足用户需求的问题,进而达到了根据用户需要表现出不同的行为的效果。In this embodiment, a first receiving unit is used to receive a control instruction, wherein the control instruction is used to instruct the robot to perform a predetermined action; an input unit is used to input the control instruction into a preset model to obtain an output result, wherein the preset model uses In order to determine the behavioral emotion pattern when the robot performs a predetermined action, the preset model is obtained by training at least based on the emotion evaluation parameters fed back by the user; the control unit is used to control the robot to perform the predetermined action according to the behavioral emotion pattern according to the output result, thereby solving the behavioral performance of the robot. A single problem that cannot meet the needs of users, and then achieves the effect of showing different behaviors according to the needs of users.

可选地,行为情感模式用于确定机器人对预定动作的行为情感状态,在机器人执行预定动作时,机器人的状态为行为情感状态。Optionally, the behavioral emotion mode is used to determine the behavioral emotion state of the robot for the predetermined action, and when the robot performs the predetermined action, the state of the robot is the behavioral emotion state.

可选地,该装置还包括:第二接收单元,用于在将所述控制指令输入预设模型,得到输出结果之前,接收事物集中的多个事物对应的行为情感指数,其中,所述行为情感指数用于表示所述机器人对所述事物集中的多个事物的喜好程度;建立单元,用于根据所述事物集中的多个事物对应的所述行为情感指数建立所述预设模型。Optionally, the device further includes: a second receiving unit, configured to receive the behavior emotion index corresponding to a plurality of things in the thing set before inputting the control instruction into a preset model and obtaining an output result, wherein the behavior The emotion index is used to indicate the degree of preference of the robot to multiple things in the thing set; the establishment unit is used to establish the preset model according to the behavior emotion index corresponding to the multiple things in the thing set.

可选地,所述事物集中的多个事物为所述机器人执行的所述预定动作对应的事物。Optionally, the multiple things in the thing set are things corresponding to the predetermined action performed by the robot.

可选地,事物集中事物的数量为n个,n为大于1的整数预设模型的计算公式为

Figure BDA0001164466160000081
其中,wx为预设的基础参数,wi为事物集中的第i项事物的影响参数,xi为事物集中的第i项事物当前被设置的开关状态参数,bi为事物集中的第i项事物的评价结果数值。Optionally, the number of things in the thing set is n, and n is an integer greater than 1. The calculation formula of the preset model is:
Figure BDA0001164466160000081
Among them, w x is the preset basic parameter, wi is the influence parameter of the i -th thing in the thing set, x i is the current switch state parameter of the i-th thing in the thing set, and b i is the ith thing in the thing set. The value of the evaluation result of the i item.

可选地,输出结果包括预定动作的期望值,该装置还包括:第三接收单元,用于在根据输出结果控制机器人按照行为情感模式执行预定动作之后,接收对机器人的动作的反馈结果;更新单元,用于根据反馈结果和期望值更新预设模型。Optionally, the output result includes an expected value of the predetermined action, and the device further includes: a third receiving unit, configured to receive a feedback result on the action of the robot after controlling the robot to perform the predetermined action according to the behavioral emotion mode according to the output result; an updating unit; , which is used to update the preset model based on the feedback results and expected values.

可选地,控制指令包括以下至少之一:图像控制指令;语音控制指令;生物信号控制指令。Optionally, the control instructions include at least one of the following: image control instructions; voice control instructions; biological signal control instructions.

本发明实施例还提供了一种机器人,该机器人包括本发明实施例的机器人动作控制装置。An embodiment of the present invention further provides a robot, and the robot includes the robot motion control device of the embodiment of the present invention.

在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present invention, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that the above-mentioned modules or steps of the present invention can be implemented by a general-purpose computing device, and they can be centralized on a single computing device or distributed in a network composed of multiple computing devices Alternatively, they can be implemented with program codes executable by a computing device, so that they can be stored in a storage device and executed by the computing device, or they can be made into individual integrated circuit modules, or they can be integrated into The multiple modules or steps are fabricated into a single integrated circuit module. As such, the present invention is not limited to any particular combination of hardware and software.

以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims (13)

1.一种机器人动作控制方法,其特征在于,包括:1. a robot motion control method, is characterized in that, comprises: 接收控制指令,其中,所述控制指令用于指示机器人执行预定动作;receiving a control instruction, wherein the control instruction is used to instruct the robot to perform a predetermined action; 将所述控制指令输入预设模型,得到输出结果,其中,所述输出结果包括与所述机器人执行的所述预定动作相对应的行为情感模式,所述预设模型至少基于用户反馈的情感评价参数训练得到;Inputting the control instruction into a preset model to obtain an output result, wherein the output result includes a behavioral emotion pattern corresponding to the predetermined action performed by the robot, and the preset model is at least based on an emotional evaluation fed back by a user parameters are trained; 根据所述输出结果控制所述机器人按照所述行为情感模式执行所述预定动作;Controlling the robot to perform the predetermined action according to the behavioral emotion pattern according to the output result; 在将所述控制指令输入预设模型,得到输出结果之前,所述方法还包括:Before inputting the control instruction into a preset model and obtaining an output result, the method further includes: 接收事物集中的多个事物对应的行为情感指数,其中,所述行为情感指数用于表示所述机器人对所述事物集中的多个事物的喜好程度;Receiving behavioral emotion indexes corresponding to multiple things in the thing set, wherein the behavioral emotion index is used to indicate the degree of preference of the robot to the multiple things in the thing set; 根据所述事物集中的多个事物对应的所述行为情感指数建立所述预设模型,The preset model is established according to the behavioral emotion index corresponding to a plurality of things in the thing set, 其中,所述预设模型输出的行为情感模式包括开心模式和与不开心的行为情感状态对应的行为情感模式,所述预定动作包括扫地,如果所述预设模型输出的行为情感模式为开心模式,则所述机器人表现出开心的扫地,机器人根据主人打分判断出主人是否喜欢在扫地时表现出的开心的行为情感模式,如果是,则在下一次扫地时依旧表现出开心的行为情感模式,如果否,则在下一次扫地时表现出不开心的行为情感状态。Wherein, the behavioral emotion mode output by the preset model includes a happy mode and a behavioral emotion mode corresponding to an unhappy behavioral emotion state, and the predetermined action includes sweeping the floor. If the behavioral emotion mode output by the preset model is the happy mode , then the robot shows a happy sweeping, and the robot judges whether the owner likes the happy behavioral emotion pattern when sweeping the floor according to the owner's score. If so, it still shows a happy behavioral emotion pattern when sweeping the floor next time. No, show an unhappy behavioral emotional state the next time you sweep the floor. 2.根据权利要求1所述的方法,其特征在于,所述行为情感模式用于确定所述机器人对所述预定动作的行为情感状态,在所述机器人执行所述预定动作时,所述机器人的状态为所述行为情感状态。2 . The method according to claim 1 , wherein the behavioral emotion mode is used to determine the behavioral emotion state of the robot for the predetermined action, and when the robot performs the predetermined action, the robot The state is the behavioral affective state. 3.根据权利要求1所述的方法,其特征在于,所述事物集中的多个事物为所述机器人执行的所述预定动作对应的事物。3 . The method according to claim 1 , wherein a plurality of things in the thing set are things corresponding to the predetermined action performed by the robot. 4 . 4.根据权利要求1所述的方法,其特征在于,所述事物集中事物的数量为n个,n为大于1的整数;所述预设模型的计算公式为
Figure FDA0002636707790000011
4. The method according to claim 1, wherein the number of things in the thing set is n, and n is an integer greater than 1; the calculation formula of the preset model is
Figure FDA0002636707790000011
其中,wx为预设的基础参数,wi为所述事物集中的第i项事物的影响参数,xi为所述事物集中的第i项事物当前被设置的开关状态参数,bi为所述事物集中的第i项事物的评价结果数值。Wherein, w x is a preset basic parameter, wi is the influence parameter of the i-th thing in the thing set, x i is the switch state parameter currently set for the i-th thing in the thing set, and b i is The value of the evaluation result of the i-th thing in the thing set.
5.根据权利要求1所述的方法,其特征在于,所述输出结果还包括所述预定动作的期望值,在根据所述输出结果控制所述机器人按照所述行为情感模式执行所述预定动作之后,所述方法还包括:5 . The method according to claim 1 , wherein the output result further includes an expected value of the predetermined action, and after the robot is controlled to perform the predetermined action according to the behavioral emotion pattern according to the output result. 6 . , the method also includes: 接收对所述机器人执行所述预定动作的反馈结果;receiving a feedback result for the robot to perform the predetermined action; 根据所述反馈结果和所述期望值更新所述预设模型。The preset model is updated according to the feedback result and the expected value. 6.根据权利要求1所述的方法,其特征在于,所述控制指令包括以下至少之一:6. The method according to claim 1, wherein the control instruction comprises at least one of the following: 图像控制指令;image control instructions; 语音控制指令;voice control commands; 生物信号控制指令。Biosignal control commands. 7.一种机器人动作控制装置,其特征在于,包括:7. A robot motion control device, characterized in that, comprising: 第一接收单元,用于接收控制指令,其中,所述控制指令用于指示机器人执行预定动作;a first receiving unit, configured to receive a control instruction, wherein the control instruction is used to instruct the robot to perform a predetermined action; 输入单元,用于将所述控制指令输入预设模型,得到输出结果,其中,所述输出结果包括与所述机器人执行的所述预定动作相对应的行为情感模式,所述预设模型至少基于用户反馈的情感评价参数训练得到;an input unit, configured to input the control instruction into a preset model to obtain an output result, wherein the output result includes a behavioral emotion pattern corresponding to the predetermined action performed by the robot, and the preset model is based on at least The emotional evaluation parameters of user feedback are trained; 控制单元,用于根据所述输出结果控制所述机器人按照所述行为情感模式执行所述预定动作,a control unit, configured to control the robot to perform the predetermined action according to the behavioral emotion pattern according to the output result, 第二接收单元,用于在将所述控制指令输入预设模型,得到输出结果之前,接收事物集中的多个事物对应的行为情感指数,其中,所述行为情感指数用于表示所述机器人对所述事物集中的多个事物的喜好程度;The second receiving unit is configured to receive the behavior emotion index corresponding to a plurality of things in the object set before inputting the control instruction into the preset model and obtaining the output result, wherein the behavior emotion index is used to represent the robot pair the degree of preference for a plurality of things in the collection of things; 建立单元,用于根据所述事物集中的多个事物对应的所述行为情感指数建立所述预设模型,a establishing unit for establishing the preset model according to the behavioral emotion index corresponding to a plurality of things in the thing set, 其中,所述预设模型输出的行为情感模式包括开心模式和与不开心的行为情感状态对应的行为情感模式,所述预定动作包括扫地,如果所述预设模型输出的行为情感模式为开心模式,则所述机器人表现出开心的扫地,机器人根据主人打分判断出主人是否喜欢在扫地时表现出的开心的行为情感模式,如果是,则在下一次扫地时依旧表现出开心的行为情感模式,如果否,则在下一次扫地时表现出不开心的行为情感状态。Wherein, the behavioral emotion mode output by the preset model includes a happy mode and a behavioral emotion mode corresponding to an unhappy behavioral emotion state, and the predetermined action includes sweeping the floor. If the behavioral emotion mode output by the preset model is the happy mode , then the robot shows a happy sweeping, and the robot judges whether the owner likes the happy behavioral emotion pattern when sweeping the floor according to the owner's score. If so, it still shows a happy behavioral emotion pattern when sweeping the floor next time. No, show an unhappy behavioral emotional state the next time you sweep the floor. 8.根据权利要求7所述的装置,其特征在于,所述行为情感模式用于确定所述机器人对所述预定动作的行为情感状态,在所述机器人执行所述预定动作时,所述机器人的状态为所述行为情感状态。8 . The device according to claim 7 , wherein the behavioral emotion mode is used to determine the behavioral emotion state of the robot for the predetermined action, and when the robot performs the predetermined action, the robot The state is the behavioral affective state. 9.根据权利要求7所述的装置,其特征在于,所述事物集中的多个事物为所述机器人执行的所述预定动作对应的事物。9 . The apparatus according to claim 7 , wherein the plurality of things in the thing set are things corresponding to the predetermined actions performed by the robot. 10 . 10.根据权利要求7所述的装置,其特征在于,所述事物集中事物的数量为n个,n为大于1的整数;所述预设模型的计算公式为
Figure FDA0002636707790000031
10 . The device according to claim 7 , wherein the number of things in the thing set is n, and n is an integer greater than 1; the calculation formula of the preset model is: 10 .
Figure FDA0002636707790000031
其中,wx为预设的基础参数,wi为所述事物集中的第i项事物的影响参数,xi为所述事物集中的第i项事物当前被设置的开关状态参数,bi为所述事物集中的第i项事物的评价结果数值。Wherein, w x is a preset basic parameter, wi is the influence parameter of the i-th thing in the thing set, x i is the switch state parameter currently set for the i-th thing in the thing set, and b i is The value of the evaluation result of the i-th thing in the thing set.
11.根据权利要求7所述的装置,其特征在于,所述输出结果还包括所述预定动作的期望值,所述装置还包括:11. The apparatus according to claim 7, wherein the output result further comprises an expected value of the predetermined action, and the apparatus further comprises: 第三接收单元,用于在根据所述输出结果控制所述机器人按照所述行为情感模式执行所述预定动作之后,接收对所述机器人执行所述预定动作的反馈结果;a third receiving unit, configured to receive a feedback result for the robot to perform the predetermined action after controlling the robot to perform the predetermined action according to the behavioral emotion pattern according to the output result; 更新单元,用于根据所述反馈结果和所述期望值更新所述预设模型。An update unit, configured to update the preset model according to the feedback result and the expected value. 12.根据权利要求7所述的装置,其特征在于,所述控制指令包括以下至少之一:12. The apparatus according to claim 7, wherein the control instruction comprises at least one of the following: 图像控制指令;image control instructions; 语音控制指令;voice control commands; 生物信号控制指令。Biosignal control commands. 13.一种机器人,其特征在于,包括权利要求7至12中任一项所述的机器人动作控制装置。13. A robot, comprising the robot motion control device according to any one of claims 7 to 12.
CN201611069128.9A 2016-11-28 2016-11-28 Robot and its motion control method and device Active CN108115678B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201611069128.9A CN108115678B (en) 2016-11-28 2016-11-28 Robot and its motion control method and device
PCT/CN2017/092038 WO2018095041A1 (en) 2016-11-28 2017-07-06 Robot, and action control method and device therefor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611069128.9A CN108115678B (en) 2016-11-28 2016-11-28 Robot and its motion control method and device

Publications (2)

Publication Number Publication Date
CN108115678A CN108115678A (en) 2018-06-05
CN108115678B true CN108115678B (en) 2020-10-23

Family

ID=62194768

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611069128.9A Active CN108115678B (en) 2016-11-28 2016-11-28 Robot and its motion control method and device

Country Status (2)

Country Link
CN (1) CN108115678B (en)
WO (1) WO2018095041A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110576433B (en) * 2018-06-08 2021-05-18 香港商女娲创造股份有限公司 Robot action generation method
CN109074502A (en) * 2018-07-26 2018-12-21 深圳前海达闼云端智能科技有限公司 Method, apparatus, storage medium and the robot of training artificial intelligence model
JP2021094677A (en) * 2019-12-19 2021-06-24 本田技研工業株式会社 Robot control device, robot control method, program and learning model
CN114578720B (en) * 2020-12-01 2023-11-07 合肥欣奕华智能机器股份有限公司 Control method and control system
CN112861804B (en) * 2021-03-17 2025-03-07 上海创屹科技有限公司 Human body motion evaluation method, evaluation device and evaluation system
CN116935497B (en) * 2023-09-19 2024-01-05 广州中鸣数码科技有限公司 Game control method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1463215A (en) * 2001-04-03 2003-12-24 索尼公司 Leg type moving robot, its motion teaching method and storage medium
JP2004098252A (en) * 2002-09-11 2004-04-02 Ntt Docomo Inc Communication terminal, control method of lip robot, and control device of lip robot
CN103456314A (en) * 2013-09-03 2013-12-18 广州创维平面显示科技有限公司 Emotion recognition method and device
CN104350541A (en) * 2012-04-04 2015-02-11 奥尔德巴伦机器人公司 Robot capable of incorporating natural dialogues with a user into the behaviour of same, and methods of programming and using said robot
CN105930374A (en) * 2016-04-12 2016-09-07 华南师范大学 Emotion robot conversation method and system based on recent feedback, and robot
CN105945949A (en) * 2016-06-01 2016-09-21 北京光年无限科技有限公司 Information processing method and system for intelligent robot
CN105988591A (en) * 2016-04-26 2016-10-05 北京光年无限科技有限公司 Intelligent robot-oriented motion control method and intelligent robot-oriented motion control device

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6445978B1 (en) * 1999-05-10 2002-09-03 Sony Corporation Robot device and method for controlling the same
US6519506B2 (en) * 1999-05-10 2003-02-11 Sony Corporation Robot and control method for controlling the robot's emotions
KR101014321B1 (en) * 2009-02-24 2011-02-14 한국전자통신연구원 Emotion Recognition Method Using Minimum Classification Error Method
CN101618280B (en) * 2009-06-30 2011-03-23 哈尔滨工业大学 Humanoid-head robot device with human-computer interaction function and behavior control method thereof
KR20110002757A (en) * 2009-07-02 2011-01-10 삼성전자주식회사 Emotion model device, apparatus and method for learning disposition of emotion model
US9117168B2 (en) * 2012-09-28 2015-08-25 Korea Institute Of Industrial Technology Apparatus and method for calculating internal state for artificial emotion
CN105260745A (en) * 2015-09-30 2016-01-20 西安沧海网络科技有限公司 Information push service system capable of carrying out emotion recognition and prediction based on big data
CN105807933B (en) * 2016-03-18 2019-02-12 北京光年无限科技有限公司 A kind of man-machine interaction method and device for intelligent robot

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1463215A (en) * 2001-04-03 2003-12-24 索尼公司 Leg type moving robot, its motion teaching method and storage medium
JP2004098252A (en) * 2002-09-11 2004-04-02 Ntt Docomo Inc Communication terminal, control method of lip robot, and control device of lip robot
CN104350541A (en) * 2012-04-04 2015-02-11 奥尔德巴伦机器人公司 Robot capable of incorporating natural dialogues with a user into the behaviour of same, and methods of programming and using said robot
CN103456314A (en) * 2013-09-03 2013-12-18 广州创维平面显示科技有限公司 Emotion recognition method and device
CN105930374A (en) * 2016-04-12 2016-09-07 华南师范大学 Emotion robot conversation method and system based on recent feedback, and robot
CN105988591A (en) * 2016-04-26 2016-10-05 北京光年无限科技有限公司 Intelligent robot-oriented motion control method and intelligent robot-oriented motion control device
CN105945949A (en) * 2016-06-01 2016-09-21 北京光年无限科技有限公司 Information processing method and system for intelligent robot

Also Published As

Publication number Publication date
CN108115678A (en) 2018-06-05
WO2018095041A1 (en) 2018-05-31

Similar Documents

Publication Publication Date Title
CN108115678B (en) Robot and its motion control method and device
JP7437327B2 (en) Method and system for interpolation of heterogeneous inputs
Haazebroek et al. A computational model of perception and action for cognitive robotics
KR101106002B1 (en) Apparatus and method for generating behaviour in an object
Churamani et al. Continual learning for affective robotics: Why, what and how?
KR101137205B1 (en) Robot behavior control system, behavior control method, and robot device
CN106445147B (en) Behavior management method and device for dialogue system based on artificial intelligence
Alissandrakis et al. Imitation with ALICE: Learning to imitate corresponding actions across dissimilar embodiments
Prasad et al. Human-robot handshaking: A review
KR101028814B1 (en) Software Robot Device and Method of Expression of Behavior of Software Robot in the Device
US20140277744A1 (en) Robotic training apparatus and methods
CN109754088A (en) Computational system with modular infrastructure for training generative adversarial networks
JP6446126B2 (en) Processing system and program
US7937348B2 (en) User profiles
KR20030029297A (en) Human nervous-system-based emotion synthesizing device and method for the same
US11231772B2 (en) Apparatus control device, method of controlling apparatus, and non-transitory recording medium
KR20090112213A (en) Favorability forming device of robot and its method
CN114529010B (en) A robot autonomous learning method, device, equipment and storage medium
CN108133259A (en) The system and method that artificial virtual life is interacted with the external world
JP2005050310A (en) Architecture for self-learning device
KR20090007972A (en) How Genetic Codes Are Organized in Software Robots
CN116968024A (en) Method, computing device and medium for obtaining control strategy for generating shape closure grabbing pose
Dong et al. Sensory Motor System: Modeling the process of action execution
KR100909532B1 (en) Method and device for learning behavior of software robot
JP2021523472A (en) How to control multiple robot effectors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20250120

Address after: Room B1-2862, Building 3, No. 20 Yong'an Road, Shilong Economic Development Zone, Mentougou District, Beijing, 102300

Patentee after: Beijing Xinhui Yongtai Technology Co.,Ltd.

Country or region after: China

Address before: 518000 Guangdong, Shenzhen, Nanshan District, Nanhai Road, West Guangxi Temple Road North Sunshine Huayi Building 1 15D-02F

Patentee before: SHEN ZHEN KUANG-CHI HEZHONG TECHNOLOGY Ltd.

Country or region before: China